Why Safety Profiles Failed

Posted by pjmlp 2 days ago

Why Safety Profiles Failed(www.circle-lang.org)

234 points | 213 comments

SubjectToChange 2 days ago|

At this point I'm wondering if the purpose of safety profiles is simply to serve as a distraction. In other words, safety profiles are just something people can point to when the topic of memory safety comes up, that’s it. The objectives of the initiative always seemed hopelessly optimistic, if not absurd. In particular, I don't understand why littering a codebase with auto, const, constexpr, inline, [[nodiscard]], noexcept, etc is wonderful, yet lifetime annotations are somehow an intolerable tyranny.

rswail 2 days ago||

It's a tick-the-box-for-compliance item like when Microsoft had a POSIX layer for Windows NT.

pjmlp 2 days ago||

Microsoft eventually learned that keeping full POSIX support would have been a better outcome in today's server room if they had done it properly instead.

Likewise, pushing half solutions like profiles that are still pretty much a paper idea, other than what already exists in static analysers, might decrease C++'s relevance in some domains, and eventually those pushing for them might find themselves in the position that adopting Safe C++ (circle's design) would have been a much better decision.

The problem with ISO driven languages, is who's around in the room when voting takes place.

badmintonbaseba 2 days ago|||

Adopting what static analyzers do is a no-go, as they rely on non-local reasoning, even across translation units for lifetime and aliasing analysis. Their output highly depend on what they can see, and they generally can't see the source code for the whole program. I also doubt that they promise any kind of stability in their output across versions.

This is a not a jab against static analyzers, by all means use them, but I don't think they are a good fit as part of the language.

pjmlp 2 days ago||

Yeah, yet that is exactly the approach being pushed by those on the profiles camp.

Further, the clang tidy and VC++ analysis based on some of the previous work, e.g. lifetime analysis paper from 2015, barely work, full of false positives.

I was looking forward to it in VC++, and to this day in VC++ latest, it still leaves too much on the table.

klodolph 2 days ago|||

We can dream of what it would be like with full POSIX support on Windows, but it was a pipe dream to begin with. There are some major differences between Windows and POSIX semantics for things like processes and files. The differences are severe enough that Windows and POSIX processes can’t coexist. The big issue with files is that on POSIX, you can conceptually think of a file as an inode, with zero or more paths pointing to it. On Windows, you conceptually think of a file as the path itself, and you can create mandatory locks. There are other differences. Maybe you could address these given enough time, but WSL’s solution is to basically isolate Windows and Linux, which makes a ton of sense.

pjmlp 2 days ago||

This wasn't the case with the subsystems approach, which is also validated by all micro-computers from IBM and Unisys still in use, being further developed, with incompatible differences between their mainframe heritage and UNIX compatability workloads.

ameliaquining 2 days ago|||

I think maybe it's because lifetime annotations can get arbitrarily complicated. If you look at enough Rust code you'll definitely see some function signatures that make your head hurt, even if they're vastly outnumbered by simple ones. A guarantee that the comprehension complexity of that part of your code will always be below some low ceiling is tempting.

estebank 2 days ago|||

The thing is, if you were to make the same design in C++ the code might look "cleaner" because there is less code/fewer annotations, but the other side of that coin is that the developer also has less information about how things are meant to fit together. You not only lose the compiler having your back, you also don't have useful documentation, even if that documentation would be too complicated to grasp at once. Without that documentation you might be fooled into thinking that you do understand what's going on even if you don't in reality.

saghm 2 days ago|||

What's interesting to me about this is that from what I understand, lifetime annotations are not present in Rust because of a desire to include information for the use of developers, but because without them the compiler would need to brute-force checking all potential combinations of lifetimes to determine whether one of them is valid. The heuristics[0] that the compiler uses to avoid requiring explicit annotations in practice cover most common situations, but outside of those, the compiler only acts as a verifier for a given set of lifetimes the user specifies rather than attempting to find a solution itself. In other words, all of the information the compiler would need to validate the program is already there; it just wouldn't be practical to do it.

[0]: https://doc.rust-lang.org/reference/lifetime-elision.html

steveklabnik 2 days ago||

There’s some truth to both. What’s good for computers is often good for humans, but there’s a balance to be had. The elision rules are an acknowledgment that being 100% explicit in surface syntax is going a bit too far, even if it’s important info for the computer to have.

saghm 2 days ago||

Fair enough! The part that always stuck out to me is that there were other potential designs that could have been made around how (and which) lifetime information would be specified. I think sometimes people might not realize that the we didn't get stuck with the requirements we have for lifetime annotation today due to validation requiring exactly that set of information or an indifference to the burden it might place on the programmer to specify it; the developer experience was at the forefront of deciding how this should work, and as you say, weighing all of the factors that entails is a balance.

steveklabnik 2 days ago||

For sure. And I do find https://cfallin.org/blog/2024/06/12/rust-path-generics , for example, interesting. It’s certainly not like what Rust does today is the only way things could ever be. Heck, I’m sad we never got elision on structs in 2018 like we were talking about.

yellow_lead 2 days ago|||

That's a good point. There's many times in a C++ codebase, where I'd see or write a seemingly innocuous function, but it has so many assumptions about lifetimes, threads, etc that it would make your brain hurt. Of course we try to remove those or add a comment, but it's still difficult to deal with.

bluGill 2 days ago||

There are reasonably good c++11 conventions for lifetimes - if it is a unique_ptr you own it, otherwise you don't and shouldn't save a copy. Almost nobody follows them, but they are good conventions and you should, and write up a bug if someone else isn't. Similar, for threads, keep your data confined to one thread, but explicit where you move/copy it to a different thread (note I said move or copy - the first thread should lose access in some way) - with the only exception of data explicitly marked as thread safe.

The above is forced by Rust, which would be nice, but the conventions are easy enough if you try at all. But most developers refuse to write anything more than C++98.

duped 2 days ago|||

> But most developers refuse to write anything more than C++98.

I think the bigger mistake is equating memory safety with C++11 smart pointers. They buy you a little, but not the whole buffet. There are a lot of C++ developers that think memory safety is a skill issue and if you just use "best practices with C++11 or higher" then you get it - when evidence proves to the contrary.

bluGill 2 days ago||

Smart pointers, containers... There are plenty of best practices that would give memory safety but nobody uses them (and not for cases where in rust you would have to use unsafe and thus there is good reason).

Which is why safety profiles are so interesting, they are something I should be able to turn on/off on a file by file basis and thus easily force the issue.

Of course profiles don't exist yet (and what is proposed is very different from what this article is arguing against) and so it remains to be seen if they will be adopted and if so how useful they will be.

steveklabnik 2 days ago||

Safe C++ is also something you turn on file by file.

bluGill 2 days ago||

What matters is tool support. Anything in the standard I expect to get tool support for (eventually), while everything else - lets just say I've been burned a lot by tools that are really nice for a few years but then they stop maintaining it and now I have to rewrite otherwise perfectly good code just so I can upgrade something else. Standard C++ isn't going anyplace soon and I feel confident that if something makes it into the standard tools will exist for a few decades at least (long enough for me to retire). Will Rust or Safe C++ still be around in 10 years, or just be another fad like so many other languages that got a lot of press for a few years and now are not used much (you probably cannot answer this other than a guess)

steveklabnik 2 days ago||

I fully agree, this thread is about two possible futures for getting that support in the standard: Safe C++ and profiles.

Analemma_ 2 days ago|||

> There are reasonably good c++11 conventions for lifetimes [...] Almost nobody follows them [...]

I swear I'm not trying to be snarky or rude here, but is it actually a "convention" if almost nobody follows it? This seems like one example of my general issue with C++, in that it could be great if everyone agreed to a restricted subset, but of course nobody can coordinate such agreement and it doesn't exist outside companies large and important enough to enforce their own in-house C++ standards (e.g. Google).

bluGill 2 days ago|||

What we have is a human problem. The convention exists in enough places (though in slightly different forms) to call it a convention, but it needs more adoption.

Every once in a while someone who writes a lot of Rust will blog about some code they discovered that was 'unsafe' and after looking close they realized it wasn't doing something that fundamentally required unsafe (and often fixing the code to be safe fixed real bugs). C++ and Rust have to leave people enough rope to hang themselves in order to solve the problems they want to solve, but that means people will find a way to do stupid things.

orf 1 day ago|||

What arguments like this fail to understand is that conventions without guardrails, culture and/or guarantees are next to useless.

That’s not a human problem. It’s like saying “this motorway is pitch black, frequently wet and slippery and has no safety barriers between sides, so crashes are frequent and fatal. What we have is a human problem - drivers should follow the convention of driving at 10mph, when it doesn’t rain and make sure they are on the right side of the road at all times”.

bluGill 1 day ago||

Which is what this whole story is about: how can we add those things to C++? There are lots of options, which should we try. Which sound good but won't work (either technically or because they are not adopted), vs which will.

orf 1 day ago||

The whole story is about how you cant do this without lifetime annotations.

In other words: you can try limiting all cars to 10mph, closing the road, automatically switching out all car tyres with skid-proof versions while in motion, or anything else.

But… just turn the god damn lights on and put up a barrier between lanes. It works on every other road.

pjmlp 1 day ago|||

Despite all the security improvements that Microsoft has pushed for, here is one of the latest posts on Old New Thing.

https://devblogs.microsoft.com/oldnewthing/20241023-00/?p=11...

Notice the use of C's memcpy() function.

This is exactly the kind of posts where showing best practises would be quite helpful, as education.

spookie 2 days ago|||

Honestly, I blame MSVC for a lot of lost momentum when adopting new standards, given it takes them more than 4 years implementing those features. Ofc, this isn't the case for C++11 today, but a lot of projects were started prior to 2015.

And don't get me started on C itself. Jesus Christ.

pjmlp 1 day ago||

They certainly aren't to blame for the state of C and C++ adoption on UNIX, Sony and Nintendo, and embedded.

They are the only C++ compiler that properly supports all C++20 modules use cases, while clang still doesn't do Parallel STL from C++17, for example.

They support C17 nowadays, where many embedded folks are slowly adopting C99.

And the UNIX story outside clang and GCC is quite lame, most still stuck in C++14, catching up to C++17.

Likewise, consoles, C++17.

pjmlp 2 days ago||||

Except those kind of annotations already exist, but have proven not to be enough withough language semantic changes, SAL is a requirement in Microsoft's own code since Windows XP SP2.

https://learn.microsoft.com/en-us/cpp/code-quality/understan...

whimsicalism 2 days ago||||

Rust has nothing on template meta programming and the type signatures you get there, though

jimbob45 2 days ago|||

I’ve spent a fair amount of time writing C++ but F12’ing any of the std data structures makes me feel like I’ve never seen C++ before in my life.

NekkoDroid 2 days ago||

to be fair, a major cause of the pain of looking at the std is because of the naming and being semi-required to use reserved names for implementation details (either double underscore or starting underscore uppercase) and also for keeping backwards compat for older standard versions.

nickitolas 2 days ago||||

Not to mention the error messages when you get something slightly wrong

crest 2 days ago|||

Give the proc macro fans a little more time...

jerf 2 days ago|||

I understand the point you are making, but C++ templates really are a uniquely disastrous programming model. They can be used to pull off neat tricks, but the way those neat tricks are done is terrible.

duped 2 days ago|||

When a proc macro fails you get an error at the site where the macro is used, and a stack trace into the proc macro crate. You can even use tools to expand the proc macro to see what went wrong (although those aren't built in, yet).

Debugging a proc macro failure is miles and above easier than debugging template errors.

consteval 2 days ago||

This isn't really true since concepts were introduced. Granted, you have to use them, but it makes the debugging/error messages MUCH better.

fmbb 2 days ago||||

Yes. Lifetimes are complicated. Complicated codes make them even harder.

Not annotating is not making anything easier.

HelloNurse 2 days ago||||

What are the "arbitrarily complicated" cases of lifetime annotations? They cannot grow beyond one lifetime (and up to one compilation error) per variable or parameter or function return value.

ameliaquining 2 days ago||

Mostly involving structs. Someone at work once posted the following, as a slightly-modified example of real code that they'd actually written:

  pub struct Step<'a, 'b> {
      pub name: &'a str,
      pub stage: &'b str,
      pub is_last: bool,
  }

  struct Request<'a, 'b, 'c, 'd, 'e> {
      step: &'a Step<'d, 'e>,
      destination: &'c mut [u8],
      size: &'b Cell<Option<usize>>,
  }

To be sure, they were seeking advice on how to simplify it, but I imagine those with a more worse-is-better technical sensibility arguing that a language simply should not allow code like that to ever be written.

I also hear that higher-ranked trait bounds can get scary even within a single function signature, but I haven't had cause to actually work with them.

steveklabnik 2 days ago|||

In general, you can usually simplify the first one to have one lifetime for both, and in the second, you’d probably want two lifetimes, one for destination and the others all shared. Defaulting to the same lifetime for everything and then introducing more of them when needed is better than starting with a unique lifetime for each reference.

I think you two are ultimately talking about slightly different things, your parent is trying to point out that, even if this signature is complex, it can’t get more complex than this: one lifetime per reference means the complexity has an upper bound.

HelloNurse 2 days ago|||

But you are specifying that all members of Request except step.is_last have arbitrary unrelated lifetimes (shouldn't some of them be unified?) and you are simply exposing these lifetime parameters to Request client code like you would expose C++ template parameters: a trivial repetition that is easy to read, write and reason about.

myworkinisgood 2 days ago||||

[flagged]

thadt 2 days ago|||

It's deceptively easy to look at a number of examples and think: "If I can see that aliasing would be a problem in this function, then a computer should be able to see that too."

The article states "A C++ compiler can infer nothing about aliasing from a function declaration." Which is true, but assumes that the compiler only looks at the function declaration. In the examples given, an analyzer could look at the function bodies and propagate the aliasing requirements upward, attaching them to the function declaration in some internal data structure. Then the analyzer ensures that those functions are used correctly at every call site. Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know. Do the same for lifetimes. Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?

I like not having to see and think about lifetimes and aliasing problems most of the time, and it would be nice if the compiler (or borrow checker) just kept track of those without requiring me to explicitly annotate them everywhere.

seanbax 2 days ago|||

From P3465: "why this is a scalable compile-time solution, because it requires only function-local analysis"

From P1179: "This paper ... shows how to efficiently diagnose many common cases of dangling (use-after-free) in C++ code, using only local analysis to report them as deterministic readable errors at compile time."

Local analysis only. It's not looking in function definitions.

Whole program analysis is extremely complicated and costly to compute. It's not comparable to return type deduction or something like that.

HelloNurse 2 days ago|||

Whole program analysis is also impossible in the common case of calling functions given only their declarations. The compiler sees the standard library and the source files it is compiling, not arbitrary external libraries to be linked at a later stage: they might not exist yet and, in case of dynamic linking, they could be replaced while the program is running.

account42 2 days ago|||

Making programmers manually annoate every single function is infinitely more costly.

dwattttt 2 days ago|||

That rather depends. Compile time certainly wouldn't scale linearly with the size of a function, you could well reach a scenario where adding in a line to a function results in a year being added to the compile time.

nickitolas 2 days ago|||

Are you also a proponent of nonlocal type inference? Do you think annotating types is too costly for programmers?

SkiFire13 2 days ago|||

> Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know.

This assumes no recursive functions, no virtual functions/function pointers, no external functions etc etc

> Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?

Aliasing is much trickier than type inference.

For example aliasing can change over time (i.e. some variables may alias at some point but not at a later point, while types are always the same) and you want any analysis to reflect it because you will likely rely on that.

Granularity is also much more important: does a pointer alias with every element of a vector or only one? The former is surely easier to represent, but it may unnecessary propagate and result in errors.

So effectively you have an infinite domain of places that can alias, while type inference is limited to locals, parameters, functions, etc etc. And even then, aliasing is quadratic, because you want to know which pairs of places alias.

I hope you can see how this can quickly get impractical, both due to the complexity of the analysis and the fact that small imprecisions can result in very big false positives.

thadt 2 days ago||

Hence the term 'deceptively'.

Even if a sufficiently advanced proof assistant could internally maintain and propagate constraints up through functions (eg. 'vec must not alias x'), your point about small imprecisions cascading into large false positives is well made.

Bottom up constraints become increasingly difficult to untangle the further away they get from their inception, whereas top down rules such as "no mutable aliasing" are much easier to reason about locally.

WalterBright 1 day ago|||

Since const can be cast away, it's useless for checking.

SubjectToChange 1 day ago||

const can be cast away, auto can have some really nasty behavior, constexpr doesn't have to do anything, inline can be ignored, [[nodiscard]] can be discarded, exceptions can be thrown in noexcept functions, etc. Almost everything in C++ can be bypassed in one way or another.

WalterBright 1 day ago||

D can cast away const, but not in @safe code. Though we are considering revising this so it can only be done in @system code.

myworkinisgood 2 days ago||

[flagged]

CoastalCoder 2 days ago||

> You are more correct than you think you are!!!

Your comment will be more interesting if you expand upon it.

ameliaquining 2 days ago||

These considerations all seem so self-evident that I can't imagine the architects of Safety Profiles weren't aware of them; they are basically just the statement of the problem. And yet these smart people presumably thought they had some kind of solution to them. Why did they think that? What did this solution look like? I would be very interested to read more context on this.

steveklabnik 2 days ago||

As always with different designs from smart people, it’s about priorities.

The profiles proposal focuses on a lack of annotations (I think there’s reasonable criticism that this isn’t achieved by it though…), and believing they can get 80% of the benefit for 20% of the effort (at least conceptually, obviously not those exact numbers). They aren’t shooting for full memory safety.

The Safe C++ proposal asks “how do we achieve 100% memory safety by default?”. And then asks what is needed to achieve that goal.

ameliaquining 2 days ago||

What's with the "this model detects all possible errors" quote at the beginning of the post, then?

steveklabnik 2 days ago||

That’s a claim about dangling pointers and ownership. Profiles do not solve aliasing or concurrency, as two examples of things that Safe C++ does that are important for memory safety.

ameliaquining 2 days ago||

Concurrency, sure, I can see thinking of that as a separate thing (as some people from Google have advocated for). But aliasing isn't a memory safety violation, it's a cause of memory safety violations (and other kinds of bugs besides). The first example from the linked post is straightforwardly a dangling pointer dereference, and I don't understand how the people behind safety profiles can claim that it's out of scope just because it involves aliasing. Did they say something like "this assumes your code follows these non-machine-checkable aliasing rules, if it doesn't then all bets are off"?

steveklabnik 2 days ago||

Sure, I said “aliasing” to mean “these rules do not prevent memory unsafety due to misusing aliased pointers.”

I hesitate to answer your question, but my impression is the answer is that they’re just not shooting for 100% safety, and so it’s acceptable to miss this kind of case.

tazjin 2 days ago|||

> Why did they think that? What did this solution look like?

I don't think they did think that. Having listened to a few podcasts with the safety profile advocates I've gotten the impression that their answer to any question about "right, but how would you actually do that?" is "well, we'll see, and in general there's other problems to think about, too!".

CoastalCoder 2 days ago|||

I wonder if the unstated issue here is:

C++ is so complex that it's hard to think through all the implications of design proposals like this.

So practically speaking, the only way to prove a design change is to implement it and get lots of people to take it for a test drive.

But it's hard to find enough people willing to do that in earnest, so the only real way to test the idea is to make it part of the language standard.

myworkinisgood 2 days ago||

[flagged]

alilleybrinker 2 days ago||

The article makes the particularly good point that you generally can’t effectively add new inferences without constraining optionality in code somehow. Put another way, you can’t draw new conclusions without new available assumptions.

In Sean’s “Safe C++” proposal, he extends C++ to enable new code to embed new assumptions, then subsets that extension to permit drawing new conclusions for safety by eliminating code that would violate the path to those safety conclusions.

steveklabnik 2 days ago||

Really glad to see this thorough examination of the weaknesses of profiles. Safe C++ is a really important project, and I hope the committee ends up making the right call here.

SubjectToChange 2 days ago||

>...I hope the committee ends up making the right call here.

WG21 hasn't been able to solve the restrict type qualifier, or make a better alternative, in over twenty years. IMO, hoping that WG21 adequately solves Safe C++ is nothing more than wishful thinking, to put it charitably.

OskarS 2 days ago|||

Yeah, this one is so weird. You've been able to do that forever in C, and virtually all big compilers have this keyword in C++ as well, just named __restrict. Why is it so hard to get into the standard, at least for pointers? I can imagine that there are complex semantics with regards to references that are tricky to get right, but can't we at least have "'restrict" can only be used on raw pointer types, and it means the same thing as it does in C"?

steveklabnik 2 days ago|||

I am intimately familiar with the dysfunctions of various language committees.

I never said it would be easy, or probable. But I’m also the kind who hopes for the best.

pjmlp 2 days ago|||

Given how C++0x concepts, C++20 contracts, ABI discussion went down, where key people involved on those processes left to other programming language communities, not sure if the right call will be done in the end.

This is a very political subject, and WG21 doesn't have a core team, rather everything goes through votes.

It suffices to have the wrong count in the room when it is time to vote.

thadt 2 days ago|||

I have a long standing debate with a friend about whether the future of C++ will be evolution or extinction.

Safe C++ looks excellent - its adoption would go a long way toward validating his steadfast belief that C++ can evolve to keep up with the world.

myworkinisgood 2 days ago|||

[flagged]

biorach 2 days ago|||

Wild accusations without any backup... please don't.

sitkack 2 days ago||

History has been written. What makes you the future will be different?

Muromec 2 days ago|||

I'm not familiar with the politics there. What do they get by having their way?

wyager 2 days ago||

> Safe C++ is a really important project

What makes you say this? It seems to me like we already have a lower-overhead approach to reach the same goal (a low-level language with substantially improved semantic specificity, memory safety, etc.); namely, we have Rust, which has already improved substantially over the safety properties of C++, and offers a better-designed platform for further safety research.

alilleybrinker 2 days ago|||

Not everything will be rewritten in Rust. I've broken down the arguments for why this is, and why it's a good thing, elsewhere [1].

Google's recent analysis on their own experiences transitioning toward memory safety provide even more evidence that you don't need to fully transition to get strong safety benefits. They incentivized moving new code to memory safe languages, and continued working to actively assure the existing memory unsafe code they had. In practice, they found that vulnerability density in a stable codebase decays exponentially as you continue to fix bugs. So you can reap the benefits of built-in memory safety for new code while driving down latent memory unsafety in existing code to great effect. [2]

[1]: https://www.alilleybrinker.com/blog/cpp-must-become-safer/

[2]: https://security.googleblog.com/2024/09/eliminating-memory-s...

lmm 2 days ago|||

Nah. The idea that sustained bugfixing could occur on a project that was not undergoing active development is purely wishful thinking, as is the idea that a project could continue to provide useful functionality without vulnerabilities becoming newly exposed. And the idea of a meaningfully safer C++ is something that has been tried and failed for 20+ years.

Eventually everything will be rewritten in Rust or successors thereof. It's the only approach that works, and the only approach that can work, and as the cost of bugs continues to increase, continuing to use memory-unsafe code will cease to be a viable option.

gpderetta 2 days ago|||

> The idea that sustained bugfixing could occur on a project that was not undergoing active development is purely wishful thinking

yet the idea that a project no longer actively developed will be rewritten in rust is not?

lmm 2 days ago||

> yet the idea that a project no longer actively developed will be rewritten in rust is not?

Rewriting it in Rust while continuing to actively develop the project is a lot more plausible than keeping it in C++ and being able to "maintain a stable codebase" but somehow still fix bugs.

(Keeping it in C++ and continuing active development is plausible, but means the project will continue to have major vulnerabilities)

bluGill 2 days ago|||

I'm not convinced. Rust is nice, but every time I think I should write this new code in Rust I discover it needs to interoperate with some C++ code. How to I work with std::vector<std::string> in rust - it isn't impossible but it isn't easy (and often requires copying data from C++ types to Rust types and back). How do I call a C++ virtual function from Rust?

The above issue is why my code is nearly all C++ - C++ was the best choice we had 15 years ago and mixing languages is hard unless you limit yourself to C (unreasonably simple IMO). D is the only language I'm aware of that has a good C++ interoperability story (I haven't worked with D so I don't know how it works in practice). Rust is really interesting, but it is hard to go from finishing a "hello world" tutorial in Rust to putting Rust in a multi-million line C++ program.

zozbot234 2 days ago||

Rust/C++ interop is in fact complex and not obviously worthwhile - some of the underlying mechanisms (like the whole deal with "pinned" objects in Rust) are very much being worked on. It's easier to just keep the shared interface to plain C.

bluGill 1 day ago||

Read I should keep writing C++ code in my project instead of trying to add Rust for new code/features.

I'm not happy with my situation, but I need a good way out. Plain C interfaces are terrible, C++ for all the warts is much better (std::string has a length so no need for strlen all over)

gpderetta 2 days ago|||

The idea is to keep it in C++ and do new development in an hypothetical Safe C++. That would ideally be significantly simpler than interface with rust or rewrite.

There is of course the "small" matter that Safe C++ doesn't exist yet, but Google analysis showing that requiring only new code to be safe is good enough, is a strong reason for developing a Safe C++.

steveklabnik 2 days ago||

Safe C++ does exist today: it’s implemented in Circle. You can try it out on godbolt right now.

gpderetta 2 days ago||

Thanks! I have been putting off playing with rust lifetimes. I guess now I have no excuses.

foldr 2 days ago|||

> Nah.

I know it's intended just to express disagreement, but this comes across as extremely dismissive (to me, anyway).

wyager 2 days ago|||

> Not everything will be rewritten in Rust.

Yeah, but it's also not going to be rewritten in safe C++.

pavon 2 days ago|||

Why not? C++ has evolved over the years, and every C++ project I have worked on, we've adopted new features that make the language safer or clearer as they are supported by the compilers we target. It doesn't get applied to the entire codebase overnight, but all new code uses these features, refactors adopt them as much as possible, and classes of bugs found by static code scanning cause them to be adopted sprinkled through the rest of the code. Our C++ software is more stable than it has ever been because of it.

Meanwhile, throwing everything away and rewriting it from scratch in another language has never been an option for any of those projects. Furthermore, even when there has been interest and buy-in to incrementally move to Rust in principle, in practice most of the time we evaluate using Rust for new features, the amount of existing code it must touch and the difficulty integrating Rust and C++ meant that we usually ended up using C++ instead.

If features of Circle C++ were standardized, or at least stabilized with wider support, we would certainly start adopting them as well.

ameliaquining 2 days ago|||

What I'm really hoping is that https://github.com/google/crubit eventually gets good enough to facilitate incremental migration of brownfield C++ codebases to Rust. That seems like it would address this concern.

safercplusplus 2 days ago|||

You might consider experimenting with the scpptool-enforced safe subset of C++ (my project). It should be even less disruptive.

[1] https://github.com/duneroadrunner/scpptool

alilleybrinker 2 days ago||||

There’s likely some amount of code which would not be rewritten into Rust but which would be rewritten into safe C++. Migrating to a whole new language is a much bigger lift than updating the compiler you’re already using and then modifying code to use things the newer compiler supports. Projects do the latter all the time.

gpderetta 2 days ago||||

The point is that it doesn't need to. According to google, making sure that new code is safe is good enough.

safercplusplus 2 days ago|||

In theory it could be auto-converted to a safe subset of C++ [1]. In theory it could be done at build-time, like the sanitizers.

[1] https://github.com/duneroadrunner/SaferCPlusPlus-AutoTransla...

steveklabnik 2 days ago||||

I am pro any movement towards memory safety. Sure, I won't stop writing Rust and start moving towards C++ for this. But not everyone is interested in introducing a second toolchain, for example. Also, as this paper mentions, Safe C++ can improve C++ <-> Rust interop, because Safe C++ can express some semantics Rust can understand. Right now, interop works but isn't very nice.

Basically, I want a variety of approaches, not a Rust monoculture.

nicoburns 2 days ago||

> But not everyone is interested in introducing a second toolchain, for example.

Not that this invalidates your broader point about Safe C++, but this particular issue could also be solved by Rust shipping clang / a frontend that can also compile C and C++.

steveklabnik 2 days ago|||

I have long thought that rust needs to copy Zig here but nobody seems to want to do it, so…

anp 2 days ago|||

I’ve often joked that rustup with a little buildscript copy/paste to use the cc crate could be the fastest way to set up a C++ toolchain and project on lots of systems, but I also haven’t received much enthusiasm on the topic from people involved more with upstream.

citelao 2 days ago||

I did that yesterday with a project: I took a Rust package that compiled a C project, then had the Rust project generate a C-compatible DLL that I could consume in dotnet.

It was so much easier (for me; I am bad at build systems) that I plan to do that for future projects.

There’s just something about `cargo run`…

evntdrvn 2 days ago||||

how about just having rustup bundle zig as part of its tooling… It would make getting Rust going on Windows ten times easier, among a bunch of other benefits.

sitkack 2 days ago|||

    pip install rust

Would be awesome!

estebank 2 days ago||||

On the one hand, I think that it would be a winning strategy. On the other, that effectively turns C++ part of the Rust language. And that is even before looking at the need to extend the Rust compiler to express things that the Rust language doesn't have/need but C++ does, like move constructors.

steveklabnik 2 days ago|||

I don’t see how it would make C++ part of the language. Nothing in the Rust front end would need to know about C++. It’s a sub command that would passthrough clang.

If you were worried about clang flag stability not being as stable as Rust, you could also include clang as part of llvm-tools. This would add an extra step to set up, but is still easier than today.

Of course, in both cases there’s still the work of having rust up (or rustc, depending on the strategy) set up the sysroot. I’m not saying this is trivial to do, but it would make cross compilation so much better than today, and bring rust to parity with Zig and Go on this front.

gauge_field 2 days ago|||

I am not sure if you understand the parent correctly (or I understand your reply). They mean shipping a different C/C+ frontend (e.g. Clang) together with Rust, which does not require any change in Rust frontend

gpderetta 2 days ago|||

Having Rust directly import C++ would be excellent, but you still need to assign semantics to non-annotated C++ to safely reflect it in safe Rust. You could import it as unsafe rust, but it would be quite gnarly.

tptacek 2 days ago||||

This is a thread about a C++ language feature; it's probably most productive for us to stipulate for this thread that C++ will continue to exist. Practical lessons C++ can learn moving forward from Rust are a good reason to talk about Rust; "C++ should not be improved for safety because code can be rewritten in Rust" is less useful.

pjmlp 2 days ago|||

Especially because many of us security minded folks do reach out for C++ as there are domains where it is the only sane option (I don't consider C a sane alternative), so anything that improves C++ safety is very much welcomed.

Improved C++'s safety means that the C++ code underlying several JVM implementations, CLR, V8, GCC and LLVM, CUDA, Unreal, Godot, Unity,... also gets a way to be improved, without a full rewrite, which while possible might not be economically feasible.

wyager 2 days ago|||

Actually, this subthread is about whether this is a "really important project"

tptacek 2 days ago||

to the C++ language

umanwizard 2 days ago||||

For new projects on mainstream architectures that don't have to depend on legacy C++ baggage, Rust is great (and, I think, practically always the better choice).

But, realistically, C++ will survive for as long as global technological civilization does. There are still people out there maintaining Fortran codebases.

(also, IDK if you already realized this, but it's funny that the person you're replying to is one of the most famous Rust boosters out there, in fact probably the most famous, at least on HN).

steveklabnik 2 days ago||

I have realized this. Sean and I have talked about it.

I became a Rust fan because of its innovations in the space. That its innovations may spread elsewhere is a good thing, not a bad thing. If a language comes along that speaks to me more than Rust does, I’ll switch to that. I’m not a partisan, even if it may feel that way from the outside.

pjmlp 2 days ago||

Indeed, even if Rust disappeared tomorrow, I would assert that its biggest contribution has been to make affine type systems more understanable in mainstream, to the point several languages, including a few with automatic memory management, are adding such concepts to their type systems without having their communities running away in horror, rather embracing the experiment.

SubjectToChange 2 days ago||||

Things like web browsers will continue to have millions of lines of C++ code regardless of how successful Rust becomes. It would be a huge improvement for everyone if such projects had a tractable path towards memory safety

wyager 2 days ago||

As this article discusses, it's not really viable that existing codebases will be able to benefit from safe C++ research without massive rewrites anyway

SubjectToChange 2 days ago||

Yes, absolutely. But it is still easier and more practical for those codebases to write new functionality in a Safe C++ dialect than it would be to use Rust.

jmull 2 days ago||||

> It seems to me like we already have a lower-overhead approach ... Rust

Rewriting all the existing C++ code in Rust is extremely high-cost. Practically speaking, that means it won't happen in many, many cases.

I think we want to find a more efficient way to achieve memory safety in C++.

Not to mention, Rust's safety model isn't that great. It does memory safety, which is good, but it's overly restrictive, disallowing various safe patterns. I suspect there are better safe alternatives out there for most cases, or at least could be. It would make sense to consider the alternatives before anyone rewrites something in Rust.

zozbot234 1 day ago|||

> It does memory safety, which is good, but it's overly restrictive, disallowing various safe patterns

The "safe" patterns Rust disallows tend to not account for safe modularity - as in, they impose complex, hard-to-verify requirements on outside code if "safety" is to be preserved. This kind of thing is essentially what the "unsafe" feature in Rust is intended to address.

steveklabnik 2 days ago|||

Folks who want to propose alternatives should do so! Right now, you’ve only got the two: profiles and Safe C++. There are also two that aren’t proposals. It have a semblance of a plan: “graft Hylo semantics instead of Rust semantics” and “scpptool.” Realistically, unless something else concrete and not “could be” is found at the eleventh hour, this is the reality of the possible choices.

jmull 2 days ago||

Don't forget the comment above proposes another alternative, "rewrite it in Rust".

The problem with such a proposal is that the cost is impossibly high for many, many cases. Effectively, across the entire existing C++ code base, you get "X% rewrite it in Rust plus (1-X)% do nothing at all", where X is probably a lot closer to 0 than 1.

If your goal is to address as many vulnerabilities as possible, you might want to look for a better plan.

I don't have a ready plan, but the general approach of incrementally improving the safety of existing C++ seems likely to be more effective than rewrites to me -- it could let the X in my formula move a lot closer to 1. Possibly one of the existing mechanisms for this is already better than "RIIR".

Edit, I meant to add:

For many, many things it's not the eleventh hour. For a lot of existing C++ code, no one has reached a final decision point. Many haven't really started at all and are at the 0th hour.

akira2501 2 days ago|||

Cool.

Do you mind if we have more than one approach?

wyager 2 days ago||

Yeah, it does not matter to me, but that wasn't what we were talking about

CJefferson 2 days ago||

This article is really good, and covers many important issues.

There were many similar issues when it came to the earlier attempts to add concepts to C++ (which would improve template dispatch), although the outcome was more about improving C++ programmer's lives, not safety.

It turned out trying to encapsulate all the things C++ functions, even in the standard library, as a list of concepts, was basically impossible. There are so many little corner-cases in C++ which need representing as a concept, the list of 'concepts' a function needed often ended up being longer than the function itself.

favorited 2 days ago||

I know Sean said on Twitter that he probably won't submit this to WG21, but I wish he would... It is a fantastic rebuttal of certain individual's continued hand-waving about how C++ is safe enough as-is.

bfrog 2 days ago||

This seems to be a common theme with many c++ developers honestly.

OvbiousError 2 days ago|||

Most C++ devs don't care that much either way I'd say, it's a vocal minority that does. I really don't understand the nay-sayers though, been a C++ dev for over 15 years, I'd give an arm and a leg for 1. faster language evolution (cries in pattern matching) and 2. a way to enforce safe code. Having to use std::variant when I want to use a sum type in 2024 is just so backwards it's hard to express. Still love the language though :p

pjmlp 2 days ago||

C++ became my favourite language after Object Pascal, as it provided similar safety levels, added with the portability.

I never been that big into C, although I do know it relatively well, as much as anyone can claim to do so, because it is a key language in anything UNIX/POSIX and Windows anyway.

One of the appealing things back then were the C++ frameworks that were provided alongside C++ compilers, pre-ISO C++98, all of them with more security consideration than what ended up landing on the standard library, e.g. bounds checking by default on collection types.

Nowadays I rather spend my time in other languages, and reach out to C++ on a per-need basis, as other language communities take the security discussion more seriously.

However, likewise I still love the language itself, and is one of those that I usually reach for in side projects, where I can freely turn to 100% all the safety features available to me, without the usual drama from some C++ circles.

saagarjha 2 days ago|||

Some of them are unfortunately on language committees.

myworkinisgood 2 days ago||

[flagged]

rurban 2 days ago||

Additionally I still cannot understand why they didn't make iterators safe from the very beginning. In the alias examples some iterators must alias and some not. With safe iterators the checks would be trivial, as just the base pointers need to be compared. This could be done even at compile-time, when all iterators bases are known at compile-time.

Their argument then was that iterators are just simple pointers, not a struct of two values base + cur. You don't want to pass two values in two registers, or even on the stack. Ok, but then call them iterators, call them mere pointers. With safe iterators, you could even add the end or size, and don't need to pass begin() and end() to a function to iterate over a container or range. Same for ranges.

A iterator should have just have been a range (with a base), so all checks could be done safely, the API would look sane, and the calls could be optimized for some values to be known at compile-time. Now we have the unsafe iterators, with the aliasing mess, plus ranges, which are still unsafe and ill-designed. Thanksfully I'm not in the library working group, because I would have had heart attacks long time ago over their incompetence.

My CTL (the STL in C) uses safe iterators, and is still comparable in performance and size to C++ containers. Wrong aliasing and API usage is detected, in many cases also at compile-time.

sirwhinesalot 2 days ago||

We're talking about a commitee that still releases "safety improving" constructs like std::span without any bounds checking. Don't think about it too much.

munificent 2 days ago|||

The C++ committee and standard library folks are in a hard spot.

They have two goals:

1. Make primitives in the language as safe as they can.

2. Be as fast as corresponding completely unsafe C code.

These goals are obviously in opposition. Sometimes, if you're lucky, you can improve safety completely at compile time and after the safety is proven, the compiler eliminates everything with no overhead. But often you can't. And when you can't, C++ folks tend to prioritize 2 over 1.

You could definitely argue that that's the wrong choice. At the same time, that choice is arguably the soul of C++. Making a different choice there would fundamentally change the identity of the language.

But I suspect that the larger issue here is cultural. Every organization has some foundational experiences that help define the group's identity and culture. For C++, the fact that the language was able to succeed at all instead of withering away like so many other C competitors did is because it ruthlessly prioritized performance and C compatibility over all other factors.

Back in the early days of C++, C programmers wouldn't sacrifice an ounce of performance to get onto a "better" language. Their identity as close-to-the-metal programmers was based in part on being able to squeeze more out of a CPU than anyone else could. And, certainly, at the time, that really was valuable when computers were three orders of magnitude slower than they are today.

That culture still pervades C++ where everyone is afraid of a performance death of a thousand cuts.

So the language has sort of wedged itself into an untenable space where it refuses to be any slower than completely guardrail-less machine code, but where it's also trying to be safer.

I suspect that long-term, it's an evolutionary dead end. Given the state of hardware (fast) and computer security failures (catastrophically harmful), it's worth paying some amount of runtime cost for safer languages. If you need to pay an extra buck or two for a slightly faster chip, but you don't leak national security secrets and go to jail, or leak personal health information and get sued for millions... buy the damn chip.

tialaramex 1 day ago|||

The primary goal of WG21 is and always has been compatibility, and particularly compatibility with existing C++ (though compatibility with C is important too).

That's why C++ 11 move is not very good. The safe "destructive" move you see in Rust wasn't some novelty that had never been imagined previously, it isn't slower, or more complicated, it's exactly what programmers wanted at the time, however C++ could not deliver it compatibly so they got the C++ 11 move (which is more expensive and leaves a trail of empty husk objects behind) instead.

You're correct that the big issue is culture. Rust's safety culture is why Rust is safe, Rust's safety technology merely† enables that culture to thrive and produce software with excellent performance. The "Safe C++" proposal would grant C++ the same technology but cannot gift it the same culture.

However, I think in many and perhaps even most cases you're wrong to think C++ is preferring better performance over safety, instead, the committee has learned to associate unsafe outcomes with performance and has falsely concluded that unsafe outcomes somehow enable or engender performance when that's often not so. The ISO documents do not specify a faster language, they specify a less safe language and they just hope that's faster.

In practice this has a perverse effect. Knowing the language is so unsafe, programmers write paranoid software in an attempt to handle the many risks haunting them. So you will find some Rust code which has six run-time checks to deliver safety - from safe library code, and then the comparable C++ code has fourteen run-time checks written by the coder, but they missed two, so it's still unsafe but it's also slower.

I read a piece of Rust documentation for an unsafe method defined on the integers the other day which stuck with me for these conversations. The documentation points out that instead of laboriously checking if you're in a case where the unsafe code would be correct but faster, and if so calling the unsafe function, you can just call the safe function - which already does that for you.

† It's very impressive technology, but I say "merely" here only to emphasise that the technology is worth nothing without the culture. The technology has no problem with me labelling unsafe things (functions, traits, attributes now) as safe, it's just a label, the choice to ensure they're labelled unsafe is cultural.

zozbot234 1 day ago|||

Rust is unique in being both safe and nearly as fast as idiomatic C/C++. This is a key differentiator between Rust and languages that rely on obligate GC or obligate reference counting for safety, including Golang, Swift, Ocaml etc.

adgjlsfhk1 1 day ago||

it's not clear to me that GC is actually slower than manual memory management (as long as you allow for immutable objects. allocation/free is slow so most high performance programs don't have useless critical path allocations anyway.

tialaramex 1 day ago||

For memory there are definitely cases where the GC is faster. This is trivially true.

However, GC loses determinism, so if you have non-memory resources where determinism matters you need the same mechanism anyway, and something like a "defer" statement is a poor substitute for the deterministic destruction in languages which have that.

Determinism can be much more important than peak performance for some problems. When you see people crowing about "lock free" or even "wait free" algorithms, the peak performance on these algorithms is often terrible, but that's not why we want them. They are deterministic, which means we can say definite things about what will happen and not just hand wave.

jcelerier 2 days ago||

> My CTL (the STL in C) uses safe iterators, and is still comparable in performance and size to C++ containers

I wonder, what's "comparable" there ? Because for instance MSVC, libstdc++ and libc++ supports some kind of safe iterators but they are definitely not useable for production due to the heavy performance cost incurred.

randomNumber7 2 days ago||

This is a great article and shows why its so hard to program in C++. When you do not deeply understand the reasons why those examples behave as they do, your code is potentially dangerous.

WalterBright 1 day ago||

> What are sort’s preconditions? 1. The first and last iterators must point at elements from the same container. 2. first must not indicate an element that appears after last. 3. first and last may not be dangling iterators.

This is a fundamental problem in C++ where a range is specified by the starting point and the ending point. This is because iterators in C++ are abstractions of a pointer.

D took a different approach. A range in D is an abstraction of an array. An array is specified by its starting point and its length. This inherently solves points one and two (not sure about three).

Sort then has a prototype of:

    Range sort(Range);

MathMonkeyMan 2 days ago|

Why is `f3` safe?

My thought (which is apparently wrong) is that the `const int& x` refers to memory that might be freed during `vector::push_back`. Then, when we go to construct the new element that is a copy of `x`, it might be invalid to read `x`. No?

Is this related to how a reference-to-const on the stack can extend the lifetime of a temporary to which it's bound? I didn't think that function parameters had this property (i.e. an implicit copy).

seanbax 2 days ago||

If the vector has to be resized in push_back, the implementation copy-constructs a new object from x. It resizes the buffer. Then it move-constructs the copy into the buffer. The cost is only a move construct per resize... So it's on the order of log2(capacity). Very efficient. But don't get used to it, because there are plenty of other C++ libraries that will break under aliasing.

embe42 2 days ago||

I think the implementation in C++ does not do realloc on push_back, but allocates new memory, creates the new element and only then deallocates it.

I believe the reason f3 is safe is that the standard says that the iterators/references are invalidated after push_back – so push_back needs to be carefully written to accept aliasing pointers.

I am pretty sure if I were writing my own push_back it would do something like "reserve(size() + 1), copy element into the new place", and it would have different aliasing requirements...

(For me this is a good example of how subtle such things are)

tialaramex 23 hours ago|||

I know this wasn't your main point here but with the C++ std::vector's reserve you must not do this because it will destroy the amortized constant time growth. Rust's Vec can Vec::reserve the extra space because it has both APIs here.

The difference is what these APIs do when we're making small incremental growth choices, Vec::reserve_exact and the std::vector reserve both grow to the exact size asked for, so if we do this for each entry inserted eight times we grow 1, 2, 3, 4, 5, 6, 7, 8 -- we're always paying to grow.

However when Vec::reserve needs to grow it either doubles, or grows to the exact size if bigger than a double. So for the same pattern we grow 1, 2, 4, no growth, 8, no growth, no growth, no growth.

There's no way to fix C++ std::vector without providing a new API, it's just a design goof in this otherwise very normal growable array type. You can somewhat hack around it, but in practice people will just advise you to never bother using reserve except once up front in C++, whereas you will get a performance win in Rust by using reserve to reserve space.

MathMonkeyMan 2 days ago|||

> allocates new memory, creates the new element and only then deallocates it.

Ah, I'll have to check the standard. Thanks.

More comments...