Likewise, pushing half solutions like profiles that are still pretty much a paper idea, other than what already exists in static analysers, might decrease C++'s relevance in some domains, and eventually those pushing for them might find themselves in the position that adopting Safe C++ (circle's design) would have been a much better decision.
The problem with ISO driven languages, is who's around in the room when voting takes place.
This is a not a jab against static analyzers, by all means use them, but I don't think they are a good fit as part of the language.
Further, the clang tidy and VC++ analysis based on some of the previous work, e.g. lifetime analysis paper from 2015, barely work, full of false positives.
I was looking forward to it in VC++, and to this day in VC++ latest, it still leaves too much on the table.
[0]: https://doc.rust-lang.org/reference/lifetime-elision.html
The above is forced by Rust, which would be nice, but the conventions are easy enough if you try at all. But most developers refuse to write anything more than C++98.
I think the bigger mistake is equating memory safety with C++11 smart pointers. They buy you a little, but not the whole buffet. There are a lot of C++ developers that think memory safety is a skill issue and if you just use "best practices with C++11 or higher" then you get it - when evidence proves to the contrary.
Which is why safety profiles are so interesting, they are something I should be able to turn on/off on a file by file basis and thus easily force the issue.
Of course profiles don't exist yet (and what is proposed is very different from what this article is arguing against) and so it remains to be seen if they will be adopted and if so how useful they will be.
I swear I'm not trying to be snarky or rude here, but is it actually a "convention" if almost nobody follows it? This seems like one example of my general issue with C++, in that it could be great if everyone agreed to a restricted subset, but of course nobody can coordinate such agreement and it doesn't exist outside companies large and important enough to enforce their own in-house C++ standards (e.g. Google).
Every once in a while someone who writes a lot of Rust will blog about some code they discovered that was 'unsafe' and after looking close they realized it wasn't doing something that fundamentally required unsafe (and often fixing the code to be safe fixed real bugs). C++ and Rust have to leave people enough rope to hang themselves in order to solve the problems they want to solve, but that means people will find a way to do stupid things.
That’s not a human problem. It’s like saying “this motorway is pitch black, frequently wet and slippery and has no safety barriers between sides, so crashes are frequent and fatal. What we have is a human problem - drivers should follow the convention of driving at 10mph, when it doesn’t rain and make sure they are on the right side of the road at all times”.
In other words: you can try limiting all cars to 10mph, closing the road, automatically switching out all car tyres with skid-proof versions while in motion, or anything else.
But… just turn the god damn lights on and put up a barrier between lanes. It works on every other road.
https://devblogs.microsoft.com/oldnewthing/20241023-00/?p=11...
Notice the use of C's memcpy() function.
This is exactly the kind of posts where showing best practises would be quite helpful, as education.
And don't get me started on C itself. Jesus Christ.
They are the only C++ compiler that properly supports all C++20 modules use cases, while clang still doesn't do Parallel STL from C++17, for example.
They support C17 nowadays, where many embedded folks are slowly adopting C99.
And the UNIX story outside clang and GCC is quite lame, most still stuck in C++14, catching up to C++17.
Likewise, consoles, C++17.
https://learn.microsoft.com/en-us/cpp/code-quality/understan...
Debugging a proc macro failure is miles and above easier than debugging template errors.
Not annotating is not making anything easier.
pub struct Step<'a, 'b> {
pub name: &'a str,
pub stage: &'b str,
pub is_last: bool,
}
struct Request<'a, 'b, 'c, 'd, 'e> {
step: &'a Step<'d, 'e>,
destination: &'c mut [u8],
size: &'b Cell<Option<usize>>,
}
To be sure, they were seeking advice on how to simplify it, but I imagine those with a more worse-is-better technical sensibility arguing that a language simply should not allow code like that to ever be written.I also hear that higher-ranked trait bounds can get scary even within a single function signature, but I haven't had cause to actually work with them.
I think you two are ultimately talking about slightly different things, your parent is trying to point out that, even if this signature is complex, it can’t get more complex than this: one lifetime per reference means the complexity has an upper bound.
The article states "A C++ compiler can infer nothing about aliasing from a function declaration." Which is true, but assumes that the compiler only looks at the function declaration. In the examples given, an analyzer could look at the function bodies and propagate the aliasing requirements upward, attaching them to the function declaration in some internal data structure. Then the analyzer ensures that those functions are used correctly at every call site. Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know. Do the same for lifetimes. Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?
I like not having to see and think about lifetimes and aliasing problems most of the time, and it would be nice if the compiler (or borrow checker) just kept track of those without requiring me to explicitly annotate them everywhere.
From P1179: "This paper ... shows how to efficiently diagnose many common cases of dangling (use-after-free) in C++ code, using only local analysis to report them as deterministic readable errors at compile time."
Local analysis only. It's not looking in function definitions.
Whole program analysis is extremely complicated and costly to compute. It's not comparable to return type deduction or something like that.
This assumes no recursive functions, no virtual functions/function pointers, no external functions etc etc
> Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?
Aliasing is much trickier than type inference.
For example aliasing can change over time (i.e. some variables may alias at some point but not at a later point, while types are always the same) and you want any analysis to reflect it because you will likely rely on that.
Granularity is also much more important: does a pointer alias with every element of a vector or only one? The former is surely easier to represent, but it may unnecessary propagate and result in errors.
So effectively you have an infinite domain of places that can alias, while type inference is limited to locals, parameters, functions, etc etc. And even then, aliasing is quadratic, because you want to know which pairs of places alias.
I hope you can see how this can quickly get impractical, both due to the complexity of the analysis and the fact that small imprecisions can result in very big false positives.
Even if a sufficiently advanced proof assistant could internally maintain and propagate constraints up through functions (eg. 'vec must not alias x'), your point about small imprecisions cascading into large false positives is well made.
Bottom up constraints become increasingly difficult to untangle the further away they get from their inception, whereas top down rules such as "no mutable aliasing" are much easier to reason about locally.
Your comment will be more interesting if you expand upon it.
The profiles proposal focuses on a lack of annotations (I think there’s reasonable criticism that this isn’t achieved by it though…), and believing they can get 80% of the benefit for 20% of the effort (at least conceptually, obviously not those exact numbers). They aren’t shooting for full memory safety.
The Safe C++ proposal asks “how do we achieve 100% memory safety by default?”. And then asks what is needed to achieve that goal.
I hesitate to answer your question, but my impression is the answer is that they’re just not shooting for 100% safety, and so it’s acceptable to miss this kind of case.
I don't think they did think that. Having listened to a few podcasts with the safety profile advocates I've gotten the impression that their answer to any question about "right, but how would you actually do that?" is "well, we'll see, and in general there's other problems to think about, too!".
C++ is so complex that it's hard to think through all the implications of design proposals like this.
So practically speaking, the only way to prove a design change is to implement it and get lots of people to take it for a test drive.
But it's hard to find enough people willing to do that in earnest, so the only real way to test the idea is to make it part of the language standard.
In Sean’s “Safe C++” proposal, he extends C++ to enable new code to embed new assumptions, then subsets that extension to permit drawing new conclusions for safety by eliminating code that would violate the path to those safety conclusions.
WG21 hasn't been able to solve the restrict type qualifier, or make a better alternative, in over twenty years. IMO, hoping that WG21 adequately solves Safe C++ is nothing more than wishful thinking, to put it charitably.
I never said it would be easy, or probable. But I’m also the kind who hopes for the best.
This is a very political subject, and WG21 doesn't have a core team, rather everything goes through votes.
It suffices to have the wrong count in the room when it is time to vote.
Safe C++ looks excellent - its adoption would go a long way toward validating his steadfast belief that C++ can evolve to keep up with the world.
What makes you say this? It seems to me like we already have a lower-overhead approach to reach the same goal (a low-level language with substantially improved semantic specificity, memory safety, etc.); namely, we have Rust, which has already improved substantially over the safety properties of C++, and offers a better-designed platform for further safety research.
Google's recent analysis on their own experiences transitioning toward memory safety provide even more evidence that you don't need to fully transition to get strong safety benefits. They incentivized moving new code to memory safe languages, and continued working to actively assure the existing memory unsafe code they had. In practice, they found that vulnerability density in a stable codebase decays exponentially as you continue to fix bugs. So you can reap the benefits of built-in memory safety for new code while driving down latent memory unsafety in existing code to great effect. [2]
[1]: https://www.alilleybrinker.com/blog/cpp-must-become-safer/
[2]: https://security.googleblog.com/2024/09/eliminating-memory-s...
Eventually everything will be rewritten in Rust or successors thereof. It's the only approach that works, and the only approach that can work, and as the cost of bugs continues to increase, continuing to use memory-unsafe code will cease to be a viable option.
yet the idea that a project no longer actively developed will be rewritten in rust is not?
Rewriting it in Rust while continuing to actively develop the project is a lot more plausible than keeping it in C++ and being able to "maintain a stable codebase" but somehow still fix bugs.
(Keeping it in C++ and continuing active development is plausible, but means the project will continue to have major vulnerabilities)
The above issue is why my code is nearly all C++ - C++ was the best choice we had 15 years ago and mixing languages is hard unless you limit yourself to C (unreasonably simple IMO). D is the only language I'm aware of that has a good C++ interoperability story (I haven't worked with D so I don't know how it works in practice). Rust is really interesting, but it is hard to go from finishing a "hello world" tutorial in Rust to putting Rust in a multi-million line C++ program.
I'm not happy with my situation, but I need a good way out. Plain C interfaces are terrible, C++ for all the warts is much better (std::string has a length so no need for strlen all over)
There is of course the "small" matter that Safe C++ doesn't exist yet, but Google analysis showing that requiring only new code to be safe is good enough, is a strong reason for developing a Safe C++.
I know it's intended just to express disagreement, but this comes across as extremely dismissive (to me, anyway).
Yeah, but it's also not going to be rewritten in safe C++.
Meanwhile, throwing everything away and rewriting it from scratch in another language has never been an option for any of those projects. Furthermore, even when there has been interest and buy-in to incrementally move to Rust in principle, in practice most of the time we evaluate using Rust for new features, the amount of existing code it must touch and the difficulty integrating Rust and C++ meant that we usually ended up using C++ instead.
If features of Circle C++ were standardized, or at least stabilized with wider support, we would certainly start adopting them as well.
[1] https://github.com/duneroadrunner/SaferCPlusPlus-AutoTransla...
Basically, I want a variety of approaches, not a Rust monoculture.
Not that this invalidates your broader point about Safe C++, but this particular issue could also be solved by Rust shipping clang / a frontend that can also compile C and C++.
It was so much easier (for me; I am bad at build systems) that I plan to do that for future projects.
There’s just something about `cargo run`…
pip install rust
Would be awesome!If you were worried about clang flag stability not being as stable as Rust, you could also include clang as part of llvm-tools. This would add an extra step to set up, but is still easier than today.
Of course, in both cases there’s still the work of having rust up (or rustc, depending on the strategy) set up the sysroot. I’m not saying this is trivial to do, but it would make cross compilation so much better than today, and bring rust to parity with Zig and Go on this front.
Improved C++'s safety means that the C++ code underlying several JVM implementations, CLR, V8, GCC and LLVM, CUDA, Unreal, Godot, Unity,... also gets a way to be improved, without a full rewrite, which while possible might not be economically feasible.
But, realistically, C++ will survive for as long as global technological civilization does. There are still people out there maintaining Fortran codebases.
(also, IDK if you already realized this, but it's funny that the person you're replying to is one of the most famous Rust boosters out there, in fact probably the most famous, at least on HN).
I became a Rust fan because of its innovations in the space. That its innovations may spread elsewhere is a good thing, not a bad thing. If a language comes along that speaks to me more than Rust does, I’ll switch to that. I’m not a partisan, even if it may feel that way from the outside.
Rewriting all the existing C++ code in Rust is extremely high-cost. Practically speaking, that means it won't happen in many, many cases.
I think we want to find a more efficient way to achieve memory safety in C++.
Not to mention, Rust's safety model isn't that great. It does memory safety, which is good, but it's overly restrictive, disallowing various safe patterns. I suspect there are better safe alternatives out there for most cases, or at least could be. It would make sense to consider the alternatives before anyone rewrites something in Rust.
The "safe" patterns Rust disallows tend to not account for safe modularity - as in, they impose complex, hard-to-verify requirements on outside code if "safety" is to be preserved. This kind of thing is essentially what the "unsafe" feature in Rust is intended to address.
The problem with such a proposal is that the cost is impossibly high for many, many cases. Effectively, across the entire existing C++ code base, you get "X% rewrite it in Rust plus (1-X)% do nothing at all", where X is probably a lot closer to 0 than 1.
If your goal is to address as many vulnerabilities as possible, you might want to look for a better plan.
I don't have a ready plan, but the general approach of incrementally improving the safety of existing C++ seems likely to be more effective than rewrites to me -- it could let the X in my formula move a lot closer to 1. Possibly one of the existing mechanisms for this is already better than "RIIR".
Edit, I meant to add:
For many, many things it's not the eleventh hour. For a lot of existing C++ code, no one has reached a final decision point. Many haven't really started at all and are at the 0th hour.
Do you mind if we have more than one approach?
There were many similar issues when it came to the earlier attempts to add concepts to C++ (which would improve template dispatch), although the outcome was more about improving C++ programmer's lives, not safety.
It turned out trying to encapsulate all the things C++ functions, even in the standard library, as a list of concepts, was basically impossible. There are so many little corner-cases in C++ which need representing as a concept, the list of 'concepts' a function needed often ended up being longer than the function itself.
I never been that big into C, although I do know it relatively well, as much as anyone can claim to do so, because it is a key language in anything UNIX/POSIX and Windows anyway.
One of the appealing things back then were the C++ frameworks that were provided alongside C++ compilers, pre-ISO C++98, all of them with more security consideration than what ended up landing on the standard library, e.g. bounds checking by default on collection types.
Nowadays I rather spend my time in other languages, and reach out to C++ on a per-need basis, as other language communities take the security discussion more seriously.
However, likewise I still love the language itself, and is one of those that I usually reach for in side projects, where I can freely turn to 100% all the safety features available to me, without the usual drama from some C++ circles.
Their argument then was that iterators are just simple pointers, not a struct of two values base + cur. You don't want to pass two values in two registers, or even on the stack. Ok, but then call them iterators, call them mere pointers. With safe iterators, you could even add the end or size, and don't need to pass begin() and end() to a function to iterate over a container or range. Same for ranges.
A iterator should have just have been a range (with a base), so all checks could be done safely, the API would look sane, and the calls could be optimized for some values to be known at compile-time. Now we have the unsafe iterators, with the aliasing mess, plus ranges, which are still unsafe and ill-designed. Thanksfully I'm not in the library working group, because I would have had heart attacks long time ago over their incompetence.
My CTL (the STL in C) uses safe iterators, and is still comparable in performance and size to C++ containers. Wrong aliasing and API usage is detected, in many cases also at compile-time.
They have two goals:
1. Make primitives in the language as safe as they can.
2. Be as fast as corresponding completely unsafe C code.
These goals are obviously in opposition. Sometimes, if you're lucky, you can improve safety completely at compile time and after the safety is proven, the compiler eliminates everything with no overhead. But often you can't. And when you can't, C++ folks tend to prioritize 2 over 1.
You could definitely argue that that's the wrong choice. At the same time, that choice is arguably the soul of C++. Making a different choice there would fundamentally change the identity of the language.
But I suspect that the larger issue here is cultural. Every organization has some foundational experiences that help define the group's identity and culture. For C++, the fact that the language was able to succeed at all instead of withering away like so many other C competitors did is because it ruthlessly prioritized performance and C compatibility over all other factors.
Back in the early days of C++, C programmers wouldn't sacrifice an ounce of performance to get onto a "better" language. Their identity as close-to-the-metal programmers was based in part on being able to squeeze more out of a CPU than anyone else could. And, certainly, at the time, that really was valuable when computers were three orders of magnitude slower than they are today.
That culture still pervades C++ where everyone is afraid of a performance death of a thousand cuts.
So the language has sort of wedged itself into an untenable space where it refuses to be any slower than completely guardrail-less machine code, but where it's also trying to be safer.
I suspect that long-term, it's an evolutionary dead end. Given the state of hardware (fast) and computer security failures (catastrophically harmful), it's worth paying some amount of runtime cost for safer languages. If you need to pay an extra buck or two for a slightly faster chip, but you don't leak national security secrets and go to jail, or leak personal health information and get sued for millions... buy the damn chip.
That's why C++ 11 move is not very good. The safe "destructive" move you see in Rust wasn't some novelty that had never been imagined previously, it isn't slower, or more complicated, it's exactly what programmers wanted at the time, however C++ could not deliver it compatibly so they got the C++ 11 move (which is more expensive and leaves a trail of empty husk objects behind) instead.
You're correct that the big issue is culture. Rust's safety culture is why Rust is safe, Rust's safety technology merely† enables that culture to thrive and produce software with excellent performance. The "Safe C++" proposal would grant C++ the same technology but cannot gift it the same culture.
However, I think in many and perhaps even most cases you're wrong to think C++ is preferring better performance over safety, instead, the committee has learned to associate unsafe outcomes with performance and has falsely concluded that unsafe outcomes somehow enable or engender performance when that's often not so. The ISO documents do not specify a faster language, they specify a less safe language and they just hope that's faster.
In practice this has a perverse effect. Knowing the language is so unsafe, programmers write paranoid software in an attempt to handle the many risks haunting them. So you will find some Rust code which has six run-time checks to deliver safety - from safe library code, and then the comparable C++ code has fourteen run-time checks written by the coder, but they missed two, so it's still unsafe but it's also slower.
I read a piece of Rust documentation for an unsafe method defined on the integers the other day which stuck with me for these conversations. The documentation points out that instead of laboriously checking if you're in a case where the unsafe code would be correct but faster, and if so calling the unsafe function, you can just call the safe function - which already does that for you.
† It's very impressive technology, but I say "merely" here only to emphasise that the technology is worth nothing without the culture. The technology has no problem with me labelling unsafe things (functions, traits, attributes now) as safe, it's just a label, the choice to ensure they're labelled unsafe is cultural.
However, GC loses determinism, so if you have non-memory resources where determinism matters you need the same mechanism anyway, and something like a "defer" statement is a poor substitute for the deterministic destruction in languages which have that.
Determinism can be much more important than peak performance for some problems. When you see people crowing about "lock free" or even "wait free" algorithms, the peak performance on these algorithms is often terrible, but that's not why we want them. They are deterministic, which means we can say definite things about what will happen and not just hand wave.
I wonder, what's "comparable" there ? Because for instance MSVC, libstdc++ and libc++ supports some kind of safe iterators but they are definitely not useable for production due to the heavy performance cost incurred.
This is a fundamental problem in C++ where a range is specified by the starting point and the ending point. This is because iterators in C++ are abstractions of a pointer.
D took a different approach. A range in D is an abstraction of an array. An array is specified by its starting point and its length. This inherently solves points one and two (not sure about three).
Sort then has a prototype of:
Range sort(Range);
My thought (which is apparently wrong) is that the `const int& x` refers to memory that might be freed during `vector::push_back`. Then, when we go to construct the new element that is a copy of `x`, it might be invalid to read `x`. No?
Is this related to how a reference-to-const on the stack can extend the lifetime of a temporary to which it's bound? I didn't think that function parameters had this property (i.e. an implicit copy).
I believe the reason f3 is safe is that the standard says that the iterators/references are invalidated after push_back – so push_back needs to be carefully written to accept aliasing pointers.
I am pretty sure if I were writing my own push_back it would do something like "reserve(size() + 1), copy element into the new place", and it would have different aliasing requirements...
(For me this is a good example of how subtle such things are)
The difference is what these APIs do when we're making small incremental growth choices, Vec::reserve_exact and the std::vector reserve both grow to the exact size asked for, so if we do this for each entry inserted eight times we grow 1, 2, 3, 4, 5, 6, 7, 8 -- we're always paying to grow.
However when Vec::reserve needs to grow it either doubles, or grows to the exact size if bigger than a double. So for the same pattern we grow 1, 2, 4, no growth, 8, no growth, no growth, no growth.
There's no way to fix C++ std::vector without providing a new API, it's just a design goof in this otherwise very normal growable array type. You can somewhat hack around it, but in practice people will just advise you to never bother using reserve except once up front in C++, whereas you will get a performance win in Rust by using reserve to reserve space.
Ah, I'll have to check the standard. Thanks.