Posted by jandeboevrie 1 day ago
The C++11 threadsafety guarantee on static initialization is explicitly scoped to block local statics. That's not an implementation detail, that's the guarantee.
The __cxa_guard_acquire/release machinery in the assembly is the standard fulfilling that contract. Move to a private static data member and you're outside that guarantee entirely. You've quietly handed that responsibility back to yourself.
Then there's the static initialization order fiasco, which is the whole reason the meyers singleton with a local static became canonical. Block local static initializes on first use, lazily, deterministically, thread safely. A static data member initializes at startup in an order that is undefined across translation units. If anything touches Instance() during its own static initialization from a different TU, you're in UB territory. The article doesn't mention this.
Real world singleton designs also need: deferred/configuration-driven initialization, optional instantiation, state recycling, controlled teardown. A block local static keeps those doors open. A static data member initializes unconditionally at startup, you've lost lazy-init, you've lost the option to not initialize it, and configuration based instantiation becomes awkward by design.
Honestly, if you're bottlenecking on singleton access, that's design smell worth addressing, not the guard variable.
There's a large group of engineers who are totally unaware of Amdahl's law and they are consequently obsessed with the performance implications of what are usually most non-important parts of the codebase.
I learned that being in the opposite group of people became (or maybe has been always) somewhat unpopular because it breaks many of the myths that we have been taught for years, and on top of which many people have built their careers. This article may or may not be an example of that. I am not reading too much into it but profiling and identifying the actual bottlenecks seems like a scarce skill nowadays.
I feel likethe mindset you are describing is kind of this intermediate senior level. Sadly a lot of programmers can get stuck there for their whole career. Even worse when they get promoted to staff/principal level and start spreading dogma.
I 100 percent agree. If you can't show me a real world performance difference you are just spinning your wheels and wasting time.
https://web.archive.org/web/20200920132133/https://blogs.bla...
Focusing on micro-"optimizations" like this one do absolutely nothing for performance (how many times are you actually calling Instance() per frame?) and skips over the absolutely-mandatory PROFILE BEFORE YOU OPTIMIZE rule.
If a coworker asked me to review this CL, my comment would be "Why are you wasting both my time and yours?"
In my view, the article is not about optimizing, but about understanding how things work under the hood. Which is interesting for some.
If a coworker submitted a patch to existing code, I'd be right there with you. If they submitted new code, and it just so happened to be using this more optimal strategy, I wouldn't blink twice before accepting it.
A bit like how java people insisted on making naive getFoo() and setFoo() to pretend that was different from making foo public
I ended up using std::call_once for those cases. More boilerplate but at least you're not debugging init order at 2am.