Posted by HeliumHydride 12/3/2025
Wouldn‘t it be nice if we could transform back from the canonical form into the most readable code? In the example that would convert all functions into x + y.
https://www.embeddedrelated.com/thread/4749/when-and-how-to-...
It's super cool to see this in practice, and for me it helps putting more trust in the compiler that it does the right thing, rather than me trying to micro-optimize my code and peppering inline qualifiers everywhere.
In the OP examples, instead of optimization, what I would prefer is a separate analysis tool that reports what optimizations are possible and a compiler that makes it easy to write both high level and machine code as necessary. Now instead of the compiler opaquely rewriting your code for you, it helps guide you into writing optimal code at the source level. This, for me, leads to a better equilibrium where you are able to express your intent at a high level and then, as needed, you can perform lower level optimizations in a transparent and deterministic way.
For me, the big value of existing optimizing compilers is that I can use them to figure out what instructions might be optimal for my use case and then I can directly write those instructions where the highest performance is needed. But I do not need to subject myself to the slow compilation times (which compounds as the compiler repeatedly reoptimizes the same function thousands of times during development -- a cost that is repeated with every single compilation of the file) nor the possibility that the optimizer breaks my code in an opaque way that I won't notice until something bad and inscrutable happens at runtime.
unsigned add(unsigned x, unsigned y) {
std::vector vx {x};
std::vector vy {y};
auto res = vx[0]+vy[0];
return res;
}If you look at GAS manual https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/a... almost every other architecture has architecture specific syntax notes, in many cases for something as trivial comments. If they couldn't even decide on single symbols for comments, there is no hope for everything else.
ARM isn't the only architecture where GAS uses similar syntax as developers of corresponding CPU arch. They are not doing the same for X86 due to historical choices inherited from Unix software ecosystem and thus AT&T. If you play around on Godbolt with compilers for different architectures it seems like x86 and use AT&T syntax is the exception, there are a few other which use similar syntax but it's a minority.
Why not use same syntax for all architectures? I don't really know all the historical reasoning but I have a few guesses and each arch probably has it's own historic baggage. Being consistent with manufacturer docs and rest of ecosystem has the obvious benefits for the ones who need to read it. Assembly is architecture specific by definition so being consistent across different architectures has little value. GAS is consistent with GCC output. Did GCC added support for some architectures early with the with help of manufacturers assembler and only later in GAS? A lot of custom syntax quirks which don't easily fit into Intel/AT&T model and are related to various addressing modes used by different architectures. For example ARM has register postincrement/preincrement and the 0 cost shifts, arm doesn't have the subregister acess like x86 (RAX/EAX/AX/AH/AL) and non word access is more or less limited to load/store instructions unlike x86 where it can show up in more places. You would need to invent quite a few extensions for AT&T syntax for it to be used by all the non x86 architectures, or you could just use the syntax made by developer of architecture.
My question is more, why even try to use the same syntax for all architectures? I thought that was what GAS's approach was: that they took AT&T syntax, which historically was unified syntax for several PDPs (and some other ISA, I believe? VAX?) and they made it fit every other ISA they supported. Except apparently no, they didn't, they adopted the vendors' syntaxes for other ISAs but not for Intel's x86? Why? It just boggles my mind.
For amd64 there isn't really much of an excuse. Intel syntax was already supported in gas.
The extent to which you can "fool the optimizer" is highly dependent on the language and the code you're talking about. Python is a great example of a language that is devilishly hard to optimize for precisely because of the language semantics. C and C++ are entirely different examples with entirely different optimization issues, usually which have to do with pointers and references and what the compiler is allowed to infer.
The point? Don't just assume your compiler will magically make all your performance issues go away and produce optimal code. Maybe it will, maybe it won't.
As always, the main performance lessons should always be "1) Don't prematurely optimize", and "2) If you see perf issues, run profilers to try to definitively nail where the perf issue is".
The most common are on function calls involving array operations and pointers, but a lot of it has to do with the C/C++ header and linker setup as well. C and C++ authors should not blithely assume the compiler is doing an awesome job, and in my experience, they don't.
Agree. And I'm sure the author agrees as well. That's why compiler-explorer exists in the first place.