Top
Best
New

Posted by ingve 4/15/2025

What the hell is a target triple?(mcyoung.xyz)
171 points | 133 commentspage 2
jkelleyrtp 4/15/2025|
The author's blog is a FANTASTIC source of information. I recommend checking out some of their other posts:

- https://mcyoung.xyz/2021/06/01/linker-script/

- https://mcyoung.xyz/2023/08/09/yarns/

- https://mcyoung.xyz/2023/08/01/llvm-ir/

eqvinox 4/15/2025||
Given TFA's bias against GCC, I'm not so sure. e.g. looking at the linker script article… it's also missing the __start_XYZ and __stop_XYZ symbols automatically created by the linker.
matheusmoreira 4/15/2025|||
It also focuses exclusively on sections. I wish it had at least mentioned segments, also known as program headers. Linux kernel's ELF loader does not care about sections, it only cares about segments.

Sections and segments are more or less the same concept: metadata that tells the loader how to map each part of the file into the correct memory regions with the correct memory protection attributes. Biggest difference is segments don't have names. Also they aren't neatly organized into logical blocks like sections are, they're just big file extents. The segments table is essentially a table of arguments for the mmap system call.

Learning this stuff from scratch was pretty tough. Linker script has commands to manipulate the program header table but I couldn't figure those out. In the end I asked developers to add command line options instead and the maintainer of mold actually obliged.

Looks like very few people know about stuff like this. One can use it to do some heavy wizardry though. I leveraged this machinery into a cool mechanism for embedding arbitrary data into ELF files. The kernel just memory maps the data in before the program has even begun execution. Typical solutions involve the program finding its own executable on the file system, reading it into memory and then finding some embedded data section. I made the kernel do almost all of that automatically.

https://www.matheusmoreira.com/articles/self-contained-lone-...

o11c 4/15/2025|||
I wouldn't call them "same concept" at all. Segments (program headers) are all about the runtime (executables and shared libraries) and are low-cost. Sections are all about development (.o files) and are detailed.

Generally there are many sections combined into a single segment, other than special-purpose ones. Unless you are reimplementing ld.so, you almost certainly don't want to touch segments; sections are far easier to work with.

Also, normally you just just call `getauxval`, but if needed the type is already named `ElfW(auxv_t)*`.

matheusmoreira 4/16/2025||
> I wouldn't call them "same concept" at all.

They are both metadata about file extents and their memory images.

> sections are far easier to work with

Yes. They are not, however, loaded into memory by default. Linkers do not generate LOAD segments for section metadata since they are not needed for execution. Thus it's impossible for a program to introspect its own sections without additional logic and I/O to read them into memory.

> Also, normally you just just call `getauxval`, but if needed the type is already named `ElfW(auxv_t)*`.

True. I didn't use it because it was not available. I wrote my article in the context of a freestanding nolibc program.

o11c 4/16/2025||
Right, but you can just use the section start/end symbols for a section that already goes into a mapped segment.
matheusmoreira 4/16/2025||
Can you show me how that would work?

It's trivial to put arbitrary files into sections:

  objcopy --add-section program.files.1=file.1.dat \
          --add-section program.files.2=file.2.dat \
          program program+files
The problem is the program.files.* sections do not get mapped in by a LOAD segment. I ended up having to write my own tool to patch in a LOAD segment into the segments table because objcopy does not have the ability to do it.

Even asked a Stack Overflow question about this two years ago:

https://stackoverflow.com/q/77468641

The only answer I got told me to simply read the sections into memory via /proc/self/exe or edit the segments table and make it so that the LOAD segments cover the whole file. I eventually figured out ways to add LOAD segments to the table. By that point I didn't need sections anymore, just a custom segment type.

o11c 4/16/2025||
The whole point of section names is that they mean something. If you give it a name that matches `.rodata.*` it will be part of the existing read-only LOADed segments, or `.data.*` for (private) read-write.

Use `ld --verbose` to see what sections are mapped by default (it is impossible for a linker to work without having such a linker script; we're just lucky that GNU ld exposes it in a sane form rather than hard-coding it as C code). In modern versions of the linker (there is still old documentation found by search engines), you can specify multiple SECTIONS commands (likely from multiple scripts, i.e. just files passed on the command line), but why would you when you can conform to the default one?

You should pick a section name that won't collide with the section names generated by `-fdata-sections` (or `-ffunction-sections` if that's ever relevant for you).

matheusmoreira 4/16/2025||
That requires relinking the executable. That is not always desirable or possible. Unless the dynamic linker ignores the segments table in favor of doing this on the fly... Even if that's the case, it won't work for statically linked executables. Only the dynamic linker can assign meaning to section names at runtime and the dynamic linker isn't involved at all in the case of statically linked programs.
eqvinox 4/16/2025|||
Absolutely agree. Had my own fun dealings with ELF, and to be clear, on plain mainline shipping products (amd64 Linux), not toys/exercise/funky embedded. (Wouldn't have known about section start/stop symbols otherwise)
sramsay 4/15/2025|||
I was really struck by the antipathy toward GCC. I'm not sure I quite understand where it's coming from.
forrestthewoods 4/15/2025||
What a great article.

Everytime I deal with target triples I get confused and have to refresh my memory. This article makes me feel better in knowing that target triples are an unmitigated cluster fuck of cruft and bad design.

> Go does the correct thing and distributes a cross compiler.

Yes but also no. AFAIK Zig is the only toolchain to provide native cross compiling out of the box without bullshit.

Missing from this discussion is the ability to specify and target different versions of glibc. Something that I think only Zig even attempts to do because Linux’s philosophy of building against local system globals is an incomprehensibly bad choice. So all these target triples are woefully underspecified.

I like that at least Rust defines its own clear list of target triples that are more rational than LLVM’s. At this point I feel like the whole concept of a target triples needs to be thrown away. Everything about it is bad.

SAI_Peregrinus 4/16/2025|
I'd say it's less bad design, & more a near-total lack of design. Someone needed a "good enough" way to specify target properties for a few targets they had to support, and picked a string format they could easily parse. Worked fine. Then more systems had to be added, and special cases happened, and nobody wanted to break backwards-compatibility so the system just grew. And nobody can agree on names, so people added alias support, and the system grew. And people started releasing OSes instead of just organizations so the "vendor" concept grew fuzzy, and the system grew. Now it is a hyphen-separated variable-length monster of confusion.

Ideally each component in the target "triple" would be a separate argument.

throw0101d 4/15/2025||
Noticed endians listed in the table. It seems like little-endian has basically taken over the world in 2025:

* https://en.wikipedia.org/wiki/Endianness#Hardware

Is there anything that is used a lot that is not little? IBM's stuff?

Network byte order is BE:

* https://en.wikipedia.org/wiki/Endianness#Networking

forrestthewoods 4/15/2025||
BE isn’t technically dead buts it’s practically dead for almost all projects. You can static_assert byte order and then never think about BE ever again.

All of my custom network serialization formats use LE because there’s literally no reason to use BE for network byte order. It’s pure legacy cruft.

f_devd 4/16/2025||
...Until you find yourself having to workaround legacy code to support some weird target that does still use BE. Speaking from experience (tbf usually lower level than anything actually networked, more like RS485 and friends).
forrestthewoods 4/16/2025||
18 years in my career and I’m still waiting for that BE target to rear its ugly head.

I’m more than happy to static_assert little endian. If any platform needs BE support then I’ll add support to the minimum amount of libraries necessary to do so. Super easy.

Here’s the thing. If you wrote BE compatible code today you probably dont even have a way to test it. So you’re adding a bunch of complexity and doing a bunch of work that you can’t even verify is correct! Complete and total waste of time.

dharmab 4/15/2025|||
LEON, used by the European Space Agency, is big endian.
naruhodo 4/16/2025||
Should have been called BEON.
Palomides 4/15/2025|||
IBM's Power chips can run in either little or big modes, but "used a lot" is a stretch
inferiorhuman 4/15/2025||
Most PowerPC related stuff (e.g. Freescale MPC5xx found in a bunch of automotiver applications) can run in either big or little endian mode, as can most ARM and MIPS (routers, IP cameras) stuff. Can't think of the last time I've seen any of them configured to run in big endian mode tho.
classichasclass 4/15/2025||
For the large Power ISA machines, it's most commonly when running AIX or IBM i these days, though the BSDs generally run big too.
rv3392 4/15/2025|||
Apart from IBM Power/AIX systems, SPARC/Solaris is another one. I wouldn't say either of these are used a lot, but there's a reasonable amount of legacy systems out there that are still being supported by IBM and Oracle.
formerly_proven 4/15/2025|||
10 years ago the fastest BE machines that were practical were then-ten year old powermacs. This hasn’t really changed. I guess they’re more expensive now.
eqvinox 4/15/2025||
e6500/T4240 are faster than powermacs. Not sure how rare they are nowadays, we didn't have any trouble buying some (on eBay). 12×2 cores, 48GB RAM, for BE that's essentially heaven…
richardwhiuk 4/15/2025|||
Some ARM stuff.
thro3838484848 4/15/2025||
Java VM is BE.
kbolino 4/15/2025||
This is misleading at best. The JVM only exposes multibyte values to ordinary applications in such a way that byte order doesn't matter. You can't break out a pointer and step through the bytes of a long field to see what order it's in, at least not without the unsafe memory APIs.

In practice, any real JVM implementation will simply use native byte order as much as possible. While bytecode and other data in class files is serialized in big endian order, it will be converted to native order whenever it's actually used. If you do pull out the unsafe APIs, you can see that e.g. values are little endian on x86(-64). The JVM would suffer from major performances issues if it tried to impose a byte order different from the underlying platform.

PhilipRoman 4/15/2025||
One relatively commonly used class which exposes this is ByteBuffer and its Int/Long variants, but there you can specify the endianness explicitly (or set it to match the native one).
cwood-sdf 4/15/2025||
"And no, a “target quadruple” is not a thing and if I catch you saying that I’m gonna bonk you with an Intel optimization manual. "

https://github.com/ziglang/zig/issues/20690

debugnik 4/15/2025||
The argument is that they're called triples even when they've got more or less components than 3. They should have simply been called target tuples or target monikers.
o11c 4/15/2025||
"gnu tuple" and "gnu type" are also common names.

The comments in `config.guess` and `config.sub`, which are the origin of triples, use a large variety of terms, at least the following:

  configuration name
  configuration type
  [machine] specification
  system name
  triplet
  tuple
pie_flavor 4/15/2025||
Sorry, going to keep typing x64. Unlike the article's recommendation of x86, literally everyone knows exactly what it means at all times.
qu4z-2 4/15/2025||
If someone tells me x86, I am certainly thinking 32-bit protected mode not 64-bit long mode... Granted I'm in the weird space where I know enough to be dangerous but not enough to keep me up-to-date with idiomatic naming conventions.
kevin_thibedeau 4/15/2025||
You mean AMD64?
cestith 4/16/2025||
> “i386” (the first Intel microarchitecture that implemented protected mode)12

This is technically incorrect. The 286 had protected mode. It was a 16-bit protected mode, being a 16-bit processor. It was also incompatible with the later protected mode of the 386 through today’s processors. It did, however, exist.

IAmLiterallyAB 4/16/2025||
> However, due to the runaway popularity of LLVM, virtually all compilers now use target triples.

That's a wild take. I think its pretty universally accepted the GCC and the GNU toolchain is what made this ubiquitous.

Also, the x32 ABI is still around, support is still around, I don't know where the author got that notion

therein 4/15/2025||
I like the code editor style preview on the right. Enough to forgive the slightly clunky scroll.
SrslyJosh 4/15/2025||
It looks nice, but I find the choppy scrolling (on an M1 MBP, no less!) to be distracting.

It also doesn't really tell me anything about the content, except where I'm going to see tables or code blocks, so I'm not sure what the benefit is.

Given the really janky scrolling, I'd like to have a way to hide it.

tiffanyh 4/15/2025|||
FYI - to see this you need to have your browser at least 1435px wide.
Starlevel004 4/15/2025||
Unfortunately the text in the preview shows up in ctrl+f.
ycombinatrix 4/15/2025||
>There’s a few variants. wasm32-unknown-unknown (here using unknown instead of none as the system, oops)

Why isn't it called wasm32-none-none?

pie_flavor 4/15/2025|
As far as I can tell, it's because libstd exists (but is full of do-nothing stubs). There is another `wasm32-none` target which is no_std.
psyclobe 4/15/2025|
Sounds like what we use with vcpkg to define the systems tooling; still trying to make sense of it all these years later, but we define things like x64-Linux-static to imply target architecture platform and linkage style to runtime.
More comments...