Top
Best
New

Posted by todsacerdoti 22 hours ago

I write type-safe generic data structures in C(danielchasehooper.com)
328 points | 121 commentspage 3
asplake 21 hours ago|
Interesting! I’m working on toy/educational generator of ML-style tagged variants and associated functions in C (for a compiler) and when I’m a bit further along I will see if they’re compatible.
ryao 16 hours ago||
uint64_t data[] in level 2 violates the strict aliasing rule. Use the char type instead to avoid the violation.
ape4 20 hours ago||
Or write in CFront and have it translated to C
zabzonk 18 hours ago|
And where are you going to get a cfront compiler these days?
mingodad 6 hours ago||
https://github.com/mingodad/cfront-3
WalterBright 18 hours ago||
Here's how to do it in D:

    struct ListNode(T) {
        ListNode* next;
        T data;
    }

    T!int node;
Why suffer the C preprocessor? Using preprocessor macros is like using a hammer for finish carpentry, rather than a nail gun. A nail gun is 10x faster, drives the nail perfectly every time, and no half moon dents in your work.
dhooper 18 hours ago|
Thanks, this post is about C.

On some projects you must use C.

WalterBright 17 hours ago||
If I may may be provocative :-) this post isn't about C. It's about layering on a custom language using C preprocessor macros.

My compilers were originally written in C. I started using the C preprocessor to do metaprogramming. After some years I got fed up with it and removed nearly all of the preprocessor use, and never looked back. My code was much easier to understand.

An amusing story: long ago, a friend of mine working for Microsoft was told by a team leader that a 50K program had a bug in it, and sadly the developer was long gone. He'd assigned programmer after programmer to it, who could not fix it. My friend said he'd give it a try, and had it fixed in 2 hours.

The source code was written in Microsoft MASM, where the M stood for "Macro". You can guess where this is going. The developer had invented his own high level language using the macro system (which was much more powerful than C's). Unfortunately, he neglected to document it, and the other programmers spent weeks studying it and could not figure it out.

The leader, astonished, asked him how he figured it out in 2 hours? My friend said simple. He assembled it to object code, then disassembled the object code with obj2asm (a disassembler I wrote that converts object code back to source code). He then immediately found and fixed the bug, and checked in the "new" source code which was the disassembled version.

I've seen many very smart and clever uses of the C macros, the article is one of them. But isn't it time to move on?

uecker 8 hours ago|||
I could tell a similar story (many, in fact) about C++'s templates. It is not entirely clear to me what exactly makes the preprocessor a bad choice. One could argue that it is too flexible, so it is possible to create a mess with it. But somehow this seems a rather weak argument for inventing another monomorphization layer, which often evolve into their own mess.
ryao 16 hours ago|||
If the C compiler accepts it, it is C.
WalterBright 15 hours ago|||
Pedantically, the preprocessor is an entirely separate language. The lexing, parsing, expressions, and semantics are totally distinct. The preprocessor is usually implemented as a completely independent program. My first C compiler did integrate the preprocessor with the C compiler, but that was for performance reasons.

Currently, ImportC runs cpp and then lexes/parses the resulting C code for use in D.

ryao 14 hours ago||
It is part of the C standard. Whether it is part of a separate binary is an implementation choice.
WalterBright 13 hours ago||
True on both counts. But they are still separate and distinct languages.
zabzonk 15 hours ago|||
There is no one "the C compiler".
ryao 14 hours ago||
Pragmatically, the only C compiler that matters for what is or is not C is the one you are using.
zabzonk 13 hours ago||
Only if you are lucky enough to only use one compiler, or only one version of the same one.
JacksonAllan 16 hours ago||
I think the idea of using a union to store the element type without any extra run-time memory cost might have some use, specifically in cases where the container struct wouldn't typically store a variable of the element type (or, more likely, a pointer to the element's type) but we want to slip that type information into the struct anyway.

However, the problem that I have with this idea as a general solution for generics is that it doesn't seem to solve any of the problems posed by the most similar alternative: just having a macro that defines a struct. The example shown in the article:

    #define List(type) union { \
        ListNode *head; \
        type *payload; \
    }
could just as easily be:

    #define List(type) struct { \
        type *head; \
        /* Other data, such as node/element count... */ \
    }
(As long as our nodes are maximally aligned - which they will be if they're dynamically allocated - it doesn't matter whether the pointer we store to the list head is ListNode *, type *, void *, or any other regular pointer type.)

The union approach has the same drawback as the struct approach: untagged unions are not compatible with each other, so we have to typedef the container in advance in order to pass in and out of functions (as noted in the article). This is broadly similar to the drawback from which the "generic headers" approach (which I usually call the "pseudo-template" approach) suffers, namely the need for boilerplate from the user. However, the generic-headers/pseudo-template approach is guaranteed to generate the most optimized code thanks to function specialization[1], and it can be combined with another technique to provide a non-type-prefixed API, as I discuss here[2] and demonstrate in practice here[3].

I'd also like to point to my own approach to generics[4] that is similar to the one described here in that it hides extra type information in the container handle's type - information that is later extracted by the API macros and passed into the relevant functions. My approach is different in that rather than exploiting unions, it exploits functions pointers' ability to hold multiple types (i.e. the return type and argument types) in one pointer. Because function pointers are "normal" C types, this approach doesn't suffer from the aforementioned typedef/boilerplate problem (and it allows for API macros that are agnostic to both element type/s and container type). However, the cost is that the code inside the library becomes rather complex, so I usually recommend the generic-headers/pseudo-template approach as the one that most people ought to take when implementing their own generic containers.

[1] https://gist.github.com/attractivechaos/6815764c213f38802227...

[2] https://github.com/JacksonAllan/CC/blob/main/articles/Better...

[3] https://github.com/JacksonAllan/Verstable

[4] https://github.com/JacksonAllan/CC

notnmeyer 21 hours ago||
pretty sure C is the new Go.
qustrolabe 17 hours ago||
pretty sure C has to go
revskill 20 hours ago||
Without the concurreny part.
oflebbe 20 hours ago|||
OpenMP to the rescue
sltkr 19 hours ago|||
Or garbage collection. Or interfaces. Or packages. Or actual generics.
kahlonel 5 hours ago||
[dead]
ValveFan6969 15 hours ago||
[dead]
luppy47474 21 hours ago||
[flagged]
luppy47474 21 hours ago|
[flagged]