Top
Best
New

Posted by todsacerdoti 18 hours ago

I write type-safe generic data structures in C(danielchasehooper.com)
270 points | 102 commentspage 2
WalterBright 14 hours ago||
Here's how to do it in D:

    struct ListNode(T) {
        ListNode* next;
        T data;
    }

    T!int node;
Why suffer the C preprocessor? Using preprocessor macros is like using a hammer for finish carpentry, rather than a nail gun. A nail gun is 10x faster, drives the nail perfectly every time, and no half moon dents in your work.
dhooper 14 hours ago|
Thanks, this post is about C.

On some projects you must use C.

WalterBright 13 hours ago||
If I may may be provocative :-) this post isn't about C. It's about layering on a custom language using C preprocessor macros.

My compilers were originally written in C. I started using the C preprocessor to do metaprogramming. After some years I got fed up with it and removed nearly all of the preprocessor use, and never looked back. My code was much easier to understand.

An amusing story: long ago, a friend of mine working for Microsoft was told by a team leader that a 50K program had a bug in it, and sadly the developer was long gone. He'd assigned programmer after programmer to it, who could not fix it. My friend said he'd give it a try, and had it fixed in 2 hours.

The source code was written in Microsoft MASM, where the M stood for "Macro". You can guess where this is going. The developer had invented his own high level language using the macro system (which was much more powerful than C's). Unfortunately, he neglected to document it, and the other programmers spent weeks studying it and could not figure it out.

The leader, astonished, asked him how he figured it out in 2 hours? My friend said simple. He assembled it to object code, then disassembled the object code with obj2asm (a disassembler I wrote that converts object code back to source code). He then immediately found and fixed the bug, and checked in the "new" source code which was the disassembled version.

I've seen many very smart and clever uses of the C macros, the article is one of them. But isn't it time to move on?

uecker 4 hours ago|||
I could tell a similar story (many, in fact) about C++'s templates. It is not entirely clear to me what exactly makes the preprocessor a bad choice. One could argue that it is too flexible, so it is possible to create a mess with it. But somehow this seems a rather weak argument for inventing another monomorphization layer, which often evolve into their own mess.
ryao 12 hours ago|||
If the C compiler accepts it, it is C.
WalterBright 11 hours ago|||
Pedantically, the preprocessor is an entirely separate language. The lexing, parsing, expressions, and semantics are totally distinct. The preprocessor is usually implemented as a completely independent program. My first C compiler did integrate the preprocessor with the C compiler, but that was for performance reasons.

Currently, ImportC runs cpp and then lexes/parses the resulting C code for use in D.

ryao 10 hours ago||
It is part of the C standard. Whether it is part of a separate binary is an implementation choice.
WalterBright 9 hours ago||
True on both counts. But they are still separate and distinct languages.
zabzonk 11 hours ago|||
There is no one "the C compiler".
ryao 10 hours ago||
Pragmatically, the only C compiler that matters for what is or is not C is the one you are using.
zabzonk 9 hours ago||
Only if you are lucky enough to only use one compiler, or only one version of the same one.
david2ndaccount 12 hours ago||
The “typeof on old compilers” section contains the code:

         (list)->payload = (item); /* just for type checking */\
That is not a no-op. That is overwriting the list head with your (item). Did you mean to wrap it in an `if(0)`?
josephg 11 hours ago|
In that example they also had replace the union with a struct - presumably to work around this issue. But that seems wasteful to me too. Doing it within an if(0) seems strictly better.
hgs3 16 hours ago||
I'm curious what a hashmap looks like with this approach. It's one thing to pass through or hold onto a generic value, but another to perform operations on it. Think computing the hash value or comparing equality of generic keys in a generic hashmap.
lhearachel 11 hours ago|
I first would question what a user wants to do with a hashmap that uses polymorphic key-values of unknowable type at compile-time.

As a thought experiment, you could certainly have users define their own hash and equality functions and attach them to the table-entries themselves. On first thought, that sounds like it would be rife with memory safety issues.

At the end of the day, it is all just bytes. You could simply say that you will only key based on raw memory sequences.

b0a04gl 16 hours ago||
what happens if two types have same size and alignment but different semantics : like `int` vs `float` or `struct { int a; }` vs `int`? does the type tag system catch accidental reuse . aka defending against structual collisions
ryao 12 hours ago||
uint64_t data[] in level 2 violates the strict aliasing rule. Use the char type instead to avoid the violation.
asplake 17 hours ago||
Interesting! I’m working on toy/educational generator of ML-style tagged variants and associated functions in C (for a compiler) and when I’m a bit further along I will see if they’re compatible.
monkeyelite 16 hours ago|
Another way is to not try to write generic data structures. When you tailor them to the use case you can simplify.

The #1 data structure in any program is array.

dwattttt 12 hours ago|
When all you have are arrays, everything looks like a problem you solve with arrays.

There are quite a few problems that specialised containers are suited for, that's why they were created.

monkeyelite 10 hours ago||
And you can write them when you need them.

The situation where you need a red black tree with 10 different key/value combos isn’t real.

el_pollo_diablo 5 hours ago|||
If, by "situation", you mean the development of a small program with so many constraints that using existing libraries is out if the question, then yes.

Otherwise, that seems unwise to me. Not every user of a generic type has to be generic. A major selling point of generic types is that you write a library once, then everyone can instantiate it. Even if that is the only instance they need in their use case, you have saved them the trouble of reinventing the wheel.

No colleague of mine may need 10 different instances of any of my generic libraries, but I bet that all of them combined do, and that our bosses are happy that we don't have to debug and maintain 10+ different implementations.

dwattttt 7 hours ago|||
You could take away anything you use and say "but we could make it ourselves", that doesn't mean it's helpful.
monkeyelite 5 hours ago||
Except it’s very common for C programs to contain one-off data structures, so it’s not a hypothetical. It’s a concrete programming style.
dwattttt 2 hours ago|||
Do you mean a data structure they only use once? Or one that's never been done elsewhere? If they only use it once, that seems like the worst effort/pay-off ratio you can get writing it yourself. And I don't think there's that many fundamental data structures out there... and even then, why would it be good to be forced to make your bespoke structure out of only arrays, when things like maps exist?
el_pollo_diablo 4 hours ago|||
Sure, but it is also very common for C programs to contain data structures that have one use in the program, and could still be instances of a generic type. You mentioned red black trees, which are a perfect example of that.
More comments...