Top
Best
New

Posted by pizlonator 9/5/2025

Fil's Unbelievable Garbage Collector(fil-c.org)
598 points | 278 commentspage 2
AndyKelley 9/5/2025|
Super cool project. Sorry if you explained this already, I don't know what "Dijkstra accurate" means. How does it know if an object is truly available to be reclaimed, given that pointers can be converted to integers?
pizlonator 9/5/2025||
> given that pointers can be converted to integers?

Because if they get converted to integers and then stored to the heap then they lose their capability. So accesses to them will trap and the GC doesn’t need to care about them.

Also it’s not “Dijkstra accurate”. It’s a Dijkstra collector in the sense that it uses a Dijkstra barrier. And it’s an accurate collector. But these are orthogonal things

AndyKelley 9/5/2025|||
Hmm, I'm still not understanding the bit of information that I'm trying to ask about.

Let's say I malloc(42) then print the address to stdout, and then do not otherwise do anything with the pointer. Ten minutes later I prompt the user for an integer, they type back the same address, and then I try to write 42 bytes to that address.

What happens?

Edit: ok I read up on GC literature briefly and I believe I understand the situation.

"conservative" means the garbage collector does not have access to language type system information and is just guessing that every pointer sized thing in the stack is probably a pointer.

"accurate" means the compiler tells the GC about pointer types, so it knows about all the pointers the type system knows about.

Neither of these are capable of correctly modeling the C language semantics, which allows ptrtoint / inttoptr. So if there are any tricks being used like xor linked lists, storing extra data inside unused pointer alignment bits, or a memory allocator implementation, these will be incompatible even with an "accurate" garbage collector such as this.

I should add, this is not a criticism, I'm just trying to understand the design space. It's a pretty compelling trade offer: give up ptrtoint, receive GC.

dan-robertson 9/5/2025|||
I think the answer in your example is that when you cast the int into a pointer, it won’t have any capabilities (the other big Fil-C feature?) and therefore you can’t access memory through it.
pizlonator 9/5/2025||
Yes!
cgh 7 days ago|||
To expand on the capabilities thing: https://fil-c.org/invisicaps_by_example

In particular, check out the sections called "Laundering Pointers As Integers" and "Laundering Integers As Pointers".

AndyKelley 7 days ago||
Thanks!

> This is because the capability is not stored at any addresses that are accessible to the Fil-C program.

How are they stored? Is the GC running in a different process?

charleslmunger 9/5/2025|||
Out of curiosity, does this idiom work in fil-c?

https://github.com/protocolbuffers/protobuf/blob/cb873c8987d...

      // This somewhat silly looking add-and-subtract behavior provides provenance
      // from the original input buffer's pointer. After optimization it produces
      // the same assembly as just casting `(uintptr_t)ptr+input_delta`
      // https://godbolt.org/z/zosG88oPn
      size_t position =
      (uintptr_t)ptr + e->input_delta - (uintptr_t)e->buffer_start;
      return e->buffer_start + position;
It does use the implementation defined behavior that a char pointer + 1 casted to uintptr is the same as casting to uintptr then adding 1.
pizlonator 9/5/2025||
Yeah that should just work

Code that strives to preserve provenance works in Fil-C

charleslmunger 9/5/2025||
Very cool. Hardware asan did not catch the pointer provenance bug in the previous implementation of that code because it relies on tag bits, and the produced pointer was bit-identical to the intended one. It sounds like fil-c would have caught it because the pointer capabilities are stored elsewhere.
kragen 9/5/2025||
What hardware do you need for hardware Asan? I'm so out of the loop that I haven't heard of it before.
saagarjha 9/5/2025||
TBI: https://clang.llvm.org/docs/HardwareAssistedAddressSanitizer...
kragen 9/5/2025||
Thanks!
shim__ 9/5/2025||
Pointers are always integers, which can be interpreted as pointers.
xd1936 9/5/2025||
Great name. Big RUNK energy[1].

1. https://twitter.com/6thgrade4ever/status/1433519577892327424

system2 9/5/2025||
> Fil-C uses a parallel concurrent on-the-fly grey-stack Dijkstra accurate non-moving garbage collector called FUGC

Half of my hair turned white while trying to understand this.

o11c 9/5/2025||
Note that the "safepointing" logic is exactly the same thing that's needed in refcounting to atomic replace a field.

This article glosses over what I consider the hardest part - the enter/exit functionality around native functions may that block (but which must touch the allocator).

pizlonator 9/5/2025|
> Note that the "safepointing" logic is exactly the same thing that's needed in refcounting to atomic replace a field.

No it's not, not even close.

> This article glosses over what I consider the hardest part - the enter/exit functionality around native functions may that block (but which must touch the allocator).

Yeah, that part is hard, and maybe I'll describe it in another post.

Look for `filc_enter` and `filc_exit` in https://github.com/pizlonator/fil-c/blob/deluge/libpas/src/l...

gleenn 9/5/2025||
> The only "pause" threads experience is the callback executed in response to the soft handshake, which does work bounded by that thread's stack height.

So this is probably not great for functional/deeply-recursive code I guess?

pizlonator 9/5/2025|
Meh.

The stack scan is really fast. There's not a lot of logic in there. If you max out the stack height limit (megabytes of stack?) then maybe that means milliseconds of work to scan that stack. That's still not bad.

adastra22 9/5/2025||
That's a very long time. Milliseconds of work is an entire frame update-render cycle in a modern game.
pizlonator 9/5/2025|||
Would your modern game have a stack that is so deep that it brushes up against the stack height limit?

Probably not. Your game would be inches of stack away from crashing

debugnik 9/5/2025||
You're missing the point, they're giving an example of an entire workload that fits into your technique's worst-case overhead. It's could be the right trade-off and rarely be hit, but that worst-case does sound bad.
pizlonator 9/5/2025|||
Stacks tend to be small enough that the cost of scanning them is minuscule.

(I’m not trying to BS my way here - I’m explaining the reason why on the fly GC optimization almost never involves doing stuff about the time it takes to scan stack. It’s just not worth it. If it was, we’d be seeing a lot of stacklet type optimizations.)

torginus 9/5/2025||||
From what he describes, he uses stack maps to tell which stack values are pointers. He can skip over everything that's not a pointer.

On x86_64 you need about 10k function deep stack, all of them with the 14 GPs filled with pointers -to have an 1MB stack.

pizlonator 7 days ago||
To play devil's advocate, the suckiest part about stack scanning is that it's a linked list walk. It's not a linear scan. So it's all pointer chasing. And it's very likely to find previously unmarked pointers, which involves CAS and other work.

(It would be a linear scan if I was just conservatively scanning, but then I'd have other problems.)

This is one of the most counterintuitive parts of GC perf! You'd think that the stack scan had to be a bottleneck, and it might even be one in some corner cases. But it's just not the longest pole in the tent most of the time, because you're so unlikely to actually have a 1MB stack, and programs that do have a 1MB stack tend to also have ginormous heaps (like many gigabytes), which then means that even if the stack scan is a problem it's not the problem.

kragen 7 days ago||
You're writing the compiler, though, so you can define the stack layout. If the stack-scanning linked-list walk were the long pole, it wouldn't be hard to eliminate the pointer chasing: your procedure prologue could add a pointer to each newly pushed stack frame to something like a std::deque, then pop it off in the epilogue.

I don't know, maybe the fact that I'm disagreeing with someone who knows a lot more than I do about the issue should be a warning sign that I'm probably wrong?

adastra22 9/5/2025||||
^ this was the intent of the example.
kristofferc 9/5/2025|||
Actually, it sounds quite ok.
munificent 9/5/2025||||
Games don't tend to have very deep callstacks. And if a game cared about performance also wanted to use GC, it would probably try to run the GC at the end of a frame when there is little on the stack.
pizlonator 9/5/2025|||
Yeah UE GC safepoints at end of tick where there is no stack. That’s a common trick in systems that have both GC and ticking.

To be fair, FUGC doesn’t currently let you do that. The GC runs in a separate thread and soft handshakes at various points, which cause your game thread to react at poll checks and exits that might not be at end of tick.

But I could add a feature that lets you to force handshake responses to be at end of tick! That sounds like a good idea

adastra22 9/5/2025|||
FUGC runs the GC in a separate thread and you don’t have a lot of control over when it interrupts.
kragen 9/5/2025|||
Latency-sensitive programs like games are usually careful to avoid deep recursion.
Quitschquat 9/5/2025||
I love garbage collector design and impl. It’s one of those “go to” thing to do when learning a new language.

Never heard of this one, looking forward to diving in this weekend.

jcul 7 days ago||
This looks pretty amazing, I'm surprised I haven't heard of it before. Looking forward to trying it out. Seems like a good way to verify the safety of some programs, even if not feasible for production due to performance constraints. Though we have sanitizers for tests, this seems more complete.
synergy20 9/5/2025||
will fil-c support ARM soon,for that matter, risc-v? safety is super important for embedded devices running c or c++
dan-robertson 9/5/2025||
I’m curious how expensive the write barrier is in practice? IIRC, it is often write barriers that most affect performance sensitive programs running on non-concurrent garbage collectors (and perhaps safepoints that can cause problems for performance sensitive threads when running with a concurrent gc).
aidenn0 7 days ago|
How can the GC be precise when C and C++ allow casting between pointers and integers?
More comments...