Top
Best
New

Posted by robin_reala 9/13/2025

RIP pthread_cancel(eissing.org)
238 points | 106 commentspage 2
hacker_homie 9/14/2025|
what's old is new again, I loved java in the early 2000's trying to remotely stop a thread

Thread.destroy() Thread.stop() Thread.suspend()

so much potential for corrupted state.

Aardwolf 9/13/2025||
Maybe this is naive, but could there just be some amount of worker threads that run forever, wait for and take jobs when needed, and message when the jobs are done? Don't need to be canceled, don't block
danappelxx 9/13/2025||
If the DNS resolution call blocks the thread, then you need N worker threads to perform N DNS calls. Threads aren’t free, so this is suboptimal. OTOH some thread pools e.g. libdispatch on Apple operating systems will spawn new threads on demand to prevent starvation, so this _can_ be viable. Though of course this can lead to thread explosion which may be even more problematic depending on the use case. In libcurl’s situation, spawning a million threads is probably even worse than a memory leak, which is worse than long timeouts.

In general, what you really want is for the API call to be nonblocking so you’re not forced to burn a thread.

variadix 9/13/2025|||
Yeah, I’m not sure I see the problem (other than that threads are more expensive than e.g. file descriptors, but this is a moot point without a better API). You define how many requests in flight you want to allow and that sets the cap on how many worker threads you spawn/use, you could also support an unbounded number in flight this way by lazily spawning worker threads per requests. Cancellation/kill interfaces for multithreading are pretty much always a footgun. Even for multiprocessing on modern machines, if you’re doing something non-trivial and decide to use SIGKILL to terminate a worker process, it’s easy to leave e.g. file system resources in a bad state.
ComputerGuru 9/13/2025||
This is, essentially, what the previous (largely pathetic) excuse for true asynchronous I/O on Linux did with the libc aio(7) interface to essentially fake support for truly asynchronous file IO. It wasn’t great.
Someone 9/13/2025||
> Then it needs to sort them if there is more than one address. And in order to do that it needs to read /etc/gai.conf

I don’t see why glibc would have to do that inside a call to getaddrinfo. can’t it do that once at library initialization? If it has to react to changes to that file while a process is running, couldn’t it have a separate thread for polling that file for changes, or use inotify for a separate thread to be called when it changes? Swapping in the new config atomically might be problematic, but I would think that is solvable.

Even ignoring the issue mentioned it seems wasteful to open, parse, and close that file repeatedly.

loeg 9/13/2025||
I think the libc people might argue this level of functionality is just outside the scope of libc. (Arguably, it is a mistake for DNS to be part of libc, given how complicated it is.)
Someone 9/16/2025|||
https://sourceware.org/glibc/ says

“The GNU C Library is designed to be a backwards compatible, portable, and high performance ISO C library. It aims to follow all relevant standards including ISO C11, POSIX.1-2008, and IEEE 754-2008.”

⇒ I don’t think they make that argument.

loeg 9/16/2025||
The standards make no requirement that an implementation be good, performant, or even useful.
ComputerGuru 9/13/2025|||
To be sure, complexity isn’t the determinator for whether something is or isn’t in scope for libc though.
loeg 9/14/2025||
Historically libcs have been leery of imposing threading on otherwise singlethreaded applications; and for similar reasons, try to minimize startup costs.
cesarb 9/13/2025|||
> I don’t see why glibc would have to do that inside a call to getaddrinfo. can’t it do that once at library initialization?

If it were a library dedicated to DNS, sure, but glibc is used by nearly every process in the system, including many which will never call getaddrinfo.

NewJazz 9/13/2025||
You want libc to start a thread whenever it is loaded?
Someone 9/14/2025||
If it’s the only alternative to being broken: yes.

It could read and parse the file the first time a thread gets created, too.

A cheaper alternative is to check the modification time of the config file, and only reparse it in pthread_cancel when that changes. That doesn’t 10% fix the problem in theory, but would do it in practice.

rurban 9/14/2025||
Just use c-ares. Threads with signals are evil
whatifitoldyou 9/14/2025||
pthread_cancel is never the answer. Abrupt thread cancelation is always a risk for leaks and/or even worse, deadlocks. No matter of it was blocked in one libc function. The function is not one instruction you don't know what the poor thread was doing when it was killed.
jart 9/13/2025||
Why can't they help fix the C library in question? Cancelation is really tricky to implement for the C library author. It's one of those concepts that, like fork, has implications that pervade everything. Please give your C library maintainers a little leeway if they get cancelation wrong. Especially if it's just a memory leak.
RedShift1 9/13/2025|
I'm betting this code is so old and its behavior so ingrained everywhere else that nobody dares touching it.
jart 9/14/2025||
No it sounds like they just need to add a pthread_cleanup_push() call somewhere in the getaddrinfo() implementation.

C libraries are not black magic. Nor are they holy code. We needn't fear them. It's the simplest part of the software stack.

nly 9/13/2025||
Why is running the DNS resolution thread a problem? It should be dequeuing resolution requests and pushing responses and sleeping when there is nothing to do

When someone kills off the curl context surely you simply set a suicide flag on the thread and wake it up so it can be joined.

foota 9/13/2025||
The thread started sounds like it's single use, not a thread handling requests in a loop. Anyway, a single thread handling requests in a loop would serialize these DNS lookups which if they're hanging would be problematic.
loeg 9/13/2025||
Yes, but why? As GP notes, the thread doesn't have to be single-use.
rwmj 9/13/2025||
One problem may be that fork() kills background threads, so now any program that uses libcurl + fork has to have a new API to restart the DNS thread (or use posix_atfork which is a big PITA), and that might break existing programs using curl.
ComputerGuru 9/13/2025|||
It’s not too much of an exaggeration to say that everything about using fork() instead of vfork() plus exec() is essentially fundamentally broken in modern osdev without a whole stack of hacks to try and patch individual issues one-by-one.
EPWN3D 9/14/2025||
It's not an exaggeration in any sense. fork(2) basically cannot be done correctly in modern userspace stacks.
loeg 9/13/2025|||
A surmountable problem, sure.
rwmj 9/13/2025||
Sometimes. To give one counterexample, golang doesn't have a way to restart the threads it uses for I/O (apparently a decision the golang developers made), so if you're embedding golang code in another binary, it better not call fork. (The reason for this warning: https://gitlab.com/nbdkit/nbdkit/-/commit/2589e6da40939af9ae...)
loeg 9/14/2025||
This is a policy choice of Golang, but a C library like Curl (the topic of this thread) is not constrained by the policy choices of the Go developers. Curl could use MADV_WIPEONFORK or other primitives to restart its DNS thread automatically.
enva2712 9/14/2025||
i’ve run into this issue with pthread_cancel

it makes cleanup logic convoluted when you can’t easily pthread_cleanup_push, forcing you to either block on the join or signal cancellation and detach

gary_0 9/13/2025|
[deleted]
okl 9/13/2025|
https://c-ares.org/