Posted by signa11 7 days ago
Fact is that by the time small allocations are failing you are almost no better off handling the null than you would be handling segfaults and the sigterms from the killer.
Often for servers performance will fall off a cliff long before the oom killer is needed, too.
Has malloc ever returned zero since then? Or has somebody undone this, erm, feature at times?
I want to agree with the locality of errors argument, and while in simple cases, yes, it holds true, it isn't necessarily true. If we don't overcommit, the allocation that kills us is simply the one that fails. Whether this allocation is the problematic one is a different question: if we have a slow leak that, every 10k allocation allocs and leaks, we're probably (9999 / 10k, assuming spherical allocations) going to fail on one that isn't the problem. We get about as much info as the oom-killer would have, anyways: this program is allocating too much.
When overcommit is enabled, the kernel is allowed to engage in fractional reserve banking.
which is elegant and completely legitimate
A forked process would assume memory is already allocated, but I guess it would fail when writing to it as if vm.overcommit is set to 0 or 1.
On the other hand, if you have a pretty good idea that the child will finish persisting and exit before the cache is fully rewritten, double is too much. There's not really a mechanism for that though. Even if you could set an optimistic multiplier for multiple mapped CoW pages, you're back to demand paging failures --- although maybe it's still worthwhile.
It's wrong 99.99999% of the time. Because alternative is either "make it take double and waste half the RAM" or "write in memory data in a way that allows for snapshotting, throwing a bunch of performance into the trash"
the author says it's bad design, but has entirely missed WHY it wants overcommit
POSIX not so much.
One would be where you have frequent large mallocs which get freed fast. Another would be where you have written a garbage collected language in C/C++.
When calling free, delete or letting your GC do that for you the memory isn't actually given back immediately, glibc has malloc_trim(0) for that, which tries it's best to give back as much unused memory to the OS as possible.
Then you can retry your call to malloc and see if it fails and then just let your supervisor restart your service/host/whatever or not.
Two reasons why overcommit is a good idea:
- It lets you reserve memory and use the dirtying of that memory to be the thing that commits it. Some algorithms and data structures rely on this strongly (i.e. you would have to use a significantly different algorithm, which is demonstrably slower or more memory intensive, if you couldn't rely on overcommit).
- Many applications have no story for out-of-memory other halting. You can scream and yell at them to do better, but that won't help, because those apps that find themselves in that supposedly-bad situation ended up there for complex and well-considered reasons. My favorite: having complex OOM error handling paths is the worst kind of attack surface, since it's hard to get test coverage for it. So, it's better to just have the program killed instead, because that nixes the untested code path. For those programs, there's zero value in having the memory allocator be able to report OOM conditions other than by asserting in prod that mmap/madvise always succeed, which then means that the value of not overcommitting is much smaller.
Are there server apps where the value of gracefully handling out of memory errors outweighs the perf benefits of overcommit and the attack surface mitigation of halting on OOM? Yeah! But I bet that not all server apps fall into that bucket
I use zram exclusively as swap, and while doing memory intensive tasks in the background, it's usually something interactive like the browser that ends up being killed. I would turn on swap that's isn't zram if this happened more often. I might also turn off overcommit if apps handled OOM situations gracefully, but as you say, the complexity might not be worth it.