Posted by ferriswil 14 hours ago
I think (respectfully) the LLM that probably wrote this overshot the mark here because busy-polling a select does not actually sound better to me than a "kernel file watcher".
This reminds me of the teenager who told her dad that she was just a tiny little bit pregnant.
I wonder if using a tiny Redis instance, or even something like LevelDB would be even more efficient.
(read that in the way of "think of the children!")
It does seem weird though even for sqlite. I wonder how oban does it. I also wonder if OP knows oban can run on sqlite.
Love Fly.
And if you are making changes, don't you have to poll regardless after the file watcher wakes you?
For WAL mode, SQLite can probably satisfy this query just by inspecting some shared memory. But it is busy waiting, sure.
This has a thread running in the background trying to catch changes made by other connections, potentially (I'm not sure here, but I suspect as much) in different processes that are modifying the same database.
on my crappy old i5 with the db file on /dev/shm it can do ~150k writes a second with the wal_hook callback called on every write. and this is using JS bindings to C++ so has some unnecessary overhead.
k3s has been running on my home server for about three years now (using the default SQLite backend), and there doesn't seem to be excessive CPU usage despite dozens of watches existing in the simulated etcd. Of course, this doesn't say much about Honker, but it's nonetheless worth pointing out that sometimes the choice of database forces one towards a certain design.
[1] https://github.com/k3s-io/kine/blob/648a2daa/pkg/logstructur...
I had a manual fs polling thing a while back. It was ugly (low time budget, didn't wanna mess with the native watchers), just scanned the whole thing once per second. It averaged out to like 0.3% CPU.
Not elegant, but acceptable for my purposes! (Small-ish directory, and "ping me within a second or two" was realtime enough for this use case.)
Wake ups are death for mobile form factors, even if not really doing much work.
Either way this does seem like a very large overhead due to the fact that there's just no other way to do it without a deeper kernel integration which might be outside the scope of what sqlite is trying to do.
For the low, low cost of $1 per minute, you can also lease a supercar.
Are they joking? SQLite is usually used for single-process (mutliple threads) applications. The proper way to communicate between threads/processes is a ring buffer, where you allocate structs (allocation typically is incrementing a pointer), and futex/eventfd for notifications (+ some spinlocking to avoid going to kernel when the tasks arrive quickly). Why do you need redis for that? If you need persistent tasks, then you can store them in the table, and still use futex for notifications. This polling is inefficient and they should not make it a library which will cause other lazy developers add it to their app.
> honker polls SQLite’s PRAGMA data_version every millisecond. That’s a monotonic counter SQLite increments on every commit from any connection, journal mode, or process — a ~3 µs read for a precise wake signal
That's 3 ms per second = 0.3% CPU time wasted for every waiting thread.
Like Electron, this feels like written by a web developer and not a real programmer.
I suspect that's actually "per process, per database (usually 1)", and not based on number of threads or tables. `data_version` semantics mean there's no need for more than one connection polling it, and it's being used as a relatively lightweight "DB has changed, check queues" check (that's pretty much its whole purpose).
Also I believe this is mostly intended for multi-process use, e.g. out-of-process workers, so an in-process dirty tracker (e.g. just check after insert/update/delete) isn't sufficient.
So I do think it's somewhat crazy, but it is at least very simple. fsnotify-like monitoring seems like a fairly obvious improvement tho, not sure why that isn't part of it. Maybe it's slower? I haven't tried to do anything actually-performant-or-reliable with fs notifications, dunno what dragons lie in wait.
Key difference vs SQL polling is that we’re touching metadata instead of data pages. I have work in process to make this work without any polling (innotify, kqueue, mmap’d shm file check) after the original stat(2) direction proved unreliable if lightweight.
Would love your feedback and or contributions in the repo - still figuring out the end shape.
> How it works: honker polls SQLite’s PRAGMA data_version every millisecond. That’s a monotonic counter SQLite increments on every commit from any connection, journal mode, or process — a ~3 µs read for a precise wake signal.
BEGIN IMMEDIATE TRANSACTION; ROLLBACK;
Otherwise the new changes weren't guaranteed to be visible to the process. I'm sure there's a more targetted approach that would work instead - maybe flock on a particular byte in the `-shm` file.