Top
Best
New

Posted by HunOL 11/1/2025

SQLite concurrency and why you should care about it(jellyfin.org)
382 points | 176 commentspage 2
Leherenn 11/1/2025|
A bit off topic, but there seems to be quite a few SQLite experts here.

We're having troubles with memory usage when using SQLite in-memory DBs with "a lot" of inserts and deletes. Like maybe inserting up to a 100k rows in 5 minutes, deleting them all after 5 minutes, and doing this for days on end. We see memory usage slowly creeping up over hours/days when doing that.

Any settings that would help with that? It's particularly bad on macOS, we've had instances where we reached 1GB of memory usage according to Activity Monitor after a week or so.

asa400 11/1/2025||
Are you running vacuums at all? auto_vacuum enabled at all?

https://sqlite.org/lang_vacuum.html

porridgeraisin 11/1/2025||
In memory DBs don't have anything to vacuum.

However... what you (and OP) are looking for might be pragma shrink_memory [1].

[1] https://sqlite.org/pragma.html#pragma_shrink_memory

asa400 11/1/2025||
Ah, you're correct. I read too fast and missed that it was in-memory databases specifically!
pstuart 11/1/2025|||
If you're deleting all rows you can also just drop the table and recreate it.
kachapopopow 11/1/2025||
sounds like normal behavior of adjusting buffers to better fit the usecase, not sure if it applies to sqlite or if sqlite even implements dynamic buffers.
rpcope1 11/1/2025||
Do these guys really not understand that WAL is still single writer multi reader? You could do concurrent (but not parallel) write DML in both the normal and WAL journaling models. WAL alleviates read transactions being blocked by writers but you still have to lock it down to a single writer. It would be nice if SQLite3 had full blown MVCC, but it still works if you understand it.
yread 11/1/2025||
I'm a bit confused. The point of this article is that the author used .NET Interceptors and TagWith to somehow tag his EF Core operations so that they make their own busy_timeout (which EF Core devs think is not necessary https://github.com/dotnet/efcore/issues/28135 ) or do a horrible global lock? No data is presented on how it improved things if it did. Nor is it described which operations were tagged with what. The only interesting thing about it are the interceptors but that's somehow not discussed in HN's comments at all.
tombert 11/1/2025||
Does this mean I can finally load-balance with multiple Jellyfin instances?

A million years ago, back when I still used Emby, I was annoyed that I couldn't use it across multiple in Docker Swarm due to locking of SQLite. It really annoyed me, enough to where I started (but never completed) a driver to change the DB to postgres [1]. I ended up moving everything over to a single server, which is mostly fine unless I have multiple people transcoding at the same time.

If this is actually fixed then I might have an excuse to rearchitect my home server setup again.

[1] https://github.com/Tombert/embypostgres

Yodel0914 11/1/2025|
Jellyfin have just gone through a massive refactor and pulled all their data access code into EFCore. This opens the path for supporting different RBDMSs which think is next on their list.
keyliejener 11/3/2025||
[dead]
dv35z 11/1/2025||
Curious if anyone has strategies on how to perform parallel writes to an SQLite database using Python's `multiprocessing` Pool.

I am using it to loop through a database of 11,000 words, hit an HTTP API for each (ChatGPT) and generate example sentences for the word. I would love to be able to asynchronously launch these API calls and have them come back and update the database row when ready, but not sure how to handle the database getting hit by all these writes from (as I understand it) multiple instances of the same Python program/function.

zie 11/1/2025||
Technically SQLite can only have 1 writer at any given moment, but it can appear like it works across multiple writers and let it serialize the calls for you.

By default SQLite will not do what you want out of the box. You have to turn on some feature flags(PRAGMA) to get it to behave for you. You need WAL mode, etc read:

* https://kerkour.com/sqlite-for-servers * https://zeroclarkthirty.com/2024-10-19-sqlite-database-is-lo...

My larger question is why multiprocessing? this looks like an IO heavy workload, not CPU bound, so python asyncio or python threads would probably do you better.

multiprocessing is when your resource hog is CPU(probably 1 python process per CPU), not IO bound.

dv35z 11/1/2025||
I will check into `asyncio` and Python threads. I used multiprocessing as my first project into asynchronous programming. The previous use-case was using Python + multiprocessing to run MacOS `say` (using Python subprocess) - so I could invoke it 10-20 times simultaneously on my computer, rather than waiting for each to complete. I experimented a bit with how many concurrent processes to run (using `time` to clock how long the runs were).
sethev 11/1/2025|||
Have you tried it?

What you're describing sounds like it would work fine to me. The blog post is misleading imho - it implies that SQLite doesn't handle concurrency at all. In reality, you can perform a bunch of writes in parallel and SQLite will handle running them one after the other internally. This works across applications and processes, you just need to use SQLite to interact with the database. The blog post is also misleading when it implies that the application has to manage access to the database file in some way.

Yes, it's correct that only one of those writes will execute at a time but it's not like you have to account for that in your code, especially in a batch-style process like you're describing. In your Python code, you'll just update a row and it will look like that happens concurrently with other updates.

I'll bet that your call to ChatGPT will take far longer than updating the row, even accounting for time when the write is waiting for its turn in SQLite.

Use WAL-mode for the best performance (and to reduce SQLITE_BUSY errors).

dv35z 11/1/2025||
I haven't tried it yet - async processing (and even using SQLite) is new to me, so I'm trying to figure out solution patterns which work for the now, and also I can continue to invest my knowledge in to solve future problems.

I will look into WAL mode. I am enjoying using SQLite (and aware that its not the solution for everything), and have several upcoming tasks which I'm planning to use async stuff - and yes, trying to find the balance between how to handle those async tasks (Networky HTTP calls being different than running `ffmpeg` locally).

crazygringo 11/2/2025|||
It should just work.

If one thread is writing another thread tries to write, the first thread will have the file write lock, and the second thread will wait to write until that lock is released.

I've written code using the pattern you describe and it's totally fine.

mickeyp 11/1/2025||
Edit: disregard. I read it as he'd done it and had contention problems.

You can't. You have a single writer - it's one of the many reasons sqlite is terrible for serious work.

You'll need a multiprocessing Queue and a writer that picks off sentences one by one and commits it.

hruk 11/1/2025||
This is just untrue - the naive implementation (make the API call, write a single row to the db) will work fine, as transactions are quite fast on modern hardware.

What do you consider "serious" work? We've served a SaaS product from SQLite (roughly 300-500 queries per second at peak) for several years without much pain. Plus, it's not like PG and MySQL are pain-free, either - they all have their quirks.

mickeyp 11/1/2025||
Edit: disregard. I read it as he'd done it and had contention problems.

I mean it's not if he's got lock contention from BUSY signals, now is it, as he implies. Much of his issues will stem from transactions blocking each other; maybe they are long-lived, maybe they are not. And those 3-500 queries --- are they writes or reads? Because reads is not a problem.

hruk 11/1/2025||
Roughly 80/20 read to write. On the instance's gp3 EBS volume (which is pretty slow), we've pushed ~700 write transactions per second without much problem.
mickeyp 11/1/2025||
For small oltp workloads the locking is not going to be a problem. But stuff that holds the write lock for some measurable fraction of a second even will gum things up real fast. Transactions that need it for many seconds? You'll quickly be dead in the water.
slashdave 11/1/2025||
I am a little confused, but maybe I am missing some context? Wouldn't using a proper database be a lot easier than all of this transaction hacking? I mean, is Postgres that hard to use?
fitsumbelay 11/1/2025||
Very helpful and a model for how technical posts should be written: clarity, concision, anchor links that summarize the top lines. It was a pleasure to read.