Posted by ajayvk 10/27/2024
using sqlite for static content might have other benefits, but i don't think that this is the main one
I've tried various solutions like litestream and even xdelta3 (which generates patches in excess of the actual changeset size), but I haven't found a solution I'm confident in other than backing up complete snapshots.
I can see how you'd use it for replication though.
Seeing as you need some kind of layer between web and sqlite, you might as well keep a layer between web and FS who nets you most or all of the benefits.
Used to be hackily done by overwriting symlinks, but now there's https://manpages.debian.org/testing/manpages-dev/renameat2.2...
Use git to manage versions of the static files.
Use [git worktree][1] to have multiple separate working trees in different directories.
Use [openat()][2] in your app whenever serving a request for a static file. On each request, open the directory, i.e. `/var/www/mysite.com`, and then use `openat()` with the directory's file descriptor when looking up any files under that directory.
`/var/www/mysite.com` is a symbolic link to some working tree.
When modifying the site, make modifications to the git repository. Then check out the modified working tree into a new directory.
Create a symbolic link to the new directory and `mv` (rename) it to `/var/www/mysite.com`.
Then some time later, `rm -r` the old working tree.
[1]: https://git-scm.com/docs/git-worktree
[2]: https://pubs.opengroup.org/onlinepubs/9799919799/functions/o...
bad match -- git stores every version in the history forever. Do you really need every revision of a binary file, completely duplicated? big files, more bad
[1]: https://www.reddit.com/r/git/comments/wk2kqy/delete_files_un...
1. I should make blog engine using flat files. That way I can diff the plain text.
2. But flat files are hard to read/write structured data from, so I should use SQLite instead.
3. But SQLite is hard to diff. GOTO 1.
There's no reason you can't do the same but including the content too.
That way you're always committing a text friendly version of the database rather than the database binary encoding. And your diffs will work great.
diff <(sqlite3 db 'select text from posts where id = 1') <(sqlite3 db 'select text from posts where id = 2')
On the other hand, the state of art when it comes to performance is zero-copy from disk to NIC using io_uring and DMA, combined with NICs that support TLS & checksumming offloading and separated header and payload that can be filled independently by the OS and user-space.
I wonder if the authors of these projects ask themselves: what reasons are there to not do it like this? This thread has a few answers.
I am not claiming a SQLite database is always the right approach for serving static files. In my particular use case for Clace, I found a database to work better.
Otherwise the file system is far superior for every "benefit" they list. And far easier to work with other software and more robust over time (nothing to change/break).
From what I remember, this was an atomic operation, and if there was a download in progress it would continue using the old file, because the data was still on disk and the filename was simply a pointer to the node.
This may behave differently on other file systems. This was done on an old Linux server with ext3
Seems like a simpler solution than using a db
I think the bigger benefit though is the increased read performance on many small files (saving system call overhead). To which amount that applies to static files that a smart server might keep in cache, I don't know.
Something I want to try is using sqlite as my filesystem but just storing content hashes that would point to an S3-compatible object store, so you get the atomicity and rollbackability and all but you also get massive parallelization and multiregionality of an object store
Edit: I googled it again and find that multiple processes can read sqlite concurrently so shouldn't be a problem
https://stackoverflow.com/questions/4060772/sqlite-concurren...