Litestream VFS - Hacker News

Posted by emschwartz 12/11/2025

Litestream VFS(fly.io)

377 points | 83 comments

psanford 12/11/2025|

Oh hey this is using my go sqlite vfs module[0]. I love it when I find out some code I wrote is useful to others!

[0]: https://github.com/psanford/sqlite3vfs

benbjohnson 12/11/2025||

It worked great! Thanks for your work on it.

fragmede 12/11/2025||

that's all we really want in life.

bencornia 12/11/2025||

> What we’re doing here is instantaneous point-in-time recovery (PITR), expressed simply in SQL and SQLite pragmas.

> Ever wanted to do a quick query against a prod dataset, but didn’t want to shell into a prod server and fumble with the sqlite3 terminal command like a hacker in an 80s movie? Or needed to do a quick sanity check against yesterday’s data, but without doing a full database restore? Litestream VFS makes that easy. I’m so psyched about how it turned out.

Man this is cool. I love the unix ethos of Litestream's design. SQLite works as normal and Litestream operates transparently on that process.

simonw 12/11/2025||

This is such a clean interface design:

  export LITESTREAM_REPLICA_URL="s3://my-bucket/my.db"
  export AWS_ACCESS_KEY_ID="your-access-key"
  export AWS_SECRET_ACCESS_KEY="your-secret-key"

  sqlite3

  .load litestream.so
  .open file:///my.db?vfs=litestream
  PRAGMA litestream_time = '5 minutes ago'; 
  select * from sandwich_ratings limit 3;

zackify 12/11/2025|

For macos users,

brew install sqlite3, then change the bottom part:

  /opt/homebrew/opt/sqlite/bin/sqlite3
  .load litestream sqlite3_litestreamvfs_init
  .open file:///my.db?vfs=litestream

you have to manually pass in the init function name

zackify 12/11/2025||

This is great... just got it working using bun:sqlite! Just need to have "LITESTREAM_REPLICA_URL" and the key id and secret env vars set when running the script.

  import { Database } from "bun:sqlite";
  Database.setCustomSQLite("/opt/homebrew/opt/sqlite/lib/libsqlite3.dylib");

  // Load extension first with a temp db
  const temp = new Database(":memory:");
  temp.loadExtension("/path/to/litestream.dylib", "sqlite3_litestreamvfs_init");

  // Now open with litestream VFS
  const db = new Database("file:my.db?vfs=litestream");

  const fruits = db.query("SELECT * FROM fruits;").all();
  console.log(fruits);

koeng 12/12/2025||

Neat! Would this mainly be used for JavaScript servers running bun (ie, not end users)?

zackify 12/14/2025||

Correct in this case if you wanted many small edge nodes on fly.io or elsewhere to do read queries

seigel 12/11/2025|||

Cool that you got this to work! How did you get the "dylib" location or file.

ricardobeat 12/12/2025|||

    brew list sqlite

gives you the installed path, works for any formula.

seigel 12/12/2025||

Neat. What I wasn't able to find was the dynamic library, just the `litestream` executable. Was there some secret you used for the litestream dylib? Thanks in advance!

ricardobeat 12/12/2025|||

Looks like you need to build it yourself: https://litestream.io/guides/vfs/

seigel 12/12/2025||

Got it....you beat me to it. I had just figured that part out!! I didn't understand that part. ALSO, and this might be helpful for others, in the releases section, if you expand the little blue link at the bottom that says something like "see the other 35 assets", then the VFS extension will be downloadable from there!

Thanks for humouring me! :D

zackify 12/12/2025|||

The litestream one, from the litestream github releases page!

seigel 12/12/2025||

For anyone following this example, one thing to note that I figured out the hard way is that this line from @zackify SHOULD BE HIGHTLIGHTED...

* "Just need to have "LITESTREAM_REPLICA_URL" and the key id and secret env vars set when running the script"

... and that attempting to load the variables using `dotenv` will not work!!

indigodaddy 12/11/2025||

This is awesome. Especially for sqlite db’s that are read only from a website user perspective. My use case would be an sqlite DB that would live on S3 and get updated by cron or some other task runner/automation means (eg some other facility independent of the website that is using the db), and the website would use litestream vfs and just make use of that “read only” (the website will never change or modify the db) db straightup. Can it be used in this described fashion? Also/if so, how will litestream vfs react to the remote db updating itself within this scenario? Will it be cool with that? Also I’m assuming there is or will be Python modules/integration for doing the needful around Litestream VFS?

Currently on this app, I have the Python/flask app just refreshing the sqlite db from a Google spreadsheet as the auth source (via dataframe then convert to sqlite) for the sqlite db on a daily scheduled basis done within the app.

For reference this is the current app: (yes the app is kinda shite but I’m just a sysadmin trying to learn Python!) https://github.com/jgbrwn/my-upc/blob/main/app.py

Eikon 12/11/2025||

Perhaps you would find ZeroFS [0] useful. It works great out the box with SQLite [1] and only depends on S3 as an external service.

[0] https://github.com/Barre/ZeroFS

[1] https://github.com/Barre/ZeroFS?#sqlite-performance

indigodaddy 12/11/2025|||

Forgot to say, thanks for posting this, looks quite useful for various projects that have been on my mind. At one point I was looking for a git vfs for Python (I did find one for caddy static serving specifically, but I needed it for Python) but couldn’t find much that wasn’t abandoned—- an s3 vfs might do the trick for a lot of use cases though.

Eikon 12/11/2025||

My pleasure, this project has been a lot of fun :).

indigodaddy 12/11/2025|||

Yes this approach might be better. Sounds like litestream vfs won’t really do what I wanted in my described scenario

benbjohnson 12/11/2025||

Author here. Litestream VFS will automatically poll for new back up data every second so it keeps itself up to date with any changes made by the original database.

You don't need any additional code (Python or otherwise) to use the VFS. It will work on the SQLite CLI as is.

indigodaddy 12/11/2025||

Ok, yeah I think litestream vfs isn’t suitable to do as I described in the intended scenario.

ncruces 12/11/2025||

I also have this implemented and ready to go in my Go SQLite driver: https://github.com/ncruces/go-sqlite3/blob/main/litestream/e...

Slightly different API (programmatic, no env variables, works with as many databases as you may want), but otherwise, everything should work.

Note that PRAGMA litestream_time is per connection, so some care is necessary when using a connection pool.

apitman 12/12/2025|

ncruces/go-sqlite3 was the first thing I thought of when I saw .load litestream.so. That's awesome that you've implemented this. Was it a big lift to make it work with wasm?

ncruces 12/12/2025||

Not really, no.

Litestream is made in Go, and I have the VFS API well covered.

The bigger issue is Litestream is not really meant to be used as a library.

It depends on the modernc driver, and some bits on mattn, APIs not very stable, just got updated to require Go 1.25 when 1.24 is still a supported version, brings a bunch of non optional dependencies for monitoring, etc.

Eventually I had to fork to make these more manageable. I'd still hope Litestream can be made more modular, and I can depend directly on upstream.

dkam 12/12/2025||

Love the progress being made here. I've been really enjoying learning about another embedded database - DuckDB - the OLAP to SQLite's OLTP.

DuckDB has a lakehouse extension called "DuckLake" which generates "snapshots" for every transaction and lets you "time travel" through your database. Feels kind of analogous to LiteStream VFS PITR - but it's fascinating to see the nomenclature used for similar features. The OLTP world calls it Point In Time Recovery, while in the OLAP/data lake world, they call it Time Travel and it feels like a first-class feature.

In SQLite Litestream VFS, you use `PRAGMA litestream_time = ‘5 minutes ago’` ( or a timestamp ) - and in DuckLake, you use `SELECT * FROM tbl AT (VERSION => 3);` ( or a time stamp ).

DuckDB (unlike SQLite) doesn't allow other processes to read while one process is writing to the same file - all processes get locked out during writes. DuckLake solves this by using an external catalog database (PostgreSQL, MySQL, or SQLite) to coordinate concurrent access across multiple processes, while storing the actual data as Parquet files. It's a clever architecture for "multiplayer DuckDB.” - deliciously dependent on an OLTP to manage their distributed multiple user OLAP. Delta Lake uses uploaded JSON files to manage the metadata skipping the OLTP.

Another interesting comparison is the Parquet files used in the OLAP world - they’re immutable, column oriented and contain summaries of the content in the footers. LTX seems analogous - they’re immutable, stored on shared storage s3, allowing multiple database readers. No doubt they’re row oriented, being from the OLTP world.

Parquet files (in DuckLake) can be "merged" together - with DuckLake tracking this in its PostgreSQL/SQLite catalog - and in SQLite Litestream, the LTX files get “compacted” by the Litestream daemon, and read by the LitestreamVFS client. They both use range requests on s3 to retrieve the headers so they can efficiently download only the needed pages.

Both worlds are converging on immutable files hosted on shared storage + metadata + compaction for handling versioned data.

I'd love to see more cross-pollination between these projects!

chickensong 12/11/2025||

As a sandwich enthusiast, I would like to know more about these sandwich ratings.

skybrian 12/11/2025||

This sounds pretty cool, but I’m confused about what software being announced. Is there a new release of Litestream?

benbjohnson 12/11/2025||

Author here. Yes, Litestream v0.5.3 has been released with a new read-only VFS option: https://github.com/benbjohnson/litestream/releases/tag/v0.5....

zackify 12/11/2025||

so how would i connect from a separate machine, i can't figure out from the post or release notes or current litestream website docs, how would i use the extension to do that?

Edit:

need to set LITESTREAM_ACCESS_KEY_ID, LITESTREAM_SECRET_ACCESS_KEY, LITESTREAM_REPLICA_URL

then the module works

wim 12/11/2025||

I noticed the new release also includes "directory replication support for multi-tenant databases", great addition as well!

darintay 12/11/2025|

Does this work with sqlite extensions? If I were using e.g. sqlite-vec or -vss or some other vector search extension would I be able to use litestream to back it up to S3 live, and then litestream-vfs to query it remotely without downloading the whole thing?

andersmurphy 12/11/2025||

Yeah I wonder how it would work with things like custome application functions etc. But I guess the query is run locally and it's the pages that are fetched from S3? So it just works? That would be awesome.

I guess there's only one way to find out.

JaggerJo 12/11/2025||

because liters works at the lowest level (virtual file system) all other things should just work.

ncruces 12/11/2025||

Yes. (that's all, really)

More comments...