SQLite JSON at full index speed using generated columns

Posted by upmostly 12/12/2025

SQLite JSON at full index speed using generated columns(www.dbpro.app)

381 points | 114 commentspage 2

AlexErrant 12/12/2025|

I was looking for a way to index a JSON column that contains a JSON array, like a list of tags. AFAIK this method won't work for that; you'll either need to use FTS or a separate "tag" table that you index.

garaetjjte 12/12/2025|||

I would want that too. It's possible in MySQL: https://dev.mysql.com/doc/refman/8.4/en/create-index.html#cr...

MyOutfitIsVague 12/12/2025|||

Yeah, SQLite doesn't have any true array datatype. I think you could probably do it with a virtual table, but that would be adding a native extension, and it would have to pack its own index.

rini17 12/12/2025||

You can use triggers to keep the tag table synchronized automatically.

simonw 12/12/2025||

Tiny bug report: I couldn't edit text in those SQL editor widgets from my iPhone, and I couldn't scroll them to see text that extended past the width of the page either.

hamburglar 12/12/2025||

The examples also needed a “drop table if exists” so they could be run more than once without errors.

upmostly 12/12/2025||

Great catch, I'll add that now!

upmostly 12/12/2025||

Thanks Simon! Looking into that now. Big fan. Hope you enjoyed the post.

Edit: This should now be fixed for you.

rrmdp 12/12/2025||

The fact the DB is portable is amazing, I use it for all my projects now but didn't know about this JSON feature

Seattle3503 12/12/2025||

It says full speed, but no benchmarks were performed to verify if performance was really equivalent.

pawelduda 12/12/2025||

I've been coding a lot of small apps recently, and going from local JSON file storage to SQLite has been a very natural path of progression, as data's order of magnitude ramps up. A fully performant database which still feels as simple as opening and reading from a plain JSON file. The trick you describe in the article is actually an unexpected performance buffer that'll come in handy when I start hitting next bottleneck :) Thank you

dmezzetti 12/12/2025||

I love this feature. I've long used json_extract to create dynamic columns with txtai sql: https://neuml.github.io/txtai/embeddings/query/#dynamic-colu...

You can do the same with DuckDB and Postgres too.

focusgroup0 12/12/2025||

Would this be a good fit for migrating from mongo --> sqlite? A task I am dreading

zffr 12/12/2025|

Just curious, why do you want to migrate from mongo (document database server) to sqlite (relational database library)?

That migration would be making two changes: document-based -> relational, and server -> library.

Have you considered migrating to Postgres instead? By using another DB server you won't need to change your application as much.

focusgroup0 12/12/2025||

Thanks for the feedback. The document model in mongo was slopped together by a junior engineer, so perhaps an unorthodox approach. It is basically flat, and already used in a pseudo-relational manner via in-app join to the existing sqlite store. This blog post inspired me to think, what if we just chucked all the json from mongo into sqlite and used the generated indices? Then we can gradually "strangler fig" endpoint by endpoint

rglynn 12/13/2025||

This sounds roughly on-track, but I agree with GP; Postgres would probably be better (also has great JSON(B) support).

mcluck 12/12/2025||

Very cool article. To really drill it home, I would have loved to see how the query plan changes. It _looks_ like it should Just Work(tm) but my brain refuses to believe that it's able to use those new indexes so flawlessly

meindnoch 12/12/2025||

In the 2nd section you're using a CREATE TABLE plus three separate ALTER TABLE calls to add the virtual columns. In the 3rd section you're using a single CREATE TABLE with the virtual columns included from the get go.

Why?

hamburglar 12/12/2025||

I think the intent is to separate the virtual column creation out when it’s introduced in order to highlight that it’s a very lightweight operation. When moving onto the 3rd example, the existence of the virtual columns is just a given.

ralferoo 12/12/2025|||

Depending on the amount of inserts, it might be more efficient to create all the indexes in one go. I think this is certainly true for normal columns.

But I suspect with JSON the overhead of parsing it each time might make it more efficient to update all the indices with every insert.

Then again, it's probably quicker still to insert the raw SQL into a temporary table in memory and then insert all of the new rows into the indexed table as a single query.

hiccuphippo 12/12/2025||

In 2 they show how to add virtual columns to an existing table, in 3 how to add indexes to existing virtual columns so they are pre-cooked. Like a cooking show.

meindnoch 12/12/2025|||

>In 2 they show how to add virtual columns to an existing table

No, in section 2 the table is created afresh. All 3 sections start with a CREATE TABLE.

hiccuphippo 12/12/2025||

Yes, it seems each section has its own independent database so you have to create everything on each of them.

upmostly 12/12/2025|||

Literally exactly as I meant it. I watch a lot of cooking shows, too, so this analogy holds up.

moregrist 12/12/2025|

Generated columns are pretty great, but what I would really love is a Postgres-style gin index, which dramatically speeds up json queries for unanticipated keys.

More comments...