Show HN: JSON Query - Hacker News

Posted by wofo 10/27/2025

I'm working on a tool that will probably involve querying JSON documents and I'm asking myself how to expose that functionality to my users.

I like the power of `jq` and the fact that LLMs are proficient at it, but I find it right out impossible to come up with the right `jq` incantations myself. Has anyone here been in a similar situation? Which tool / language did you end up exposing to your users?

154 points | 70 comments

nothrabannosir 10/28/2025|

crucial jq insight which unlocked the tool for me: it's jsonl, not json.

it's a pipeline operating on a stream of independent json terms. The filter is reapplied to every element from the stream. Streams != lists; the latter are just a data type. `.` always points at the current element of the stream. Functions like `select` operate on separate items of the stream, while `map` operates on individual elements of a list. If you want a `map` over all elements of the stream: that's just what jq is, naturally :)

stream of a single element which is a list:

    echo '[1,2,3,4]' | jq .
    # [1,2,3,4]

unpack the list into a stream of separate elements:

    echo '[1,2,3,4]' | jq '.[]'
    # 1
    # 2
    # 3
    # 4
    echo '[1,2,3,4]' | jq '.[] | .' # same: piping into `.` is a NOP:

only keep elements 2 and 4 from the stream, not from the array--there is no array left after .[] :

    echo '[1,2,3,4]' | jq '.[] | select(. % 2 == 0)'
    # 2
    # 4

keep the array:

    echo '[1,2,3,4]' | jq 'map(. * 2)'
    # [2,4,6,8]

map over individual elements of a stream instead:

    echo '[1,2,3,4]' | jq '.[] | . * 2'
    # 2
    # 4
    # 6
    # 8
    printf '1\n2\n3\n4\n' | jq '. * 2' # same

This is how you can do things like

    printf '{"a":{"b":1}}\n{"a":{"b":2}}\n{"a":{"b":3}}\n' | jq 'select(.a.b % 2 == 0) | .a'
    # {"b": 2}

select creates a nested "scope" for the current element in its parens, but restores the outer scope when it exits.

Hope this helps someone else!

tcdent 10/27/2025||

Doesn't the command-line utility `jq` already define a protocol for this? How do the syntaxes compare?

(LLMs are already very adept at using `jq` so I would think it was preferable to be able to prompt a system that implements querying inside of source code as "this command uses the same format as `jq`")

jonny_eh 10/27/2025||

For convenience: https://en.wikipedia.org/wiki/Jq_(programming_language)

cryptonector 10/27/2025||

Oh wow, it got undeleted. Some editor insisted on deleting it because it was a "personal project" (Stephen Dolan's) even though it has a huge user base. I guess now that it has a proper "org" in GitHub it's different. What nonsense.

ancarda 10/28/2025|||

It's likely because there's a citation in a paper. That's apparently the bar you need to reach to get Wikipedia to see something as significant enough. I tried to get a draft article about SourceHut ( https://sourcehut.org/ ) to be published after extensive improvements and they refused because there weren't enough third party links. This is despite the fact there's like a dozen pages in Wikipedia about software that is hosted on SourceHut, so it seems notable enough?

jonny_eh 10/27/2025||||

Maybe it helped that they called it a "programming language"? It helps make it sound super serious.

rendall 10/27/2025|||

Wikipedia is such a disappointment

millerm 10/28/2025|||

How could you even type such a ridiculous statement?

lioeters 10/28/2025||

Seriously, Wikipedia has been of immense value to society and education.

Yes there are issues with ideologically motivated moderators, poorly cited articles, etc. But even with its flaws, it's an amazing resource provided to the public for free (as in coffee and maybe as in speech also).

rendall 10/31/2025||

Not at all free as in free speech. It is entirely captured by motivated gangs of collaborators that make the unwary who read it stupider. Try to make reasonable changes and these people collaborate to outvote you.

inlined 10/27/2025||

Mongo also has a good query language and a mongo DB can be seen as an array of documents

cryptonector 10/27/2025||

You just have to wrap your mind around jq. It's a) functional, b) has pervasive generators and backtracking. So when you write `.a[].b`, which is a lot like `(.a | .[] | .b)` what you get is three generators strung together in an `and_then` fashion: `.a`, then `.[]`, and then `.b`. And here `.a` generates exactly one value, as does `.b`, but `.[]` generates as many values as are in the value produced by `.a`. And obviously `.b` won't run at all if `.a` has no values, and `.b` will run for _each_ value of `.a[]`. Once you begin to see the generators and the backtracking then everything begins to make sense.

movpasd 10/28/2025|

I think this is a paradigm known as concatenative programming: https://en.wikipedia.org/wiki/Concatenative_programming_lang...

mrtimo 10/27/2025||

DuckDB can read JSON - you can query JSON with normal SQL.[1] I prefer to Malloy Data language for querying as it is 10x simpler than SQL.[2]

[1] - https://duckdb.org/docs/stable/data/json/overview [2] - https://www.malloydata.dev/

zie 10/27/2025|

So can postgres, I tend to just use PG, since I have instances running basically everywhere, even locally, but duckdb works well too.

gcr 10/27/2025||

I read the man page of `jq` and learned how to use it. It's quite well-written and contains a good introduction.

I've observed that too many users of jq aren't willing to take a few minutes to understand how stream programming works. That investment pays off in spades.

MrApathy 10/28/2025||

Plugging a previous personal project for learning jq interactively: https://jqjake.com/

gcr 10/30/2025||

THIS IS SO COOL! Thanks for sharing!

Are you interested in having help writing more scenarios? I’ve had a couple ideas for similar kata-like exercises that I haven’t shared publicly. Happy to send a PR or something if it would provide value

penguin_booze 10/28/2025|||

I'm a big fan of jq but won't credit its man page with much. There were (ineffable) insights that I picked up through my own usage over time, that I couldn't glean from reading the man page alone. In other words, it's not doing its best to put the correct mental model out for a newish user.

wpm 10/27/2025||

Also, LLMs are good at spitting out filters, but you can learn what they do by going and then looking up what it’s doing in the docs. They often apply things in far more interesting and complex ways than the docs at jqlang.org do, which are often far too “foo bar baz” tier to truly understand explain the power of things.

jawns 10/27/2025||

I'd like to know how it compares to https://jsonata.org

gnarlouse 10/27/2025||

JSONata looks to be more general purpose with its support for variables/statements, and custom functions. I'd probably still stick with JSONata

aeberhart 10/28/2025|||

We wrote an article on this: "JQ vs. JSONata: Language and Tooling Compared". https://dashjoin.medium.com/jq-vs-jsonata-language-and-tooli...

Alifatisk 10/27/2025||

Can't you just visit both pages, build an understanding and compare them?

OrderlyTiamat 10/27/2025||

Maybe the author would be in a better place to do that, having the expertise already. Also, as a user I'm quite happy with jq already, so why expend the effort?

HatchedLake721 10/27/2025||

https://jsonpath.com/ or https://jsonata.org/

wofo 10/27/2025|

Would you mind sharing a bit more? Have you used them? How did that go?

gnarlouse 10/27/2025||

I use `jsonata` currently at work. I think it's excellent. There's even a limited-functionality rustlib (https://github.com/Stedi/jsonata-rs). What I particularly like about `jsonata` is its support for variables, they're super useful in a pinch when a pure expression becomes ugly or unwieldy or redundant. It also lets you "bring your own functions", which lets you do things like:

``` $sum($myArrayExtractor($.context)) ```

where `$myArrayExtractor` is your custom code.

---

Re: "how did it go"

We had a situation where we needed to generate EDI from json objects, which routinely required us to make small tweaks to data, combine data, loop over data, etc. JSONata provided a backend framework for data transformations that reduced the scope and complexity of the project drastically.

I think JSONata is an excellent fit for situations where companies need to do data transforms, for example when it's for the sake of integrations from 3rd-party sources; all the data is there, it just needs to be mapped. Instead of having potentially buggy code as integration, you can have a pseudo-declarative jsonata spec that describes the transform for each integration source, and then just keep a single unified "JSONata runner" as the integration handler.

mediaman 10/27/2025|||

We've had a great experience with JSONata too.

It's nice because we can just put the JSONata expression into a db field, and so you can have arbitrary data transforms for different customers for different data structures coming or going, and they can be set up just by editing the expression via the site, without having to worry about sandboxing it (other than resource exhaustion for recursive loops). It really sped up the iteration process for configuring transforms.

montekristooGDB 10/28/2025|||

I confirm. At first I was trying to write that "buggy" code, until I got jsonata, and started working with the queries it supports.

It made my life a lot easier

arccy 10/27/2025||

In the k8s world there's a random collection of json path, json query, some random expression language.

Just use jq. None of the other ones are as flexible or widespread and you just end up with frustrated users.

voidfunc 10/27/2025|

This. Jq is the defacto standard and anytime I come across something else I am annoyed.

Which isn't to say jq is the best or even good but its battle-tested and just about every conceivable query problem has been thrown at it by now.

pscanf 10/27/2025||

I have a similar use case in the app I'm working on. Initially I went with JSONata, which worked, but resulted in queries that indeed felt more like incantations and were difficult even for me to understand (let alone my users).

I then switched to JavaScript / TypeScript, which I found much better overall: it's understandable to basically every developer, and LLMs are very good at it. So now in my app I have a button wherever a TypeScript snippet is required that asks the LLM for its implementation, and even "weak" models one-shot it correctly 99% of the times.

It's definitely more difficult to set up, though, as it requires a sandbox where you can run the code without fears. In my app I use QuickJS, which works very well for my use case, but might not be performant enough in other contexts.

cweagans 10/27/2025|

"JSON Query" is kind of a long name. You should find a way to shorten it. Maybe "jQuery" or something along those lines :P

More comments...