ggsql: A Grammar of Graphics for SQL

Posted by thomasp85 14 hours ago

ggsql: A Grammar of Graphics for SQL(opensource.posit.co)

354 points | 73 commentspage 3

radarsat1 14 hours ago||

Wow, love this idea.

data_ders 13 hours ago||

ok, this is definitely up my alley. color me nerd-sniped and forgive the onslaught of questions.

my questions are less about the syntax, which i'm largely familiar with knowing both SQL and ggplot.

i'm more interested in the backend architecture. Looking at the Cargo.toml [1], I was surprised to not see a visualization dependency like D3 or Vega. Is this intentional?

I'm certainly going to take this for a spin and I think this could be incredible for agentic analytics. I'm mostly curious right now what "deployment" looks like both currently in a utopian future.

utopia is easier -- what if databases supported it directly?!? but even then I think I'd rather have databases spit out an intermediate representation (IR) that could be handed to a viz engine, similar to how vega works. or perhaps the SQL is the IR?!

another question that arises from the question of composability: how distinct would a ggplot IR be from a metrics layer spec? could i use ggsql to create an IR that I then use R's ggplot to render (or vise versa maybe?)

as for the deployment story today, I'll likely learn most by doing (with agents). My experiment will be to kick off an agent to do something like: extract this dataset to S3 using dlt [2], model it using dbt [3], then use ggsql to visualize.

p.s. @thomasp85, I was a big fan of tidygraph back in the day [4]. love how small our data world is.

[1]: https://github.com/posit-dev/ggsql/blob/main/Cargo.toml

[2]: https://github.com/dlt-hub/dlt

[3]: https://github.com/dbt-labs/dbt-fusion

[4]: https://stackoverflow.com/questions/46466351/how-to-hide-unc...

thomasp85 13 hours ago|

Let me try to not miss any of the questions :-)

ggsql is modular by design. It consists of various reader modules that takes care of connecting with different data backends (currently we have a DuckDB, an SQLite, and an ODBC reader), a central plot module, and various writer modules that take care of the rendering (currently only Vegalite but I plan to write my own renderer from scratch).

As for deployment I can only talk about a utopian future since this alpha-release doesn't provide much tangible in that area. The ggsql Jupyter kernel already allows you to execute ggsql queries in Jupyter and Quarto notebooks, so deployment of reports should kinda work already, though we are still looking at making it as easy as possible to move database credentials along with the deployment. I also envision deployment of single .ggsql files that result in embeddable visualisations you can reference on websites etc. Our focus in this area will be Posit Connect in the short term

I'm afraid I don't know what IR stands for - can you elaborate?

stevedh 11 hours ago||

Intermediate Representation

thomasp85 10 hours ago||

Ah - yes, in theory you could create a "ggplot2 writer" which renders the plot object to an R file you can execute. It is not too far away from the current Vega-Lite writer we use. The other direction (ggplot2->ggsql) is not really feasible

persedes 9 hours ago||

Soo can I put this on top of e.g. grafana?

estetlinus 7 hours ago||

I like it. I can see a world where these visuals become my serving layer in dbt. Small, clean and versioned .sql-files.

Please, for the love of god and in the name of everything holy, kill the Jupyter Notebook.

breakfastduck 12 hours ago||

This is fantastic. Feels like something that should've been in there from the start!

hei-lima 13 hours ago||

Really cool!

rvba 12 hours ago||

1) does this alllw to export to Excel?

2) how to make manual adjustments?

thomasp85 12 hours ago|

My answers will probably disappoint

1) No (unless you count 'render to image and insert that into your excel document') 2) This is not possible - manual adjustments are not reproducible and we live by that ethos

tonyarkles 9 hours ago|||

> 2) This is not possible - manual adjustments are not reproducible and we live by that ethos

Just want to give you a high-five on that one. I've dealt with so many hand-adjusted plots in the past where they work until either the dataset changes just a little bit or the plot library itself gets upgraded... in both cases, the plots completely fall apart when you're not expecting it.

i000 8 hours ago|||

What makes ggplot great is that it allows manual adjustments AND has a nice declerative grammar. Hard for me to see the value of a plotting library without being able to adjust plots.

hadley 54 minutes ago||

You’ll be able to adjust plots. But you have to do it with code, not UI.

dartharva 13 hours ago|

Would be awesome if somehow coupled into Evidence.dev