Posted by lumpa 6 days ago
“Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of 44 tools and 154 plugins dedicated to making working with structured data as productive as possible.”
I'm curious where the pattern will go. My sense is there is a split between cathedrals vs bazaar for approach here, where cathedrals are quite rigid app builders, think framer/wix, while bazaars focus a layer below for more flexibility but less integrated.
So I imagine we could now load some data in to sqlite, design some HTML also loaded in to the db, and deploy. Although looking at the source, it seems like stored apps are expected to be managed by the plugin itself, but I'm sure there's a way around that
[0] Eg from one of the examples - https://datasette.io/legislators/-/query.json?sql=select+*+f... . If you strip the '.json' you get the html view. For what it's worth there's also a '.csv' version.
I have an idea for a way to edit them through Datasette and have them backed up to Git via a separate mechanism, but having them on disk would be a whole lot more convenient.
Filed an issue here: https://github.com/datasette/datasette-apps/issues/30
[0] https://sqlite.org/src/file/ext/misc/fileio.c, it allows you to read a directory recursively in the cli (`select * from fsdir("./");`)
Edit: It allows upwards traversals (`select * from fsdir("../../../../etc/passwd");`), so beware
I'm sticking with the Python bundled sqlite3 though so I'm not in a good place to take advantage of that one.
CORS headers?
I remember writing code in the bad old days to parse HTML tags and allowlist specific attributes. Now browsers have a much better solution baked in.
But it still makes me a bit nervous. Seems like a very small bug could sneak in. This is a good example of where I would reach for Fable to double check the implementation and have a lot of extra tests.
(nit: would be nice if the chat box treated Enter and Shift+Enter the way these other companies have trained my brain, but maybe that is a deliberate choice.)
Thankfully GPT-5.5 is really strong on security stuff too. I wouldn't have dared build this without a whole lot of Opus/GPT-assisted prototyping and testing along the way.
The 'write' part would technically be very doable and not that different from other back-ends.
You almost never need just a basic list of all the data in your table, even if you're able to filter and sort it. There's no moat there at all. People need serious BI tools, and that throws simplicity out of the window (PowerBI, QuickSight, etc.).
In reality, what most people need is much simpler, a mini app with some curated datasets and simple filters, maybe some AI querying if we want to get fancy. There's some companies out there that work with big data, but for the rest of us small data is ok.
My phone has 1TB of storage.
My more recent prototype shrinks that to 10.47 MB transferred: https://simonw.github.io/research/pyodide-asgi-browser/datas...
> a Datasette-style backend to a self-contained HTML frontend is an astonishingly powerful combination.
1000% agree. And datasette is a terrific framework to build any kind of data exploration or visualization on.
This sounds like an awesome feature and a good excuse for me to dive back into playing with Datasette.
The design keeps data and presentation together and even maps do not rely on external services.
I have called it Pihka: https://ghentcdh.github.io/Pihka/ https://github.com/GhentCDH/Pihka
although I'm coming from a different starting point, it seems like some of our thoughts have aligned. I'm building https://caipi.ai/ as a workspace for agents to build simple data driven apps. The agent edits through MCP and the user gets an interactive app in the browser.
If you're interested picking each others brains around this topic, I'd be psyched to have a chat. gh:pietz.
/-/apps/iframe-content/timeline.html
You can protect it with CSP headers, but you can't also protect it with the sandbox="" attribute (should a user visit it directly)If you want both sandbox= restrictions and CSP headers at the same time the only way I've found that works cross all major borders is the iframe plus srcdoc="" with injected CSP meta headers patterns.
Note that a lot of sandbox implementations serve their iframe content from a separate domain, to ensure cookies and localStorage and other same origin things are robustly protected.
I can't do that easily for Datasette because it's open source software that people can run on their own laptops, so I didn't want to block people on "now register a domain/subdomain and set this up in DNS".
Not for untrusted content living on the same origin to prevent it from exercising any of the powers that it would ordinarily have to be able to access sensitive data. It's a misleading name and shouldn't have been chosen. There is no combination of CSP or the iframe sandbox attribute that can be relied upon for that purpose. This is a fundamental limitation of the way the specs were written.
(There needs to be a big warning about this on MDN, but moving from the old wiki to a wiki with GitHub for login to the GitHub-based pull request process really didn't help the there's-a-problem-on-this-page-but-limited-resources-to-make-things-better problem.)
And I serve the content in srcdoc= to ensure there's no URL a user can visit which would directly execute the content outside of that iframe sandbox.
It doesn't matter. I just said there is no combination of CSP or the iframe sandbox attribute that can be relied upon here.
I'm not convinced it's true - I've been thinking about this for months, and building experimental prototypes to help me get to the combination that I think makes sense.
Can you describe an exploit that the combination I'm using of iframe sandbox= srcdoc= with an injected meta CSP tag doesn't handle?
Would moving the untrusted content to be served from a separate domain entirely close the hole?
(In case it's not clear the iframe sandbox= is the bit that's doing most of the work here - the CSP stuff is there mainly to protect against malicious apps that deliberately exfiltrate stolen private data.)
Yes.