Posted by jakobem 19 hours ago
It opens up absolutely bonkers capabilities.
I've done exactly that with Filestash [1] using its virtual filesystem plugin [2], which exposes arbitrary systems as a filesystem. It turns out the filesystem abstraction works extremely well even for systems that are not filesystems at all. There are connector for literally every possible storage (SFTP, S3, GDrive, Dropbox, FTP, Sharepoint, GCP, Azure Cloud, IPFS....), but also things like MySQL and Postgres (where the first level folder represent the list of databases, the second level is tables that belong to a database, and each row is represented as a form file generated from the schema), LDAP (where tree nodes are represented as folders and leaf are form files), ....
The whole filesystem is available to agents via MCP [3] and has been published to the OpenAI marketplace since around Christmas, currently pending review.
ref:
[1]: https://github.com/mickael-kerjean/filestash
[2]: https://www.filestash.app/docs/guide/virtual-filesystem.html
[3]: https://www.filestash.app/docs/guide/mcp-gateway.html https://github.com/mickael-kerjean/filestash/tree/master/ser...
[0] https://github.com/Barre/ZeroFS
[1] https://github.com/Barre/ZeroFS?tab=readme-ov-file#why-nfs-a...
- agents tend to need (already have) a filesystem anyway to be useful (not technically required but generally true, they’re already running somewhere with a filesystem)
- LLMs have a ton of CLI/filesystem stuff in their training data, while MCP is still pretty new (FUSE is old and boring)
- MCP tends to bloat context (not necessarily true but generally true)
UNIX philosophy is really compelling (moreso than MCP being bad). if you can turn your context into files, agents likely “just work” for your use case
Yes, it should be able to generically use a filesystem, but there has to be a better way to find an email than greping through each email as a file.
So, I see merit in the idea in theory, I’m just skeptical in practice.
There is tons of more complexity to sandboxing, I agree!
https://en.wikipedia.org/wiki/Sandbox_(computer_security)
Notably, a sandbox exists to separate one thing from other things. Limiting/filtering/monitoring what the sandboxes thing can do are often components of that, but the underlying premise is about separation.
Containers, VMs, etc. are 100% examples of sandboxing based on the actual industry definition of the term.
You are incorrect.
If you’re going to get fired up about people you feel are misusing this term, and then ignore citations about its actual definition, I think the ball’s in your court to back up your claim.
I’ve asked what background leads to your conclusion, because if you have eg written some sandboxing tooling, I’d be curious to give it a look. Always up to learn things, and I am more than a little baffled by how upset the comments I’m replying to here sound. You’ve linked me to Wikipedia, and another commenter asserts I can ‘just look it up on google scholar’. That seems pretty dismissive and reductive overall.
Firecracker kind of ends up being in the VM categories and I would place gvisor in a similar category too under the VM
So in my opinion, VM's are sandboxes.
Of course there is also libriscv https://github.com/libriscv/libriscv which is a sandbox (The fastest RISC-V sandbox)
There is also https://github.com/Zouuup/landrun Run any Linux process in a secure, unprivileged sandbox using Landlock. Think firejail, but lightweight, user-friendly, and baked into the kernel.
Your mileage may vary but I consider firecracker to be the AI sandbox usually. Othertimes it can be that they abstract on a cloud provider and open up servers in that or similar (I feel E2B does this on top of gcp)
Furthermore, running lots of random 3rd party programs in the same instance, be it a container, or an ec2 vm, or a firecracker vm all have the same issues - it is inherently totally unsafe. If you want to "sandbox" something you need to detail what exactly you are wanting to isolate.
A lot of people might suggest not being able to write to the filesystem, read env vars, or talk over the network but these are table stakes for a lot of the workloads that people want to "isolate" to begin with.
So not only is there this incorrect view that you are isolating anything at all, but I'm not convinced that the most important things, like being able to run arbitrary 3rd party programs, is even being considered.