Posted by justincormack 3 days ago
This is why I have been avoiding the word sandbox for exe.dev. I don’t think developers agents need something “sandbox” shaped.
They do spike on different features like:
- snapshotting and forking
- good SSH and VPN access for end-users
- agent-friendly features, like obscuring secrets at network layer
Then there's also the option to use libkrun to run local sandboxes on your own computer. That doesn't scratch the itch for hosted services, but works if your goal is to run agents inside isolated environments for your own work.I've been working on some open-core stuff[1] to coordinate sandboxes, and we're making changes to have a library that lets people coordinate any number of remote or local sandboxes using any provider, kinda like how the Docker CLI works for managing containers, git repos, and coding agents. Flue[2] is another player in this space, and is more of a pure framework, while we're building it as an interactive product for using sandboxed agents and workflows.
[1] https://github.com/gofixpoint/amika/blob/main/ROADMAP.md
My personal belief is that the future of an "app" is a combo:
1. micro VM
2. agent on the VM
3. software bundled into the VM
So, it should be stupid simple to run these local sandboxed apps/agents. Right now, not too hard for technical users (esp. with things like https://smolmachines.com/ and https://microsandbox.dev/), but not as easy as clicking an app icon or typing `/path/to/binary` in the CLIYou'd have to build more of that with libkrun
The core tech of both are great though.
I am quite sure I'm not the only person working on post-firecracker KVM.
The startups in this space right now don't provide much value on top of the cloud providers they're wrapping. They don't tend to be run by experienced infra people either so they seem very vibecoded, insecure, janky, etc. They're also significantly overpriced because they're marking up already expensive providers.
Something surprising from my own experience is that while there's certainly a huge role for async agents in cloud sandboxes, async agents running locally seem more useful in many cases.
Most of the startups are just wrappers around AWS and significantly more expensive.
Agents need sandboxes that are cheaper so that they can run thousands
I feel that AWS, GCP and all the other cloud providers can provide this natively.
But still it would be nice to self host.
The best part of self hosting is that you own it as well, no rug pulls from the laundry list of reselling providers that could go away at anytime.
It would be nice to have a one click sandbox agent on a self hosted instance that is, free, fast (can pay a bit more for more intensive operations) and that is open source.
Part of it might just be that I am old and inflation is catching up with my understanding of prices.
But as far as AWS I still have to say no thanks. Imagine some group actually started using my hosted AI agent service for something compute and network intensive. It could turn into $2000 overnight and if I didn't account for one of the numerous types of AWS charges, I might have only collected $500 for credits purchases.
Or it could easily be ten times that. But who am I kidding. No one is going to use my agents. So it doesn't matter if it's gvisor or Firecracker or whatever.
Firecracker just has a ReSTful unix socket with a defined API and launches KVM vms with limited options.
For custom SMB I still think libvirt is a lower entry cost and may have transferable use cases to longer lived VMs, so you can just launch a qemu microvm[0] and use virsh and/or libvirt xml to set up the networking.
The ~400ms boot time of a qemu microvm vs ~120ms for firecracker may not be an issue for some loads, but qemu will also allow you a bit more density of placement than firecracker. qemu microvms will use a bit more memory individually, but they will also tend to use less real system memory with a larger number of microVMs.
It is all tradeoffs, and kata containers are yet another option that may apply depending on your use case.
You can run your own firecracker or qemu/kvm microvms on most instances that allow nested hypervisors, or on a local host. If cost containment is critical to you this is one possible way forward.
Really it just depends on if you want/need ReSTful control, or need to support short lived serverless functions, or if CLIs fit better and you many want to support full VMs.
They both are just Virtual Machine Monitors that targeted different use cases and decided on different tradeoffs.
Just be careful about hosting traditional containers and microVMs on the same system, that config is going to be problematic do to fundamental reasons that are too complex to properly address here.
[0] https://www.qemu.org/docs/master/system/i386/microvm.html
Daytona, E2B, OpenComputer, Freestyle, Blaxel, Vercel, Modal, Cloudflare, Tensorlake, Superserve, etc. etc.
Some of them work by pre-purchasing credits, so you can control the blast radius of spend.
Also, if you want a more embedded sandbox runtime as a library instead of a daemon + REST API, you can check out libkrun (and friendly layers on top of it like https://microsandbox.dev/ and https://smolmachines.com/)
We run quite a few Slicer instances on mini PCs and Ryzen builds - also on Hetzner (and yes ouch 120 EUR / mo up to ~ 550 EUR / mo for 16core / 128GB RAM feels almost unfair)
> Containers launch in seconds, yet their shared-kernel architecture requires significant custom hardening to safely contain untrusted code
That's literally why they made Fargate. It's managed firecracker VMs with containers. They invented firecracker for this purpose. This new product is competing with Fargate, but they don't mention Fargate at all in the announcement. > you create a MicroVM Image by supplying a Dockerfile and code packaged as a zip artifact in Amazon S3
>
> MicroVMs support up to 8 hours of total runtime
So you're already using containers with this new thing, same as Fargate! And not only that, it's more limited in runtime than Fargate! The only thing different with this service is stateful file storage, which is actually a problem you later have to engineer around, which is why containers are stateless.This smells like a competing team building something to capitalize on AI hype, but the product isn't differentiated enough for this to make sense long term. If this was a service called managed AI agents, and you added features specific to AI agents, that has value. But "here's Fargate with a different name" isn't gonna last.
https://aws.amazon.com/blogs/aws/firecracker-lightweight-vir... says
> Battle-Tested – Firecracker has been battled-tested and is already powering multiple high-volume AWS services including AWS Lambda and AWS Fargate.
https://engine.build/lab/agent-sandboxes
Will add MicroVMs there today (and any others that are missing if you let me know!)
When we did AWS AgentCore Runtime last year we introduced session isolation, with MicroVMs per session. You can think of Lambda MicroVMs as the same stack, but generalized to fit a larger number of application patterns.
Does this mean you effectively can't use them as long-lived developer environments? It sounds like even if you suspend them, this is the hard limit on the total time it can run.
Using this for a long lived "developer environment" would be extraordinarily expensive anyhow. Scaling the vCPU + RAM cost of these to the same shape compute optimized Graviton On-Demand EC2 instance (16 vCPU x 32 GB RAM) shows about 4x the cost.
So don't do that. Just use an EC2 instance.
But I think the point is that they should be cheap to set up, and because of the short life, never really contain anything except the potential to compute when needed, not important data.
You just have to finish development in 8 hours.
then when you launch the next one, its like you are still there?