MicroVMs: Run isolated sandboxes with full lifecycle control

Posted by justincormack 4 days ago

MicroVMs: Run isolated sandboxes with full lifecycle control(aws.amazon.com)

377 points | 203 commentspage 3

patabyte 1 day ago|

This seems roughly similar to Google's Cloud Run gen2 instance types. My understanding is with the second generation, they are running microvms which are bootstrapped from a container image.

9294 1 day ago||

No one talks about new Railway Sandboxes - https://docs.railway.com/sandboxes

I think they have one of the best sandbox environments on the market with pay per utilized resources pricing, it's a huge cost reduction for agentic workloads when you have 95%+ idle CPU time and occasional spikes for CPU heavy work (e.g. agent run tests or something like this).

I use railway to host my openclaw like personal agent for friends and family (9 instances) and it costs like 1-2$/mo with scale to zero.

dj0k3r 1 day ago|

Have you tried using unikraft? I think it might be cheaper imo. Worth a try.

skybrian 1 day ago||

Does anyone understand the pricing? The pricing page says “Lambda MicroVMs are priced per instance-second” but MicroVM’s aren’t otherwise mentioned.

otterley 1 day ago|

Click on the "MicroVMs" tab of the pricing page: https://aws.amazon.com/lambda/pricing/

skybrian 1 day ago||

Thanks! These tabs render badly on mobile, but you can click on “Functions” to hide it and then click the “MicroVMs” tab to show it.

This pricing model looks very complicated and unfriendly for hobbyists. Maybe it’s cheaper than exe.dev’s $20/month, but I have no idea. I’d have to a complicated calculation based on guesses to tell.

otterley 1 day ago||

I don't think it's that complicated, but yeah, it's not as simple as $X/month.

The primary difference is that with Lambda you pay by the second, not by the month. According to my math, the break-even point for a 8GB allocation (the minimum exe.dev supplies) would be about 1.65 days of continuous runtime. Less than that, and you're better off with Lambda. More than that, and you're better off with exe.dev (assuming we're just talking about money and not opportunity cost). Lambda allows you to use just 2GB of memory, though, so being more memory efficient would change the break-even point to 6.61 days.

skybrian 21 hours ago||

I’m running a web server in a VM and I use it every day. It’s mostly idle, but it’s continually available. I wonder how much “continuous runtime” that is?

otterley 18 hours ago||

The stopwatch starts when a request arrives and stops after your processor sends the response. You’re not charged for idle time. For low-demand services, it’s a bargain. The tradeoff is a bit of extra latency for cold starts (i.e. when a request hasn’t been processed in a while). Nowhere near a full classic VM launch though—typically under a second.

TacticalCoder 1 day ago||

What's the point of microVMs for running agents?

Are you guys literally spinning up agents where a 100 ms boot time vs a 3 seconds boot time makes a difference?

I'm asking because I understand the appeal of micro VMs but every time the subject comes up people talk about "isolating agents": what's wrong about isolating agents in a regular VM (or in a container which, itself, is in a VM)?

FWIW I've got my stuff nicely isolated in regular VMs that are regularly up for hours and hours.

It's like the microVMs boots in 100 ms, then the agent does... What? And exits after another 100ms and now you need to launch another one?

What's the use case of "microVMs to isolate agents"?

sonink 14 hours ago||

I dont get it either - I was going to ask the same question but found this.

We have been doing the exact opposite - instead of micro VM's we are giving agents larger VMs.

Previously we were giving them 1GB RAM VM's - now we have upped to 4 GB RAM VM's. When the agent is working - the real cost is in the inference. There is no reason to keep the agent waiting because your VM is too damn slow. So we moved to larger and faster VMs.

The agent might install a package, or run a script - and now it moves along just faster. Not to mention that if the agent is installing a 'fat' SDK, like maybe android sdk, a thicker RAM just moves along everything smoothly without breakages. The incremental amount we pay for the bigger VM is more than justified by the increase in agent performance.

And all the tooling that has already been built up for standard human operated VM's just works pretty well out of the box. We are able to spin up VM's pretty much on demand and purge them clean once the work is done.

We are moving to 8 GB RAMs/4CPUs sometime this year, and GPU's hopefully sometime next.

victorbjorklund 1 day ago|||

I imagine you can have a situation where you let an agent run in a shared env but to access certain tools you spin up a VM just for the tool call duration and then shut it down again. Let’s say you wanna allow the agent to write and run code then you need it to run it somewhere safe

vmg12 1 day ago|||

Microvms are better for the VM provider. They use less memory and have a smaller attack surface. Also starting in 100ms means you don't need to add a bunch of async machinery when launching the vms.

0xbadcafebee 1 day ago|||

This is for people who want both faster execution, and better security isolation for agents/subagents. It is a different use case than yours

TacticalCoder 1 day ago||

I understand that but micro VMs don't provide better security isolation than regular VMs.

So that leaves faster boot times.

Faster boot times and then the agent does what? And at how many token/s? And what's the "time to first token" anyway?

How do the time to first token and then the token/s inherent limitations of LLMs not totally dominate the running time?

I just don't get the use case.

nok22kon 1 day ago||

imagine installing an agent in slack at a company with 1000 employees, and you want each request to have its own VM for data analysis, downloading repos and working on them, ...

regular VMs just use too much memory, a typical ubuntu uses 512 MB as a baseline

0xbadcafebee 1 day ago||

^ this. a single long session may use 20 subagents, each of which need their own VM, on top of the parent agent's VM, all of which may need separate security credentials, isolation, in addition to the spinup time, and resources used. each user might do 100 sessions a week. so that's 2,000 VMs per week per user. each regular VM takes, let's say, 10s to boot up. that's 5.5 hours per week just waiting for VMs to start (for a single user).

then there's the disk iops used for spinning up all these VMs (loading and booting a whole distro), the security attack vectors of an entire VM vs microVM, the maintenance of the images, the hypervisor abstraction to handle all this automation, ssh for the agent to run in the VM, etc.

compared to mounting an extracted container image to a folder, starting a microVM kernel with folder mount, with specific credentials attached. minimum memory and CPU allocated, minimum possible system resource use, fastest operation, least maintenance. you get more time, more resources, more security.

(micro VMs do provide better security isolation. they have kernels with fewer built-in vulnerabilities, fewer hardware drivers to exploit, a more locked-down network, and they lack a full OS's applications and filesystem permissions to exploit)

sublimefire 17 hours ago||

This example is a bit over the top and is more of an edge case, subagents of the same session can use the same VM because what is the point to isolate among them? If at least one subagent is trying to hack you then I would consider the whole session was compromised anyway as you cannot guarantee the agents leaking this among themselves.

tastyeffectco 20 hours ago|||

in so many cases, docker is more than sufficient for major agent workloads... with no hostile users of course

coder-pm 14 hours ago||

[flagged]

lysecret 1 day ago||

I don’t get it we are paying at least hundreds or maybe thousands per month on ai costs. Just get a regular vm ?

mjb 1 day ago||

You absolutely can run agents on a regular VM. But if you want to build multi-tenant and multi-agent systems with strong security boundaries, then having a VM or MicroVM per agent session (or session with a group of agents) really simplifies things.

When we did AWS AgentCore Runtime last year we introduced session isolation, with MicroVMs per session. You can think of Lambda MicroVMs as the same stack, but generalized to fit a larger number of application patterns.

retinaros 1 day ago||

why use agentcore runtime then

skybrian 1 day ago|||

You don’t have to pay that much. I did pay a couple hundred for a while, but not since I switched to Chinese models along with a $20 ChatGPT subscription.

Also, a single VM is pretty limiting.

victorbjorklund 1 day ago||

Isn’t the point that you wanna be able to spin up and down thousands of VM:s on demand (literally a VM just to run a tool and then shut it down until the next tool call)

rbbydotdev 1 day ago||

Anyone have a price chart comparing all the sandbox providers? (microvm included)?

robmccoll 1 day ago||

What does the actual startup latency look like? Does it depend on the size of the resulting image?

simonw 1 day ago|

I tried this a few days ago. Once you have an image built and ready startup time is fast, but building that original image took 5-10 minutes.

I think it's designed for building an image once and then reusing it many, many times.

colesantiago 1 day ago||

How does this compare to Fly.io

Which is more cheaper for me?

Ideally maybe self hosting would be better?

simonw 1 day ago|

Fly.io doesn't set a maximum of 8 hours of alive time on your instance.

Also, MicroVMs can't be exposed directly to the web. Your code running in them can only be executed via API calls with attached auth tokens - so if you wanted to host a public facing API or website with them you'd need to implement your own additional layer in front.

Something I appreciate about Fly (disclaimer: they support my work) is that the pricing is fixed - you pay $1.94/month (less if you suspend your machine) for the smallest instance, up to $976.25/month for the largest (16 CPUs, 128GB) plus predictable costs for volume storage.

The only variable outside your control is bandwidth, and that's unlikely to cause a nasty shock.

Contrast with any of the more "elastic" hosting providers - Vercel, Cloud Run - and you're much less likely to get a horrifying bill if something gets overly-crawled or goes viral.

tptacek 1 day ago|||

I'm pretty proud of this:

https://fly.io/blog/accident-forgiveness/

A way we simply suck at business: we didn't keep beating the drum about this after we wrote the policy up. We just sort of figured everyone read the blog post and moved on. We probably should have been continuously making noise about it.

What you get from having a company made almost entirely of engineers.

anamexis 1 day ago|||

Fly.io's Sprites [1] do offer public web access as an option. They also have dynamic pricing.

https://sprites.dev

tptacek 1 day ago||

To a first approximation everything in this space has dynamic pricing. If it's not priced dynamically, you're presumably paying a premium either on a commit or in gym pricing.

anamexis 1 day ago||

I don't know what the right term is, but maybe "deterministic" pricing (this is not the right term, but maybe closer). That is, I'm not going to know how much a sprite cost until I see the bill (or look up the live usage report), whereas if I spin up a Fly Machine, I know exactly how much I'm going to pay per unit of time.

(Both make sense for their respective use cases.)

tptacek 23 hours ago||

Ah, that makes sense. Yeah, that's a technical limitation! I'm sure we'll work through it at some point this year, but it's a consequence of the fact that for most people, most of their Sprites are dormant most of the time; it's how you comfortably get to having 20-30 Sprites (making a new one any time you do something new) for every user.

It's a good callout, a genuine difference between Sprites and Fly Machines. Believe it or not, it's intended to make Sprites cheaper than Machines.

anamexis 21 hours ago||

I absolutely believe it! And feel the pricing model makes perfect sense for the use case.

dev_l1x_be 1 day ago||

I am not sure how much this changes the landscape.

praveenhm 1 day ago|

what is the trend right now on mac to run microvm? I am using OrbStack.. is anything micro than this?

bkircher 17 hours ago|

Yes. On macOS particularly you can do sandbox-exec(1) with custom / per-task SBPL profiles. Combined with strict control over environment variables that are passed into the agent process plus an outbound firewall like LittleSnitch.

Important is to isolate tasks from each other. Example: for work related tasks I let the agent access Datadog or Docker socket. Everything else does not have access to these.

More comments...