Top
Best
New

Posted by usrme 2 days ago

Code and Let Live(fly.io)
https://sprites.dev/
432 points | 171 commentspage 4
jFriedensreich 11 hours ago|
i dont think i really get what this gives me over docker. everything i read is how i work for years
vulcan01 11 hours ago|
Docker does not and cannot offer full isolation. A sandboxed VM on someone else's computer is less likely to be problematic for running untrusted code than a container on your system.
jFriedensreich 10 hours ago||
seems not to justify submitting to a proprietary single vendor solution where users are locked into opaque checkpoints they forgot how to migrate away from. this is not something made for users lets be clear. there are tens or hundreds of vm layers for defense in depth for docker so thats a non argument, no one says docker has to provide security its for tooling and common practices that allow vendor independence and moving to self hosted stacks as needed!
a_lanfranco 1 day ago||
sprites.dev looks very interesting to me. Is there a way to set up a limit to how much scaling a sprite can get, or to set a spending limit? I wouldn't want to spin something up, and then be surprised by an unexpectedly high bill.
CGamesPlay 1 day ago||
I spun one up, started a server on port 8080, ran `sprite url`, it gave me a URL, that URL just has `{ "error": "unauthorized" }`. How am I supposed to access it?
mrkurt 1 day ago|
sprite url update --auth public

It requires your api token by default.

indigodaddy 1 day ago|||
Do we handle our own certs or do you have a proxy in front of the sprites that can do auto ssl stuff?
tptacek 1 day ago||
We handle all the SSL stuff. Sprites run on the same Anycast network with the same control plane as Fly Machines, which are built for srs bzns.
CGamesPlay 1 day ago|||
Oh, thanks, that works. ([edit] rewrote this whole post) I guess I need to install my own tunneling into the VM to do web development on it, but that's not so bad. The lack of regional support is crippling, because whatever region you put me in is ~200ms from me and the typing lag is terrible.

I'd love to adopt this for all my development (which I currently do using rented cloud instances, so I'm pretty comfortable with the remote development paradigm). I'm especially excited about the snapshot/clone pattern, and have (this past week) been researching solutions for exactly this problem.

Hope you launch multiple regions for this ASAP. Will be watching.

mrkurt 1 day ago||
If you `sprite console` to it, it'll forward any ports you open to localhost. You can tunnel almost everything through the CLI with the `sprite proxy` command.
zaptheimpaler 22 hours ago||
The sprite installer got stuck after "Installed to ..." for me. After waiting a few minutes I just ctrl+ced and looked at what it does after and manually ran "sprite auth setup --token <token>" and that seems to just hang for me.
aostiles 21 hours ago||
This seems cool but maybe not for a production setting requiring concurrency? I just signed up on PAYG which offers 3 concurrent sprites. I only see an option to upgrade to 10 concurrent sprites.
tptacek 20 hours ago|
Without getting into Kurt's galaxy-brained take on the declining importance of "production" in a post-AI world, I'd say: yeah, run prod apps on Fly Machines, for more predictable performance, scaling, and pricing. Do exploratory computing --- "figuring out what you'd run on a Fly Machine" --- in Sprites.
siliconc0w 23 hours ago||
It'd cool to create a MCP for this so you can have your agents run persistent code/other agents.

This is a large pain point today if you aren't technical, most of the chat interfaces just let you create frontend only apps.

tptacek 23 hours ago|
You can do this now without an MCP, by auth'ing the `sprite` command inside of a Sprite and telling Claude to go document it for you. You can do things like "make me three versions of this feature on three different Sprites so I can compare them". It is spooky how easy it is to teach agents this stuff.
dangoodmanUT 22 hours ago||
I thought fly.io snapshots weren't guaranteed to stick around? Although I can can't find the docs mentioning it, but i checked within the last few months... maybe they changed it?
tptacek 21 hours ago|
More complicated than that, but with respect to Sprites --- this is a totally new stack.
dangoodmanUT 12 hours ago||
it seems like when you snapshot, you snapshot memory AND the filesystem (immutable ftw), that's pretty awesome

i am dying to know: firecracker still? I know you have an upcoming post abt it, but i'm incredibly impatient when it comes to fool new infra

dangoodmanUT 11 hours ago|||
Alright nerd-snipe snooping research post happning now!

Seems like they are using JuiceFS under the hood, with an overlay root for your CoW semantics. JuiceFS gives them instant clone (because they're not cloning the whole rootfs), while the chnages to the overlay are done as an overlayfs and probably synced back to S3 via a custom block device they have mounted into firecracker.

You can also see they are using juicefs it for the "policy" directly (which I'm assuming is the network policy functionality). iirc juicefs has support for block devices too, so maybe they are using that to back the rootfs overlay.

One concerning thing is the `/var/lib/docker` mount - i ran this in an ubuntu container, did they... attach it? Maybe that's a coincidence, but docker is not installed on the sprite by default. (the terminal is also super busted when used through an ubuntu container)

https://pastebin.com/raw/kt6q9fuA (edit: moved terminal output to pastebin because it was so ugly here)

I played with a similar stack recently, my guess is they are: 1. making some base vm, snapshotting it 2. when you create a vm, they just restore a copy and push metadata to it (probably via one of the mounts) 3. any changes that you make to the rootfs are stored on the juicefs block device (the overlay), which is relatively minimal compared to the base os. JucieFS also supports snapshotting, so that's probably how they support memory + filesystem snapshot and restore so quick

interestingly, seems they provision maybe a max disk size of 100GB for total checkpoints?

```

NAME TYPE SIZE FSTYPE MOUNTPOINTS

loop0 loop 100G /.sprite/checkpoints/active

```

fuse is definitely being used within the VMM, i can see a fuse mount and id being assigned. They're probably using juicefs directly for the policy mount because that doesn't need to be local nvme-cached, just consistent. The local-nvme -> s3 write-through runs on the hypervisor through a custom block device they attach to the firecracker vmm. This might just be the --cache-dir + --writeback cache option in juicefs. Wild guess is just 1 file per block.

guessing the "s3" here is tigris, since fly.io seems to have a relatoinship with them, and that probably keeps latency down for the filesystem

dangoodmanUT 12 hours ago|||
i think firecracker, just snooping around a sprite i see a lot of virtio-mmio, which afaik CHV would be using PCI in those instances
skybrian 2 days ago||
This sounds great and it's roughly what exe.dev is doing too. Coincidence?
tptacek 2 days ago||
This has been in the works for quite awhile here. We put a long bet on "slow create fast start/stop" --- which is a really interesting and useful shape for execution environments --- but it didn't make sense to sandboxers, so "fast create" has been the White Whale at Fly.io for over a year.
HumanOstrich 1 day ago|||
Not really. One of the primary features of sprites.dev that I don't see anywhere on exe.dev is a fast way to create and restore checkpoints, like a git repo for your entire VM.

This is needed for sandboxes if you don't want to throw them away and start over when something goes wrong.

With sprites.dev you can create an additional checkpoint and then turn Claude Code (or your preferred agent) loose to do anything. Even if it burns down the sandbox you can just restore a checkpoint in about a second.

crawshaw 1 day ago|||
[exe.dev co-founder here] If you are curious, we have a `clone` command coming soon for sub-section creation of a new VM out of an existing VM. This is our first pass at checkpointing, rather than introducing an independent `snapshot` noun, you can keep a VM around as the snapshot.

We realize that is not going to cover all the business cases we have been discussing with customers and plan to introduce a snapshot concept (in particular for rewinding the state of a VM to an automatic backup), but we have a lot of FS work underway before we can launch it. There are some other things we want out of our VMs that we cannot do using conventional cloud techniques, so we have code to write.

tptacek 1 day ago||
Exe.dev is very cool.
skybrian 1 day ago|||
Yes that’s certainly a great feature and they don’t have it currently. For what it’s worth, they do have a teaser about “Persistent disks with some really interesting work coming soon.”

https://blog.exe.dev/meet-exe.dev

memset 1 day ago||
I have just now learned about exe.dev and it looks awesome.

I really hate that modern development means not having persistent disk. I’m glad there are new options coming out which let you do this in and easier way than managing my own EC2 instances!

psanford 1 day ago||
What is the criteria for a sprite being "idle"? Is it no network activity or is it cpu based?
mrkurt 1 day ago||
It stays awake if you have an open connection (like sprite console) or an exec session if running and producing stdout.

You can specify a max exec time for a process when you launch it via the API.

simonw 1 day ago||
Looks like it's no network activity for 30 seconds.
resonious 17 hours ago|
Would LOVE a Termux build of the CLI. I ran the linux install script and got an incompatible binary.
More comments...