Top
Best
New

Posted by ozkatz 3 days ago

Show HN: Tilde.run – Agent sandbox with a transactional, versioned filesystem(tilde.run)
199 points | 132 comments
docheinestages 3 days ago|
Just my two cents: less is more and the first impression matters a lot. I'm saying this because we see a new agent sandbox tool on the front-page almost every day. Most of them have an AI-made landing page design, lots of animations, lots of words. This has become a bad sign for me. I can tell that you put time into it, made a video, and everything, but I guess I'm suffering from some kind of fatigue of having to go through all these tools. So, the less I have to process to get to the meat of exactly what I'm looking at, what sets this apart from others, why and when I would need to use it, then the more likely I am to actually engage with the product.
ozkatz 3 days ago||
That's fair. What makes this unique is the versioned, composable filesystem. It's built on top of lakeFS (https://github.com/treeverse/lakeFS) so it scales really well, unlike other solutions that try and do this with Git directly.
hamandcheese 2 days ago|||
Is lakeFS an FS....? Zero mention of FUSE or a kernel module at all in the README.
rendaw 2 days ago||
The title says it's a new filesystem, you either need to use fuse or a kernel module.
cbsks 2 days ago||
I mean not really. There is a FUSE implementation, but you need an enterprise account https://docs.lakefs.io/v1.60/reference/mount/

I’m not seeing a kernel module anywhere..

doctorpangloss 3 days ago|||
LLM authored comments are against the rules. I don't think file versioning is differentiated anyway.
nateb2022 2 days ago||
OP is actually one of the co-creators of lakeFS, for context.
messh 2 days ago|||
Sadly this is what sells. Standing out in this regard checkout https://shellbox.dev maybe swinging too far though?
whalesalad 3 days ago|||
Agreed. All of these tools promise the world and are so incredibly vague. Actually show me what I can do with it, like hands on.
ozkatz 3 days ago||
https://www.youtube.com/watch?v=fDR8tmes020 - a 2 minute hands-on demo!
lifty 2 days ago|||
I see a lot of negative feedback here, but I don't agree with it. This is really fantastic what you have built, especially for longer running agents that are used repeatedly, in which case the initial investment of giving only the permissions it needs is worth the effort. To that end, ability to combine several agents which have different roles, which are narrowly scoped in terms of permissions, would be a very useful feature. Perhaps you could even have an agent or UI overlay driven by AI, which can quickly scope the permissions for a new agent, so that users don't need to do it manually.
whalesalad 2 days ago|||
Being brutally honest - terrible demo. 80% of this is baseline stuff, setting up permissions (annoying), and the last few seconds we see a file was deleted and we can approve it. This is not selling your product.
ozkatz 2 days ago||
Appreciate the honest feedback. I agree there's a lot to improve there.
dev360 2 days ago|||
As someone who is building an AI tool in this category, can you give examples? :)

I've tried to focus more on end-user use-cases in my own product positioning, even though security is absolutely at the top of my list. This was hard to watch because it felt it demonstrated a security feature that is really secondary to the purpose of an agent.

What would be a spin in this AI category that would excite or surprise you?

debarshri 2 days ago||
Anthropic is probably looking at this trend and building something. When released will kill couple of startups.
jFriedensreich 2 days ago||
I had to dig hard to find this is a SAAS sandbox offering not an actual sandbox (the software i can use locally). Its just wasting peoples time, no one needs a non opensource sandbox. There are now at least 3 apache 2 projects (smolmachines, microsandbox, boxlite) working on sandboxes and at least one of them should be ready for primetime soon.
alexellisuk 2 days ago||
It's interesting to see this one launch (yes yet another sandbox.. I was getting worried we'd not seen one for a few days)

SlicerVM (est. 2022) is already used for prime time, not "free as in beer" but has pretty reasonable individual plans that include all features. Shares the core code with actuated. (Creator of both speaking here)

Feel free to take a look and see if gives you a little more than the others you mentioned. If not no problems, I realise some folks prefer free stuff.

jFriedensreich 2 days ago||
What do you mean "not free as in beer"? Its not free as in anything? Sandboxes need to be open source, nothing else is acceptable.
kjok 2 days ago||
Why should they be open source?
jFriedensreich 1 day ago||
Same reason linux or databases need to be open source. A sandbox is not a nice to have or a feature anymore, it is as fundamental building block to running any software. You cannot depend on closed source building blocks, not as closed source product and especially not as open source product.
HatchedLake721 2 days ago||
It’s like saying no one needs Dropbox because rsync exists, or no one needs HubSpot because Salesforce exists.
grim_io 2 days ago|||
No, it's like saying almost no one wants a saaszsh. Which is probably an accurate statement.
jFriedensreich 2 days ago|||
Not really, its more like saying no one needs another windows when linux exists. By "no one needs" i mean the world needs open source sandbox building blocks that are up to the challenges of the current age, no closed source solution can be a fundamental building block for the world to become better and more secure. No non-local building block can be at the foundation to anything that makes the world better and more robust.
HatchedLake721 2 days ago||
That's a very narrow and technical person's point of view.

You might need it open source, the majority of the world doesn't care, like they don't care Windows is closed source, or like AWS is a "cloud" running somewhere else. Both of them are building blocks that made "the world better and more robust".

suprjami 19 hours ago||
Users might not care but the people building such things (us) do care and must care so that we can provide a product for those users which does what it says.

If you don't care that's fine. You go run the Claude Code "sandbox" and let it put your entire home directory on a public pastebin. Anthropic guarantee it will exfiltrate your data in the most secure way possible.

The rest of us want verifiable sandboxes which we can fix if they are wrong.

skeledrew 3 days ago||
I made something pretty similar to this a couple months ago, when I was just getting into using coding agents. Has 2 parts that work individually but are better together: a change tracking FS and an agent sandbox. Haven't really used it though as it's a pain to get Claude Code working in that - Docker-based - sandbox without baking it in, and I really want something that's fully configurable. And then I didn't really need it to because I'm a very interactive user; I'm almost constantly watching the agent and never use YOLO... except for 1 codebase where it's frustratingly failing to fix a single particular bug and I really don't want to deal with it myself.
jmull 3 days ago||
This is an excellent idea who's time has come.

But this is too vague for me. I'm not seeing my questions answered in the landing page or FAQ either.

E.g.,... what's the pricing?

How does atomic commit really work? E.g., if one write to S3 succeeds but the update to a git repo fails?

Does this use optimistic locking or something else? What happens if I commit changes to a resource that was updated since it was imported?

Where/how is it hosted?

ozkatz 3 days ago|
Regarding pricing - that's indeed a great question and we don't have an answer yet. It will very likely be based on consumption and should be competitive to similar solutions.

Atomic commits are based on snapshotting done by lakeFS under the hood. Each sandbox run produces a new atomic commit to a hidden "main" branch. Updating that branch is optimistically concurrent, with lakeFS checking for conflicts - multiple writers updating the same object.

aussieguy1234 2 days ago||
Nice project, but saying "Run AI agents in production without the risk" isn't quite accurate.

Even if some tool makes it impossible for an AI agent to delete things in a way that isn't recoverable, there are other risks such as data exfiltration that need to be managed separately.

SachitRafa 9 hours ago||
Would be definitely trying this, how do you check the rollback of API calls which would not be an part of a transaction?
_pdp_ 2 days ago||
Git is already versioned, S3 support versioning and any file copied into the sandbox, is well a copy, so I am not sure what is the angle here.

Other than that it looks cool!

gatvol 2 days ago|
Doesn't s3 now have versioning + POSIX mounts?
ozkatz 2 days ago|||
S3 offers versioning at the single file level.

Imagine an agent dropping a directory with 1m images in it. just figuring out what happened and what got dropped, restoring it one by one, etc. - doable, but ergonomics are a bit lacking.

sudb 2 days ago||||
Yep!

https://aws.amazon.com/blogs/aws/launching-s3-files-making-s...

_pdp_ 2 days ago||
Thanks. This is actually interesting. The only downside is that it only works within AWS.
revv00 2 days ago||
JuiceFS for posix but lack of versioning
otterley 2 days ago|||
S3 Files is not POSIX compliant and doesn’t claim to be so. For example, atomic renames aren’t supported.
kushalpatil07 3 days ago||
I was trying to build an agent. None of the sandboxes out there had solved the filesystem problem. I want my agent to have a persistent storage, and that stays forever. Like a human with a computer. When the agent spins up again, it has access to the computer with the same files.

I had to create my own setup using aws s3 filesystem and docker for this.

Does Tilde solve for this?

thepoet 3 days ago||
Hey, this is exactly what we do at https://instavm.io Agents get persistent storage that outlive the sandbox and when the agent spins up again you get access to the computer with same files.
theaniketmaurya 2 days ago|||
This is something solved by a bunch of new sandboxes including ours - SmolVM

https://github.com/CelestoAI/smolVM

Galanwe 3 days ago|||
Snapshotting a filesystem is trivial with e.g. btrfs. You can hook snapshot creation in your agent.

That is a single one liner of btrfs subvolume snapshot, in a single hook configuration file, ready to be valued at $10B as quantum agentic versioned sandbox startup.

ozkatz 3 days ago||
Part of the appeal (subjective, I know) of versioning is stuff like human-in-the-loop approvals. Think of a pull request: a change is requested by an agent, a human approves, changes get merged atomically. Even if other changes were applied since creation.
empath75 2 days ago|||
Agent Sandboxes is the official k8s solution for this.
gitaarik 2 days ago|||
Isn't that like working on a codebase with an agent?
gavmor 3 days ago|||
Nanoclaw mounts each agent's folder to the ephemeral container.
zuzululu 3 days ago|||
just get a $5 VPS or hetzner and you are good.
keepamovin 2 days ago|||
Just run it on your GitHub actions minutes
stronglikedan 3 days ago|||
infosec would like a word...
zuzululu 3 days ago||
which is the bare minimum that I hope people are doing , nothing about trusting a third party is any less or more secure.
ozkatz 3 days ago||
Exactly that!
seamossfet 3 days ago||
Does this provide gitflow to handle conflicts from multiple agents touching the same file system or is it purely for single-branch sequential iterations on the filesystem?

I have a use case that could use this if it supports handling branching and merging file systems.

ozkatz 3 days ago|
It uses lakeFS under the hood, so the unit of conflict would be a single file (object, under the hood). Resolving conflicts requires "picking" a winning side, or rerunning a conflicting job. Would you see a use case for merging changes into the same file? Interested to hear about your use case!
seamossfet 3 days ago||
We're building a CAD for drug design, we often have to handle large and highly varied file formats. Protein structures, compounds, python scripts, lab notebook entries, instrumentation data, etc.

From a data structure and file ergonomics perspective, think of it as similar to Unity or UE4 for drug design. We have a huge variety of assets to manage alongside their relationships to each other, and the project files are local on the user's machine (with a collaboration / sync over the network between scientists working on the same project, hence where something like this would come in for us).

Many of those files are fine with a winning side strategy, but some of them might not be that clean. Take a protein structure defined by an `mmcif` file for example, if we clean the file by removing hydrogen atoms and another scientist repairs a side chain on that same file then we'd need a way to reconcile those differences.

On the agent side, our agents will generate small python scripts that manipulate the proteins, then cache and re-use those scripts as tools when possible. So preserving those scripts alongside the mutated asset and conversation history is something we've been working on.

anonymousiam 3 days ago|
Back in the 1970's when versioned filesystems were invented, they provided a recovery path for when a file was improperly changed or deleted. Now, in the age of LLMs that go rouge, I can see why they would become popular again.
ozkatz 3 days ago|
Oh VMS, How I miss thee
More comments...