Agent Safehouse – macOS-native sandboxing for local agents

Posted by atombender 16 hours ago

Agent Safehouse – macOS-native sandboxing for local agents(agent-safehouse.dev)

608 points | 149 commentspage 3

hsaliak 10 hours ago|

This is a very nice and clean implementation. Related to this - I've been exploring injecting landlock and seccomp profiles directly into the elf binary, so that applications that are backed by some LLM, but want to 'do the right thing' can lock themselves out. This ships a custom process loader (that reads the .sandbox section) and applies the policies, not unlike bubblewrap which uses namespaces). The loading can be pushed to a kernel module in the future.

https://github.com/hsaliak/sacre_bleu very rough around the edges, but it works. In the past there were apps that either behaved well, or had malicious intent, but with these LLM backed apps, you are going to see apps that want to behave well, but cannot guarantee it. We are going to see a lot of experimentation in this space until the UX settles!

jeff_antseed 5 hours ago||

the macOS-only constraint is the biggest blocker for us. most of our agents run on linux VMs and there's basically nothing equivalent -- you end up choosing between full docker isolation (heavy) or just... not sandboxing at all and hoping.

been watching microsandbox but its pretty early. landlock is the linux kernel primitive that could theoretically enable something like this but nobody's built the nice policy layer on top yet.

curious if anyone has a good solution for the "agent running on a remote linux server" case. the threat model is a bit different anyway (no iMessage/keychain to protect) but filesystem and network containment still matter a lot

carderne 4 hours ago||

There is sandbox-runtime [1] from Anthropic that uses bubblewrap to sandbox on Linux (and works the same as OP on macOS). You can look at the code to see how it uses it. Anthropic's tool only support read blacklist, not a whitelist, so I forked it yesterday to support that [2].

[1] https://github.com/anthropic-experimental/sandbox-runtime [2] https://github.com/carderne/sandbox-runtime

edf13 5 hours ago||

We are a different approach and are targeting Linux for our first release (Windows & Mac shortly afterwards).

Taking more of an automated supervisor approach with limited manual approval for edge cases.

Grith.ai

abhisek 9 hours ago||

I think this is the right approach to building sandbox for agents ie. over existing OS native sandbox capabilities so that they are truly enforced.

However the challenge is, sandbox profiles (rules) are always workload specific. How do you define “least privilege” for a workload and then enforce it through the sandbox.

Which is why general sandboxes wont be useful or even feasible. The value is observing and probably auto-generating baseline policy for a given workload.

Wrong or overly relaxed policies would make sandbox ineffective against real threats it is expected to protect against.

matifali 12 hours ago||

I wonder why you believe that running agents locally is the best approach. For most people, having agents operate remotely is more effective because the agent can stay active without your local machine needing to remain powered on and connected to the internet 24/7.

NegativeLatency 12 hours ago||

It’s nice having control and ownership of your software.

I’m assuming it’s similar to why people run plex, web servers, file sharing, etc

Also personally I’d rather not pay monthly fees for stuff if it can be avoided.

paxys 9 hours ago|||

These agents are all calling APIs that are well beyond your control. How does it matter whether a thin CLI wrapper is running on your computer or not?

mikodin 11 hours ago|||

Piggybacking on this - I think it well equips us for a future when local models are stronger. I for one am grateful for efforts like these

deevus 11 hours ago|||

For this specific problem I built pixels: https://github.com/deevus/pixels

It supports running on a TrueNAS SCALE server, or via Incus (local or remote). I'm still working on tightening the security posture, but for many types of AI workflows it will be more than sufficient.

sunnybeetroot 7 hours ago||

It’s nice to debug Apple platform projects immediately

synparb 15 hours ago||

I’ve been playing around with https://nono.sh/ , which adds a proxy to the sandbox piece to keep credentials out of the agent’s scope. It’s a little worrisome that everyone is playing catch up on this front and many of the builtin solutions aren’t good.

srid 13 hours ago||

If you are using Nix, there's also https://github.com/srid/sandnix that works on Linux (landrun) and macOS (sandbox-exec).

sunir 11 hours ago||

Is clunker some new slang that's different than clanker? I'm asking for a friend of my friend Roku.

p.s. thanks for making this; timely as I am playing whackamole with sandboxing right now.

e1g 11 hours ago|

Testing in prod! Thank you, just fixed that typo.

devonkelley 12 hours ago||

Sandboxing solves "prevent the agent from doing damage." The failure mode it doesn't catch is when the agent operates perfectly within its permissions and still produces garbage because the model degraded or the tool stopped returning useful results.

That's a 200 OK the whole way down. "Prevent bad actions" and "detect wrong-but-permitted actions" are completely different problems.

inoki 11 hours ago||

I'm also working on a cross-platform solution (sandbox-exec on macOS). What if Apple finally drops this after long deprecation?

e1g 10 hours ago|

Let’s make something so popular and useful that they can’t drop it.

ashniu123 6 hours ago|

How's this different from https://container-use.com?

More comments...