Posted by atombender 16 hours ago
https://github.com/hsaliak/sacre_bleu very rough around the edges, but it works. In the past there were apps that either behaved well, or had malicious intent, but with these LLM backed apps, you are going to see apps that want to behave well, but cannot guarantee it. We are going to see a lot of experimentation in this space until the UX settles!
been watching microsandbox but its pretty early. landlock is the linux kernel primitive that could theoretically enable something like this but nobody's built the nice policy layer on top yet.
curious if anyone has a good solution for the "agent running on a remote linux server" case. the threat model is a bit different anyway (no iMessage/keychain to protect) but filesystem and network containment still matter a lot
[1] https://github.com/anthropic-experimental/sandbox-runtime [2] https://github.com/carderne/sandbox-runtime
Taking more of an automated supervisor approach with limited manual approval for edge cases.
Grith.ai
However the challenge is, sandbox profiles (rules) are always workload specific. How do you define “least privilege” for a workload and then enforce it through the sandbox.
Which is why general sandboxes wont be useful or even feasible. The value is observing and probably auto-generating baseline policy for a given workload.
Wrong or overly relaxed policies would make sandbox ineffective against real threats it is expected to protect against.
I’m assuming it’s similar to why people run plex, web servers, file sharing, etc
Also personally I’d rather not pay monthly fees for stuff if it can be avoided.
It supports running on a TrueNAS SCALE server, or via Incus (local or remote). I'm still working on tightening the security posture, but for many types of AI workflows it will be more than sufficient.
p.s. thanks for making this; timely as I am playing whackamole with sandboxing right now.
That's a 200 OK the whole way down. "Prevent bad actions" and "detect wrong-but-permitted actions" are completely different problems.