Running Claude Code dangerously (safely)

Posted by emilburzo 1/20/2026

Running Claude Code dangerously (safely)(blog.emilburzo.com)

351 points | 258 comments

runekaagaard 1/20/2026|

It's impossible to not get decision-fatique and just mash enter anyway after a couple of months with Claude not messing anything important up, so a sandboxed approach in YOLO mode feels much safer.

It takes the stress about needing to monitor all the agents all the time too, which is great and creates incentives to learn how to build longer tasks for CC with more feedback loops.

I'm on Ubuntu 22.04 and it was surprisingly pleasant to create a layered sandbox approach with bubblewrap and Landlock LSM: Landlock for filesystem restrictions (deny-first, only whitelisted paths accessible) and TCP port control (API, git, local dev servers), bubblewrap for mount namespace isolation (/tmp per-project, hiding secrets), and dnsmasq for DNS whitelisting (only essential domains resolve - everything else gets NXDOMAIN).

tptacek 1/20/2026||

I've been working for the past several weeks in an environment where it's easy and safe to give different claudes yolo-mode, but yesterday I needed to build an Emacs TRAMP plugin, and I had to do that on my local development NUC. I am extremely spoiled for yolo-mode, because even just yes-ok'ing all the elisp fragments claude came up with was exasperating, the whole experience was draining, and that was me not being especially careful (just making sure it didn't run random bash commands to, like, install a different Emacs or something).

runekaagaard 1/20/2026|||

Configuring Claude Code ... the new init.el ;)

runekaagaard 1/21/2026|||

... also interested. What would one build an Emacs TRAMP plugin for? :)

tptacek 1/21/2026||

Directly editing files on a remote VM that happens to have an API for directly accessing files.

Nition 1/20/2026||

I'm currently stuck on Windows, but I thought sandboxing was built in to Claude Code as a feature on Linux with the /sandbox command?

hu3 1/20/2026|||

For Windows a quick win is to install VMware Workstation Pro (which is free) and install Ubuntu 24.04 LTS as a VM.

Broadcom bought VMware then released Workstation Pro for free and I don't think they kept the download link but you can get from TechPowerUp:

https://www.techpowerup.com/download/vmware-workstation-pro/

You can then let LLMs on YOLO mode inside it.

dragonwriter 1/20/2026|||

What is the advantage of using VMware Workstation Pro for this as opposed to using WSL2?

TheTaytay 1/20/2026|||

I think it has default access to your c drive via a mount, for one. You could add layers/sandboxes, but it’s not isolated.

tracker1 1/21/2026|||

Funny, but I wrote some environment initialization and setup scripts that you just unzip to a new dev desktop, and run the first powershell script, and it will work through (have to reboot after a couple installs), but it goes through, then once WSL is up, it'll rely on the /mnt/c/ paths to run bash scripts to initialize the wsl environment too... was pretty handy.

dragonwriter 1/20/2026||||

Yeah, I do most Linux stuff on Windows in containers using podman leveraging WSL2, but that's a good point.

bt1a 1/21/2026|||

I wouldn't put it past Opus 4.5 in yolo mode to vm escape if it felt like it haha

UltraSane 1/21/2026|||

Stronger isolation and choice of OS

Tossrock 1/20/2026|||

Windows has the WSL for native Linux vms, these days (and also the past ~decade)

hu3 1/21/2026||

I can rm -rf Windows files from WSL2. And so can LLMs.

Meanwhile a VM isolates by default.

jassmith87 1/21/2026||

You can turn all the interop and mounting of the windows FS with ease. I run claude in yolo mode using this exact setup. Just role out a new WSL env for each claude I want yoloing and away it goes. I suppose we could try to theorize how this is still dangerous buts its getting into extremely silly territory.

hu3 1/21/2026||

That's great to know! And important to clarify because by default WSL has access to all disks.

runekaagaard 1/21/2026||||

/sandbox AFAIK uses https://github.com/anthropic-experimental/sandbox-runtime under the hood.

It's still experimental and if you dive into the issues I would call its protection light. Many users experiences erratic issues with perms not being enforced, etc.

For me the largest limitation was that it's read-mode is deny-only, meaning that with an empty deny-list it can read all files on your laptop.

Restricting to specific domains have worked fine for me, but it can't block on specific ports, so you can't say for instance you may access these dev-server ports, but not dev-server ports belonging to another sandbox.

It feels as though the primary usecase is running inside an already network and filesystem sandboxed container.

cedws 1/22/2026|||

It’s pretty weak sandboxing. It still grants full read only access to the file system so any secrets in your home directory can still be exfiltrated. I’m pretty sure it could also be deceptive and use a script to write where it shouldn’t be able to as well. That’s not really sandboxing in my opinion. It should be something like unveil, the process gets a working space at startup, and it cannot ever do anything outside of that directory.

lucasluitjes 1/20/2026||

> What you’re NOT protecting against:

> a malicious AI trying to escape the VM (VM escape vulnerabilities exist, but they’re rare and require deliberate exploitation)

No VM escape vulns necessary. A malicious AI could just add arbitrary code to your Vagrantfile and get host access the first time you run a vagrant command.

If you're only worried about mistakes, Claude could decide to fix/improve something by adding a commit hook. If that contains a mistake, the mistake gets executed on your host the first time you git commit/push.

(Yes, it's unpleasantly difficult to truly isolate dev environments without inconveniencing yourself.)

johndough 1/20/2026||

    > A malicious AI could just add arbitrary code to your Vagrantfile
    > [...]
    > Claude could decide to fix/improve something by adding a commit hook.

You can fix this by confining Claude to a subdirectory (with Docker volume mounts, for example):

    repository/
    ├── sandbox <--- Claude lives in here
    │   └── main.py <--- Claude can edit this
    └── .git <--- Claude can not touch this

embedding-shape 1/20/2026|||

Doesn't this assume you bi-directionally share directories between the host or the VM? Or how would the AI inside the VM be able to write to your .git repository or Vagrantfile? That's not the default setup with VMs (AFAIK, you need to explicitly use "shared directories" or similar), nor should you do that if you're trying to use VM for containment of something.

I basically do something like "take snapshot -> run tiny vm -> let agent do what it does -> take snapshot -> look at diff" for each change, restarting if it doesn't give me what I wanted, or I misdirected it somehow. But there is no automatic sync of files, that'd defeat the entire point of putting it into a VM in the first place, wouldn't it?

lucasluitjes 1/20/2026||

It's the default behaviour for Vagrant. You put a Vagrantfile in your repo, run `vagrant up` and it creates a VM with the repo folder shared r+w to `/vagrant` in the VM.

embedding-shape 1/20/2026||

That's because Vagrant isn't "VM", it's a developer tool you use locally that happens to use VMs, and it was created in a era where 1) containers didn't exist as they do today, 2) packaging and distribution for major languages wasn't infected with malware and 3) LLM agents now runs on our computers and they are kind of dumb sometimes and delete stuff.

With new realities, new workflows have to be adopted. Once malware started to appear on npm/pypi, I started running all my stuff in VMs unless it's something really common and presumed vetted. I do my banking on the same computer I do programming, so it's either that or get another computer.

lucasluitjes 1/21/2026|||

Agree with all of that, especially modern supply chain risk (imho the more important reason to opt for VM isolation rather than containerization). But the original article specifically talks Vagrant as an isolation solution, and describes it as not protecting against VM escape, but also that guest-to-host 0day is rare.

Hence pointing out that VM escape is a lot easier than that if your VM management tool syncs folders the way that Vagrant does by default.

pluralmonad 1/21/2026|||

Or just use a separate user?

embedding-shape 1/21/2026||

That's another option. Or containers. Or cloud hosts. Point is to stop doing bidirectional syncing of directories when you're trying to do isolation.

dist-epoch 1/20/2026|||

Another way is malicious code gets added to the repo, if you ever run the repo code outside the VM you get infected.

martinflack 1/21/2026|||

Maybe before 'vagrant up' you run 'sudo chattr +i Vagrantfile' to make it immutable. Seems to disallow removal of the attribute inside the VM, but allow it outside.

redactsureAI 1/20/2026|||

ec2 node?

eli 1/20/2026||

Or just a VM that doesn't share so much with your host. Just makes for a more annoying dev experience.

dist-epoch 1/20/2026||

Why do you need to share anything? Code goes through GitHub - VM has it's own repo clone, if you need data files, you mount them read-only in the VM, have a read-write mount for output data.

eli 1/21/2026||

I'd like to be able to see and edit the code in an IDE

redactsureAI 1/22/2026||

I work every day in a remote node with an IDE. VS code has a really simple extension you can run a full ide with file system control in a remote server. Git clone your files, open up VS code.

boppo1 1/21/2026||

Eh, I stuck it in a docker container with pass-thru to my repo directory and I feel pretty safe about letting it fly.

Then again I dont work on anything serious.

corv 1/20/2026||

I'm pursuing a different approach: instead of isolating where Claude runs, intercept what it wants to do.

Shannot[0] captures intent before execution. Scripts run in a PyPy sandbox that intercepts all system calls - commands and file writes get logged but don't happen. You review in a TUI, approve what's safe, then it actually executes.

The trade-off vs VMs: VMs let Claude do anything in isolation, Shannot lets Claude propose changes to your real system with human approval. Different use cases - VMs for agentic coding, whereas this is for "fix my server" tasks where you want the changes applied but reviewed first.

There's MCP integration for Claude, remote execution via SSH, checkpoint/rollback for undoing mistakes.

Feedback greatly appreciated!

[0] https://github.com/corv89/shannot

horsawlarway 1/20/2026||

I'm struggling to see how this resolves the problem the author has. I still think there's value in this approach, but it feels to be in the same thrust as the built in controls that already exist in claude code.

The problem with this approach (unless I'm misunderstanding - entirely possible!) is that it still blocks the agent on the first need for approval.

What I think most folks actually want (or at least what I want) is to allow the agent to explore a space, including exploring possible dead ends that require permissions/access, without stopping until the task is finished.

So if the agent is trying to "fix a server" it might suggest installing or removing a package. That suggestion blocks future progress.

Until a human comes in and says "yes - do it" or "no - try X instead" it will sit there doing nothing.

If instead it can just proceed, observe that the package doesn't resolve the issue, and continue exploring other solutions immediately, you save a whole lot of time.

corv 1/20/2026||

You're right that blocking on every operation would defeat the purpose! Shannot is able to auto-approve safe operations for this reason (e.g. read-only, immutable)

So the agent can freely explore, check logs, list files, inspect service status. It only blocks when it wants to change something (install a package, write a config, restart a service).

Also worth noting: Shannot operates on entire scripts, not individual commands. The agent writes a complete program, the sandbox captures everything it wants to do during a dry run, then you review the whole batch at once. Claude Code's built-in controls interrupt at each command whereas Shannot interrupts once per script with a full picture of intent.

That said, you're pointing at a real limitation: if the fix genuinely requires a write to test a hypothesis, you're back to blocking. The agent can't speculatively install a package, observe it didn't help, and roll back autonomously.

For that use case, the OP's VM approach is probably better. Shannot is more suited to cases where you want changes applied to the real system but reviewed first.

Definitely food for thought though. A combined approach might be the right answer. VM/scratch space where the agent can freely test hypotheses, then human-in-the-loop to apply those conclusions to production systems.

horsawlarway 1/20/2026||

yeah, I think the combo approach definitely has the most appeal:

- Spin up a vm with an image of the real target device.

- Let the agent act freely in the vm until the task is resolved, but capture and record all dangerous actions

- Review & replay those actions on the real machine

My issue is that for any real task, an agent without feedback mechanisms is essentially worthless. You have to have some sort of structured "this is what success looks like, here's how you check" target for it. A human in the loop can act as that feedback, which is in line with how claude code works by default (you define success by approving actions and giving feedback on status), but requiring a human in the loop also slows it down a bunch - you can end up ping-ponging between terminals trying to approve actions and review the current status.

charcircuit 1/20/2026|||

>commands and file writes get logged but don't happen. You review in a TUI, approve what's safe, then it actually executes.

This what claude already does out of the box.

bigwheels 1/20/2026|||

Very cool, this sounds similar in spirit to Leash (https://github.com/strongdm/leash), especially the mac-native system extension mode of Leash (although AFAIU Leash doesn't currently have full interactive-approval mode).

Nice work!

Retr0id 1/20/2026||

Very clever name!

corv 1/20/2026||

Thank you, good to know it landed :)

molson8472 1/20/2026||

Once approval fatigue and ongoing permission management kicks in, the temptation is strong to run `--dangerously-skip-permissions`. I think that's what we all want - run agents in a locked-down sandbox where the blast radius of mistakes and/or prompt injection attacks is minimal/acceptable.

I started running Claude Code in a devcontainer with limited file access (repo only) and limited outbound network access (allowlist only) for that reason.

This weekend, I generalized this to work with docker compose. Next up is support for additional agents (Codex, OpenCode, etc). After that, I'd like to force all network access through a proxy running on the host for greater control and logging (currently it uses iptables rules).

This workflow has been working well for me so far.

Still fresh, so may be rough around the edges, but check it out: https://github.com/mattolson/agent-sandbox

holoduke 1/21/2026||

I wanted to vibe code an app in an evening with some friends including setting up coolify for production and testing environments. Ended up with giving Claude root access to a cluster of servers. Vibe coded the entire application with 3 people. Did not touch a line of code. The only shell command given was claude. It spend couple hours to self configure the system. Result was remarkable good. Amazing how far we are already in the ai race.

asabla 1/20/2026|||

Very nice!

I've been experimenting with a similar setup. And I'll probably implement some of the things you've been doing.

For the proxy part I've been running https://www.mitmproxy.org/ It's not fully working for all workflows yet. But it's getting close

dingnuts 1/20/2026||

[dead]

srini-docker 1/20/2026||

Hey - Srini from Docker here. We’ve seen a lot of developers turn to Docker for this use case and heard some mentions of the Docker-in-Docker block. We put out Docker Sandboxes in experimental preview as a potential answer. Still early but we're working on the next iteration based on MicroVMs and avoids Docker-in-Docker.

adriaanmulder 1/21/2026|

How does docker sandbox solve the docker-in-docker issue? Can Claude running in docker sandbox spin up other docker containers, without having privileged access?

ejpir 1/21/2026||

micro-vms, not DinD

kernc 1/20/2026||

Since everyone tends to present their own solution, I bid you mine:

    sandbox-run npx @anthropic-ai/claude-code

This runs npx (...) transparently inside a Bubblewrap sandbox, exposing only the $PWD. Contrary to many other solutions, it is a few lines of pure POSIX shell.

https://github.com/sandbox-utils/sandbox-run

corv 1/20/2026|

I like the bubblewrap approach, it just happens to be Linux-only unfortunately. And once privileges are dropped for a process it doesn't appear to be possible to reinstate them.

kernc 1/20/2026||

> Linux-only

What other dev OSs are there?

> once privileges are dropped [...] it doesn't appear to be possible to reinstate them

I don't understand. If unprivileged code could easily re-elevate itself, privilege dropping would be meaningless ... If you need to communicate with the outside, you can do so via sockets (such as the bind-mounted X11 socket in one of the readme Examples).

corv 1/20/2026||

I happen to use a Mac, even when targeting Linux so I'd have to use a container or VM anyways. It's nice how lightweight bubblewrap would be however.

Consider one wanted to replicate the human-approval workflow that most agent harnesses offer. It's not obvious to me how that could be accomplished by dropping privileges without an escape hatch.

kernc 1/20/2026||

It being deprecated and all, didn't feel like wrapping it, but macOS supposedly has a similar `sandbox-exec` command ...

nowahe 1/20/2026||

IIRC from a comment in another thread, it's marked as deprecated to stop people from using it directly and to use the offical macOS tools directly. But it's still used internally by macOS.

And I think that what CC's /sandbox uses on a Mac

crabmusket 1/20/2026||

What is the consensus on Claude Code's built-in sandboxing?

https://code.claude.com/docs/en/sandboxing#sandboxing

> Claude Code includes an intentional escape hatch mechanism that allows commands to run outside the sandbox when necessary. When a command fails due to sandbox restrictions (such as network connectivity issues or incompatible tools), Claude is prompted to analyze the failure and may retry the command with the dangerouslyDisableSandbox parameter.

The ability for the agent itself to decide to disable the sandbox seems like a flaw. But do I understand correctly that this would cause a pause to ask for the user's approval?

shakna 1/20/2026||

Afraid that it regularly bypasses requests for confirmation...

[0] https://github.com/anthropics/claude-code/issues/14268

[1] https://github.com/anthropics/claude-code/issues/13583

[2] https://github.com/anthropics/claude-code/issues/10089

prodigycorp 1/20/2026||

It's trivially easy to get Claude Code to go out of its sandbox using prompting alone.

Side note: I wish Anthropic would open source claude code. filing an issue is like tossing toilet paper into the wind.

TZubiri 1/21/2026||

Don't depend on the thing to protect you from the thing

loloquwowndueo 1/20/2026||

Shellbox.dev and sprites.dev were discussed recently on hacker news, they give you a sandbox machine where it’s likely safe to run coding agents in dangerous mode. Filesystem checkpoint and restore make it easy to recover from even catastrophic mistakes.

thruflo 1/20/2026||

I made a little tool for Ralphing on Sprites: https://github.com/thruflo/wisp

I’ve found the sprites just work for claude. Pull how a repo (or repos) and run dangerously.

gcr 1/20/2026||

What about API calls? What about GitHub trusted CI deploys?

One frustrating thing about these solutions is that they’re great to prevent Claude from breaking a machine, but there’s no pervasive sandbox for third party services

tptacek 1/20/2026|||

This is a fun open problem. We've got stuff coming for it (don't want to hijack the thread, though).

jermaustin1 1/20/2026||||

Rollback? Its the same as all dev work. Use a dev endpoint for APIs, and thankfully git is a great tool to undo fuckups.

loloquwowndueo 1/20/2026|||

What about them?

nunez 1/20/2026||

Vagrant is great for Claude!

You can also use Lima, a lightweight VM control plane, as it natively works with qemu and Virtualization.Framework. (I think Vagrant does too; it's been a minute since I've tried.) This has traditionally been used for running container engines, but it's great for narrowly-scoped use cases like this.

Just need to be careful about how the directory Claude is working with is shared. I copy my Git repo to a container volume to use with Claude (DinD is an issue unless you do something like what Kind did) and rsync my changes back and verify before pushing. This way, I don't have to worry if Claude decides to rewind the reflog or something.

bonsai_spool 1/20/2026|

How are you configuring Lima? Do you have any scripts you use to set up the environments or is this done ad hoc?

metachris 1/20/2026||

I recently wrote a blog post about just that - how to run LLMs in Lima VMs: https://www.metachris.dev/2025/11/sandbox-your-ai-dev-tools-...

ejia 1/20/2026|

PM for Docker Sandboxes here.

Our next version of Docker Sandboxes will have MicroVM isolation and a Docker instance within for this exact reason. It'll let you use Claude Code + Containers without Docker-in-Docker.

More comments...