Agentic Orchestration Part 5: Docker Sandboxing and YOLO Mode
Over the past two weeks, we have broken down our highly capable Orchestrator framework. We explored defining agent profiles, chaining them together with Handoff Documents, grouping minor dependencies (Part 2), automatically migrating code for major upgrades (Part 3), and gracefully rolling back failures (Part 4).
But with great power comes a glaring security risk.
The Risk of Root Access
When you give an LLM the ability to run bash commands directly on your host machine, you are walking a tightrope. Even with strict Orchestrator configurations, what happens if the agent accidentally decides to run rm -rf /? Or worse, what if a compromised npm package runs a malicious post-install script while the agent is executing pnpm install?
You shouldn’t hand an AI the keys to your entire system.
Enter Docker Sandboxes & MicroVMs
To solve this, we draw inspiration from Microsoft’s recent work on GitHub Copilot Workspaces. The goal is to run the Orchestrator and the Agent inside a highly isolated MicroVM via Docker Sandbox.
Instead of running /orchestrate dependency-updates on your Mac, you execute it inside a temporary sandbox container.
Parallel Worktrees
When working across multiple agents and complex repositories, doing everything in a single directory is slow and error-prone. Our architecture uses an orchestrate-worktrees.js script to manage execution in the sandbox:
docker exec -it claude-sandbox node scripts/orchestrate-worktrees.js .claude/plan/workflow-e2e.json --execute
This command takes a JSON plan, spins up isolated git worktrees inside the MicroVM for each agent worker, and executes them in parallel (using tmux panes under the hood). If you need to monitor the operation without interrupting it, you can export a live control-plane snapshot:
docker exec -it claude-sandbox node scripts/orchestration-status.js .claude/plan/workflow-e2e.json
Because the worktrees are fully isolated inside the container, the agent can mangle files, install untrusted packages, and break compilers without ever touching your host machine’s state. Furthermore, a proxy strictly filters outgoing network traffic, ensuring that if a malicious package is pulled, it cannot exfiltrate your source code.
Entering YOLO Mode
When an agent is running directly on your laptop, you naturally want to approve every single action. “Agent wants to run npm install. Allow? (y/n)” This causes “approval fatigue” and bottlenecks the entire modernization process.
But once your agent is secured inside a network-filtered, isolated MicroVM where it literally cannot destroy your host system or leak data… you can flip the switch to YOLO Mode.
You can spin up 100 Docker Sandboxes in parallel across a fleet of servers, feed them your .claude/plan/ configurations, and let your major-upgrader agents autonomously upgrade 100 different legacy repositories simultaneously.
Conclusion
Agentic refactoring is no longer a futuristic concept. By combining carefully designed local Claude pipelines, structured Handoff Documents, three-strike fallback logic, and Docker Sandbox isolation, we achieve the best of both worlds: the speed of autonomous AI and the safety of enterprise infrastructure.
Thank you for following along with this 5-part series! Go forth and automate responsibly.
Keep pushing forward and savor every step of your coding journey.
