43453c66ea
test / run tests/run_tests.py (push) Successful in 15s
Argues that running claude-bottle unchanged on a remote Linux VM with dockerd is the cheapest practical path to stronger isolation than local Docker — preserves the v1 pipelock topology, requires zero code changes, and shrinks the agent's blast radius from the developer laptop to a disposable VM. Cross-references the existing stronger-isolation-alternatives and local-vs-remote-agent-execution notes so the research set composes cleanly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
190 lines
9.0 KiB
Markdown
190 lines
9.0 KiB
Markdown
# Remote Docker VM as an isolation upgrade for claude-bottle
|
||
|
||
Note on the cheapest practical path to stronger isolation than local
|
||
Docker: run claude-bottle unchanged on a remote Linux VM that has
|
||
dockerd. Complements `stronger-isolation-alternatives.md` (which
|
||
surveys runtime swaps like gVisor, Kata, Firecracker, Apple Container)
|
||
and `local-vs-remote-agent-execution.md` (which surveys the
|
||
local-vs-remote decision broadly).
|
||
|
||
## Summary
|
||
|
||
If the goal is "stronger isolation than Docker-on-my-laptop without
|
||
rewriting the runtime," the cleanest answer is to keep claude-bottle
|
||
exactly as it is and run it on a remote Linux VM where you can install
|
||
dockerd. The v1 design — pipelock as a separate container on a
|
||
`--internal` network, ephemeral agent containers, OAuth-token
|
||
forwarding — works as-is. The only thing that changes is that the
|
||
"host" is now a disposable VM you provisioned for the session, not your
|
||
laptop.
|
||
|
||
This is structurally equivalent to a Firecracker rewrite (Rung 3 in
|
||
`stronger-isolation-alternatives.md`), but the cloud provider operates
|
||
the runtime for you. It is also strictly cheaper than adopting a cloud
|
||
sandbox SDK (Vercel Sandbox, E2B, Cloudflare Sandbox SDK) because you
|
||
keep the existing Docker-shaped abstractions instead of swapping them
|
||
for a vendor API.
|
||
|
||
## The argument
|
||
|
||
### What changes in the threat model
|
||
|
||
The agent's blast radius shrinks from "developer laptop + everything
|
||
on the LAN" to "this disposable VM." Concretely, what's no longer
|
||
reachable on container escape:
|
||
|
||
- `~/.ssh/`, `~/.aws/credentials`, `~/.config/gh`, the macOS Keychain
|
||
- Browser cookies and session state
|
||
- Other dev machines on the home/office LAN
|
||
- NAS, printers, smart-home devices, anything else on the local network
|
||
|
||
What replaces it on the remote side: only what the operator chose to
|
||
ship to the VM for the session. Typically the OAuth token, optional SSH
|
||
keys for the bottle, the manifest, and the workspace if the agent needs
|
||
one. None of which are on the laptop after the VM is destroyed.
|
||
|
||
### Why the boundary is equivalent to v1, not weaker
|
||
|
||
A natural objection — raised in the design discussion that produced
|
||
this note — is that running pipelock and the agent on the same VM
|
||
collapses a network boundary into a kernel-namespace boundary, which
|
||
sounds weaker. It is not, *if you reuse Docker for the inner topology.*
|
||
|
||
Docker on the remote VM gives the agent and pipelock their own network
|
||
namespaces by default, with the agent attached to a `--internal`
|
||
network and pipelock straddling it and an egress bridge. That is the
|
||
same v1 topology. Bypassing pipelock from the agent requires the same
|
||
class of attack as bypassing it on a laptop: a kernel-level netns
|
||
escape inside the VM. The only difference is that the kernel under
|
||
attack belongs to a disposable VM, not the developer's machine.
|
||
|
||
In other words: the "weaker because colocated" framing only applies if
|
||
you naively run agent and pipelock as two processes in the same
|
||
namespace. With Docker on the VM, you don't.
|
||
|
||
### Why this is cheaper than the alternatives
|
||
|
||
| Path | Effort | Where the VM-grade boundary comes from |
|
||
| --- | --- | --- |
|
||
| gVisor (`runsc`) per bottle | ~1–2 days | Userspace syscall barrier; not a full VM |
|
||
| Kata Containers per bottle | ~1–2 days, Linux-only | Kata's microVM-per-container |
|
||
| Firecracker rewrite | 2–4 weeks | Self-operated Firecracker |
|
||
| Apple Container (macOS) | ~1 week spike + integration | Apple's Virtualization.framework, per-container |
|
||
| Cloud sandbox SDK (Vercel, E2B, …) | Days–weeks of API rewrite + lock-in | Provider-operated Firecracker / equivalent |
|
||
| **Remote Docker VM (this note)** | **0 lines of code** | **Cloud-provider hypervisor under the VM** |
|
||
|
||
The "stronger isolation alternatives" doc concludes that gVisor is the
|
||
right today-step and Apple Container is probably the right v2.
|
||
This note adds a third option that sits orthogonal to both: don't
|
||
change the runtime, change the host. Use it when the failure mode you
|
||
care about is "agent compromises my laptop" specifically, rather than
|
||
"agent escapes Docker into a kernel I share with other workloads."
|
||
|
||
## What the provider has to give you
|
||
|
||
Not every cloud sandbox is suitable. The minimum for this approach to
|
||
work:
|
||
|
||
- Root or rootless-Docker capability inside the VM. Rules out
|
||
Fargate-style locked-down container hosts and most "function" tier
|
||
FaaS. Verify before committing — Vercel Sandbox specifically may or
|
||
may not allow installing dockerd depending on tier; Fly Machines,
|
||
EC2, GCE, Hetzner, Linode, and self-hosted hypervisors give you full
|
||
control.
|
||
- Enough disk + RAM to host the claude-bottle image, the agent
|
||
container, and the pipelock sidecar. Headroom of ~2–4 GB RAM and
|
||
~5 GB disk is comfortable; less works for short sessions.
|
||
- An interactive reach path. SSH is fine. The launcher uses
|
||
`docker exec -it`, so any TTY-capable session works.
|
||
|
||
## What you give up
|
||
|
||
- **Typing latency.** Interactive Claude sessions over SSH have visible
|
||
per-keystroke latency; usually fine on wired/fiber, less fine on
|
||
Wi-Fi-to-cloud. Mosh helps if it's bothersome.
|
||
- **Token shipping.** `CLAUDE_BOTTLE_OAUTH_TOKEN` has to live on the
|
||
remote box for the launcher to forward it into containers. Use the
|
||
provider's secret-injection path (cloud-init user-data,
|
||
`flyctl secrets`, Tailscale-served local file, etc.). Never echo the
|
||
token onto the SSH command line; it ends up in the local shell
|
||
history and possibly the SSH server's auth log.
|
||
- **Idle cost.** Unless the VM is torn down between sessions, you pay
|
||
for it sitting idle. Ephemeral provisioning (one VM per session,
|
||
destroyed on exit) is the cheaper and more secure pattern; see
|
||
`local-vs-remote-agent-execution.md` on why ephemeral is also
|
||
recommended for credential-concentration reasons.
|
||
- **Source code goes to the VM.** Same as any remote-execution
|
||
topology. If the project is under NDA, the VM provider matters.
|
||
- **Provider trust.** Multi-tenancy side channels, supply-chain
|
||
compromise of the provider, insider risk. Generally smaller than
|
||
laptop-kernel-CVE risk, but the failure mode (provider-wide breach)
|
||
is correlated across all your sandboxes.
|
||
|
||
## Operational shape
|
||
|
||
The minimum-viable workflow, no claude-bottle code changes:
|
||
|
||
1. `terraform apply` / `flyctl machine run` / `gcloud compute
|
||
instances create` — provision a fresh Linux VM.
|
||
2. Install dockerd via the provider's image or a one-liner
|
||
(`curl -fsSL https://get.docker.com | sh`).
|
||
3. SSH in.
|
||
4. `git clone` claude-bottle on the VM, drop a manifest in place,
|
||
inject `CLAUDE_BOTTLE_OAUTH_TOKEN` via the provider's secrets path.
|
||
5. `./cli.py start <agent>` — the existing launcher handles the rest.
|
||
6. On exit: destroy the VM. No host artifacts persist.
|
||
|
||
For the "VPN pivot" failure mode, see
|
||
`local-vs-remote-agent-execution.md`. Short version: never VPN the
|
||
remote VM back to your LAN. If the agent needs LAN resources, expose
|
||
those through a narrow API instead.
|
||
|
||
## Optional ergonomics direction
|
||
|
||
A future addon — not architecturally necessary, just nicer:
|
||
|
||
- `cli.py start --remote=user@host <agent>` that:
|
||
- rsyncs the manifest and (optionally) cwd to the remote
|
||
- SSHes in with the OAuth token forwarded via `SendEnv`
|
||
- runs `cli.py start <agent>` on the remote
|
||
- forwards the TTY for the interactive session
|
||
- on exit, optionally tears down the remote VM via a provider hook
|
||
(`flyctl machine destroy`, `terraform destroy`, etc.)
|
||
|
||
This is roughly a day of work and would make the remote pattern feel
|
||
like a single launcher invocation. It is the only piece of remote
|
||
support that would benefit from being upstreamed; everything else is
|
||
operator workflow.
|
||
|
||
## Recommendation
|
||
|
||
For users who want stronger isolation than local Docker without
|
||
rewriting the runtime, this is probably the right answer. Cleaner than
|
||
gVisor (which only adds a syscall barrier on the same kernel), cleaner
|
||
than a Firecracker rewrite (which is weeks of work), cleaner than
|
||
adopting a cloud-sandbox SDK (which trades the v1 design for a vendor
|
||
API). The pre-existing `local-vs-remote-agent-execution.md` decision
|
||
heuristics still apply for *whether* this is worth the operational
|
||
overhead in any given setting.
|
||
|
||
If we wanted to land this as a real project direction:
|
||
|
||
1. Add a short "Running claude-bottle on a remote Docker VM" section
|
||
to the README pointing at this doc.
|
||
2. Optionally: prototype the `--remote=user@host` launcher subcommand.
|
||
3. Update `stronger-isolation-alternatives.md` to mention the remote
|
||
Docker VM as a fourth path, since the survey is otherwise
|
||
incomplete.
|
||
|
||
## Caveats
|
||
|
||
- "Just install Docker" isn't free on every provider; some lock down
|
||
what kernel modules and caps the VM has. Spike-test before committing.
|
||
- Multi-tenant cloud hypervisors (EC2, GCE, Vercel) have their own
|
||
side-channel and supply-chain risk surfaces, separately bounded from
|
||
the laptop-kernel risk this approach addresses.
|
||
- The remote-VM topology still does not protect source code or secrets
|
||
from the cloud provider — it protects them from a kernel exploit
|
||
reaching the developer's laptop. Different fear, different fix.
|
||
- Research conducted 2026-05-10.
|