docs: add research note on remote Docker VM as an isolation upgrade

Argues that running claude-bottle unchanged on a remote Linux VM with dockerd is the cheapest practical path to stronger isolation than local Docker — preserves the v1 pipelock topology, requires zero code changes, and shrinks the agent's blast radius from the developer laptop to a disposable VM. Cross-references the existing stronger-isolation-alternatives and local-vs-remote-agent-execution notes so the research set composes cleanly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 01:07:17 -04:00
parent e3f5a5907a
commit 43453c66ea
1 changed files with 189 additions and 0 deletions
@@ -0,0 +1,189 @@
+# Remote Docker VM as an isolation upgrade for claude-bottle
+
+Note on the cheapest practical path to stronger isolation than local
+Docker: run claude-bottle unchanged on a remote Linux VM that has
+dockerd. Complements `stronger-isolation-alternatives.md` (which
+surveys runtime swaps like gVisor, Kata, Firecracker, Apple Container)
+and `local-vs-remote-agent-execution.md` (which surveys the
+local-vs-remote decision broadly).
+
+## Summary
+
+If the goal is "stronger isolation than Docker-on-my-laptop without
+rewriting the runtime," the cleanest answer is to keep claude-bottle
+exactly as it is and run it on a remote Linux VM where you can install
+dockerd. The v1 design — pipelock as a separate container on a
+`--internal` network, ephemeral agent containers, OAuth-token
+forwarding — works as-is. The only thing that changes is that the
+"host" is now a disposable VM you provisioned for the session, not your
+laptop.
+
+This is structurally equivalent to a Firecracker rewrite (Rung 3 in
+`stronger-isolation-alternatives.md`), but the cloud provider operates
+the runtime for you. It is also strictly cheaper than adopting a cloud
+sandbox SDK (Vercel Sandbox, E2B, Cloudflare Sandbox SDK) because you
+keep the existing Docker-shaped abstractions instead of swapping them
+for a vendor API.
+
+## The argument
+
+### What changes in the threat model
+
+The agent's blast radius shrinks from "developer laptop + everything
+on the LAN" to "this disposable VM." Concretely, what's no longer
+reachable on container escape:
+
+- `~/.ssh/`, `~/.aws/credentials`, `~/.config/gh`, the macOS Keychain
+- Browser cookies and session state
+- Other dev machines on the home/office LAN
+- NAS, printers, smart-home devices, anything else on the local network
+
+What replaces it on the remote side: only what the operator chose to
+ship to the VM for the session. Typically the OAuth token, optional SSH
+keys for the bottle, the manifest, and the workspace if the agent needs
+one. None of which are on the laptop after the VM is destroyed.
+
+### Why the boundary is equivalent to v1, not weaker
+
+A natural objection — raised in the design discussion that produced
+this note — is that running pipelock and the agent on the same VM
+collapses a network boundary into a kernel-namespace boundary, which
+sounds weaker. It is not, *if you reuse Docker for the inner topology.*
+
+Docker on the remote VM gives the agent and pipelock their own network
+namespaces by default, with the agent attached to a `--internal`
+network and pipelock straddling it and an egress bridge. That is the
+same v1 topology. Bypassing pipelock from the agent requires the same
+class of attack as bypassing it on a laptop: a kernel-level netns
+escape inside the VM. The only difference is that the kernel under
+attack belongs to a disposable VM, not the developer's machine.
+
+In other words: the "weaker because colocated" framing only applies if
+you naively run agent and pipelock as two processes in the same
+namespace. With Docker on the VM, you don't.
+
+### Why this is cheaper than the alternatives
+
+| Path | Effort | Where the VM-grade boundary comes from |
+| --- | --- | --- |
+| gVisor (`runsc`) per bottle | ~1–2 days | Userspace syscall barrier; not a full VM |
+| Kata Containers per bottle | ~1–2 days, Linux-only | Kata's microVM-per-container |
+| Firecracker rewrite | 2–4 weeks | Self-operated Firecracker |
+| Apple Container (macOS) | ~1 week spike + integration | Apple's Virtualization.framework, per-container |
+| Cloud sandbox SDK (Vercel, E2B, …) | Days–weeks of API rewrite + lock-in | Provider-operated Firecracker / equivalent |
+| **Remote Docker VM (this note)** | **0 lines of code** | **Cloud-provider hypervisor under the VM** |
+
+The "stronger isolation alternatives" doc concludes that gVisor is the
+right today-step and Apple Container is probably the right v2.
+This note adds a third option that sits orthogonal to both: don't
+change the runtime, change the host. Use it when the failure mode you
+care about is "agent compromises my laptop" specifically, rather than
+"agent escapes Docker into a kernel I share with other workloads."
+
+## What the provider has to give you
+
+Not every cloud sandbox is suitable. The minimum for this approach to
+work:
+
+- Root or rootless-Docker capability inside the VM. Rules out
+  Fargate-style locked-down container hosts and most "function" tier
+  FaaS. Verify before committing — Vercel Sandbox specifically may or
+  may not allow installing dockerd depending on tier; Fly Machines,
+  EC2, GCE, Hetzner, Linode, and self-hosted hypervisors give you full
+  control.
+- Enough disk + RAM to host the claude-bottle image, the agent
+  container, and the pipelock sidecar. Headroom of ~2–4 GB RAM and
+  ~5 GB disk is comfortable; less works for short sessions.
+- An interactive reach path. SSH is fine. The launcher uses
+  `docker exec -it`, so any TTY-capable session works.
+
+## What you give up
+
+- **Typing latency.** Interactive Claude sessions over SSH have visible
+  per-keystroke latency; usually fine on wired/fiber, less fine on
+  Wi-Fi-to-cloud. Mosh helps if it's bothersome.
+- **Token shipping.** `CLAUDE_BOTTLE_OAUTH_TOKEN` has to live on the
+  remote box for the launcher to forward it into containers. Use the
+  provider's secret-injection path (cloud-init user-data,
+  `flyctl secrets`, Tailscale-served local file, etc.). Never echo the
+  token onto the SSH command line; it ends up in the local shell
+  history and possibly the SSH server's auth log.
+- **Idle cost.** Unless the VM is torn down between sessions, you pay
+  for it sitting idle. Ephemeral provisioning (one VM per session,
+  destroyed on exit) is the cheaper and more secure pattern; see
+  `local-vs-remote-agent-execution.md` on why ephemeral is also
+  recommended for credential-concentration reasons.
+- **Source code goes to the VM.** Same as any remote-execution
+  topology. If the project is under NDA, the VM provider matters.
+- **Provider trust.** Multi-tenancy side channels, supply-chain
+  compromise of the provider, insider risk. Generally smaller than
+  laptop-kernel-CVE risk, but the failure mode (provider-wide breach)
+  is correlated across all your sandboxes.
+
+## Operational shape
+
+The minimum-viable workflow, no claude-bottle code changes:
+
+1. `terraform apply` / `flyctl machine run` / `gcloud compute
+   instances create` — provision a fresh Linux VM.
+2. Install dockerd via the provider's image or a one-liner
+   (`curl -fsSL https://get.docker.com | sh`).
+3. SSH in.
+4. `git clone` claude-bottle on the VM, drop a manifest in place,
+   inject `CLAUDE_BOTTLE_OAUTH_TOKEN` via the provider's secrets path.
+5. `./cli.py start <agent>` — the existing launcher handles the rest.
+6. On exit: destroy the VM. No host artifacts persist.
+
+For the "VPN pivot" failure mode, see
+`local-vs-remote-agent-execution.md`. Short version: never VPN the
+remote VM back to your LAN. If the agent needs LAN resources, expose
+those through a narrow API instead.
+
+## Optional ergonomics direction
+
+A future addon — not architecturally necessary, just nicer:
+
+- `cli.py start --remote=user@host <agent>` that:
+  - rsyncs the manifest and (optionally) cwd to the remote
+  - SSHes in with the OAuth token forwarded via `SendEnv`
+  - runs `cli.py start <agent>` on the remote
+  - forwards the TTY for the interactive session
+  - on exit, optionally tears down the remote VM via a provider hook
+    (`flyctl machine destroy`, `terraform destroy`, etc.)
+
+This is roughly a day of work and would make the remote pattern feel
+like a single launcher invocation. It is the only piece of remote
+support that would benefit from being upstreamed; everything else is
+operator workflow.
+
+## Recommendation
+
+For users who want stronger isolation than local Docker without
+rewriting the runtime, this is probably the right answer. Cleaner than
+gVisor (which only adds a syscall barrier on the same kernel), cleaner
+than a Firecracker rewrite (which is weeks of work), cleaner than
+adopting a cloud-sandbox SDK (which trades the v1 design for a vendor
+API). The pre-existing `local-vs-remote-agent-execution.md` decision
+heuristics still apply for *whether* this is worth the operational
+overhead in any given setting.
+
+If we wanted to land this as a real project direction:
+
+1. Add a short "Running claude-bottle on a remote Docker VM" section
+   to the README pointing at this doc.
+2. Optionally: prototype the `--remote=user@host` launcher subcommand.
+3. Update `stronger-isolation-alternatives.md` to mention the remote
+   Docker VM as a fourth path, since the survey is otherwise
+   incomplete.
+
+## Caveats
+
+- "Just install Docker" isn't free on every provider; some lock down
+  what kernel modules and caps the VM has. Spike-test before committing.
+- Multi-tenant cloud hypervisors (EC2, GCE, Vercel) have their own
+  side-channel and supply-chain risk surfaces, separately bounded from
+  the laptop-kernel risk this approach addresses.
+- The remote-VM topology still does not protect source code or secrets
+  from the cloud provider — it protects them from a kernel exploit
+  reaching the developer's laptop. Different fear, different fix.
+- Research conducted 2026-05-10.