From a2ac124d5c245e42fff49f8f25cacd0c50ddae4c Mon Sep 17 00:00:00 2001 From: didericis Date: Tue, 26 May 2026 23:19:08 -0400 Subject: [PATCH] docs(prd-0023): smolmachines bottle backend Specs a second concrete BottleBackend selectable via CLAUDE_BOTTLE_BACKEND=smolmachines: per-agent libkrun microVM on macOS, sidecars relocated to host-side loopback ports plumbed via Smolfile env, PRD 0022's sandbox-escape suite as the acceptance gate (the env-var flip is the only change required). Docker backend ships unchanged and remains default. Co-Authored-By: Claude Opus 4.7 --- docs/prds/0023-smolmachines-backend.md | 427 +++++++++++++++++++++++++ 1 file changed, 427 insertions(+) create mode 100644 docs/prds/0023-smolmachines-backend.md diff --git a/docs/prds/0023-smolmachines-backend.md b/docs/prds/0023-smolmachines-backend.md new file mode 100644 index 0000000..dd602ff --- /dev/null +++ b/docs/prds/0023-smolmachines-backend.md @@ -0,0 +1,427 @@ +# PRD 0023: smolmachines bottle backend + +- **Status:** Draft +- **Author:** didericis +- **Created:** 2026-05-26 + +## Summary + +Ship a second concrete `BottleBackend` — `SmolmachinesBottleBackend`, +selected via `CLAUDE_BOTTLE_BACKEND=smolmachines` — that runs a +bottle inside a per-agent libkrun microVM on macOS (and KVM on Linux, +opportunistically). The egress topology moves out of an internal +Docker network and onto libkrun's TSI ("Transport Socket Interface") +allowlist plus a host-side pipelock/egress/git-gate/supervise stack +listening on per-bottle loopback ports. The Docker backend ships +unchanged; this is opt-in via the existing env-var selector. + +The acceptance gate is PRD 0022's `tests/integration/test_sandbox_escape.py` +running green against `CLAUDE_BOTTLE_BACKEND=smolmachines`. + +## Problem + +`agent-vm-isolation.md` argues for hardware-isolated microVMs over +container-based bottles on macOS; `smolmachines-as-vm-backend.md` +concludes that smolmachines is the most plausible concrete VMM for +this project. Today, the only backend in the registry is Docker +(`claude_bottle/backend/__init__.py:_BACKENDS = {"docker": ...}`), +and three things motivate a second one now: + +- **Isolation ceiling.** On macOS the Docker backend's agent + container shares Docker Desktop's host VM with every other bottle. + Container escape from claude-code lands the agent inside that + shared VM. A per-bottle libkrun microVM gets hardware page tables + via `Hypervisor.framework`; cross-bottle isolation becomes + enforced by the CPU's MMU instead of namespace bookkeeping. +- **PRD 0022 is backend-agnostic by design** but currently only + exercises the Docker backend. The suite was written with + `CLAUDE_BOTTLE_BACKEND` selection in mind precisely so the + smolmachines path could be validated against the same five + attacks. Until a second backend exists, the abstraction is + unproven. +- **CI carve-outs.** Most bottle-bringup integration tests skip + under `GITEA_ACTIONS=true` because act_runner shares the host + Docker socket but not the host filesystem. A smolmachines path + doesn't share that constraint shape (it has its own, but + different), so adding the backend forces the abstraction to be + clean in places where Docker-specific assumptions have been + tolerated. + +The smolmachines research note's `## Recommendation` ("adopt +smolmachines as the bottle VM backend on macOS; keep pipelock DIY") +is the design hypothesis under test here. + +## Goals / Success Criteria + +The feature works when all of the following are observable on a +macOS host with smolmachines installed: + +- `CLAUDE_BOTTLE_BACKEND=smolmachines python3 cli.py start ` + brings up a microVM, runs claude-code inside it, and tears it + down on exit. Same y/N preflight UX as Docker — only the + resolved-runtime line differs. +- The sandbox-escape suite in `tests/integration/test_sandbox_escape.py` + runs green against the smolmachines backend (all five attack + categories blocked). +- Selecting the backend on a host without `smolvm` installed dies + at startup with an install pointer; no silent fall-through to + Docker. +- Active bottles show up under + `python3 cli.py list-bottles` regardless of backend. +- `python3 cli.py stop ` and orphan cleanup work for both + Docker bottles and smolmachines bottles via the same CLI surface. + +The feature is **done** when all of the following ship: + +- A new `claude_bottle/backend/smolmachines/` subpackage exists, + mirroring the layout of `claude_bottle/backend/docker/` + (`backend.py`, `bottle.py`, `bottle_plan.py`, + `bottle_cleanup_plan.py`, `prepare.py`, `launch.py`, + `cleanup.py`, `util.py`, and a `provision/` subpackage for the + five `provision_*` methods). +- `SmolmachinesBottleBackend` registered under the + `"smolmachines"` key in `claude_bottle/backend/__init__.py:_BACKENDS`. +- Per-bottle Smolfile generation: a runtime-rendered TOML written + to the bottle's stage dir, analogous to the compose file the + Docker backend writes today. The Smolfile pins `command`, + `env`, `--outbound-localhost-only`, and the per-bottle DNS + allowlist. +- Host-side sidecar relocation: pipelock, egress, git-gate, and + supervise each run as host processes (one set per bottle), + bound to `127.0.0.1` on per-bottle dynamically-allocated ports. + The agent's environment carries the resolved URLs (e.g. + `HTTPS_PROXY=http://127.0.0.1:`). +- The agent guest image is produced from the existing `Dockerfile` + (or a thin variant), exported as an OCI archive, and consumed by + `smolvm machine create`. The image build step is part of `prepare`, + analogous to `docker_mod.build_image`. +- The PRD 0022 sandbox-escape suite, run with + `CLAUDE_BOTTLE_BACKEND=smolmachines`, passes locally on a + smolmachines-capable host. The suite is updated to skip cleanly + on hosts that can't reach smolmachines (same shape as the + existing `GITEA_ACTIONS == "true"` skip), not to fail. +- README + `CLAUDE.md` updated to document the env-var selection, + the macOS-only scope for v1, and the `smolvm` install + prerequisite. + +## Non-goals + +- **No Linux KVM support shipped in this PRD.** smolmachines works + on Linux via KVM, but the abstraction win is biggest on macOS + where Docker's shared-VM topology hurts most. Linux can come + later behind the same selector. +- **No removal of the Docker backend.** Both backends ship side by + side. Selection stays env-driven; the manifest does not gain a + `backend` field. +- **No default-backend change.** `docker` remains the default + value of `CLAUDE_BOTTLE_BACKEND`; smolmachines is strictly + opt-in until it has been load-bearing on at least one operator's + workflow for a release cycle. +- **No host bind mounts.** The smolmachines research note flagged + that `-v HOST:GUEST` mounts via virtiofs would defeat the + isolation goal. The manifest already has no concept of host + mounts; this PRD does not introduce one. If a future PRD wants + agent-side access to host files, it must come through a + controlled channel (vsock relay, OCI overlay, supervise sidecar + endpoint). +- **No HTTP API mode.** `smolvm serve` is the long-term-clean + control plane, but v1 drives smolmachines via CLI subprocess + invocations — the lower-overhead first iteration the research + note already endorses. +- **No custom kernel / initrd.** smolmachines uses libkrunfw + only; the agent image is an OCI ref, not a kernel + rootfs pair. +- **No warm-pool or snapshot/restore.** Each bottle gets a fresh + microVM; cold-start cost is paid up front. +- **No supervise/agent-credential rewrites for the new backend.** + Provisioning logic ports as-is; only the *transport* (host-side + port URLs instead of in-network DNS names) changes. + +## Scope + +### In scope + +- New `claude_bottle/backend/smolmachines/` subpackage with the + full set of `BottleBackend` overrides. +- Smolfile generator (TOML), analogous to + `backend/docker/compose.py`'s `bottle_plan_to_compose`. +- A host-side sidecar process manager that owns the lifecycle of + pipelock + egress + git-gate + supervise for one bottle, binding + them to per-bottle loopback ports and tearing them down with the + bottle. This is the smolmachines-specific replacement for + `docker compose up`/`down`. +- Per-bottle CA install path: the egress sidecar's CA cert lands + inside the microVM via `smolvm machine exec` after start + (analogous to the existing `provision_ca` for Docker). +- DNS allowlist plumbing: every host in `bottle.egress.allowlist` + goes into the Smolfile's DNS filter section (vsock port 6002), + so the VMM-layer DNS filter and the bottle's policy stay in + sync — agent can't `dig` its way out via raw IP literals (TSI + + CIDR allowlist enforces this; DNS filter denies hostname + resolution). +- Preflight `smolvm` check: if the user selects this backend and + `smolvm` isn't on `$PATH`, die with an install pointer (brew tap + + version pin TBD in implementation; see open question 3). +- Manifest validation: refuse any bottle field this backend can't + honor (today there are none, since the Docker backend already + rejects host mounts; this is a forward-compat check). +- Tests: + - Smoke unit-level test: Smolfile renderer produces the + expected TOML for a fixture bottle. + - Integration test: `prepare → launch → exec("echo hi") → + teardown` on a smolmachines-capable host (skips otherwise + via the same env/platform gate the Docker integration tests + use). + - PRD 0022 suite, re-run with the env var flipped, passes. + +### Out of scope + +- VM image caching across bottles (each prepare rebuilds from the + OCI archive; layer reuse is whatever smolmachines provides). +- Cross-host bottle relocation (the OCI archive is local-only). +- Operator-facing knobs for vCPU / memory / overlay size (use + sensible defaults; expose as manifest fields in a later PRD if + needed). +- Integration with the `supervise` plane's permission-prompt UX + beyond port plumbing — supervise already speaks HTTP and binds + to whatever loopback the backend hands it. + +## Proposed Design + +### Backend layout + +``` +claude_bottle/backend/smolmachines/ + __init__.py re-exports SmolmachinesBottleBackend + backend.py SmolmachinesBottleBackend façade + bottle.py SmolmachinesBottle (exec_claude / exec / cp_in / close) + bottle_plan.py SmolmachinesBottlePlan + .print() + bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan + prepare.py resolve_plan(spec, stage_dir, ...) -> SmolmachinesBottlePlan + launch.py @contextmanager launch(plan) -> SmolmachinesBottle + cleanup.py prepare_cleanup / cleanup / list_active + smolfile.py bottle_plan_to_smolfile(...) -> dict + render + sidecars.py host-side pipelock/egress/git-gate/supervise lifecycle + smolvm.py thin subprocess wrapper: machine create/start/exec/stop + util.py slugify, port allocation, OCI archive helpers + provision/ ca.py, prompt.py, skills.py, git.py, supervise.py +``` + +### Network + egress topology + +``` + ┌── macOS host ─────────────────────────────────────────────┐ + │ │ + │ ┌── per-bottle host sidecars (one set per microVM) ─┐ │ + │ │ pipelock 127.0.0.1: │ │ + │ │ egress 127.0.0.1: │ │ + │ │ git-gate 127.0.0.1: │ │ + │ │ supervise 127.0.0.1: │ │ + │ └───────────────────────────────────────────────────┘ │ + │ ▲ │ + │ │ TSI passthrough (localhost) │ + │ │ │ + │ ┌── libkrun microVM (per bottle) ───────────────────┐ │ + │ │ env: HTTPS_PROXY=http://127.0.0.1: │ │ + │ │ EGRESS_URL=http://127.0.0.1: │ │ + │ │ GIT_GATE_URL=http://127.0.0.1: │ │ + │ │ MCP_SUPERVISE_URL=http://127.0.0.1: │ │ + │ │ --outbound-localhost-only │ │ + │ │ DNS filter (vsock:6002) → host allowlist │ │ + │ └───────────────────────────────────────────────────┘ │ + │ │ + └───────────────────────────────────────────────────────────┘ +``` + +Two changes vs. the Docker backend: + +1. **Sidecars are host processes, not sibling containers.** No + internal Docker network; isolation comes from TSI plus the + per-bottle loopback port set. +2. **The "internal" allowlist becomes localhost-only.** Egress out + to the public internet still happens through pipelock + egress + — the same scanning + DLP + auth-injection chain — but the + agent's first hop is `127.0.0.1:` reached via TSI, not a + sidecar's IP on a Docker-managed bridge. + +### Lifecycle + +`SmolmachinesBottleBackend.prepare(spec, stage_dir)`: + +1. Cross-backend validation via `BottleBackend._validate` (skills, + git identity files). +2. Allocate four loopback ports (bind, get free port, release; + record on plan). +3. Resolve the agent OCI archive path (build if missing, cache by + Dockerfile + agent-name hash). +4. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`, + pinning command/env/`--outbound-localhost-only` + DNS allowlist. +5. Resolve the in-VM CA paths so launch knows where to copy + pipelock's CA after start. +6. Return a `SmolmachinesBottlePlan` carrying the slug, port map, + OCI archive path, Smolfile path, and host sidecar specs. + +`SmolmachinesBottleBackend.launch(plan)`: + +1. Start the four host sidecars in dependency order (pipelock → + egress → git-gate → supervise), bound to the plan's allocated + ports. Register teardown callbacks in reverse order. +2. `smolvm machine create --smolfile ` and + `smolvm machine start `. +3. Provisioning: CA install → prompt → skills → git → supervise + config, each via `smolvm machine exec` (analogous to + `docker exec`). +4. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` / + `cp_in` all funnel through `smolvm machine exec` / + `smolvm machine cp`. +5. Teardown: stop and remove the VM, then stop the sidecars (in + reverse start order). + +### Data model + +No manifest schema change. `bottles[]` continues to carry +`egress.allowlist`, `env`, `git`, `skills` references, etc.; the +smolmachines backend reads the same fields as the docker backend. +The DNS allowlist plumbed into the Smolfile is just +`bottle.egress.allowlist` re-encoded as TOML. + +The `BottleSpec` dataclass and the `Bottle` ABC do not change. + +### Selection wiring + +In `claude_bottle/backend/__init__.py`: + +```python +from .docker import DockerBottleBackend +from .smolmachines import SmolmachinesBottleBackend + +_BACKENDS: dict[str, BottleBackend[Any, Any]] = { + "docker": DockerBottleBackend(), + "smolmachines": SmolmachinesBottleBackend(), +} +``` + +The existing "unknown backend" `die()` path stays as-is. + +### External dependencies + +- `smolvm` CLI binary on `$PATH` (one new external dep, gated by + the preflight check). Pinned version policy is deferred to the + open questions; v1 reads `smolvm --version` and refuses to launch + outside a known-good range. +- No new Python packages. Subprocess + stdlib `tomllib`/`tomli_w` + for Smolfile authoring. (`tomli_w` is the only candidate + module; if it's not stdlib in the target Python, render TOML + by hand from a `dict[str, Any]` — Smolfile shape is small.) + +### Acceptance test plan + +- **Unit:** `tests/unit/test_smolfile.py` verifies the renderer + produces the expected TOML for a fixture bottle (allowlist → + DNS rules, env → `env =`, command line, outbound-localhost + flag). +- **Integration smoke:** `tests/integration/test_smolmachines_smoke.py` + with `prepare → launch → exec → teardown`, guarded by a + `smolvm` presence check + macOS / KVM platform check. +- **PRD 0022 re-run:** with `CLAUDE_BOTTLE_BACKEND=smolmachines`, + all five attack categories return sandbox-block markers and the + suite passes. The test code does not change beyond the env-var + flip — that's the contract the PRD 0022 abstraction was + designed for. + +## Sizing — into chunks + +1. **Backend skeleton + selection + Smolfile renderer.** Subpackage + layout, `_resolve_plan` stub that emits a TOML file but doesn't + launch anything, `_BACKENDS` registration, preflight `smolvm` + check. Unit test on the renderer. No VM bringup yet. +2. **VM lifecycle + OCI archive build.** `smolvm.py` subprocess + wrapper, prepare-time image build (existing Dockerfile → OCI + archive), launch path that creates + starts + stops a VM with + no sidecars wired. Smoke integration test: `exec("echo hi")` + inside a started VM. +3. **Host-side sidecar relocation.** `sidecars.py`: per-bottle + pipelock + egress + git-gate + supervise as host processes on + loopback. Port allocator. Teardown ordering. No provisioning + yet beyond what the sidecars need. +4. **Provisioning parity with Docker.** CA install via + `smolvm machine exec`, prompt/skills/.git copy-in, supervise + MCP config. End-to-end `start` works for a real agent manifest. +5. **PRD 0022 sandbox-escape suite green.** Skip-guard update, + small adjustments to test helpers if any (the test uses + `bottle.exec(script)` and inspects `returncode` + body for + sandbox markers — should be transport-agnostic, but verify). + Document the macOS-only scope in README. + +## Open questions + +1. **Sidecar locality: host process vs in-VM init.** This PRD + defaults to host-process sidecars (proposed design above). The + alternative — bake pipelock + egress + git-gate + supervise + into the OCI image and start them via init in the same VM — + would simplify port plumbing (the agent reaches sidecars over + localhost inside the VM, not over TSI) but expands the trust + boundary of the agent VM. Default A unless someone identifies + a TSI loopback edge case during chunk 3. +2. **`smolvm` install policy.** Pin via brew formula version, or + build-from-source step, or vendored binary checked into the + repo. v1 most likely runs `smolvm --version` at preflight and + accepts a documented range; vendoring is heavier but reduces + "works on my Mac" drift. +3. **CA install inside the OCI overlay.** Two paths: bake at + prepare time (one OCI archive per CA fingerprint, big cache + key) vs. inject at start time via `smolvm machine exec` after + the VM is up. PRD 0006 chose the runtime path for Docker + (docker-cp + `update-ca-certificates`); smolvm has the same + shape via `machine exec`. Default to runtime injection unless + it conflicts with `--outbound-localhost-only` start order. +4. **DNS filter granularity.** smolmachines's vsock-6002 filter + accepts an allowlist of hostnames; we want to enforce both + "agent can only resolve names on the bottle's allowlist" *and* + "agent can only egress via TSI to 127.0.0.1." Confirm + empirically (smoke test in chunk 2) that the allowlist applies + to *guest-initiated* DNS only and doesn't accidentally NXDOMAIN + the host-side pipelock's upstream lookups. +5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test + suite reads `returncode` + stdout + stderr from + `ExecResult`. Confirm `smolvm machine exec` propagates exit + codes and separated streams — the research note's + "external integration is the CLI" implies yes, but the + embedded SDK bug it flagged suggests we should verify before + coding around it. +6. **CI gating.** Gitea's act_runner is Linux without nested KVM, + so smolmachines integration tests will skip there for the same + structural reason the Docker bringup tests do (no real + isolation primitive available on the runner). The skip + predicate becomes `not (smolvm_available() and + (platform.system() == "Darwin" or kvm_available()))`. CI + coverage for this backend will come from local runs on the + maintainer's macOS host until a Darwin runner is wired up; + ack that as a known gap. +7. **Active bottle discovery.** Docker uses container labels to + enumerate active bottles (`list_active` queries the daemon). + smolmachines's enumeration story is `smolvm machine list`; the + plan is to mirror the label scheme via Smolfile metadata + (`labels = { "claude-bottle" = "1" }`-style entries, if the + format supports it; otherwise via a deterministic name prefix + `claude-bottle-`). + +## References + +- `docs/research/smolmachines-as-vm-backend.md` — primary research + note recommending this adoption; PRD 0023's design hypothesis. +- `docs/research/agent-vm-isolation.md` — the broader microVM / + gvproxy / pipelock landscape this PRD lands inside of. +- `docs/research/agent-sandbox-landscape.md` — identifies + `"runtime": "microvm"`-style opt-in as the borrowable idea; + smolmachines is the concrete implementation. +- PRD 0003 (`docs/prds/0003-bottle-backend-abstraction.md`) — the + backend abstraction this PRD is the first non-Docker consumer + of. +- PRD 0017 (`docs/prds/0017-egress-proxy-via-mitmproxy.md`) — the + egress sidecar the host-side relocation reuses verbatim, only + with a different transport. +- PRD 0022 + (`docs/prds/0022-sandbox-escape-integration-test.md`) — the + acceptance gate for this PRD; the suite already runs through + `get_bottle_backend()` so the env-var flip is the only change + needed to exercise the smolmachines path.