# PRD 0023: smolmachines bottle backend - **Status:** Draft - **Author:** didericis - **Created:** 2026-05-26 ## Summary Ship a second concrete `BottleBackend` — `SmolmachinesBottleBackend`, selected via `CLAUDE_BOTTLE_BACKEND=smolmachines` — that runs a bottle inside a per-agent libkrun microVM on macOS (and KVM on Linux, opportunistically). The egress topology moves out of an internal Docker network and onto libkrun's TSI ("Transport Socket Interface") allowlist plus a host-side pipelock/egress/git-gate/supervise stack listening on per-bottle loopback ports. The Docker backend ships unchanged; this is opt-in via the existing env-var selector. The acceptance gate is PRD 0022's `tests/integration/test_sandbox_escape.py` running green against `CLAUDE_BOTTLE_BACKEND=smolmachines`. ## Problem `agent-vm-isolation.md` argues for hardware-isolated microVMs over container-based bottles on macOS; `smolmachines-as-vm-backend.md` concludes that smolmachines is the most plausible concrete VMM for this project. Today, the only backend in the registry is Docker (`claude_bottle/backend/__init__.py:_BACKENDS = {"docker": ...}`), and three things motivate a second one now: - **Isolation ceiling.** On macOS the Docker backend's agent container shares Docker Desktop's host VM with every other bottle. Container escape from claude-code lands the agent inside that shared VM. A per-bottle libkrun microVM gets hardware page tables via `Hypervisor.framework`; cross-bottle isolation becomes enforced by the CPU's MMU instead of namespace bookkeeping. - **PRD 0022 is backend-agnostic by design** but currently only exercises the Docker backend. The suite was written with `CLAUDE_BOTTLE_BACKEND` selection in mind precisely so the smolmachines path could be validated against the same five attacks. Until a second backend exists, the abstraction is unproven. - **CI carve-outs.** Most bottle-bringup integration tests skip under `GITEA_ACTIONS=true` because act_runner shares the host Docker socket but not the host filesystem. A smolmachines path doesn't share that constraint shape (it has its own, but different), so adding the backend forces the abstraction to be clean in places where Docker-specific assumptions have been tolerated. The smolmachines research note's `## Recommendation` ("adopt smolmachines as the bottle VM backend on macOS; keep pipelock DIY") is the design hypothesis under test here. ## Goals / Success Criteria The feature works when all of the following are observable on a macOS host with smolmachines installed: - `CLAUDE_BOTTLE_BACKEND=smolmachines python3 cli.py start ` brings up a microVM, runs claude-code inside it, and tears it down on exit. Same y/N preflight UX as Docker — only the resolved-runtime line differs. - The sandbox-escape suite in `tests/integration/test_sandbox_escape.py` runs green against the smolmachines backend (all five attack categories blocked). - Selecting the backend on a host without `smolvm` installed dies at startup with an install pointer; no silent fall-through to Docker. - Active bottles show up under `python3 cli.py list-bottles` regardless of backend. - `python3 cli.py stop ` and orphan cleanup work for both Docker bottles and smolmachines bottles via the same CLI surface. The feature is **done** when all of the following ship: - A new `claude_bottle/backend/smolmachines/` subpackage exists, mirroring the layout of `claude_bottle/backend/docker/` (`backend.py`, `bottle.py`, `bottle_plan.py`, `bottle_cleanup_plan.py`, `prepare.py`, `launch.py`, `cleanup.py`, `util.py`, and a `provision/` subpackage for the five `provision_*` methods). - `SmolmachinesBottleBackend` registered under the `"smolmachines"` key in `claude_bottle/backend/__init__.py:_BACKENDS`. - Per-bottle Smolfile generation: a runtime-rendered TOML written to the bottle's stage dir, analogous to the compose file the Docker backend writes today. The Smolfile pins `command`, `env`, `--outbound-localhost-only`, and the per-bottle DNS allowlist. - Host-side sidecar relocation: pipelock, egress, git-gate, and supervise each run as host processes (one set per bottle), bound to `127.0.0.1` on per-bottle dynamically-allocated ports. The agent's environment carries the resolved URLs (e.g. `HTTPS_PROXY=http://127.0.0.1:`). - The agent guest image is produced from the existing `Dockerfile` (or a thin variant), exported as an OCI archive, and consumed by `smolvm machine create`. The image build step is part of `prepare`, analogous to `docker_mod.build_image`. - The PRD 0022 sandbox-escape suite, run with `CLAUDE_BOTTLE_BACKEND=smolmachines`, passes locally on a smolmachines-capable host. The suite is updated to skip cleanly on hosts that can't reach smolmachines (same shape as the existing `GITEA_ACTIONS == "true"` skip), not to fail. - README + `CLAUDE.md` updated to document the env-var selection, the macOS-only scope for v1, and the `smolvm` install prerequisite. ## Non-goals - **No Linux KVM support shipped in this PRD.** smolmachines works on Linux via KVM, but the abstraction win is biggest on macOS where Docker's shared-VM topology hurts most. Linux can come later behind the same selector. - **No removal of the Docker backend.** Both backends ship side by side. Selection stays env-driven; the manifest does not gain a `backend` field. - **No default-backend change.** `docker` remains the default value of `CLAUDE_BOTTLE_BACKEND`; smolmachines is strictly opt-in until it has been load-bearing on at least one operator's workflow for a release cycle. - **No host bind mounts.** The smolmachines research note flagged that `-v HOST:GUEST` mounts via virtiofs would defeat the isolation goal. The manifest already has no concept of host mounts; this PRD does not introduce one. If a future PRD wants agent-side access to host files, it must come through a controlled channel (vsock relay, OCI overlay, supervise sidecar endpoint). - **No HTTP API mode.** `smolvm serve` is the long-term-clean control plane, but v1 drives smolmachines via CLI subprocess invocations — the lower-overhead first iteration the research note already endorses. - **No custom kernel / initrd.** smolmachines uses libkrunfw only; the agent image is an OCI ref, not a kernel + rootfs pair. - **No warm-pool or snapshot/restore.** Each bottle gets a fresh microVM; cold-start cost is paid up front. - **No supervise/agent-credential rewrites for the new backend.** Provisioning logic ports as-is; only the *transport* (host-side port URLs instead of in-network DNS names) changes. ## Scope ### In scope - New `claude_bottle/backend/smolmachines/` subpackage with the full set of `BottleBackend` overrides. - Smolfile generator (TOML), analogous to `backend/docker/compose.py`'s `bottle_plan_to_compose`. - A host-side sidecar process manager that owns the lifecycle of pipelock + egress + git-gate + supervise for one bottle, binding them to per-bottle loopback ports and tearing them down with the bottle. This is the smolmachines-specific replacement for `docker compose up`/`down`. - Per-bottle CA install path: the egress sidecar's CA cert lands inside the microVM via `smolvm machine exec` after start (analogous to the existing `provision_ca` for Docker). - DNS allowlist plumbing: every host in `bottle.egress.allowlist` goes into the Smolfile's DNS filter section (vsock port 6002), so the VMM-layer DNS filter and the bottle's policy stay in sync — agent can't `dig` its way out via raw IP literals (TSI + CIDR allowlist enforces this; DNS filter denies hostname resolution). - Preflight `smolvm` check: if the user selects this backend and `smolvm` isn't on `$PATH`, die with an install pointer (brew tap + version pin TBD in implementation; see open question 3). - Manifest validation: refuse any bottle field this backend can't honor (today there are none, since the Docker backend already rejects host mounts; this is a forward-compat check). - Tests: - Smoke unit-level test: Smolfile renderer produces the expected TOML for a fixture bottle. - Integration test: `prepare → launch → exec("echo hi") → teardown` on a smolmachines-capable host (skips otherwise via the same env/platform gate the Docker integration tests use). - PRD 0022 suite, re-run with the env var flipped, passes. ### Out of scope - VM image caching across bottles (each prepare rebuilds from the OCI archive; layer reuse is whatever smolmachines provides). - Cross-host bottle relocation (the OCI archive is local-only). - Operator-facing knobs for vCPU / memory / overlay size (use sensible defaults; expose as manifest fields in a later PRD if needed). - Integration with the `supervise` plane's permission-prompt UX beyond port plumbing — supervise already speaks HTTP and binds to whatever loopback the backend hands it. ## Proposed Design ### Backend layout ``` claude_bottle/backend/smolmachines/ __init__.py re-exports SmolmachinesBottleBackend backend.py SmolmachinesBottleBackend façade bottle.py SmolmachinesBottle (exec_claude / exec / cp_in / close) bottle_plan.py SmolmachinesBottlePlan + .print() bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan prepare.py resolve_plan(spec, stage_dir, ...) -> SmolmachinesBottlePlan launch.py @contextmanager launch(plan) -> SmolmachinesBottle cleanup.py prepare_cleanup / cleanup / list_active smolfile.py bottle_plan_to_smolfile(...) -> dict + render sidecars.py host-side pipelock/egress/git-gate/supervise lifecycle smolvm.py thin subprocess wrapper: machine create/start/exec/stop util.py slugify, port allocation, OCI archive helpers provision/ ca.py, prompt.py, skills.py, git.py, supervise.py ``` ### Network + egress topology ``` ┌── macOS host ─────────────────────────────────────────────┐ │ │ │ ┌── per-bottle host sidecars (one set per microVM) ─┐ │ │ │ pipelock 127.0.0.1: │ │ │ │ egress 127.0.0.1: │ │ │ │ git-gate 127.0.0.1: │ │ │ │ supervise 127.0.0.1: │ │ │ └───────────────────────────────────────────────────┘ │ │ ▲ │ │ │ TSI passthrough (localhost) │ │ │ │ │ ┌── libkrun microVM (per bottle) ───────────────────┐ │ │ │ env: HTTPS_PROXY=http://127.0.0.1: │ │ │ │ EGRESS_URL=http://127.0.0.1: │ │ │ │ GIT_GATE_URL=http://127.0.0.1: │ │ │ │ MCP_SUPERVISE_URL=http://127.0.0.1: │ │ │ │ --outbound-localhost-only │ │ │ │ DNS filter (vsock:6002) → host allowlist │ │ │ └───────────────────────────────────────────────────┘ │ │ │ └───────────────────────────────────────────────────────────┘ ``` Two changes vs. the Docker backend: 1. **Sidecars are host processes, not sibling containers.** No internal Docker network; isolation comes from TSI plus the per-bottle loopback port set. 2. **The "internal" allowlist becomes localhost-only.** Egress out to the public internet still happens through pipelock + egress — the same scanning + DLP + auth-injection chain — but the agent's first hop is `127.0.0.1:` reached via TSI, not a sidecar's IP on a Docker-managed bridge. ### Lifecycle `SmolmachinesBottleBackend.prepare(spec, stage_dir)`: 1. Cross-backend validation via `BottleBackend._validate` (skills, git identity files). 2. Allocate four loopback ports (bind, get free port, release; record on plan). 3. Resolve the agent OCI archive path (build if missing, cache by Dockerfile + agent-name hash). 4. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`, pinning command/env/`--outbound-localhost-only` + DNS allowlist. 5. Resolve the in-VM CA paths so launch knows where to copy pipelock's CA after start. 6. Return a `SmolmachinesBottlePlan` carrying the slug, port map, OCI archive path, Smolfile path, and host sidecar specs. `SmolmachinesBottleBackend.launch(plan)`: 1. Start the four host sidecars in dependency order (pipelock → egress → git-gate → supervise), bound to the plan's allocated ports. Register teardown callbacks in reverse order. 2. `smolvm machine create --smolfile ` and `smolvm machine start `. 3. Provisioning: CA install → prompt → skills → git → supervise config, each via `smolvm machine exec` (analogous to `docker exec`). 4. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` / `cp_in` all funnel through `smolvm machine exec` / `smolvm machine cp`. 5. Teardown: stop and remove the VM, then stop the sidecars (in reverse start order). ### Data model No manifest schema change. `bottles[]` continues to carry `egress.allowlist`, `env`, `git`, `skills` references, etc.; the smolmachines backend reads the same fields as the docker backend. The DNS allowlist plumbed into the Smolfile is just `bottle.egress.allowlist` re-encoded as TOML. The `BottleSpec` dataclass and the `Bottle` ABC do not change. ### Selection wiring In `claude_bottle/backend/__init__.py`: ```python from .docker import DockerBottleBackend from .smolmachines import SmolmachinesBottleBackend _BACKENDS: dict[str, BottleBackend[Any, Any]] = { "docker": DockerBottleBackend(), "smolmachines": SmolmachinesBottleBackend(), } ``` The existing "unknown backend" `die()` path stays as-is. ### External dependencies - `smolvm` CLI binary on `$PATH` (one new external dep, gated by the preflight check). Pinned version policy is deferred to the open questions; v1 reads `smolvm --version` and refuses to launch outside a known-good range. - No new Python packages. Subprocess + stdlib `tomllib`/`tomli_w` for Smolfile authoring. (`tomli_w` is the only candidate module; if it's not stdlib in the target Python, render TOML by hand from a `dict[str, Any]` — Smolfile shape is small.) ### Acceptance test plan - **Unit:** `tests/unit/test_smolfile.py` verifies the renderer produces the expected TOML for a fixture bottle (allowlist → DNS rules, env → `env =`, command line, outbound-localhost flag). - **Integration smoke:** `tests/integration/test_smolmachines_smoke.py` with `prepare → launch → exec → teardown`, guarded by a `smolvm` presence check + macOS / KVM platform check. - **PRD 0022 re-run:** with `CLAUDE_BOTTLE_BACKEND=smolmachines`, all five attack categories return sandbox-block markers and the suite passes. The test code does not change beyond the env-var flip — that's the contract the PRD 0022 abstraction was designed for. ## Sizing — into chunks 1. **Backend skeleton + selection + Smolfile renderer.** Subpackage layout, `_resolve_plan` stub that emits a TOML file but doesn't launch anything, `_BACKENDS` registration, preflight `smolvm` check. Unit test on the renderer. No VM bringup yet. 2. **VM lifecycle + OCI archive build.** `smolvm.py` subprocess wrapper, prepare-time image build (existing Dockerfile → OCI archive), launch path that creates + starts + stops a VM with no sidecars wired. Smoke integration test: `exec("echo hi")` inside a started VM. 3. **Host-side sidecar relocation.** `sidecars.py`: per-bottle pipelock + egress + git-gate + supervise as host processes on loopback. Port allocator. Teardown ordering. No provisioning yet beyond what the sidecars need. 4. **Provisioning parity with Docker.** CA install via `smolvm machine exec`, prompt/skills/.git copy-in, supervise MCP config. End-to-end `start` works for a real agent manifest. 5. **PRD 0022 sandbox-escape suite green.** Skip-guard update, small adjustments to test helpers if any (the test uses `bottle.exec(script)` and inspects `returncode` + body for sandbox markers — should be transport-agnostic, but verify). Document the macOS-only scope in README. ## Open questions 1. **Sidecar locality: host process vs in-VM init.** This PRD defaults to host-process sidecars (proposed design above). The alternative — bake pipelock + egress + git-gate + supervise into the OCI image and start them via init in the same VM — would simplify port plumbing (the agent reaches sidecars over localhost inside the VM, not over TSI) but expands the trust boundary of the agent VM. Default A unless someone identifies a TSI loopback edge case during chunk 3. 2. **`smolvm` install policy.** Pin via brew formula version, or build-from-source step, or vendored binary checked into the repo. v1 most likely runs `smolvm --version` at preflight and accepts a documented range; vendoring is heavier but reduces "works on my Mac" drift. 3. **CA install inside the OCI overlay.** Two paths: bake at prepare time (one OCI archive per CA fingerprint, big cache key) vs. inject at start time via `smolvm machine exec` after the VM is up. PRD 0006 chose the runtime path for Docker (docker-cp + `update-ca-certificates`); smolvm has the same shape via `machine exec`. Default to runtime injection unless it conflicts with `--outbound-localhost-only` start order. 4. **DNS filter granularity.** smolmachines's vsock-6002 filter accepts an allowlist of hostnames; we want to enforce both "agent can only resolve names on the bottle's allowlist" *and* "agent can only egress via TSI to 127.0.0.1." Confirm empirically (smoke test in chunk 2) that the allowlist applies to *guest-initiated* DNS only and doesn't accidentally NXDOMAIN the host-side pipelock's upstream lookups. 5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test suite reads `returncode` + stdout + stderr from `ExecResult`. Confirm `smolvm machine exec` propagates exit codes and separated streams — the research note's "external integration is the CLI" implies yes, but the embedded SDK bug it flagged suggests we should verify before coding around it. 6. **CI gating.** Gitea's act_runner is Linux without nested KVM, so smolmachines integration tests will skip there for the same structural reason the Docker bringup tests do (no real isolation primitive available on the runner). The skip predicate becomes `not (smolvm_available() and (platform.system() == "Darwin" or kvm_available()))`. CI coverage for this backend will come from local runs on the maintainer's macOS host until a Darwin runner is wired up; ack that as a known gap. 7. **Active bottle discovery.** Docker uses container labels to enumerate active bottles (`list_active` queries the daemon). smolmachines's enumeration story is `smolvm machine list`; the plan is to mirror the label scheme via Smolfile metadata (`labels = { "claude-bottle" = "1" }`-style entries, if the format supports it; otherwise via a deterministic name prefix `claude-bottle-`). ## References - `docs/research/smolmachines-as-vm-backend.md` — primary research note recommending this adoption; PRD 0023's design hypothesis. - `docs/research/agent-vm-isolation.md` — the broader microVM / gvproxy / pipelock landscape this PRD lands inside of. - `docs/research/agent-sandbox-landscape.md` — identifies `"runtime": "microvm"`-style opt-in as the borrowable idea; smolmachines is the concrete implementation. - PRD 0003 (`docs/prds/0003-bottle-backend-abstraction.md`) — the backend abstraction this PRD is the first non-Docker consumer of. - PRD 0017 (`docs/prds/0017-egress-proxy-via-mitmproxy.md`) — the egress sidecar the host-side relocation reuses verbatim, only with a different transport. - PRD 0022 (`docs/prds/0022-sandbox-escape-integration-test.md`) — the acceptance gate for this PRD; the suite already runs through `get_bottle_backend()` so the env-var flip is the only change needed to exercise the smolmachines path.