Specs a second concrete BottleBackend selectable via CLAUDE_BOTTLE_BACKEND=smolmachines: per-agent libkrun microVM on macOS, sidecars relocated to host-side loopback ports plumbed via Smolfile env, PRD 0022's sandbox-escape suite as the acceptance gate (the env-var flip is the only change required). Docker backend ships unchanged and remains default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
21 KiB
PRD 0023: smolmachines bottle backend
- Status: Draft
- Author: didericis
- Created: 2026-05-26
Summary
Ship a second concrete BottleBackend — SmolmachinesBottleBackend,
selected via CLAUDE_BOTTLE_BACKEND=smolmachines — that runs a
bottle inside a per-agent libkrun microVM on macOS (and KVM on Linux,
opportunistically). The egress topology moves out of an internal
Docker network and onto libkrun's TSI ("Transport Socket Interface")
allowlist plus a host-side pipelock/egress/git-gate/supervise stack
listening on per-bottle loopback ports. The Docker backend ships
unchanged; this is opt-in via the existing env-var selector.
The acceptance gate is PRD 0022's tests/integration/test_sandbox_escape.py
running green against CLAUDE_BOTTLE_BACKEND=smolmachines.
Problem
agent-vm-isolation.md argues for hardware-isolated microVMs over
container-based bottles on macOS; smolmachines-as-vm-backend.md
concludes that smolmachines is the most plausible concrete VMM for
this project. Today, the only backend in the registry is Docker
(claude_bottle/backend/__init__.py:_BACKENDS = {"docker": ...}),
and three things motivate a second one now:
- Isolation ceiling. On macOS the Docker backend's agent
container shares Docker Desktop's host VM with every other bottle.
Container escape from claude-code lands the agent inside that
shared VM. A per-bottle libkrun microVM gets hardware page tables
via
Hypervisor.framework; cross-bottle isolation becomes enforced by the CPU's MMU instead of namespace bookkeeping. - PRD 0022 is backend-agnostic by design but currently only
exercises the Docker backend. The suite was written with
CLAUDE_BOTTLE_BACKENDselection in mind precisely so the smolmachines path could be validated against the same five attacks. Until a second backend exists, the abstraction is unproven. - CI carve-outs. Most bottle-bringup integration tests skip
under
GITEA_ACTIONS=truebecause act_runner shares the host Docker socket but not the host filesystem. A smolmachines path doesn't share that constraint shape (it has its own, but different), so adding the backend forces the abstraction to be clean in places where Docker-specific assumptions have been tolerated.
The smolmachines research note's ## Recommendation ("adopt
smolmachines as the bottle VM backend on macOS; keep pipelock DIY")
is the design hypothesis under test here.
Goals / Success Criteria
The feature works when all of the following are observable on a macOS host with smolmachines installed:
CLAUDE_BOTTLE_BACKEND=smolmachines python3 cli.py start <agent>brings up a microVM, runs claude-code inside it, and tears it down on exit. Same y/N preflight UX as Docker — only the resolved-runtime line differs.- The sandbox-escape suite in
tests/integration/test_sandbox_escape.pyruns green against the smolmachines backend (all five attack categories blocked). - Selecting the backend on a host without
smolvminstalled dies at startup with an install pointer; no silent fall-through to Docker. - Active bottles show up under
python3 cli.py list-bottlesregardless of backend. python3 cli.py stop <bottle>and orphan cleanup work for both Docker bottles and smolmachines bottles via the same CLI surface.
The feature is done when all of the following ship:
- A new
claude_bottle/backend/smolmachines/subpackage exists, mirroring the layout ofclaude_bottle/backend/docker/(backend.py,bottle.py,bottle_plan.py,bottle_cleanup_plan.py,prepare.py,launch.py,cleanup.py,util.py, and aprovision/subpackage for the fiveprovision_*methods). SmolmachinesBottleBackendregistered under the"smolmachines"key inclaude_bottle/backend/__init__.py:_BACKENDS.- Per-bottle Smolfile generation: a runtime-rendered TOML written
to the bottle's stage dir, analogous to the compose file the
Docker backend writes today. The Smolfile pins
command,env,--outbound-localhost-only, and the per-bottle DNS allowlist. - Host-side sidecar relocation: pipelock, egress, git-gate, and
supervise each run as host processes (one set per bottle),
bound to
127.0.0.1on per-bottle dynamically-allocated ports. The agent's environment carries the resolved URLs (e.g.HTTPS_PROXY=http://127.0.0.1:<pipelock-port>). - The agent guest image is produced from the existing
Dockerfile(or a thin variant), exported as an OCI archive, and consumed bysmolvm machine create. The image build step is part ofprepare, analogous todocker_mod.build_image. - The PRD 0022 sandbox-escape suite, run with
CLAUDE_BOTTLE_BACKEND=smolmachines, passes locally on a smolmachines-capable host. The suite is updated to skip cleanly on hosts that can't reach smolmachines (same shape as the existingGITEA_ACTIONS == "true"skip), not to fail. - README +
CLAUDE.mdupdated to document the env-var selection, the macOS-only scope for v1, and thesmolvminstall prerequisite.
Non-goals
- No Linux KVM support shipped in this PRD. smolmachines works on Linux via KVM, but the abstraction win is biggest on macOS where Docker's shared-VM topology hurts most. Linux can come later behind the same selector.
- No removal of the Docker backend. Both backends ship side by
side. Selection stays env-driven; the manifest does not gain a
backendfield. - No default-backend change.
dockerremains the default value ofCLAUDE_BOTTLE_BACKEND; smolmachines is strictly opt-in until it has been load-bearing on at least one operator's workflow for a release cycle. - No host bind mounts. The smolmachines research note flagged
that
-v HOST:GUESTmounts via virtiofs would defeat the isolation goal. The manifest already has no concept of host mounts; this PRD does not introduce one. If a future PRD wants agent-side access to host files, it must come through a controlled channel (vsock relay, OCI overlay, supervise sidecar endpoint). - No HTTP API mode.
smolvm serveis the long-term-clean control plane, but v1 drives smolmachines via CLI subprocess invocations — the lower-overhead first iteration the research note already endorses. - No custom kernel / initrd. smolmachines uses libkrunfw only; the agent image is an OCI ref, not a kernel + rootfs pair.
- No warm-pool or snapshot/restore. Each bottle gets a fresh microVM; cold-start cost is paid up front.
- No supervise/agent-credential rewrites for the new backend. Provisioning logic ports as-is; only the transport (host-side port URLs instead of in-network DNS names) changes.
Scope
In scope
- New
claude_bottle/backend/smolmachines/subpackage with the full set ofBottleBackendoverrides. - Smolfile generator (TOML), analogous to
backend/docker/compose.py'sbottle_plan_to_compose. - A host-side sidecar process manager that owns the lifecycle of
pipelock + egress + git-gate + supervise for one bottle, binding
them to per-bottle loopback ports and tearing them down with the
bottle. This is the smolmachines-specific replacement for
docker compose up/down. - Per-bottle CA install path: the egress sidecar's CA cert lands
inside the microVM via
smolvm machine execafter start (analogous to the existingprovision_cafor Docker). - DNS allowlist plumbing: every host in
bottle.egress.allowlistgoes into the Smolfile's DNS filter section (vsock port 6002), so the VMM-layer DNS filter and the bottle's policy stay in sync — agent can'tdigits way out via raw IP literals (TSI- CIDR allowlist enforces this; DNS filter denies hostname resolution).
- Preflight
smolvmcheck: if the user selects this backend andsmolvmisn't on$PATH, die with an install pointer (brew tap- version pin TBD in implementation; see open question 3).
- Manifest validation: refuse any bottle field this backend can't honor (today there are none, since the Docker backend already rejects host mounts; this is a forward-compat check).
- Tests:
- Smoke unit-level test: Smolfile renderer produces the expected TOML for a fixture bottle.
- Integration test:
prepare → launch → exec("echo hi") → teardownon a smolmachines-capable host (skips otherwise via the same env/platform gate the Docker integration tests use). - PRD 0022 suite, re-run with the env var flipped, passes.
Out of scope
- VM image caching across bottles (each prepare rebuilds from the OCI archive; layer reuse is whatever smolmachines provides).
- Cross-host bottle relocation (the OCI archive is local-only).
- Operator-facing knobs for vCPU / memory / overlay size (use sensible defaults; expose as manifest fields in a later PRD if needed).
- Integration with the
superviseplane's permission-prompt UX beyond port plumbing — supervise already speaks HTTP and binds to whatever loopback the backend hands it.
Proposed Design
Backend layout
claude_bottle/backend/smolmachines/
__init__.py re-exports SmolmachinesBottleBackend
backend.py SmolmachinesBottleBackend façade
bottle.py SmolmachinesBottle (exec_claude / exec / cp_in / close)
bottle_plan.py SmolmachinesBottlePlan + .print()
bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan
prepare.py resolve_plan(spec, stage_dir, ...) -> SmolmachinesBottlePlan
launch.py @contextmanager launch(plan) -> SmolmachinesBottle
cleanup.py prepare_cleanup / cleanup / list_active
smolfile.py bottle_plan_to_smolfile(...) -> dict + render
sidecars.py host-side pipelock/egress/git-gate/supervise lifecycle
smolvm.py thin subprocess wrapper: machine create/start/exec/stop
util.py slugify, port allocation, OCI archive helpers
provision/ ca.py, prompt.py, skills.py, git.py, supervise.py
Network + egress topology
┌── macOS host ─────────────────────────────────────────────┐
│ │
│ ┌── per-bottle host sidecars (one set per microVM) ─┐ │
│ │ pipelock 127.0.0.1:<p1> │ │
│ │ egress 127.0.0.1:<p2> │ │
│ │ git-gate 127.0.0.1:<p3> │ │
│ │ supervise 127.0.0.1:<p4> │ │
│ └───────────────────────────────────────────────────┘ │
│ ▲ │
│ │ TSI passthrough (localhost) │
│ │ │
│ ┌── libkrun microVM (per bottle) ───────────────────┐ │
│ │ env: HTTPS_PROXY=http://127.0.0.1:<p1> │ │
│ │ EGRESS_URL=http://127.0.0.1:<p2> │ │
│ │ GIT_GATE_URL=http://127.0.0.1:<p3> │ │
│ │ MCP_SUPERVISE_URL=http://127.0.0.1:<p4> │ │
│ │ --outbound-localhost-only │ │
│ │ DNS filter (vsock:6002) → host allowlist │ │
│ └───────────────────────────────────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────┘
Two changes vs. the Docker backend:
- Sidecars are host processes, not sibling containers. No internal Docker network; isolation comes from TSI plus the per-bottle loopback port set.
- The "internal" allowlist becomes localhost-only. Egress out
to the public internet still happens through pipelock + egress
— the same scanning + DLP + auth-injection chain — but the
agent's first hop is
127.0.0.1:<p1>reached via TSI, not a sidecar's IP on a Docker-managed bridge.
Lifecycle
SmolmachinesBottleBackend.prepare(spec, stage_dir):
- Cross-backend validation via
BottleBackend._validate(skills, git identity files). - Allocate four loopback ports (bind, get free port, release; record on plan).
- Resolve the agent OCI archive path (build if missing, cache by Dockerfile + agent-name hash).
- Render the per-bottle Smolfile to
stage_dir/smolfile.toml, pinning command/env/--outbound-localhost-only+ DNS allowlist. - Resolve the in-VM CA paths so launch knows where to copy pipelock's CA after start.
- Return a
SmolmachinesBottlePlancarrying the slug, port map, OCI archive path, Smolfile path, and host sidecar specs.
SmolmachinesBottleBackend.launch(plan):
- Start the four host sidecars in dependency order (pipelock → egress → git-gate → supervise), bound to the plan's allocated ports. Register teardown callbacks in reverse order.
smolvm machine create --smolfile <path>andsmolvm machine start <name>.- Provisioning: CA install → prompt → skills → git → supervise
config, each via
smolvm machine exec(analogous todocker exec). - Yield a
SmolmachinesBottlewhoseexec_claude/exec/cp_inall funnel throughsmolvm machine exec/smolvm machine cp. - Teardown: stop and remove the VM, then stop the sidecars (in reverse start order).
Data model
No manifest schema change. bottles[] continues to carry
egress.allowlist, env, git, skills references, etc.; the
smolmachines backend reads the same fields as the docker backend.
The DNS allowlist plumbed into the Smolfile is just
bottle.egress.allowlist re-encoded as TOML.
The BottleSpec dataclass and the Bottle ABC do not change.
Selection wiring
In claude_bottle/backend/__init__.py:
from .docker import DockerBottleBackend
from .smolmachines import SmolmachinesBottleBackend
_BACKENDS: dict[str, BottleBackend[Any, Any]] = {
"docker": DockerBottleBackend(),
"smolmachines": SmolmachinesBottleBackend(),
}
The existing "unknown backend" die() path stays as-is.
External dependencies
smolvmCLI binary on$PATH(one new external dep, gated by the preflight check). Pinned version policy is deferred to the open questions; v1 readssmolvm --versionand refuses to launch outside a known-good range.- No new Python packages. Subprocess + stdlib
tomllib/tomli_wfor Smolfile authoring. (tomli_wis the only candidate module; if it's not stdlib in the target Python, render TOML by hand from adict[str, Any]— Smolfile shape is small.)
Acceptance test plan
- Unit:
tests/unit/test_smolfile.pyverifies the renderer produces the expected TOML for a fixture bottle (allowlist → DNS rules, env →env =, command line, outbound-localhost flag). - Integration smoke:
tests/integration/test_smolmachines_smoke.pywithprepare → launch → exec → teardown, guarded by asmolvmpresence check + macOS / KVM platform check. - PRD 0022 re-run: with
CLAUDE_BOTTLE_BACKEND=smolmachines, all five attack categories return sandbox-block markers and the suite passes. The test code does not change beyond the env-var flip — that's the contract the PRD 0022 abstraction was designed for.
Sizing — into chunks
- Backend skeleton + selection + Smolfile renderer. Subpackage
layout,
_resolve_planstub that emits a TOML file but doesn't launch anything,_BACKENDSregistration, preflightsmolvmcheck. Unit test on the renderer. No VM bringup yet. - VM lifecycle + OCI archive build.
smolvm.pysubprocess wrapper, prepare-time image build (existing Dockerfile → OCI archive), launch path that creates + starts + stops a VM with no sidecars wired. Smoke integration test:exec("echo hi")inside a started VM. - Host-side sidecar relocation.
sidecars.py: per-bottle pipelock + egress + git-gate + supervise as host processes on loopback. Port allocator. Teardown ordering. No provisioning yet beyond what the sidecars need. - Provisioning parity with Docker. CA install via
smolvm machine exec, prompt/skills/.git copy-in, supervise MCP config. End-to-endstartworks for a real agent manifest. - PRD 0022 sandbox-escape suite green. Skip-guard update,
small adjustments to test helpers if any (the test uses
bottle.exec(script)and inspectsreturncode+ body for sandbox markers — should be transport-agnostic, but verify). Document the macOS-only scope in README.
Open questions
- Sidecar locality: host process vs in-VM init. This PRD defaults to host-process sidecars (proposed design above). The alternative — bake pipelock + egress + git-gate + supervise into the OCI image and start them via init in the same VM — would simplify port plumbing (the agent reaches sidecars over localhost inside the VM, not over TSI) but expands the trust boundary of the agent VM. Default A unless someone identifies a TSI loopback edge case during chunk 3.
smolvminstall policy. Pin via brew formula version, or build-from-source step, or vendored binary checked into the repo. v1 most likely runssmolvm --versionat preflight and accepts a documented range; vendoring is heavier but reduces "works on my Mac" drift.- CA install inside the OCI overlay. Two paths: bake at
prepare time (one OCI archive per CA fingerprint, big cache
key) vs. inject at start time via
smolvm machine execafter the VM is up. PRD 0006 chose the runtime path for Docker (docker-cp +update-ca-certificates); smolvm has the same shape viamachine exec. Default to runtime injection unless it conflicts with--outbound-localhost-onlystart order. - DNS filter granularity. smolmachines's vsock-6002 filter accepts an allowlist of hostnames; we want to enforce both "agent can only resolve names on the bottle's allowlist" and "agent can only egress via TSI to 127.0.0.1." Confirm empirically (smoke test in chunk 2) that the allowlist applies to guest-initiated DNS only and doesn't accidentally NXDOMAIN the host-side pipelock's upstream lookups.
bottle.exec(script)exit-code fidelity. The PRD 0022 test suite readsreturncode+ stdout + stderr fromExecResult. Confirmsmolvm machine execpropagates exit codes and separated streams — the research note's "external integration is the CLI" implies yes, but the embedded SDK bug it flagged suggests we should verify before coding around it.- CI gating. Gitea's act_runner is Linux without nested KVM,
so smolmachines integration tests will skip there for the same
structural reason the Docker bringup tests do (no real
isolation primitive available on the runner). The skip
predicate becomes
not (smolvm_available() and (platform.system() == "Darwin" or kvm_available())). CI coverage for this backend will come from local runs on the maintainer's macOS host until a Darwin runner is wired up; ack that as a known gap. - Active bottle discovery. Docker uses container labels to
enumerate active bottles (
list_activequeries the daemon). smolmachines's enumeration story issmolvm machine list; the plan is to mirror the label scheme via Smolfile metadata (labels = { "claude-bottle" = "1" }-style entries, if the format supports it; otherwise via a deterministic name prefixclaude-bottle-<slug>).
References
docs/research/smolmachines-as-vm-backend.md— primary research note recommending this adoption; PRD 0023's design hypothesis.docs/research/agent-vm-isolation.md— the broader microVM / gvproxy / pipelock landscape this PRD lands inside of.docs/research/agent-sandbox-landscape.md— identifies"runtime": "microvm"-style opt-in as the borrowable idea; smolmachines is the concrete implementation.- PRD 0003 (
docs/prds/0003-bottle-backend-abstraction.md) — the backend abstraction this PRD is the first non-Docker consumer of. - PRD 0017 (
docs/prds/0017-egress-proxy-via-mitmproxy.md) — the egress sidecar the host-side relocation reuses verbatim, only with a different transport. - PRD 0022
(
docs/prds/0022-sandbox-escape-integration-test.md) — the acceptance gate for this PRD; the suite already runs throughget_bottle_backend()so the env-var flip is the only change needed to exercise the smolmachines path.