a2ac124d5c
Specs a second concrete BottleBackend selectable via CLAUDE_BOTTLE_BACKEND=smolmachines: per-agent libkrun microVM on macOS, sidecars relocated to host-side loopback ports plumbed via Smolfile env, PRD 0022's sandbox-escape suite as the acceptance gate (the env-var flip is the only change required). Docker backend ships unchanged and remains default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
428 lines
21 KiB
Markdown
428 lines
21 KiB
Markdown
# PRD 0023: smolmachines bottle backend
|
|
|
|
- **Status:** Draft
|
|
- **Author:** didericis
|
|
- **Created:** 2026-05-26
|
|
|
|
## Summary
|
|
|
|
Ship a second concrete `BottleBackend` — `SmolmachinesBottleBackend`,
|
|
selected via `CLAUDE_BOTTLE_BACKEND=smolmachines` — that runs a
|
|
bottle inside a per-agent libkrun microVM on macOS (and KVM on Linux,
|
|
opportunistically). The egress topology moves out of an internal
|
|
Docker network and onto libkrun's TSI ("Transport Socket Interface")
|
|
allowlist plus a host-side pipelock/egress/git-gate/supervise stack
|
|
listening on per-bottle loopback ports. The Docker backend ships
|
|
unchanged; this is opt-in via the existing env-var selector.
|
|
|
|
The acceptance gate is PRD 0022's `tests/integration/test_sandbox_escape.py`
|
|
running green against `CLAUDE_BOTTLE_BACKEND=smolmachines`.
|
|
|
|
## Problem
|
|
|
|
`agent-vm-isolation.md` argues for hardware-isolated microVMs over
|
|
container-based bottles on macOS; `smolmachines-as-vm-backend.md`
|
|
concludes that smolmachines is the most plausible concrete VMM for
|
|
this project. Today, the only backend in the registry is Docker
|
|
(`claude_bottle/backend/__init__.py:_BACKENDS = {"docker": ...}`),
|
|
and three things motivate a second one now:
|
|
|
|
- **Isolation ceiling.** On macOS the Docker backend's agent
|
|
container shares Docker Desktop's host VM with every other bottle.
|
|
Container escape from claude-code lands the agent inside that
|
|
shared VM. A per-bottle libkrun microVM gets hardware page tables
|
|
via `Hypervisor.framework`; cross-bottle isolation becomes
|
|
enforced by the CPU's MMU instead of namespace bookkeeping.
|
|
- **PRD 0022 is backend-agnostic by design** but currently only
|
|
exercises the Docker backend. The suite was written with
|
|
`CLAUDE_BOTTLE_BACKEND` selection in mind precisely so the
|
|
smolmachines path could be validated against the same five
|
|
attacks. Until a second backend exists, the abstraction is
|
|
unproven.
|
|
- **CI carve-outs.** Most bottle-bringup integration tests skip
|
|
under `GITEA_ACTIONS=true` because act_runner shares the host
|
|
Docker socket but not the host filesystem. A smolmachines path
|
|
doesn't share that constraint shape (it has its own, but
|
|
different), so adding the backend forces the abstraction to be
|
|
clean in places where Docker-specific assumptions have been
|
|
tolerated.
|
|
|
|
The smolmachines research note's `## Recommendation` ("adopt
|
|
smolmachines as the bottle VM backend on macOS; keep pipelock DIY")
|
|
is the design hypothesis under test here.
|
|
|
|
## Goals / Success Criteria
|
|
|
|
The feature works when all of the following are observable on a
|
|
macOS host with smolmachines installed:
|
|
|
|
- `CLAUDE_BOTTLE_BACKEND=smolmachines python3 cli.py start <agent>`
|
|
brings up a microVM, runs claude-code inside it, and tears it
|
|
down on exit. Same y/N preflight UX as Docker — only the
|
|
resolved-runtime line differs.
|
|
- The sandbox-escape suite in `tests/integration/test_sandbox_escape.py`
|
|
runs green against the smolmachines backend (all five attack
|
|
categories blocked).
|
|
- Selecting the backend on a host without `smolvm` installed dies
|
|
at startup with an install pointer; no silent fall-through to
|
|
Docker.
|
|
- Active bottles show up under
|
|
`python3 cli.py list-bottles` regardless of backend.
|
|
- `python3 cli.py stop <bottle>` and orphan cleanup work for both
|
|
Docker bottles and smolmachines bottles via the same CLI surface.
|
|
|
|
The feature is **done** when all of the following ship:
|
|
|
|
- A new `claude_bottle/backend/smolmachines/` subpackage exists,
|
|
mirroring the layout of `claude_bottle/backend/docker/`
|
|
(`backend.py`, `bottle.py`, `bottle_plan.py`,
|
|
`bottle_cleanup_plan.py`, `prepare.py`, `launch.py`,
|
|
`cleanup.py`, `util.py`, and a `provision/` subpackage for the
|
|
five `provision_*` methods).
|
|
- `SmolmachinesBottleBackend` registered under the
|
|
`"smolmachines"` key in `claude_bottle/backend/__init__.py:_BACKENDS`.
|
|
- Per-bottle Smolfile generation: a runtime-rendered TOML written
|
|
to the bottle's stage dir, analogous to the compose file the
|
|
Docker backend writes today. The Smolfile pins `command`,
|
|
`env`, `--outbound-localhost-only`, and the per-bottle DNS
|
|
allowlist.
|
|
- Host-side sidecar relocation: pipelock, egress, git-gate, and
|
|
supervise each run as host processes (one set per bottle),
|
|
bound to `127.0.0.1` on per-bottle dynamically-allocated ports.
|
|
The agent's environment carries the resolved URLs (e.g.
|
|
`HTTPS_PROXY=http://127.0.0.1:<pipelock-port>`).
|
|
- The agent guest image is produced from the existing `Dockerfile`
|
|
(or a thin variant), exported as an OCI archive, and consumed by
|
|
`smolvm machine create`. The image build step is part of `prepare`,
|
|
analogous to `docker_mod.build_image`.
|
|
- The PRD 0022 sandbox-escape suite, run with
|
|
`CLAUDE_BOTTLE_BACKEND=smolmachines`, passes locally on a
|
|
smolmachines-capable host. The suite is updated to skip cleanly
|
|
on hosts that can't reach smolmachines (same shape as the
|
|
existing `GITEA_ACTIONS == "true"` skip), not to fail.
|
|
- README + `CLAUDE.md` updated to document the env-var selection,
|
|
the macOS-only scope for v1, and the `smolvm` install
|
|
prerequisite.
|
|
|
|
## Non-goals
|
|
|
|
- **No Linux KVM support shipped in this PRD.** smolmachines works
|
|
on Linux via KVM, but the abstraction win is biggest on macOS
|
|
where Docker's shared-VM topology hurts most. Linux can come
|
|
later behind the same selector.
|
|
- **No removal of the Docker backend.** Both backends ship side by
|
|
side. Selection stays env-driven; the manifest does not gain a
|
|
`backend` field.
|
|
- **No default-backend change.** `docker` remains the default
|
|
value of `CLAUDE_BOTTLE_BACKEND`; smolmachines is strictly
|
|
opt-in until it has been load-bearing on at least one operator's
|
|
workflow for a release cycle.
|
|
- **No host bind mounts.** The smolmachines research note flagged
|
|
that `-v HOST:GUEST` mounts via virtiofs would defeat the
|
|
isolation goal. The manifest already has no concept of host
|
|
mounts; this PRD does not introduce one. If a future PRD wants
|
|
agent-side access to host files, it must come through a
|
|
controlled channel (vsock relay, OCI overlay, supervise sidecar
|
|
endpoint).
|
|
- **No HTTP API mode.** `smolvm serve` is the long-term-clean
|
|
control plane, but v1 drives smolmachines via CLI subprocess
|
|
invocations — the lower-overhead first iteration the research
|
|
note already endorses.
|
|
- **No custom kernel / initrd.** smolmachines uses libkrunfw
|
|
only; the agent image is an OCI ref, not a kernel + rootfs pair.
|
|
- **No warm-pool or snapshot/restore.** Each bottle gets a fresh
|
|
microVM; cold-start cost is paid up front.
|
|
- **No supervise/agent-credential rewrites for the new backend.**
|
|
Provisioning logic ports as-is; only the *transport* (host-side
|
|
port URLs instead of in-network DNS names) changes.
|
|
|
|
## Scope
|
|
|
|
### In scope
|
|
|
|
- New `claude_bottle/backend/smolmachines/` subpackage with the
|
|
full set of `BottleBackend` overrides.
|
|
- Smolfile generator (TOML), analogous to
|
|
`backend/docker/compose.py`'s `bottle_plan_to_compose`.
|
|
- A host-side sidecar process manager that owns the lifecycle of
|
|
pipelock + egress + git-gate + supervise for one bottle, binding
|
|
them to per-bottle loopback ports and tearing them down with the
|
|
bottle. This is the smolmachines-specific replacement for
|
|
`docker compose up`/`down`.
|
|
- Per-bottle CA install path: the egress sidecar's CA cert lands
|
|
inside the microVM via `smolvm machine exec` after start
|
|
(analogous to the existing `provision_ca` for Docker).
|
|
- DNS allowlist plumbing: every host in `bottle.egress.allowlist`
|
|
goes into the Smolfile's DNS filter section (vsock port 6002),
|
|
so the VMM-layer DNS filter and the bottle's policy stay in
|
|
sync — agent can't `dig` its way out via raw IP literals (TSI
|
|
+ CIDR allowlist enforces this; DNS filter denies hostname
|
|
resolution).
|
|
- Preflight `smolvm` check: if the user selects this backend and
|
|
`smolvm` isn't on `$PATH`, die with an install pointer (brew tap
|
|
+ version pin TBD in implementation; see open question 3).
|
|
- Manifest validation: refuse any bottle field this backend can't
|
|
honor (today there are none, since the Docker backend already
|
|
rejects host mounts; this is a forward-compat check).
|
|
- Tests:
|
|
- Smoke unit-level test: Smolfile renderer produces the
|
|
expected TOML for a fixture bottle.
|
|
- Integration test: `prepare → launch → exec("echo hi") →
|
|
teardown` on a smolmachines-capable host (skips otherwise
|
|
via the same env/platform gate the Docker integration tests
|
|
use).
|
|
- PRD 0022 suite, re-run with the env var flipped, passes.
|
|
|
|
### Out of scope
|
|
|
|
- VM image caching across bottles (each prepare rebuilds from the
|
|
OCI archive; layer reuse is whatever smolmachines provides).
|
|
- Cross-host bottle relocation (the OCI archive is local-only).
|
|
- Operator-facing knobs for vCPU / memory / overlay size (use
|
|
sensible defaults; expose as manifest fields in a later PRD if
|
|
needed).
|
|
- Integration with the `supervise` plane's permission-prompt UX
|
|
beyond port plumbing — supervise already speaks HTTP and binds
|
|
to whatever loopback the backend hands it.
|
|
|
|
## Proposed Design
|
|
|
|
### Backend layout
|
|
|
|
```
|
|
claude_bottle/backend/smolmachines/
|
|
__init__.py re-exports SmolmachinesBottleBackend
|
|
backend.py SmolmachinesBottleBackend façade
|
|
bottle.py SmolmachinesBottle (exec_claude / exec / cp_in / close)
|
|
bottle_plan.py SmolmachinesBottlePlan + .print()
|
|
bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan
|
|
prepare.py resolve_plan(spec, stage_dir, ...) -> SmolmachinesBottlePlan
|
|
launch.py @contextmanager launch(plan) -> SmolmachinesBottle
|
|
cleanup.py prepare_cleanup / cleanup / list_active
|
|
smolfile.py bottle_plan_to_smolfile(...) -> dict + render
|
|
sidecars.py host-side pipelock/egress/git-gate/supervise lifecycle
|
|
smolvm.py thin subprocess wrapper: machine create/start/exec/stop
|
|
util.py slugify, port allocation, OCI archive helpers
|
|
provision/ ca.py, prompt.py, skills.py, git.py, supervise.py
|
|
```
|
|
|
|
### Network + egress topology
|
|
|
|
```
|
|
┌── macOS host ─────────────────────────────────────────────┐
|
|
│ │
|
|
│ ┌── per-bottle host sidecars (one set per microVM) ─┐ │
|
|
│ │ pipelock 127.0.0.1:<p1> │ │
|
|
│ │ egress 127.0.0.1:<p2> │ │
|
|
│ │ git-gate 127.0.0.1:<p3> │ │
|
|
│ │ supervise 127.0.0.1:<p4> │ │
|
|
│ └───────────────────────────────────────────────────┘ │
|
|
│ ▲ │
|
|
│ │ TSI passthrough (localhost) │
|
|
│ │ │
|
|
│ ┌── libkrun microVM (per bottle) ───────────────────┐ │
|
|
│ │ env: HTTPS_PROXY=http://127.0.0.1:<p1> │ │
|
|
│ │ EGRESS_URL=http://127.0.0.1:<p2> │ │
|
|
│ │ GIT_GATE_URL=http://127.0.0.1:<p3> │ │
|
|
│ │ MCP_SUPERVISE_URL=http://127.0.0.1:<p4> │ │
|
|
│ │ --outbound-localhost-only │ │
|
|
│ │ DNS filter (vsock:6002) → host allowlist │ │
|
|
│ └───────────────────────────────────────────────────┘ │
|
|
│ │
|
|
└───────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
Two changes vs. the Docker backend:
|
|
|
|
1. **Sidecars are host processes, not sibling containers.** No
|
|
internal Docker network; isolation comes from TSI plus the
|
|
per-bottle loopback port set.
|
|
2. **The "internal" allowlist becomes localhost-only.** Egress out
|
|
to the public internet still happens through pipelock + egress
|
|
— the same scanning + DLP + auth-injection chain — but the
|
|
agent's first hop is `127.0.0.1:<p1>` reached via TSI, not a
|
|
sidecar's IP on a Docker-managed bridge.
|
|
|
|
### Lifecycle
|
|
|
|
`SmolmachinesBottleBackend.prepare(spec, stage_dir)`:
|
|
|
|
1. Cross-backend validation via `BottleBackend._validate` (skills,
|
|
git identity files).
|
|
2. Allocate four loopback ports (bind, get free port, release;
|
|
record on plan).
|
|
3. Resolve the agent OCI archive path (build if missing, cache by
|
|
Dockerfile + agent-name hash).
|
|
4. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`,
|
|
pinning command/env/`--outbound-localhost-only` + DNS allowlist.
|
|
5. Resolve the in-VM CA paths so launch knows where to copy
|
|
pipelock's CA after start.
|
|
6. Return a `SmolmachinesBottlePlan` carrying the slug, port map,
|
|
OCI archive path, Smolfile path, and host sidecar specs.
|
|
|
|
`SmolmachinesBottleBackend.launch(plan)`:
|
|
|
|
1. Start the four host sidecars in dependency order (pipelock →
|
|
egress → git-gate → supervise), bound to the plan's allocated
|
|
ports. Register teardown callbacks in reverse order.
|
|
2. `smolvm machine create --smolfile <path>` and
|
|
`smolvm machine start <name>`.
|
|
3. Provisioning: CA install → prompt → skills → git → supervise
|
|
config, each via `smolvm machine exec` (analogous to
|
|
`docker exec`).
|
|
4. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` /
|
|
`cp_in` all funnel through `smolvm machine exec` /
|
|
`smolvm machine cp`.
|
|
5. Teardown: stop and remove the VM, then stop the sidecars (in
|
|
reverse start order).
|
|
|
|
### Data model
|
|
|
|
No manifest schema change. `bottles[]` continues to carry
|
|
`egress.allowlist`, `env`, `git`, `skills` references, etc.; the
|
|
smolmachines backend reads the same fields as the docker backend.
|
|
The DNS allowlist plumbed into the Smolfile is just
|
|
`bottle.egress.allowlist` re-encoded as TOML.
|
|
|
|
The `BottleSpec` dataclass and the `Bottle` ABC do not change.
|
|
|
|
### Selection wiring
|
|
|
|
In `claude_bottle/backend/__init__.py`:
|
|
|
|
```python
|
|
from .docker import DockerBottleBackend
|
|
from .smolmachines import SmolmachinesBottleBackend
|
|
|
|
_BACKENDS: dict[str, BottleBackend[Any, Any]] = {
|
|
"docker": DockerBottleBackend(),
|
|
"smolmachines": SmolmachinesBottleBackend(),
|
|
}
|
|
```
|
|
|
|
The existing "unknown backend" `die()` path stays as-is.
|
|
|
|
### External dependencies
|
|
|
|
- `smolvm` CLI binary on `$PATH` (one new external dep, gated by
|
|
the preflight check). Pinned version policy is deferred to the
|
|
open questions; v1 reads `smolvm --version` and refuses to launch
|
|
outside a known-good range.
|
|
- No new Python packages. Subprocess + stdlib `tomllib`/`tomli_w`
|
|
for Smolfile authoring. (`tomli_w` is the only candidate
|
|
module; if it's not stdlib in the target Python, render TOML
|
|
by hand from a `dict[str, Any]` — Smolfile shape is small.)
|
|
|
|
### Acceptance test plan
|
|
|
|
- **Unit:** `tests/unit/test_smolfile.py` verifies the renderer
|
|
produces the expected TOML for a fixture bottle (allowlist →
|
|
DNS rules, env → `env =`, command line, outbound-localhost
|
|
flag).
|
|
- **Integration smoke:** `tests/integration/test_smolmachines_smoke.py`
|
|
with `prepare → launch → exec → teardown`, guarded by a
|
|
`smolvm` presence check + macOS / KVM platform check.
|
|
- **PRD 0022 re-run:** with `CLAUDE_BOTTLE_BACKEND=smolmachines`,
|
|
all five attack categories return sandbox-block markers and the
|
|
suite passes. The test code does not change beyond the env-var
|
|
flip — that's the contract the PRD 0022 abstraction was
|
|
designed for.
|
|
|
|
## Sizing — into chunks
|
|
|
|
1. **Backend skeleton + selection + Smolfile renderer.** Subpackage
|
|
layout, `_resolve_plan` stub that emits a TOML file but doesn't
|
|
launch anything, `_BACKENDS` registration, preflight `smolvm`
|
|
check. Unit test on the renderer. No VM bringup yet.
|
|
2. **VM lifecycle + OCI archive build.** `smolvm.py` subprocess
|
|
wrapper, prepare-time image build (existing Dockerfile → OCI
|
|
archive), launch path that creates + starts + stops a VM with
|
|
no sidecars wired. Smoke integration test: `exec("echo hi")`
|
|
inside a started VM.
|
|
3. **Host-side sidecar relocation.** `sidecars.py`: per-bottle
|
|
pipelock + egress + git-gate + supervise as host processes on
|
|
loopback. Port allocator. Teardown ordering. No provisioning
|
|
yet beyond what the sidecars need.
|
|
4. **Provisioning parity with Docker.** CA install via
|
|
`smolvm machine exec`, prompt/skills/.git copy-in, supervise
|
|
MCP config. End-to-end `start` works for a real agent manifest.
|
|
5. **PRD 0022 sandbox-escape suite green.** Skip-guard update,
|
|
small adjustments to test helpers if any (the test uses
|
|
`bottle.exec(script)` and inspects `returncode` + body for
|
|
sandbox markers — should be transport-agnostic, but verify).
|
|
Document the macOS-only scope in README.
|
|
|
|
## Open questions
|
|
|
|
1. **Sidecar locality: host process vs in-VM init.** This PRD
|
|
defaults to host-process sidecars (proposed design above). The
|
|
alternative — bake pipelock + egress + git-gate + supervise
|
|
into the OCI image and start them via init in the same VM —
|
|
would simplify port plumbing (the agent reaches sidecars over
|
|
localhost inside the VM, not over TSI) but expands the trust
|
|
boundary of the agent VM. Default A unless someone identifies
|
|
a TSI loopback edge case during chunk 3.
|
|
2. **`smolvm` install policy.** Pin via brew formula version, or
|
|
build-from-source step, or vendored binary checked into the
|
|
repo. v1 most likely runs `smolvm --version` at preflight and
|
|
accepts a documented range; vendoring is heavier but reduces
|
|
"works on my Mac" drift.
|
|
3. **CA install inside the OCI overlay.** Two paths: bake at
|
|
prepare time (one OCI archive per CA fingerprint, big cache
|
|
key) vs. inject at start time via `smolvm machine exec` after
|
|
the VM is up. PRD 0006 chose the runtime path for Docker
|
|
(docker-cp + `update-ca-certificates`); smolvm has the same
|
|
shape via `machine exec`. Default to runtime injection unless
|
|
it conflicts with `--outbound-localhost-only` start order.
|
|
4. **DNS filter granularity.** smolmachines's vsock-6002 filter
|
|
accepts an allowlist of hostnames; we want to enforce both
|
|
"agent can only resolve names on the bottle's allowlist" *and*
|
|
"agent can only egress via TSI to 127.0.0.1." Confirm
|
|
empirically (smoke test in chunk 2) that the allowlist applies
|
|
to *guest-initiated* DNS only and doesn't accidentally NXDOMAIN
|
|
the host-side pipelock's upstream lookups.
|
|
5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test
|
|
suite reads `returncode` + stdout + stderr from
|
|
`ExecResult`. Confirm `smolvm machine exec` propagates exit
|
|
codes and separated streams — the research note's
|
|
"external integration is the CLI" implies yes, but the
|
|
embedded SDK bug it flagged suggests we should verify before
|
|
coding around it.
|
|
6. **CI gating.** Gitea's act_runner is Linux without nested KVM,
|
|
so smolmachines integration tests will skip there for the same
|
|
structural reason the Docker bringup tests do (no real
|
|
isolation primitive available on the runner). The skip
|
|
predicate becomes `not (smolvm_available() and
|
|
(platform.system() == "Darwin" or kvm_available()))`. CI
|
|
coverage for this backend will come from local runs on the
|
|
maintainer's macOS host until a Darwin runner is wired up;
|
|
ack that as a known gap.
|
|
7. **Active bottle discovery.** Docker uses container labels to
|
|
enumerate active bottles (`list_active` queries the daemon).
|
|
smolmachines's enumeration story is `smolvm machine list`; the
|
|
plan is to mirror the label scheme via Smolfile metadata
|
|
(`labels = { "claude-bottle" = "1" }`-style entries, if the
|
|
format supports it; otherwise via a deterministic name prefix
|
|
`claude-bottle-<slug>`).
|
|
|
|
## References
|
|
|
|
- `docs/research/smolmachines-as-vm-backend.md` — primary research
|
|
note recommending this adoption; PRD 0023's design hypothesis.
|
|
- `docs/research/agent-vm-isolation.md` — the broader microVM /
|
|
gvproxy / pipelock landscape this PRD lands inside of.
|
|
- `docs/research/agent-sandbox-landscape.md` — identifies
|
|
`"runtime": "microvm"`-style opt-in as the borrowable idea;
|
|
smolmachines is the concrete implementation.
|
|
- PRD 0003 (`docs/prds/0003-bottle-backend-abstraction.md`) — the
|
|
backend abstraction this PRD is the first non-Docker consumer
|
|
of.
|
|
- PRD 0017 (`docs/prds/0017-egress-proxy-via-mitmproxy.md`) — the
|
|
egress sidecar the host-side relocation reuses verbatim, only
|
|
with a different transport.
|
|
- PRD 0022
|
|
(`docs/prds/0022-sandbox-escape-integration-test.md`) — the
|
|
acceptance gate for this PRD; the suite already runs through
|
|
`get_bottle_backend()` so the env-var flip is the only change
|
|
needed to exercise the smolmachines path.
|