Files
bot-bottle/docs/prds/0023-smolmachines-backend.md
T
didericis-claude 2edc1abb9a
test / unit (pull_request) Successful in 27s
test / integration (pull_request) Successful in 41s
feat(smolmachines): per-bottle loopback alias scopes TSI to single /32
PR #74's Docker-Desktop fix routed the agent through
`127.0.0.1:<random>` loopback forwards, but TSI filters by IP
only — so the allowlist `127.0.0.1/32` let the agent VM reach
**any** host service on macOS loopback (postgres, dev servers,
other bottles' published ports, mDNSResponder, ...). Real
downgrade vs the docker backend's `--internal` network.

Resolution: per-bottle loopback alias.

- New `loopback_alias` module manages a pool of
  `127.0.0.16` .. `127.0.0.31` on `lo0`. macOS only routes
  `127.0.0.1` by default; the extras need `sudo ifconfig lo0
  alias`. `ensure_pool()` lazily adds the missing entries via
  one sudo prompt on first launch per reboot — aliases persist
  on `lo0` until reboot, so subsequent launches skip the
  prompt entirely.
- `allocate(slug)` picks the lowest-numbered unused alias by
  inspecting running bundle containers' port-binding HostIps.
  No on-disk reservation — docker is the source of truth.
- Bundle bringup binds published ports to the allocated alias
  (`docker run -p <alias>::<port>`) instead of `127.0.0.1`.
- TSI allowlist becomes the alias's /32 — narrows reachability
  to this bottle's bundle only.
- Linux native daemons share the host's network namespace;
  `127.0.0.0/8` works without aliases, so the module no-ops on
  non-Darwin and returns `127.0.0.1` from `allocate`.

Tracking issue closed: gitea/issues/75.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 16:23:17 -04:00

648 lines
33 KiB
Markdown

# PRD 0023: smolmachines bottle backend
- **Status:** Draft
- **Author:** didericis
- **Created:** 2026-05-26
## Summary
Ship a second concrete `BottleBackend`
`SmolmachinesBottleBackend`, selected via
`CLAUDE_BOTTLE_BACKEND=smolmachines` — that runs each bottle inside
a per-agent libkrun microVM via `smolvm`. Egress is enforced by
libkrun's TSI ("Transport Socket Interface") allowlist set to a
**single /32** — the docker IP of the per-bottle sidecar bundle
(PRD 0024) on a dedicated docker bridge. Everything else — host
loopback, LAN, public internet directly — is denied at the VMM
layer, before a host-side socket is ever opened.
The sidecar bundle is the same image PRD 0024 ships for the docker
backend; this PRD consumes it. Inside the bundle, pipelock /
git-gate / supervise bind `0.0.0.0:<port>` so the agent (reaching
the bundle via the allowed /32) can talk to them; egress (the
internal upstream of pipelock) binds `127.0.0.1:9099` so it's only
reachable from pipelock within the bundle — the agent can't dial
it directly even though TSI's allowlist is IP-granular rather than
port-granular.
The Docker backend ships unchanged; this is opt-in via the existing
env-var selector. The acceptance gate is PRD 0022's
`tests/integration/test_sandbox_escape.py` running green against
`CLAUDE_BOTTLE_BACKEND=smolmachines`.
### Design pivot from the first draft
The original PRD landed (PR #53) calling for **gvproxy** as the
network primitive — a userspace TCP/IP stack the guest's virtio-net
device would hook into via `VZFileHandleNetworkDeviceAttachment`,
with explicit `port_forwards` controlling what the guest could
reach. That design was built around the smolmachines research
note's claim that libkrun supports a virtio-net mode separate
from TSI.
Chunk 1's empirical spike against `smolvm 0.8.0`'s actual CLI
contradicted that claim: smolvm exposes only TSI-style egress
filters (`--allow-host`, `--allow-cidr`, `--outbound-localhost-only`),
with no documented option to attach virtio-net to a custom unixgram
socket. The gvproxy path would have required dropping smolvm
entirely and driving `Virtualization.framework` via PyObjC.
Re-examining the "why gvproxy" argument with smolvm's real surface,
the loopback gap PRD 0023 worried about only exists with
`--outbound-localhost-only`. With `--allow-cidr <bundle-ip>/32`
instead — and no `--outbound-localhost-only` — the agent can reach
exactly one IP (the bundle) and nothing else: not host loopback,
not LAN, not public internet. That's the same security property
the gvproxy design was chasing, enforced one layer lower (VMM
socket interception, not a userspace TCP/IP stack we maintain),
with significantly less code.
## Problem
`agent-vm-isolation.md` argues for hardware-isolated microVMs over
container-based bottles on macOS; `smolmachines-as-vm-backend.md`
evaluates smolmachines as the lifecycle wrapper. Today, the only
backend in the registry is Docker
(`claude_bottle/backend/__init__.py:_BACKENDS = {"docker": ...}`),
and four things motivate a second one now:
- **Network reach beyond pipelock.** The threat model is a malicious
agent attempting to dial something on the operator's *local
network* (`192.168.x.x` services, the home router, a coworker's
laptop on the same Wi-Fi) or *host's loopback* (a local database,
an editor plugin, another bottle's sidecar) without traversing
pipelock. The Docker backend's `--internal` network blocks the
first; nothing in the current stack blocks the second cleanly.
This PRD's design closes both gaps via TSI's
`--allow-cidr <bundle-ip>/32`: the guest can only dial that one
IP, period. Host loopback, LAN, and the public internet are
refused at the VMM layer.
- **Isolation ceiling.** On macOS the Docker backend's agent
container shares Docker Desktop's host VM with every other
bottle. Container escape from claude-code lands the agent inside
that shared VM. A per-bottle microVM gets hardware page tables
via `Hypervisor.framework`; cross-bottle isolation becomes
enforced by the CPU's MMU instead of namespace bookkeeping.
- **PRD 0022 is backend-agnostic by design** but currently only
exercises the Docker backend. The suite was written with
`CLAUDE_BOTTLE_BACKEND` selection in mind precisely so the
smolmachines path could be validated against the same five
attacks. Until a second backend exists, the abstraction is
unproven.
- **CI carve-outs.** Most bottle-bringup integration tests skip
under `GITEA_ACTIONS=true` because act_runner shares the host
Docker socket but not the host filesystem. A microVM path
doesn't share that constraint shape (it has its own, but
different), so adding the backend forces the abstraction to be
clean in places where Docker-specific assumptions have been
tolerated.
## How TSI's single-IP allowlist achieves the property
libkrun's TSI hijacks guest socket syscalls inside the VMM and
opens the actual sockets from the host process, gated by a CIDR
allowlist. Three flags expose the allowlist:
- `--outbound-localhost-only` — opens up the whole `127.0.0.0/8`
range, all ports. This is the flag the first draft of this PRD
rejected, and we still reject it: it would let the agent dial
any host-loopback service (local Postgres, IDE plugins, another
bottle's sidecar).
- `--allow-cidr CIDR` — IP/CIDR allowlist with no port filter.
- `--allow-host HOSTNAME` — resolves the host on the host's DNS
at VM-start time, stores the result as `/32` CIDRs, and also
enables guest-side DNS filtering (only the allowed hostname
resolves).
This backend uses `--allow-cidr <bundle-ip>/32` (single host) and
nothing else. With the bundle running as a docker container with a
known IP on a dedicated docker bridge, the agent can reach exactly
one address: the bundle. Host loopback is denied (not in the
allowlist). LAN is denied. Public internet directly is denied. DNS
inside the guest is denied (no resolver in the allowlist) — the
agent uses an IP literal for `HTTPS_PROXY`.
The one wrinkle TSI doesn't directly handle is **port granularity
within the allowed IP**. The bundle runs four daemons; pipelock /
git-gate / supervise are agent-facing, egress is pipelock's
internal upstream. If egress were bound to `0.0.0.0:9099` inside
the bundle, the agent could dial `<bundle-ip>:9099` and bypass
pipelock's DLP. We mitigate by binding egress to `127.0.0.1:9099`
*inside* the bundle so only pipelock — also in the bundle, on the
same localhost — can reach it. The bind-address strategy gives us
port-level isolation that TSI's IP-only allowlist doesn't.
Net result: same security property the first draft chased with
gvproxy, enforced at the VMM layer rather than via a userspace
TCP/IP stack, with significantly less code (no gvproxy lifecycle,
no `VZFileHandleNetworkDeviceAttachment` plumbing, no Smolfile
virtio-net carve-out smolvm doesn't expose anyway).
## Goals / Success Criteria
The feature works when all of the following are observable on a
macOS host with smolmachines installed:
- `CLAUDE_BOTTLE_BACKEND=smolmachines python3 cli.py start <agent>`
brings up a microVM, runs claude-code inside it, and tears it
down on exit. Same y/N preflight UX as Docker — only the
resolved-runtime line differs.
- The sandbox-escape suite in `tests/integration/test_sandbox_escape.py`
runs green against the smolmachines backend (all five attack
categories blocked).
- Selecting the backend on a host without `smolvm` installed dies
at startup with an install pointer; no silent fall-through to
Docker.
- Active bottles show up under
`python3 cli.py list-bottles` regardless of backend.
- `python3 cli.py stop <bottle>` and orphan cleanup work for both
Docker bottles and smolmachines bottles via the same CLI surface.
The feature is **done** when all of the following ship:
- A new `claude_bottle/backend/smolmachines/` subpackage exists,
mirroring the layout of `claude_bottle/backend/docker/`
(`backend.py`, `bottle.py`, `bottle_plan.py`,
`bottle_cleanup_plan.py`, `prepare.py`, `launch.py`,
`cleanup.py`, `util.py`, and a `provision/` subpackage for the
five `provision_*` methods).
- `SmolmachinesBottleBackend` registered under the
`"smolmachines"` key in `claude_bottle/backend/__init__.py:_BACKENDS`.
- Per-bottle Smolfile generation: a runtime-rendered TOML written
to the bottle's stage dir using smolvm 0.8.0's actual schema
(`image`, `entrypoint`, `cmd`, `env = ["K=V", …]`, `[network]
allow_cidrs = ["<bundle-ip>/32"]`). The renderer chunk 1
shipped emits the wrong shape (built around the gvproxy
unixgram attachment) — it gets rewritten in this chunk plan as
the cost of the design pivot.
- Per-bottle docker bridge for the bundle: the sidecar bundle
runs as a docker container on a dedicated per-bottle bridge
network with a pinned IP (`--ip <bundle-ip>` against a
per-slug `/24` derived from the slug hash). The pinned IP is
what TSI's allowlist points at; without pinning we'd need to
inspect the running container's IP and feed it back into the
Smolfile, which is a race.
- Per-bottle sidecar bundle: one container per bottle running the
bundle image defined in PRD 0024. pipelock / git-gate /
supervise bind `0.0.0.0:<port>` so the agent (reaching the
bundle via the allowed /32) can reach them. egress binds
`127.0.0.1:9099` inside the bundle so only pipelock can reach
it — the agent sees `<bundle-ip>:9099` refuse the connection
even though TSI's allowlist permits the IP. The agent's
environment carries IP-literal URLs (e.g.
`HTTPS_PROXY=http://<bundle-ip>:8888`).
- The agent guest image is produced from the existing `Dockerfile`
via `smolvm pack create``.smolmachine` artifact, then loaded
into smolvm via `machine create --from <path>`. The image build
step is part of `prepare`, analogous to
`docker_mod.build_image`.
- The PRD 0022 sandbox-escape suite, run with
`CLAUDE_BOTTLE_BACKEND=smolmachines`, passes locally on a
smolmachines-capable host. The suite is updated to skip cleanly
on hosts that can't reach smolmachines (same shape as the
existing `GITEA_ACTIONS == "true"` skip), not to fail.
- README + `CLAUDE.md` updated to document the env-var selection,
the macOS-only scope for v1, and the `smolvm` install
prerequisite.
## Non-goals
- **No Linux KVM support shipped in this PRD.** smolmachines works
on Linux via KVM, but the abstraction win is biggest on macOS
where Docker's shared-VM topology hurts most. Linux can come
later behind the same selector.
- **No removal of the Docker backend.** Both backends ship side by
side. Selection stays env-driven; the manifest does not gain a
`backend` field.
- **No default-backend change.** `docker` remains the default
value of `CLAUDE_BOTTLE_BACKEND`; smolmachines is strictly
opt-in until it has been load-bearing on at least one operator's
workflow for a release cycle.
- **No `--outbound-localhost-only`.** That TSI flag opens the
entire `127.0.0.0/8` range and is the loopback gap the original
draft of this PRD called out. Use `--allow-cidr <bundle-ip>/32`
instead so the agent reaches one IP and one IP only.
- **No gvproxy.** Rejected after the chunk-1 spike against the
real smolvm CLI: smolvm 0.8.0 exposes no virtio-net-over-unixgram
attachment. Adopting gvproxy would have required dropping smolvm
and driving Virtualization.framework via PyObjC; the TSI
single-IP approach gives the same property at a fraction of the
cost.
- **No host bind mounts.** The smolmachines research note flagged
that `-v HOST:GUEST` mounts via virtiofs would defeat the
isolation goal. The manifest already has no concept of host
mounts; this PRD does not introduce one. If a future PRD wants
agent-side access to host files, it must come through a
controlled channel (vsock relay, OCI overlay, supervise sidecar
endpoint).
- **No HTTP API mode.** `smolvm serve` is the long-term-clean
control plane, but v1 drives smolmachines via CLI subprocess
invocations — the lower-overhead first iteration the research
note already endorses.
- **No custom kernel / initrd.** smolmachines uses libkrunfw
only; the agent image is an OCI ref, not a kernel + rootfs pair.
- **No warm-pool or snapshot/restore.** Each bottle gets a fresh
microVM; cold-start cost is paid up front.
- **No supervise/agent-credential rewrites for the new backend.**
Provisioning logic ports as-is; only the *transport* (host-side
port URLs instead of in-network DNS names) changes.
## Scope
### In scope
- New `claude_bottle/backend/smolmachines/` subpackage with the
full set of `BottleBackend` overrides.
- Smolfile generator (TOML) emitting the smolvm 0.8.0 schema:
top-level `image`, `entrypoint`, `cmd`, `env = [...]`,
`[network] allow_cidrs = ["<bundle-ip>/32"]`. (The renderer
that chunk 1 shipped under the gvproxy design — `name=`,
`[[net]]` — gets rewritten as part of this chunk plan.)
- A host-side sidecar-bundle lifecycle manager that brings up
one container per bottle on a dedicated per-bottle docker
bridge with a pinned IP (`--ip <bundle-ip>`), waits for the
daemons to bind their ports, and tears it down with the bottle.
This backend depends on PRD 0024's bundle image; it does not
own the bundle's Dockerfile or init.
- Per-bottle CA install path: the bundle's CA cert lands inside
the microVM via `smolvm machine exec` after start
(analogous to the existing `provision_ca` for Docker).
- Per-bottle docker bridge: a `claude-bottle-bundle-<slug>`
network with a /24 subnet derived from the slug hash; the
bundle gets a pinned IP at `.2` (gateway is `.1`). Pinning the
IP at start time avoids a race between the bundle's IP being
assigned and the Smolfile being written.
- TSI policy: the Smolfile sets `[network] allow_cidrs =
["<bundle-ip>/32"]` and nothing else. The agent can reach the
bundle's IP (any port) and nothing else; no DNS resolution is
available inside the guest, so the agent uses IP-literal URLs.
- Bundle bind addresses: egress binds `127.0.0.1:9099` inside
the bundle (pipelock-only); pipelock / git-gate / supervise
bind `0.0.0.0` so the agent can reach them. This is the
port-granularity TSI's IP-only allowlist doesn't provide.
PRD 0024's bundle init may need a config knob for this;
raised as open question 4.
- Preflight `smolvm` check: if the user selects this backend and
`smolvm` isn't on `$PATH`, die with an install pointer (brew tap
+ version pin TBD in implementation; see open question 3).
- Manifest validation: refuse any bottle field this backend can't
honor (today there are none, since the Docker backend already
rejects host mounts; this is a forward-compat check).
- Tests:
- Smoke unit-level test: Smolfile renderer produces the
expected TOML for a fixture bottle (smolvm 0.8.0 shape).
- Integration test: `prepare → launch → exec("echo hi") →
teardown` on a smolmachines-capable host (skips otherwise
via the same env/platform gate the Docker integration tests
use).
- PRD 0022 suite, re-run with the env var flipped, passes.
### Out of scope
- VM image caching across bottles (each prepare rebuilds from the
OCI archive; layer reuse is whatever smolmachines provides).
- Cross-host bottle relocation (the OCI archive is local-only).
- Operator-facing knobs for vCPU / memory / overlay size (use
sensible defaults; expose as manifest fields in a later PRD if
needed).
- Integration with the `supervise` plane's permission-prompt UX
beyond port plumbing — supervise already speaks HTTP and binds
to whatever loopback the backend hands it.
## Proposed Design
### Backend layout
```
claude_bottle/backend/smolmachines/
__init__.py re-exports SmolmachinesBottleBackend
backend.py SmolmachinesBottleBackend façade
bottle.py SmolmachinesBottle (exec_claude / exec / cp_in / close)
bottle_plan.py SmolmachinesBottlePlan + .print()
bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan
prepare.py resolve_plan(spec, stage_dir, ...) -> SmolmachinesBottlePlan
launch.py @contextmanager launch(plan) -> SmolmachinesBottle
cleanup.py prepare_cleanup / cleanup / list_active
smolfile.py bottle_plan_to_smolfile(...) -> dict + render
sidecar_bundle.py host-side bundle lifecycle (per-bottle docker bridge + pinned IP)
smolvm.py thin subprocess wrapper: machine create/start/exec/stop, pack create
util.py slugify, subnet derivation, OCI archive helpers
provision/ ca.py, prompt.py, skills.py, git.py, supervise.py
```
Note what's NOT here vs. the original draft: `gvproxy.py`,
`vfkit_attach.py`. The gvproxy design needed both; the TSI single-IP
design needs neither.
### Network + egress topology
```
┌── macOS host ─────────────────────────────────────────────────────┐
│ │
│ ┌── per-bottle docker bridge claude-bottle-bundle-<slug> ──┐ │
│ │ subnet: 192.168.X.0/24 (X = hash(slug) mod 254) │ │
│ │ │ │
│ │ ┌── bundle container (pinned --ip 192.168.X.2) ────────┐ │ │
│ │ │ init.py (PRD 0024 Python supervisor) │ │ │
│ │ │ ├─ pipelock (binds 0.0.0.0:8888) │ │ │
│ │ │ ├─ egress (mitmproxy) (binds 127.0.0.1:9099) │ │ │
│ │ │ ├─ git-gate (binds 0.0.0.0:9418) │ │ │
│ │ │ └─ supervise (binds 0.0.0.0:9100) │ │ │
│ │ │ Internal-only egress is unreachable from outside │ │ │
│ │ │ the bundle even though TSI permits the IP. │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────┬─────┘ │
│ │ │
│ ┌── microVM (per bottle, libkrun via smolvm) ──────────▼─┐ │
│ │ Smolfile: [network] allow_cidrs = ["192.168.X.2/32"] │ │
│ │ env: HTTPS_PROXY=http://192.168.X.2:8888 │ │
│ │ GIT_GATE_URL=git://192.168.X.2:9418 (cond.) │ │
│ │ MCP_SUPERVISE_URL=http://192.168.X.2:9100 (cond) │ │
│ │ No other host reachable — TSI denies any connect() │ │
│ │ that isn't to 192.168.X.2. No DNS inside the guest │ │
│ │ (no resolver in the allowlist). │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────┘
```
What the guest can reach, exhaustively: **only `<bundle-ip>` on
ports the bundle binds to 0.0.0.0**. Egress's 127.0.0.1-only bind
makes it bundle-internal; host loopback / LAN / public internet
direct are all refused by TSI's allowlist.
Three changes vs. the Docker backend:
1. **One sidecar container per bottle, not four.** Same bundle
image PRD 0024 ships for the docker backend.
2. **Sidecar container is on a per-bottle docker bridge with a
pinned IP**, reached directly by the smolvm guest's allowed
/32 — no localhost port allocation, no userspace TCP/IP stack
in the middle.
3. **The agent dials IP literals, not hostnames.** TSI doesn't
filter DNS at the protocol level, and we don't put DNS
resolvers in the allowlist, so name resolution is denied by
construction.
### Lifecycle
`SmolmachinesBottleBackend.prepare(spec, stage_dir)`:
1. Cross-backend validation via `BottleBackend._validate` (skills,
git identity files).
2. Derive a per-bottle docker subnet from `sha256(slug) % 254`
(skipping the docker-default 17): `192.168.X.0/24`. The bundle
IP is always `192.168.X.2` (gateway is `.1`).
3. Resolve the agent guest image: `docker build` the existing
`Dockerfile`, then convert the resulting image into a
`.smolmachine` artifact. Empirically `smolvm pack create` only
reads OCI registry refs — it rejects `docker-daemon://`,
`oci-layout://`, `docker-archive:` tarballs, and every other
transport tested. The conversion path is a registry hop: bring
up an ephemeral `registry:2.8.3` container bound to
`127.0.0.1:<random>`, `docker tag` + `docker push` into it,
`smolvm pack create --image localhost:<port>/claude-bottle:<id>`,
tear down the registry. The `.smolmachine` is cached under
`~/.cache/claude-bottle/smolmachines/` keyed by the docker
image ID, so Dockerfile changes invalidate the cache and
unchanged rebuilds skip the whole pipeline.
4. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`
using smolvm 0.8.0's schema:
- `image` / `entrypoint` / `cmd` — bundled into the
`.smolmachine` from the previous step (one Smolfile, one
artifact).
- `env = [...]` — `HTTPS_PROXY`, `NO_PROXY`, `NODE_EXTRA_CA_CERTS`,
etc., all pointing at IP-literal URLs (`http://192.168.X.2:8888`).
- `[network] allow_cidrs = ["192.168.X.2/32"]` — TSI's single
/32 allowlist.
5. Resolve the in-VM CA paths so launch knows where to copy
pipelock's CA after start.
6. Return a `SmolmachinesBottlePlan` carrying the slug, bundle
subnet/IP, `.smolmachine` artifact path, Smolfile path, and
bundle run spec.
`SmolmachinesBottleBackend.launch(plan)`:
1. Create the per-bottle docker bridge network
(`claude-bottle-bundle-<slug>` with the resolved subnet) and
start the sidecar bundle container with `docker run --network
... --ip <bundle-ip> ...`. Wait for its daemons to bind:
pipelock on 8888, git-gate on 9418 (conditional), supervise
on 9100 (conditional). Register teardown callbacks.
2. `smolvm machine create --from <stage>/agent.smolmachine
--smolfile <stage>/smolfile.toml <name>` and
`smolvm machine start --name <name>`. The Smolfile's TSI
allowlist gates outbound to the bundle's /32; libkrun's TSI
layer enforces it.
3. Provisioning: CA install → prompt → skills → git → supervise
config, each via `smolvm machine exec` / `smolvm machine cp`.
4. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` /
`cp_in` all funnel through `smolvm machine exec` /
`smolvm machine cp`.
5. Teardown: stop and delete the VM → stop + remove the bundle
container → remove the per-bottle docker network.
### Data model
No manifest schema change. `bottles[]` continues to carry
`egress.allowlist`, `env`, `git`, `skills` references, etc.; the
smolmachines backend reads the same fields as the docker backend.
`egress.allowlist` is enforced by pipelock inside the bundle
(unchanged from the docker backend); the guest has no DNS resolver
in TSI's allowlist, so an agent that tries to dial an arbitrary
hostname can't resolve it in the first place — the DNS-exfil
attack from PRD 0022 test 4 is blocked at the resolver step.
The `BottleSpec` dataclass and the `Bottle` ABC do not change.
### Selection wiring
In `claude_bottle/backend/__init__.py`:
```python
from .docker import DockerBottleBackend
from .smolmachines import SmolmachinesBottleBackend
_BACKENDS: dict[str, BottleBackend[Any, Any]] = {
"docker": DockerBottleBackend(),
"smolmachines": SmolmachinesBottleBackend(),
}
```
The existing "unknown backend" `die()` path stays as-is.
### External dependencies
- `smolvm` CLI binary on `$PATH` (one new external dep, gated by
the preflight check). Pinned version policy is deferred to the
open questions; v1 reads `smolvm --version` and refuses to launch
outside a known-good range (currently 0.8.x).
- No `gvproxy` dep (the original draft listed it; dropped after
the chunk-1 spike).
- No `pyobjc-framework-Virtualization` dep (dropped from the
original draft for the same reason).
- No new pure-Python packages. Subprocess + stdlib `tomllib` for
Smolfile authoring.
### Acceptance test plan
- **Unit (smolfile):** `tests/unit/test_smolfile.py` verifies the
renderer produces the expected TOML for a fixture bottle in
smolvm 0.8.0's schema — top-level `image` / `entrypoint` /
`cmd` / `env`, plus `[network] allow_cidrs = ["<bundle-ip>/32"]`
and nothing else under `[network]`.
- **Unit (subnet derivation):** the existing
`test_smolmachines_util.py` covers the per-bottle subnet hash
+ collision-avoidance and stays as-is.
- **Integration smoke:** `tests/integration/test_smolmachines_smoke.py`
with `prepare → launch → exec → teardown`, guarded by a
`smolvm` presence check + macOS / KVM platform check.
- **Localhost-reach probe:** a focused integration test that
brings up a bottle, has the host bind a test service on
`127.0.0.1:<unused-port>`, and asserts the in-bottle agent
cannot connect to it. This is the regression test for the
exact gap `--outbound-localhost-only` would have introduced —
with `--allow-cidr <bundle-ip>/32` only, the probe must fail.
- **Egress-port-bypass probe:** also brings up a bottle and
asserts the in-bottle agent's connect to `<bundle-ip>:9099`
(egress's port) is refused — confirming the bundle-internal
bind of egress to `127.0.0.1` works as the port-granularity
layer TSI doesn't provide.
- **PRD 0022 re-run:** with `CLAUDE_BOTTLE_BACKEND=smolmachines`,
all five attack categories return sandbox-block markers and the
suite passes. The test code does not change beyond the env-var
flip — that's the contract the PRD 0022 abstraction was
designed for.
## Sizing — into chunks
PRD 0024's bundle image is a prerequisite — this PRD assumes
`claude-bottle-sidecars:<pinned>` is available when chunk 3 lands.
1. **Backend skeleton + selection + Smolfile/gvproxy renderers.**
*Shipped (PR #62), but under the now-rejected gvproxy design.*
The Smolfile renderer emits `name = …` / `[[net]]` instead of
smolvm 0.8.0's `image` / `[network] allow_cidrs`. The gvproxy
renderer is dead. Chunk 2 rewrites the Smolfile renderer and
deletes `gvproxy_config.py` / its tests.
2. **VM lifecycle + bundle bringup + Smolfile rewrite.**
`smolvm.py` subprocess wrapper, prepare-time image conversion
(`smolvm pack create` → `.smolmachine`), per-bottle docker
bridge + bundle container with pinned IP, launch path that
starts the bundle and brings up the VM (`smolvm machine create
--from --smolfile`), exec into the VM, tear everything down.
Smoke integration test: `exec("echo hi")` inside a started
VM. Includes the localhost-reach probe + egress-port-bypass
probe from the acceptance plan. The chunk-1 Smolfile renderer
gets rewritten to the smolvm 0.8.0 schema; `gvproxy_config.py`
and `gvproxy.py` (if any) get deleted.
3. **Bundle bind-address mitigation.** Update PRD 0024's bundle
init to bind egress on `127.0.0.1:9099` instead of `0.0.0.0`
(or expose a config knob — open question 4). Reverify the
egress-port-bypass probe. Pipelock / git-gate / supervise
continue to bind `0.0.0.0`.
4. **Provisioning parity with Docker.** CA install via
`smolvm machine cp`, prompt/skills/.git copy-in, supervise
MCP config. End-to-end `start` works for a real agent
manifest.
5. **PRD 0022 sandbox-escape suite green.** Skip-guard update,
small adjustments to test helpers if any (the test uses
`bottle.exec(script)` and inspects `returncode` + body for
sandbox markers — should be transport-agnostic, but verify).
Document the macOS-only scope in README.
## Open questions
1. **~~VMM choice~~ Resolved.** Chunk-1 spike against `smolvm
0.8.0` confirmed there's no virtio-net-over-unixgram option;
the gvproxy design isn't viable on top of smolvm. Resolved
by switching to TSI `--allow-cidr <bundle-ip>/32` + bundle
bind-address mitigation; smolvm stays as the VMM. See the
"Design pivot from the first draft" section.
2. **`smolvm` install policy.** Pin via brew / `curl install.sh`,
or vendor a binary in the repo. v1 likely runs
`smolvm --version` at preflight and accepts a documented range
(currently 0.8.x). The
`curl -sSL https://smolmachines.com/install.sh | sh` path is
what the operator used; document it in the README.
3. **CA install inside the agent guest.** Two paths: bake at
prepare time (one `.smolmachine` artifact per CA fingerprint,
big cache key) vs. inject at start time via `smolvm machine
cp` after the VM is up. PRD 0006 chose the runtime path for
Docker (docker-cp + `update-ca-certificates`); smolvm has the
same shape via `machine cp` + `machine exec`. Default to
runtime injection.
4. **Bundle bind-address knob.** PRD 0024's bundle currently runs
all four daemons under one supervisor with daemon argv
hardcoded. To make egress bind `127.0.0.1:9099` instead of
`0.0.0.0:9099`, either: (a) edit the supervisor's
`_DAEMONS` entry to pass a `--listen-host 127.0.0.1` flag to
mitmdump, OR (b) introduce a per-daemon `bind_localhost`
knob the renderer can set. Option (a) is simpler and matches
that egress is bundle-internal regardless of backend; resolve
in chunk 3.
5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test
suite reads `returncode` + stdout + stderr from
`ExecResult`. Confirm `smolvm machine exec` propagates exit
codes and separated streams. The CLI help mentions a
`--stream` flag for streaming output; behavior under default
(non-stream) mode is what we want — verify in chunk 2.
6. **CI gating.** Gitea's act_runner is Linux without nested KVM,
so this backend's integration tests will skip there for the
same structural reason the Docker bringup tests do. The skip
predicate becomes `not (smolvm_available() and
platform.system() == "Darwin")`. CI coverage for this backend
will come from local runs on the maintainer's macOS host
until a Darwin runner is wired up; ack that as a known gap.
7. **Active bottle discovery.** Docker uses container labels to
enumerate active bottles (`list_active` queries the daemon).
The microVM enumeration story is `smolvm machine ls --json`;
the plan is to filter on a deterministic name prefix
`claude-bottle-<slug>` + cross-reference with on-disk metadata
under `state/<slug>/`.
8. **~~Loopback scoping (Docker Desktop pivot).~~ Resolved.**
Each bottle now allocates a per-bottle loopback alias from a
pool of `127.0.0.16` .. `127.0.0.31`, binds the bundle's
port-forwards to that alias, and sets TSI's allowlist to the
alias's /32. So a smolmachines bottle can only reach its own
bundle's published ports — not other bottles' ports, and not
unrelated host services on `127.0.0.1`. macOS loopback
aliases need `sudo ifconfig lo0 alias`; the launcher lazily
adds missing pool entries on first launch per reboot (sudo
prompts once, aliases persist until reboot). Linux native
daemons share the host's network namespace and skip the
alias dance.
## References
- `docs/research/agent-vm-isolation.md` — describes the
gvproxy + `VZFileHandleNetworkDeviceAttachment` path. The
current design no longer needs that recipe (the TSI single-IP
approach replaced it after the chunk-1 spike); kept for
historical context if a future operator needs to drop smolvm
and own the VM lifecycle directly.
- `docs/research/smolmachines-as-vm-backend.md` — evaluation of
smolmachines as the VM lifecycle wrapper. The research note's
TSI-bad-due-to-loopback-gap argument turned out to apply only
to `--outbound-localhost-only`, not to TSI generally; this PRD
uses `--allow-cidr <bundle-ip>/32` instead, sidestepping the
gap.
- `docs/research/agent-sandbox-landscape.md` — identifies
`"runtime": "microvm"`-style opt-in as the borrowable idea;
smolmachines is the concrete implementation.
- PRD 0003 (`docs/prds/0003-bottle-backend-abstraction.md`) — the
backend abstraction this PRD is the first non-Docker consumer
of.
- PRD 0017 (`docs/prds/0017-egress-proxy-via-mitmproxy.md`) — the
egress sidecar the bundle reuses verbatim as pipelock's internal
upstream.
- PRD 0022
(`docs/prds/0022-sandbox-escape-integration-test.md`) — the
acceptance gate for this PRD; the suite already runs through
`get_bottle_backend()` so the env-var flip is the only change
needed to exercise the smolmachines path.
- PRD 0024
(`docs/prds/0024-consolidate-sidecar-bundle.md`) — defines the
single bundle image (`claude-bottle-sidecars`) this PRD
consumes. Prerequisite for chunk 3 of this PRD.