Merge pull request 'docs(prd-0023): pivot to smolvm + TSI single-IP allowlist' (#63) from prd-0023-revise-option-b into main
test / unit (push) Successful in 21s
test / integration (push) Successful in 42s

This commit was merged in pull request #63.
This commit is contained in:
2026-05-27 03:54:11 -04:00
+324 -304
View File
@@ -9,36 +9,54 @@
Ship a second concrete `BottleBackend` Ship a second concrete `BottleBackend`
`SmolmachinesBottleBackend`, selected via `SmolmachinesBottleBackend`, selected via
`CLAUDE_BOTTLE_BACKEND=smolmachines` — that runs each bottle inside `CLAUDE_BOTTLE_BACKEND=smolmachines` — that runs each bottle inside
a per-agent microVM on macOS. The egress topology is enforced by a per-agent libkrun microVM via `smolvm`. Egress is enforced by
**gvproxy** (gvisor-tap-vsock), a userspace TCP/IP stack the guest's libkrun's TSI ("Transport Socket Interface") allowlist set to a
virtio-net device is wired into via `VZFileHandleNetworkDeviceAttachment`. **single /32** — the docker IP of the per-bottle sidecar bundle
gvproxy's only outbound configuration is an explicit per-bottle (PRD 0024) on a dedicated docker bridge. Everything else — host
port-forward set into a **single per-bottle sidecar container** that loopback, LAN, public internet directly — is denied at the VMM
bundles pipelock + egress + git-gate + supervise behind one supervised layer, before a host-side socket is ever opened.
init. Everything else — the host's LAN, the host's loopback
services, the public internet — is unreachable from the guest by
construction.
The sidecar bundle is the same image PRD 0024 introduces for the The sidecar bundle is the same image PRD 0024 ships for the docker
docker backend; this PRD consumes it. Inside the bundle, egress is backend; this PRD consumes it. Inside the bundle, pipelock /
pipelock's internal upstream over localhost and is not exposed git-gate / supervise bind `0.0.0.0:<port>` so the agent (reaching
externally. gvproxy port-forwards three external ports into the the bundle via the allowed /32) can talk to them; egress (the
bundle: pipelock (for `HTTPS_PROXY`), git-gate (for git push), and internal upstream of pipelock) binds `127.0.0.1:9099` so it's only
supervise (for MCP). reachable from pipelock within the bundle — the agent can't dial
it directly even though TSI's allowlist is IP-granular rather than
This explicitly rejects libkrun's TSI ("Transport Socket Interface") port-granular.
allowlist as the network primitive. TSI's `--outbound-localhost-only`
is permissive on the entire `127.0.0.0/8` range with no
destination-port filter — the agent can dial any host-side service
bound to loopback (a local Postgres, an IDE plugin, a different
bottle's pipelock). That's the wrong default for a malicious-agent
threat model; see "Why gvproxy, not TSI" below.
The Docker backend ships unchanged; this is opt-in via the existing The Docker backend ships unchanged; this is opt-in via the existing
env-var selector. The acceptance gate is PRD 0022's env-var selector. The acceptance gate is PRD 0022's
`tests/integration/test_sandbox_escape.py` running green against `tests/integration/test_sandbox_escape.py` running green against
`CLAUDE_BOTTLE_BACKEND=smolmachines`. `CLAUDE_BOTTLE_BACKEND=smolmachines`.
### Design pivot from the first draft
The original PRD landed (PR #53) calling for **gvproxy** as the
network primitive — a userspace TCP/IP stack the guest's virtio-net
device would hook into via `VZFileHandleNetworkDeviceAttachment`,
with explicit `port_forwards` controlling what the guest could
reach. That design was built around the smolmachines research
note's claim that libkrun supports a virtio-net mode separate
from TSI.
Chunk 1's empirical spike against `smolvm 0.8.0`'s actual CLI
contradicted that claim: smolvm exposes only TSI-style egress
filters (`--allow-host`, `--allow-cidr`, `--outbound-localhost-only`),
with no documented option to attach virtio-net to a custom unixgram
socket. The gvproxy path would have required dropping smolvm
entirely and driving `Virtualization.framework` via PyObjC.
Re-examining the "why gvproxy" argument with smolvm's real surface,
the loopback gap PRD 0023 worried about only exists with
`--outbound-localhost-only`. With `--allow-cidr <bundle-ip>/32`
instead — and no `--outbound-localhost-only` — the agent can reach
exactly one IP (the bundle) and nothing else: not host loopback,
not LAN, not public internet. That's the same security property
the gvproxy design was chasing, enforced one layer lower (VMM
socket interception, not a userspace TCP/IP stack we maintain),
with significantly less code.
## Problem ## Problem
`agent-vm-isolation.md` argues for hardware-isolated microVMs over `agent-vm-isolation.md` argues for hardware-isolated microVMs over
@@ -55,8 +73,10 @@ and four things motivate a second one now:
an editor plugin, another bottle's sidecar) without traversing an editor plugin, another bottle's sidecar) without traversing
pipelock. The Docker backend's `--internal` network blocks the pipelock. The Docker backend's `--internal` network blocks the
first; nothing in the current stack blocks the second cleanly. first; nothing in the current stack blocks the second cleanly.
This PRD's gvproxy-based design closes both gaps: the guest can This PRD's design closes both gaps via TSI's
only reach the explicit port-forward list, period. `--allow-cidr <bundle-ip>/32`: the guest can only dial that one
IP, period. Host loopback, LAN, and the public internet are
refused at the VMM layer.
- **Isolation ceiling.** On macOS the Docker backend's agent - **Isolation ceiling.** On macOS the Docker backend's agent
container shares Docker Desktop's host VM with every other container shares Docker Desktop's host VM with every other
bottle. Container escape from claude-code lands the agent inside bottle. Container escape from claude-code lands the agent inside
@@ -77,30 +97,46 @@ and four things motivate a second one now:
clean in places where Docker-specific assumptions have been clean in places where Docker-specific assumptions have been
tolerated. tolerated.
## Why gvproxy, not TSI ## How TSI's single-IP allowlist achieves the property
libkrun's TSI hijacks guest socket syscalls inside the VMM and libkrun's TSI hijacks guest socket syscalls inside the VMM and
opens the actual sockets from the host process, with a CIDR opens the actual sockets from the host process, gated by a CIDR
allowlist gate. That works fine for blocking LAN reach (don't allowlist. Three flags expose the allowlist:
allowlist `192.168.0.0/16`, agent can't dial it). But TSI's
`--outbound-localhost-only` permits the *entire* `127.0.0.0/8`
range across all ports — there is no destination-port filter at
the TSI layer (`smolmachines-as-vm-backend.md` flags this in the
"`--allow-host` semantics" caveat). For our threat model that
means any host-loopback service is reachable from the guest.
gvproxy implements a full userspace TCP/IP stack on the host side - `--outbound-localhost-only` — opens up the whole `127.0.0.0/8`
of a `VZFileHandleNetworkDeviceAttachment` unixgram socket. The range, all ports. This is the flag the first draft of this PRD
guest has a real virtio-net device; gvproxy is its gateway. The rejected, and we still reject it: it would let the agent dial
guest can only reach what gvproxy is configured to forward — any host-loopback service (local Postgres, IDE plugins, another
typically a single port forward to the per-bottle pipelock — bottle's sidecar).
and DNS resolves NXDOMAIN by default. There is no "permissive - `--allow-cidr CIDR` — IP/CIDR allowlist with no port filter.
loopback" mode to mis-configure; if it's not in `port_forwards`, - `--allow-host HOSTNAME` — resolves the host on the host's DNS
the guest cannot reach it. at VM-start time, stores the result as `/32` CIDRs, and also
enables guest-side DNS filtering (only the allowed hostname
resolves).
That property — *explicit allowlist by port forward, not CIDR* This backend uses `--allow-cidr <bundle-ip>/32` (single host) and
is the load-bearing reason this PRD chooses gvproxy. TSI shows up nothing else. With the bundle running as a docker container with a
once more in this doc, under Non-goals, where it is closed off. known IP on a dedicated docker bridge, the agent can reach exactly
one address: the bundle. Host loopback is denied (not in the
allowlist). LAN is denied. Public internet directly is denied. DNS
inside the guest is denied (no resolver in the allowlist) — the
agent uses an IP literal for `HTTPS_PROXY`.
The one wrinkle TSI doesn't directly handle is **port granularity
within the allowed IP**. The bundle runs four daemons; pipelock /
git-gate / supervise are agent-facing, egress is pipelock's
internal upstream. If egress were bound to `0.0.0.0:9099` inside
the bundle, the agent could dial `<bundle-ip>:9099` and bypass
pipelock's DLP. We mitigate by binding egress to `127.0.0.1:9099`
*inside* the bundle so only pipelock — also in the bundle, on the
same localhost — can reach it. The bind-address strategy gives us
port-level isolation that TSI's IP-only allowlist doesn't.
Net result: same security property the first draft chased with
gvproxy, enforced at the VMM layer rather than via a userspace
TCP/IP stack, with significantly less code (no gvproxy lifecycle,
no `VZFileHandleNetworkDeviceAttachment` plumbing, no Smolfile
virtio-net carve-out smolvm doesn't expose anyway).
## Goals / Success Criteria ## Goals / Success Criteria
@@ -133,33 +169,33 @@ The feature is **done** when all of the following ship:
- `SmolmachinesBottleBackend` registered under the - `SmolmachinesBottleBackend` registered under the
`"smolmachines"` key in `claude_bottle/backend/__init__.py:_BACKENDS`. `"smolmachines"` key in `claude_bottle/backend/__init__.py:_BACKENDS`.
- Per-bottle Smolfile generation: a runtime-rendered TOML written - Per-bottle Smolfile generation: a runtime-rendered TOML written
to the bottle's stage dir, analogous to the compose file the to the bottle's stage dir using smolvm 0.8.0's actual schema
Docker backend writes today. The Smolfile pins `command`, (`image`, `entrypoint`, `cmd`, `env = ["K=V", …]`, `[network]
`env`, and a virtio-net device backed by a unixgram socket allow_cidrs = ["<bundle-ip>/32"]`). The renderer chunk 1
pointed at the per-bottle gvproxy. There is no TSI shipped emits the wrong shape (built around the gvproxy
`--allow-cidr` / `--outbound-localhost-only` / `--allow-host` unixgram attachment) — it gets rewritten in this chunk plan as
in the Smolfile — TSI is not used. the cost of the design pivot.
- Per-bottle gvproxy: one `gvproxy` process per bottle, started - Per-bottle docker bridge for the bundle: the sidecar bundle
before the VM, listening on a unixgram socket the VM's runs as a docker container on a dedicated per-bottle bridge
virtio-net device hooks into. The gvproxy config has up to network with a pinned IP (`--ip <bundle-ip>` against a
three `port_forwards` entries (pipelock / git-gate / supervise per-slug `/24` derived from the slug hash). The pinned IP is
— git-gate and supervise only when the bottle uses them) all what TSI's allowlist points at; without pinning we'd need to
pointing at the per-bottle sidecar bundle's exposed ports, plus inspect the running container's IP and feed it back into the
a DNS section that resolves only `proxy.internal`. Every other Smolfile, which is a race.
hostname returns NXDOMAIN; every other destination is
unreachable.
- Per-bottle sidecar bundle: one container per bottle running the - Per-bottle sidecar bundle: one container per bottle running the
bundle image defined in PRD 0024. The bundle exposes up to bundle image defined in PRD 0024. pipelock / git-gate /
three host ports (pipelock for `HTTPS_PROXY`, git-gate for git supervise bind `0.0.0.0:<port>` so the agent (reaching the
push, supervise for MCP), bound to `127.0.0.1` on dynamically bundle via the allowed /32) can reach them. egress binds
allocated ports. egress runs *inside* the bundle as pipelock's `127.0.0.1:9099` inside the bundle so only pipelock can reach
upstream over localhost and is not exposed externally. The it — the agent sees `<bundle-ip>:9099` refuse the connection
agent's environment carries the resolved URLs (e.g. even though TSI's allowlist permits the IP. The agent's
`HTTPS_PROXY=http://proxy.internal:<pipelock-gateway-port>`). environment carries IP-literal URLs (e.g.
`HTTPS_PROXY=http://<bundle-ip>:8888`).
- The agent guest image is produced from the existing `Dockerfile` - The agent guest image is produced from the existing `Dockerfile`
(or a thin variant), exported as an OCI archive, and consumed by via `smolvm pack create``.smolmachine` artifact, then loaded
`smolvm machine create`. The image build step is part of `prepare`, into smolvm via `machine create --from <path>`. The image build
analogous to `docker_mod.build_image`. step is part of `prepare`, analogous to
`docker_mod.build_image`.
- The PRD 0022 sandbox-escape suite, run with - The PRD 0022 sandbox-escape suite, run with
`CLAUDE_BOTTLE_BACKEND=smolmachines`, passes locally on a `CLAUDE_BOTTLE_BACKEND=smolmachines`, passes locally on a
smolmachines-capable host. The suite is updated to skip cleanly smolmachines-capable host. The suite is updated to skip cleanly
@@ -182,15 +218,16 @@ The feature is **done** when all of the following ship:
value of `CLAUDE_BOTTLE_BACKEND`; smolmachines is strictly value of `CLAUDE_BOTTLE_BACKEND`; smolmachines is strictly
opt-in until it has been load-bearing on at least one operator's opt-in until it has been load-bearing on at least one operator's
workflow for a release cycle. workflow for a release cycle.
- **No TSI for network policy.** libkrun's TSI mode is rejected - **No `--outbound-localhost-only`.** That TSI flag opens the
for this backend — it lacks per-port filtering on `127.0.0.0/8` entire `127.0.0.0/8` range and is the loopback gap the original
and would expose every host-loopback service to the guest. The draft of this PRD called out. Use `--allow-cidr <bundle-ip>/32`
Smolfile must select libkrun's virtio-net mode and attach to instead so the agent reaches one IP and one IP only.
the per-bottle gvproxy unixgram socket; if that combination is - **No gvproxy.** Rejected after the chunk-1 spike against the
not supported by the pinned smolmachines version (see open real smolvm CLI: smolvm 0.8.0 exposes no virtio-net-over-unixgram
question 1), the implementation falls back to driving attachment. Adopting gvproxy would have required dropping smolvm
Virtualization.framework directly via PyObjC and reuses the and driving Virtualization.framework via PyObjC; the TSI
same gvproxy attachment. single-IP approach gives the same property at a fraction of the
cost.
- **No host bind mounts.** The smolmachines research note flagged - **No host bind mounts.** The smolmachines research note flagged
that `-v HOST:GUEST` mounts via virtiofs would defeat the that `-v HOST:GUEST` mounts via virtiofs would defeat the
isolation goal. The manifest already has no concept of host isolation goal. The manifest already has no concept of host
@@ -216,30 +253,35 @@ The feature is **done** when all of the following ship:
- New `claude_bottle/backend/smolmachines/` subpackage with the - New `claude_bottle/backend/smolmachines/` subpackage with the
full set of `BottleBackend` overrides. full set of `BottleBackend` overrides.
- Smolfile generator (TOML), analogous to - Smolfile generator (TOML) emitting the smolvm 0.8.0 schema:
`backend/docker/compose.py`'s `bottle_plan_to_compose`. top-level `image`, `entrypoint`, `cmd`, `env = [...]`,
`[network] allow_cidrs = ["<bundle-ip>/32"]`. (The renderer
that chunk 1 shipped under the gvproxy design — `name=`,
`[[net]]` — gets rewritten as part of this chunk plan.)
- A host-side sidecar-bundle lifecycle manager that brings up - A host-side sidecar-bundle lifecycle manager that brings up
one container per bottle (the bundle image defined in PRD 0024), one container per bottle on a dedicated per-bottle docker
publishes its one to three host ports, waits for readiness, bridge with a pinned IP (`--ip <bundle-ip>`), waits for the
and tears it down with the bottle. This backend depends on daemons to bind their ports, and tears it down with the bottle.
PRD 0024's bundle image; it does not own the bundle's This backend depends on PRD 0024's bundle image; it does not
Dockerfile or init. own the bundle's Dockerfile or init.
- Per-bottle CA install path: the bundle's CA cert lands inside - Per-bottle CA install path: the bundle's CA cert lands inside
the microVM via `smolvm machine exec` after start the microVM via `smolvm machine exec` after start
(analogous to the existing `provision_ca` for Docker). (analogous to the existing `provision_ca` for Docker).
- gvproxy lifecycle: per-bottle `gvproxy` started by the backend - Per-bottle docker bridge: a `claude-bottle-bundle-<slug>`
before VM bringup, torn down after VM teardown, configured with network with a /24 subnet derived from the slug hash; the
up to three `port_forwards` entries (gateway port → host bundle gets a pinned IP at `.2` (gateway is `.1`). Pinning the
bundle port for each of pipelock / git-gate / supervise) and a IP at start time avoids a race between the bundle's IP being
DNS section that resolves only `proxy.internal`. Subnet and assigned and the Smolfile being written.
gateway IP are derived from the bottle slug so two concurrent - TSI policy: the Smolfile sets `[network] allow_cidrs =
bottles don't collide. ["<bundle-ip>/32"]` and nothing else. The agent can reach the
- DNS policy: the bottle's `egress.allowlist` does *not* go into bundle's IP (any port) and nothing else; no DNS resolution is
gvproxy's DNS — the agent resolves only `proxy.internal`, and available inside the guest, so the agent uses IP-literal URLs.
pipelock on the host enforces the egress allowlist against - Bundle bind addresses: egress binds `127.0.0.1:9099` inside
the actual upstream connect target. This keeps the DNS-exfil the bundle (pipelock-only); pipelock / git-gate / supervise
attack (PRD 0022 test 4) blocked because gvproxy answers bind `0.0.0.0` so the agent can reach them. This is the
NXDOMAIN for every name except `proxy.internal`. port-granularity TSI's IP-only allowlist doesn't provide.
PRD 0024's bundle init may need a config knob for this;
raised as open question 4.
- Preflight `smolvm` check: if the user selects this backend and - Preflight `smolvm` check: if the user selects this backend and
`smolvm` isn't on `$PATH`, die with an install pointer (brew tap `smolvm` isn't on `$PATH`, die with an install pointer (brew tap
+ version pin TBD in implementation; see open question 3). + version pin TBD in implementation; see open question 3).
@@ -248,7 +290,7 @@ The feature is **done** when all of the following ship:
rejects host mounts; this is a forward-compat check). rejects host mounts; this is a forward-compat check).
- Tests: - Tests:
- Smoke unit-level test: Smolfile renderer produces the - Smoke unit-level test: Smolfile renderer produces the
expected TOML for a fixture bottle. expected TOML for a fixture bottle (smolvm 0.8.0 shape).
- Integration test: `prepare → launch → exec("echo hi") → - Integration test: `prepare → launch → exec("echo hi") →
teardown` on a smolmachines-capable host (skips otherwise teardown` on a smolmachines-capable host (skips otherwise
via the same env/platform gate the Docker integration tests via the same env/platform gate the Docker integration tests
@@ -282,80 +324,65 @@ claude_bottle/backend/smolmachines/
launch.py @contextmanager launch(plan) -> SmolmachinesBottle launch.py @contextmanager launch(plan) -> SmolmachinesBottle
cleanup.py prepare_cleanup / cleanup / list_active cleanup.py prepare_cleanup / cleanup / list_active
smolfile.py bottle_plan_to_smolfile(...) -> dict + render smolfile.py bottle_plan_to_smolfile(...) -> dict + render
gvproxy.py per-bottle gvproxy config render + process lifecycle sidecar_bundle.py host-side bundle lifecycle (per-bottle docker bridge + pinned IP)
sidecar_bundle.py host-side lifecycle for the PRD 0024 bundle container smolvm.py thin subprocess wrapper: machine create/start/exec/stop, pack create
smolvm.py thin subprocess wrapper: machine create/start/exec/stop util.py slugify, subnet derivation, OCI archive helpers
vfkit_attach.py VZFileHandleNetworkDeviceAttachment + VFKT handshake
util.py slugify, port allocation, OCI archive helpers
provision/ ca.py, prompt.py, skills.py, git.py, supervise.py provision/ ca.py, prompt.py, skills.py, git.py, supervise.py
``` ```
Note what's NOT here vs. the original draft: `gvproxy.py`,
`vfkit_attach.py`. The gvproxy design needed both; the TSI single-IP
design needs neither.
### Network + egress topology ### Network + egress topology
``` ```
┌── macOS host ─────────────────────────────────────────────────────┐ ┌── macOS host ─────────────────────────────────────────────────────┐
│ │ │ │
│ ┌── per-bottle sidecar bundle (one container per microVM) ─┐ │ │ ┌── per-bottle docker bridge claude-bottle-bundle-<slug> ──┐ │
│ │ init.py (Python supervisor) │ │ │ │ subnet: 192.168.X.0/24 (X = hash(slug) mod 254) │ │
│ │ ├─ pipelock (binds 0.0.0.0:8888 in container) │ │ │
│ │ ├─ egress (mitmproxy) (binds 127.0.0.1:p_internal) │ │ ┌── bundle container (pinned --ip 192.168.X.2) ────────┐ │ │
│ │ ├─ git-gate (binds 0.0.0.0:8889) │ │ │ │ │ init.py (PRD 0024 Python supervisor)
│ │ └─ supervise (MCP) (binds 0.0.0.0:8890) │ │ │ │ │ ├─ pipelock (binds 0.0.0.0:8888)
│ │ pipelock's upstream is 127.0.0.1:p_internal (egress); │ │ │ │ │ ├─ egress (mitmproxy) (binds 127.0.0.1:9099)
│ │ egress is not exposed outside the bundle. │ │ │ │ │ ├─ git-gate (binds 0.0.0.0:9418)
└─────────────────────────────────────────────────────┬─────┘ │ │ └─ supervise (binds 0.0.0.0:9100) │ │
Host ports published (loopback, dynamic): │ Internal-only egress is unreachable from outside │ │
pipelock 127.0.0.1:<p1> the bundle even though TSI permits the IP.
git-gate 127.0.0.1:<p2> (conditional) │ │ └──────────────────────────────────────────────────────┘
supervise 127.0.0.1:<p3> (conditional) │ └──────────────────────────────────────────────────────┬─────┘
▲ host TCP, reached via gvproxy port-forward
┌── microVM (per bottle, libkrun via smolvm) ──────────▼─┐
┌── gvproxy (per bottle) ─────────────────────────────┐ │ Smolfile: [network] allow_cidrs = ["192.168.X.2/32"]
│ │ subnet: 192.168.127.X/24 (X derived from slug) │ │ env: HTTPS_PROXY=http://192.168.X.2:8888 │ │
│ │ gateway: 192.168.127.X.1 │ │ GIT_GATE_URL=git://192.168.X.2:9418 (cond.) │ │
│ │ port_forwards: │ │ │ MCP_SUPERVISE_URL=http://192.168.X.2:9100 (cond) │
│ │ - gateway 8888 → host 127.0.0.1:<p1> │ │ No other host reachable — TSI denies any connect() │ │
│ │ - gateway 8889 → host 127.0.0.1:<p2> (cond) │ │ │ that isn't to 192.168.X.2. No DNS inside the guest
│ │ - gateway 8890 → host 127.0.0.1:<p3> (cond) │ │ (no resolver in the allowlist).
│ # nothing else │ └────────────────────────────────────────────────────────┘
│ │ DNS: proxy.internal → gateway IP; * → NXDOMAIN │ │
│ └─────────────────────────────────────────────────────┘ │
│ ▲ unixgram socket (VFKT handshake) │
│ │ │
│ ┌── microVM (per bottle) ─────────────────────────────┐ │
│ │ virtio-net device backed by VZFileHandle... │ │
│ │ env: HTTPS_PROXY=http://proxy.internal:8888 │ │
│ │ GIT_GATE_URL=http://proxy.internal:8889 │ │
│ │ MCP_SUPERVISE_URL=http://proxy.internal:8890 │ │
│ │ no other host visible │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │ │
└───────────────────────────────────────────────────────────────────┘ └───────────────────────────────────────────────────────────────────┘
``` ```
What the guest can reach, exhaustively: **only `proxy.internal` What the guest can reach, exhaustively: **only `<bundle-ip>` on
on the gateway-port set we configured.** Everything else — ports the bundle binds to 0.0.0.0**. Egress's 127.0.0.1-only bind
host LAN, host loopback (Postgres, IDE plugins, other bottles' makes it bundle-internal; host loopback / LAN / public internet
sidecars), public internet directly — is gone, enforced at the direct are all refused by TSI's allowlist.
gvproxy userspace stack rather than relying on guest cooperation.
Three changes vs. the Docker backend: Three changes vs. the Docker backend:
1. **One sidecar container per bottle, not four.** The bundle 1. **One sidecar container per bottle, not four.** Same bundle
defined in PRD 0024 is the unit of sidecar lifecycle on both image PRD 0024 ships for the docker backend.
backends. egress is internal to the bundle as pipelock's 2. **Sidecar container is on a per-bottle docker bridge with a
upstream, never directly addressed. pinned IP**, reached directly by the smolvm guest's allowed
2. **Sidecar container is on the host, not a sibling on a Docker /32 — no localhost port allocation, no userspace TCP/IP stack
internal network.** Isolation primitive is gvproxy's explicit in the middle.
port-forward list, not Docker's `--internal` flag. 3. **The agent dials IP literals, not hostnames.** TSI doesn't
3. **The agent's first hop is `proxy.internal`, not a sidecar's filter DNS at the protocol level, and we don't put DNS
container hostname.** Same scanning + DLP + auth-injection resolvers in the allowlist, so name resolution is denied by
chain, but the first hop crosses a userspace TCP/IP stack we construction.
own, not a Docker-managed bridge.
git-gate and supervise are conditional port forwards: only
emitted into gvproxy's config when the bottle actually uses
them, narrowing the attack surface for bottles that don't.
### Lifecycle ### Lifecycle
@@ -363,61 +390,59 @@ them, narrowing the attack surface for bottles that don't.
1. Cross-backend validation via `BottleBackend._validate` (skills, 1. Cross-backend validation via `BottleBackend._validate` (skills,
git identity files). git identity files).
2. Allocate one to three host loopback ports for the sidecar 2. Derive a per-bottle docker subnet from `sha256(slug) % 254`
bundle (pipelock always; git-gate and supervise conditional on (skipping the docker-default 17): `192.168.X.0/24`. The bundle
manifest — egress is internal to the bundle and gets no host IP is always `192.168.X.2` (gateway is `.1`).
port). 3. Resolve the agent guest image: convert the existing
3. Resolve the agent OCI archive path (build if missing, cache by `Dockerfile` into a `.smolmachine` artifact via
Dockerfile + agent-name hash). The sidecar-bundle image `smolvm pack create --image <name> -o <stage>/agent.smolmachine`
(`claude-bottle-sidecars:<pinned>`) is pulled or built per (idempotent, layer-cached).
PRD 0024; this backend does not own its build. 4. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`
4. Pick a per-bottle gvproxy subnet (e.g. `192.168.127.X/24` where using smolvm 0.8.0's schema:
`X` is derived from the slug) and render - `image` / `entrypoint` / `cmd` — bundled into the
`stage_dir/gvproxy.yaml`: one DNS entry for `proxy.internal` `.smolmachine` from the previous step (one Smolfile, one
and one `port_forwards` entry per active sidecar port artifact).
(gateway port → host loopback port on the bundle). - `env = [...]` — `HTTPS_PROXY`, `NO_PROXY`, `NODE_EXTRA_CA_CERTS`,
5. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`, etc., all pointing at IP-literal URLs (`http://192.168.X.2:8888`).
pinning command / env / a virtio-net device backed by the - `[network] allow_cidrs = ["192.168.X.2/32"]` — TSI's single
gvproxy unixgram socket path. No TSI flags. /32 allowlist.
6. Resolve the in-VM CA paths so launch knows where to copy 5. Resolve the in-VM CA paths so launch knows where to copy
pipelock's CA after start. pipelock's CA after start.
7. Return a `SmolmachinesBottlePlan` carrying the slug, port map, 6. Return a `SmolmachinesBottlePlan` carrying the slug, bundle
OCI archive path, Smolfile path, gvproxy config path, and subnet/IP, `.smolmachine` artifact path, Smolfile path, and
the bundle's container/run spec. bundle run spec.
`SmolmachinesBottleBackend.launch(plan)`: `SmolmachinesBottleBackend.launch(plan)`:
1. Start the sidecar bundle container with `docker run` (still 1. Create the per-bottle docker bridge network
using the local Docker daemon for sidecars; the VM is what's (`claude-bottle-bundle-<slug>` with the resolved subnet) and
moving off Docker). Wait for its three readiness signals: start the sidecar bundle container with `docker run --network
pipelock listening, git-gate listening (if enabled), supervise ... --ip <bundle-ip> ...`. Wait for its daemons to bind:
listening (if enabled). Register the teardown callback. pipelock on 8888, git-gate on 9418 (conditional), supervise
2. Start the per-bottle `gvproxy` against the unixgram socket on 9100 (conditional). Register teardown callbacks.
path the Smolfile references, with `port_forwards` pointed at 2. `smolvm machine create --from <stage>/agent.smolmachine
the bundle's published host ports. Wait for the socket to --smolfile <stage>/smolfile.toml <name>` and
appear (the spike-style poll loop from `agent-vm-isolation.md`). `smolvm machine start --name <name>`. The Smolfile's TSI
3. `smolvm machine create --smolfile <path>` and allowlist gates outbound to the bundle's /32; libkrun's TSI
`smolvm machine start <name>`. The Smolfile's virtio-net layer enforces it.
device handshakes (`VFKT` magic) with gvproxy on start. 3. Provisioning: CA install → prompt → skills → git → supervise
4. Provisioning: CA install → prompt → skills → git → supervise config, each via `smolvm machine exec` / `smolvm machine cp`.
config, each via `smolvm machine exec` (analogous to 4. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` /
`docker exec`).
5. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` /
`cp_in` all funnel through `smolvm machine exec` / `cp_in` all funnel through `smolvm machine exec` /
`smolvm machine cp`. `smolvm machine cp`.
6. Teardown: stop and remove the VM → stop gvproxy → stop + 5. Teardown: stop and delete the VM → stop + remove the bundle
remove the sidecar bundle container. container → remove the per-bottle docker network.
### Data model ### Data model
No manifest schema change. `bottles[]` continues to carry No manifest schema change. `bottles[]` continues to carry
`egress.allowlist`, `env`, `git`, `skills` references, etc.; the `egress.allowlist`, `env`, `git`, `skills` references, etc.; the
smolmachines backend reads the same fields as the docker backend. smolmachines backend reads the same fields as the docker backend.
`egress.allowlist` is enforced by pipelock on the host side `egress.allowlist` is enforced by pipelock inside the bundle
(unchanged from the docker backend); gvproxy's DNS resolves only (unchanged from the docker backend); the guest has no DNS resolver
`proxy.internal` regardless of the allowlist's contents, so an in TSI's allowlist, so an agent that tries to dial an arbitrary
agent that bypasses pipelock by raw IP cannot resolve any name hostname can't resolve it in the first place — the DNS-exfil
gvproxy doesn't know about. attack from PRD 0022 test 4 is blocked at the resolver step.
The `BottleSpec` dataclass and the `Bottle` ABC do not change. The `BottleSpec` dataclass and the `Bottle` ABC do not change.
@@ -442,37 +467,38 @@ The existing "unknown backend" `die()` path stays as-is.
- `smolvm` CLI binary on `$PATH` (one new external dep, gated by - `smolvm` CLI binary on `$PATH` (one new external dep, gated by
the preflight check). Pinned version policy is deferred to the the preflight check). Pinned version policy is deferred to the
open questions; v1 reads `smolvm --version` and refuses to launch open questions; v1 reads `smolvm --version` and refuses to launch
outside a known-good range. outside a known-good range (currently 0.8.x).
- `gvproxy` binary on `$PATH` - No `gvproxy` dep (the original draft listed it; dropped after
(`go install github.com/containers/gvisor-tap-vsock/cmd/gvproxy@latest`, the chunk-1 spike).
or vendored). Same preflight pattern as `smolvm`. - No `pyobjc-framework-Virtualization` dep (dropped from the
- `pyobjc-framework-Virtualization` *only* if smolmachines does original draft for the same reason).
not expose a way to attach virtio-net to a unixgram socket and
we fall back to driving Virtualization.framework directly (see
open question 1). Default path is "no PyObjC needed."
- No new pure-Python packages. Subprocess + stdlib `tomllib` for - No new pure-Python packages. Subprocess + stdlib `tomllib` for
Smolfile authoring; the gvproxy YAML is small enough to render Smolfile authoring.
by hand from a `dict[str, Any]`.
### Acceptance test plan ### Acceptance test plan
- **Unit (smolfile):** `tests/unit/test_smolfile.py` verifies the - **Unit (smolfile):** `tests/unit/test_smolfile.py` verifies the
renderer produces the expected TOML for a fixture bottle renderer produces the expected TOML for a fixture bottle in
command line, env entries, virtio-net device referencing the smolvm 0.8.0's schema — top-level `image` / `entrypoint` /
expected unixgram socket path, no TSI flags. `cmd` / `env`, plus `[network] allow_cidrs = ["<bundle-ip>/32"]`
- **Unit (gvproxy config):** `tests/unit/test_gvproxy_config.py` and nothing else under `[network]`.
verifies the per-bottle YAML has exactly one DNS entry - **Unit (subnet derivation):** the existing
(`proxy.internal`), one `port_forwards` entry per active `test_smolmachines_util.py` covers the per-bottle subnet hash
sidecar pointed at the resolved host loopback port, and a + collision-avoidance and stays as-is.
per-bottle subnet/gateway derived from the slug.
- **Integration smoke:** `tests/integration/test_smolmachines_smoke.py` - **Integration smoke:** `tests/integration/test_smolmachines_smoke.py`
with `prepare → launch → exec → teardown`, guarded by a with `prepare → launch → exec → teardown`, guarded by a
`smolvm` + `gvproxy` presence check + macOS / KVM platform check. `smolvm` presence check + macOS / KVM platform check.
- **Localhost-reach probe:** a focused integration test that - **Localhost-reach probe:** a focused integration test that
brings up a bottle, has the host bind a test service on brings up a bottle, has the host bind a test service on
`127.0.0.1:<unused-port>`, and asserts the in-bottle agent `127.0.0.1:<unused-port>`, and asserts the in-bottle agent
cannot connect to it. This is the regression test for the cannot connect to it. This is the regression test for the
exact gap that motivated choosing gvproxy over TSI. exact gap `--outbound-localhost-only` would have introduced —
with `--allow-cidr <bundle-ip>/32` only, the probe must fail.
- **Egress-port-bypass probe:** also brings up a bottle and
asserts the in-bottle agent's connect to `<bundle-ip>:9099`
(egress's port) is refused — confirming the bundle-internal
bind of egress to `127.0.0.1` works as the port-granularity
layer TSI doesn't provide.
- **PRD 0022 re-run:** with `CLAUDE_BOTTLE_BACKEND=smolmachines`, - **PRD 0022 re-run:** with `CLAUDE_BOTTLE_BACKEND=smolmachines`,
all five attack categories return sandbox-block markers and the all five attack categories return sandbox-block markers and the
suite passes. The test code does not change beyond the env-var suite passes. The test code does not change beyond the env-var
@@ -484,28 +510,32 @@ The existing "unknown backend" `die()` path stays as-is.
PRD 0024's bundle image is a prerequisite — this PRD assumes PRD 0024's bundle image is a prerequisite — this PRD assumes
`claude-bottle-sidecars:<pinned>` is available when chunk 3 lands. `claude-bottle-sidecars:<pinned>` is available when chunk 3 lands.
1. **Backend skeleton + selection + Smolfile + gvproxy renderers.** 1. **Backend skeleton + selection + Smolfile/gvproxy renderers.**
Subpackage layout, `_resolve_plan` stub that emits both a *Shipped (PR #62), but under the now-rejected gvproxy design.*
TOML Smolfile and a gvproxy YAML but doesn't launch anything, The Smolfile renderer emits `name = …` / `[[net]]` instead of
`_BACKENDS` registration, preflight `smolvm` + `gvproxy` smolvm 0.8.0's `image` / `[network] allow_cidrs`. The gvproxy
checks. Unit tests on both renderers. No VM bringup yet. renderer is dead. Chunk 2 rewrites the Smolfile renderer and
2. **gvproxy + VM lifecycle + OCI archive build.** `smolvm.py` deletes `gvproxy_config.py` / its tests.
and `gvproxy.py` subprocess wrappers, prepare-time image 2. **VM lifecycle + bundle bringup + Smolfile rewrite.**
build (existing Dockerfile → OCI archive), launch path that `smolvm.py` subprocess wrapper, prepare-time image conversion
starts gvproxy, brings up the VM attached to gvproxy's socket (`smolvm pack create` → `.smolmachine`), per-bottle docker
via VFKT handshake, exec into the VM, tear everything down. bridge + bundle container with pinned IP, launch path that
starts the bundle and brings up the VM (`smolvm machine create
--from --smolfile`), exec into the VM, tear everything down.
Smoke integration test: `exec("echo hi")` inside a started Smoke integration test: `exec("echo hi")` inside a started
VM. Includes the localhost-reach probe test from the VM. Includes the localhost-reach probe + egress-port-bypass
acceptance plan. probe from the acceptance plan. The chunk-1 Smolfile renderer
3. **Sidecar bundle lifecycle.** `sidecar_bundle.py`: per-bottle gets rewritten to the smolvm 0.8.0 schema; `gvproxy_config.py`
bundle container brought up via `docker run`, with one to and `gvproxy.py` (if any) get deleted.
three published host ports, gvproxy `port_forwards` pointed 3. **Bundle bind-address mitigation.** Update PRD 0024's bundle
at them, and teardown integrated into the bottle's lifecycle. init to bind egress on `127.0.0.1:9099` instead of `0.0.0.0`
Port allocator. No provisioning yet beyond what the bundle (or expose a config knob — open question 4). Reverify the
needs. egress-port-bypass probe. Pipelock / git-gate / supervise
continue to bind `0.0.0.0`.
4. **Provisioning parity with Docker.** CA install via 4. **Provisioning parity with Docker.** CA install via
`smolvm machine exec`, prompt/skills/.git copy-in, supervise `smolvm machine cp`, prompt/skills/.git copy-in, supervise
MCP config. End-to-end `start` works for a real agent manifest. MCP config. End-to-end `start` works for a real agent
manifest.
5. **PRD 0022 sandbox-escape suite green.** Skip-guard update, 5. **PRD 0022 sandbox-escape suite green.** Skip-guard update,
small adjustments to test helpers if any (the test uses small adjustments to test helpers if any (the test uses
`bottle.exec(script)` and inspects `returncode` + body for `bottle.exec(script)` and inspects `returncode` + body for
@@ -514,78 +544,68 @@ PRD 0024's bundle image is a prerequisite — this PRD assumes
## Open questions ## Open questions
1. **VMM choice: smolmachines vs PyObjC + Virtualization.framework.** 1. **~~VMM choice~~ Resolved.** Chunk-1 spike against `smolvm
The network design requires libkrun's virtio-net mode attached 0.8.0` confirmed there's no virtio-net-over-unixgram option;
to a unixgram socket (so gvproxy is the gateway). The the gvproxy design isn't viable on top of smolvm. Resolved
smolmachines research note says libkrun *has* a virtio-net by switching to TSI `--allow-cidr <bundle-ip>/32` + bundle
mode but says it "does not support policy" — meaning libkrun bind-address mitigation; smolvm stays as the VMM. See the
itself enforces no allowlist in that mode, which is exactly "Design pivot from the first draft" section.
what we want (gvproxy is the policy). What's unverified is 2. **`smolvm` install policy.** Pin via brew / `curl install.sh`,
whether the Smolfile surface lets us point virtio-net at a or vendor a binary in the repo. v1 likely runs
custom unixgram socket. If yes: this is a smolmachines backend `smolvm --version` at preflight and accepts a documented range
verbatim. If no: chunk 2 drops `smolvm` and drives (currently 0.8.x). The
`Virtualization.framework` via PyObjC directly (the recipe in `curl -sSL https://smolmachines.com/install.sh | sh` path is
`agent-vm-isolation.md` § "gvisor-tap-vsock + PyObjC + what the operator used; document it in the README.
Pipelock"), keeping the backend name "smolmachines" because 3. **CA install inside the agent guest.** Two paths: bake at
the operator-facing UX is unchanged. Resolve in chunk 1 via a prepare time (one `.smolmachine` artifact per CA fingerprint,
spike against the pinned smolmachines version. big cache key) vs. inject at start time via `smolvm machine
2. **`smolvm` + `gvproxy` install policy.** Pin via brew / cp` after the VM is up. PRD 0006 chose the runtime path for
`go install` versions, or vendor binaries in the repo. v1 Docker (docker-cp + `update-ca-certificates`); smolvm has the
likely runs `smolvm --version` / `gvproxy --help` at preflight same shape via `machine cp` + `machine exec`. Default to
and accepts a documented range; vendoring is heavier but runtime injection.
reduces "works on my Mac" drift. 4. **Bundle bind-address knob.** PRD 0024's bundle currently runs
3. **CA install inside the OCI overlay.** Two paths: bake at all four daemons under one supervisor with daemon argv
prepare time (one OCI archive per CA fingerprint, big cache hardcoded. To make egress bind `127.0.0.1:9099` instead of
key) vs. inject at start time via `smolvm machine exec` after `0.0.0.0:9099`, either: (a) edit the supervisor's
the VM is up. PRD 0006 chose the runtime path for Docker `_DAEMONS` entry to pass a `--listen-host 127.0.0.1` flag to
(docker-cp + `update-ca-certificates`); smolvm has the same mitmdump, OR (b) introduce a per-daemon `bind_localhost`
shape via `machine exec`. Default to runtime injection unless knob the renderer can set. Option (a) is simpler and matches
it conflicts with VM start order. that egress is bundle-internal regardless of backend; resolve
4. **gvproxy subnet collision.** Two concurrent bottles must not in chunk 3.
land on the same `192.168.127.X/24` subnet — they'd both want
the same gateway IP. Derive the third octet from a hash of
the slug (mod 254, skip the docker-default 17), and at launch
time confirm the subnet isn't already in use by another
bottle's gvproxy. Resolve the hash-collision policy in
chunk 2.
5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test 5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test
suite reads `returncode` + stdout + stderr from suite reads `returncode` + stdout + stderr from
`ExecResult`. Confirm the VM-exec path (`smolvm machine exec` `ExecResult`. Confirm `smolvm machine exec` propagates exit
or its PyObjC equivalent) propagates exit codes and separated codes and separated streams. The CLI help mentions a
streams. The research note's "external integration is the CLI" `--stream` flag for streaming output; behavior under default
implies yes, but the embedded SDK bug it flagged suggests we (non-stream) mode is what we want — verify in chunk 2.
should verify before coding around it.
6. **CI gating.** Gitea's act_runner is Linux without nested KVM, 6. **CI gating.** Gitea's act_runner is Linux without nested KVM,
so this backend's integration tests will skip there for the so this backend's integration tests will skip there for the
same structural reason the Docker bringup tests do (no real same structural reason the Docker bringup tests do. The skip
isolation primitive available on the runner). The skip predicate becomes `not (smolvm_available() and
predicate becomes `not (smolvm_available() and gvproxy_available() platform.system() == "Darwin")`. CI coverage for this backend
and platform.system() == "Darwin")`. CI coverage for this will come from local runs on the maintainer's macOS host
backend will come from local runs on the maintainer's macOS until a Darwin runner is wired up; ack that as a known gap.
host until a Darwin runner is wired up; ack that as a known
gap.
7. **Active bottle discovery.** Docker uses container labels to 7. **Active bottle discovery.** Docker uses container labels to
enumerate active bottles (`list_active` queries the daemon). enumerate active bottles (`list_active` queries the daemon).
The microVM enumeration story is `smolvm machine list` The microVM enumeration story is `smolvm machine ls --json`;
(or the PyObjC backend's own bookkeeping); the plan is to the plan is to filter on a deterministic name prefix
mirror the label scheme via Smolfile metadata `claude-bottle-<slug>` + cross-reference with on-disk metadata
(`labels = { "claude-bottle" = "1" }`-style entries, if the under `state/<slug>/`.
format supports it; otherwise via a deterministic name prefix
`claude-bottle-<slug>` + on-disk metadata under
`state/<slug>/`).
## References ## References
- `docs/research/agent-vm-isolation.md` — primary reference for - `docs/research/agent-vm-isolation.md` — describes the
the gvproxy + `VZFileHandleNetworkDeviceAttachment` network gvproxy + `VZFileHandleNetworkDeviceAttachment` path. The
attachment used here. The "Full Setup: gvisor-tap-vsock + current design no longer needs that recipe (the TSI single-IP
PyObjC + Pipelock" section is the recipe the PyObjC fallback approach replaced it after the chunk-1 spike); kept for
in open question 1 would adopt verbatim. historical context if a future operator needs to drop smolvm
and own the VM lifecycle directly.
- `docs/research/smolmachines-as-vm-backend.md` — evaluation of - `docs/research/smolmachines-as-vm-backend.md` — evaluation of
smolmachines as the VM lifecycle wrapper. This PRD diverges smolmachines as the VM lifecycle wrapper. The research note's
from its conclusion on the *network* primitive (rejecting TSI TSI-bad-due-to-loopback-gap argument turned out to apply only
in favor of gvproxy) but keeps its VM-lifecycle conclusion to `--outbound-localhost-only`, not to TSI generally; this PRD
conditional on the libkrun-virtio-net spike in open question 1. uses `--allow-cidr <bundle-ip>/32` instead, sidestepping the
gap.
- `docs/research/agent-sandbox-landscape.md` — identifies - `docs/research/agent-sandbox-landscape.md` — identifies
`"runtime": "microvm"`-style opt-in as the borrowable idea; `"runtime": "microvm"`-style opt-in as the borrowable idea;
smolmachines is the concrete implementation. smolmachines is the concrete implementation.