Files
bot-bottle/docs/prds/0023-smolmachines-backend.md
T
2026-05-28 17:56:14 -04:00

672 lines
34 KiB
Markdown

# PRD 0023: smolmachines bottle backend
- **Status:** Draft
- **Author:** didericis
- **Created:** 2026-05-26
## Summary
Ship a second concrete `BottleBackend`
`SmolmachinesBottleBackend`, selected via
`BOT_BOTTLE_BACKEND=smolmachines` — that runs each bottle inside
a per-agent libkrun microVM via `smolvm`. Egress is enforced by
libkrun's TSI ("Transport Socket Interface") allowlist set to a
**single /32** — the docker IP of the per-bottle sidecar bundle
(PRD 0024) on a dedicated docker bridge. Everything else — host
loopback, LAN, public internet directly — is denied at the VMM
layer, before a host-side socket is ever opened.
The sidecar bundle is the same image PRD 0024 ships for the docker
backend; this PRD consumes it. Inside the bundle, pipelock /
git-gate / supervise bind `0.0.0.0:<port>` so the agent (reaching
the bundle via the allowed /32) can talk to them; egress (the
internal upstream of pipelock) binds `127.0.0.1:9099` so it's only
reachable from pipelock within the bundle — the agent can't dial
it directly even though TSI's allowlist is IP-granular rather than
port-granular.
The Docker backend ships unchanged; this is opt-in via the existing
env-var selector. The acceptance gate is PRD 0022's
`tests/integration/test_sandbox_escape.py` running green against
`BOT_BOTTLE_BACKEND=smolmachines`.
### Design pivot from the first draft
The original PRD landed (PR #53) calling for **gvproxy** as the
network primitive — a userspace TCP/IP stack the guest's virtio-net
device would hook into via `VZFileHandleNetworkDeviceAttachment`,
with explicit `port_forwards` controlling what the guest could
reach. That design was built around the smolmachines research
note's claim that libkrun supports a virtio-net mode separate
from TSI.
Chunk 1's empirical spike against `smolvm 0.8.0`'s actual CLI
contradicted that claim: smolvm exposes only TSI-style egress
filters (`--allow-host`, `--allow-cidr`, `--outbound-localhost-only`),
with no documented option to attach virtio-net to a custom unixgram
socket. The gvproxy path would have required dropping smolvm
entirely and driving `Virtualization.framework` via PyObjC.
Re-examining the "why gvproxy" argument with smolvm's real surface,
the loopback gap PRD 0023 worried about only exists with
`--outbound-localhost-only`. With `--allow-cidr <bundle-ip>/32`
instead — and no `--outbound-localhost-only` — the agent can reach
exactly one IP (the bundle) and nothing else: not host loopback,
not LAN, not public internet. That's the same security property
the gvproxy design was chasing, enforced one layer lower (VMM
socket interception, not a userspace TCP/IP stack we maintain),
with significantly less code.
## Problem
`agent-vm-isolation.md` argues for hardware-isolated microVMs over
container-based bottles on macOS; `smolmachines-as-vm-backend.md`
evaluates smolmachines as the lifecycle wrapper. Today, the only
backend in the registry is Docker
(`bot_bottle/backend/__init__.py:_BACKENDS = {"docker": ...}`),
and four things motivate a second one now:
- **Network reach beyond pipelock.** The threat model is a malicious
agent attempting to dial something on the operator's *local
network* (`192.168.x.x` services, the home router, a coworker's
laptop on the same Wi-Fi) or *host's loopback* (a local database,
an editor plugin, another bottle's sidecar) without traversing
pipelock. The Docker backend's `--internal` network blocks the
first; nothing in the current stack blocks the second cleanly.
This PRD's design closes both gaps via TSI's
`--allow-cidr <bundle-ip>/32`: the guest can only dial that one
IP, period. Host loopback, LAN, and the public internet are
refused at the VMM layer.
- **Isolation ceiling.** On macOS the Docker backend's agent
container shares Docker Desktop's host VM with every other
bottle. Container escape from claude-code lands the agent inside
that shared VM. A per-bottle microVM gets hardware page tables
via `Hypervisor.framework`; cross-bottle isolation becomes
enforced by the CPU's MMU instead of namespace bookkeeping.
- **PRD 0022 is backend-agnostic by design** but currently only
exercises the Docker backend. The suite was written with
`BOT_BOTTLE_BACKEND` selection in mind precisely so the
smolmachines path could be validated against the same five
attacks. Until a second backend exists, the abstraction is
unproven.
- **CI carve-outs.** Most bottle-bringup integration tests skip
under `GITEA_ACTIONS=true` because act_runner shares the host
Docker socket but not the host filesystem. A microVM path
doesn't share that constraint shape (it has its own, but
different), so adding the backend forces the abstraction to be
clean in places where Docker-specific assumptions have been
tolerated.
## How TSI's single-IP allowlist achieves the property
libkrun's TSI hijacks guest socket syscalls inside the VMM and
opens the actual sockets from the host process, gated by a CIDR
allowlist. Three flags expose the allowlist:
- `--outbound-localhost-only` — opens up the whole `127.0.0.0/8`
range, all ports. This is the flag the first draft of this PRD
rejected, and we still reject it: it would let the agent dial
any host-loopback service (local Postgres, IDE plugins, another
bottle's sidecar).
- `--allow-cidr CIDR` — IP/CIDR allowlist with no port filter.
- `--allow-host HOSTNAME` — resolves the host on the host's DNS
at VM-start time, stores the result as `/32` CIDRs, and also
enables guest-side DNS filtering (only the allowed hostname
resolves).
This backend uses `--allow-cidr <bundle-ip>/32` (single host) and
nothing else. With the bundle running as a docker container with a
known IP on a dedicated docker bridge, the agent can reach exactly
one address: the bundle. Host loopback is denied (not in the
allowlist). LAN is denied. Public internet directly is denied. DNS
inside the guest is denied (no resolver in the allowlist) — the
agent uses an IP literal for `HTTPS_PROXY`.
The one wrinkle TSI doesn't directly handle is **port granularity
within the allowed IP**. The bundle runs four daemons; pipelock /
git-gate / supervise are agent-facing, egress is pipelock's
internal upstream. If egress were bound to `0.0.0.0:9099` inside
the bundle, the agent could dial `<bundle-ip>:9099` and bypass
pipelock's DLP. We mitigate by binding egress to `127.0.0.1:9099`
*inside* the bundle so only pipelock — also in the bundle, on the
same localhost — can reach it. The bind-address strategy gives us
port-level isolation that TSI's IP-only allowlist doesn't.
Net result: same security property the first draft chased with
gvproxy, enforced at the VMM layer rather than via a userspace
TCP/IP stack, with significantly less code (no gvproxy lifecycle,
no `VZFileHandleNetworkDeviceAttachment` plumbing, no Smolfile
virtio-net carve-out smolvm doesn't expose anyway).
## Goals / Success Criteria
The feature works when all of the following are observable on a
macOS host with smolmachines installed:
- `BOT_BOTTLE_BACKEND=smolmachines python3 cli.py start <agent>`
brings up a microVM, runs claude-code inside it, and tears it
down on exit. Same y/N preflight UX as Docker — only the
resolved-runtime line differs.
- The sandbox-escape suite in `tests/integration/test_sandbox_escape.py`
runs green against the smolmachines backend (all five attack
categories blocked).
- Selecting the backend on a host without `smolvm` installed dies
at startup with an install pointer; no silent fall-through to
Docker.
- Active bottles show up under
`python3 cli.py list-bottles` regardless of backend.
- `python3 cli.py stop <bottle>` and orphan cleanup work for both
Docker bottles and smolmachines bottles via the same CLI surface.
The feature is **done** when all of the following ship:
- A new `bot_bottle/backend/smolmachines/` subpackage exists,
mirroring the layout of `bot_bottle/backend/docker/`
(`backend.py`, `bottle.py`, `bottle_plan.py`,
`bottle_cleanup_plan.py`, `prepare.py`, `launch.py`,
`cleanup.py`, `util.py`, and a `provision/` subpackage for the
five `provision_*` methods).
- `SmolmachinesBottleBackend` registered under the
`"smolmachines"` key in `bot_bottle/backend/__init__.py:_BACKENDS`.
- Per-bottle Smolfile generation: a runtime-rendered TOML written
to the bottle's stage dir using smolvm 0.8.0's actual schema
(`image`, `entrypoint`, `cmd`, `env = ["K=V", …]`, `[network]
allow_cidrs = ["<bundle-ip>/32"]`). The renderer chunk 1
shipped emits the wrong shape (built around the gvproxy
unixgram attachment) — it gets rewritten in this chunk plan as
the cost of the design pivot.
- Per-bottle docker bridge for the bundle: the sidecar bundle
runs as a docker container on a dedicated per-bottle bridge
network with a pinned IP (`--ip <bundle-ip>` against a
per-slug `/24` derived from the slug hash). The pinned IP is
what TSI's allowlist points at; without pinning we'd need to
inspect the running container's IP and feed it back into the
Smolfile, which is a race.
- Per-bottle sidecar bundle: one container per bottle running the
bundle image defined in PRD 0024. pipelock / git-gate /
supervise bind `0.0.0.0:<port>` so the agent (reaching the
bundle via the allowed /32) can reach them. egress binds
`127.0.0.1:9099` inside the bundle so only pipelock can reach
it — the agent sees `<bundle-ip>:9099` refuse the connection
even though TSI's allowlist permits the IP. The agent's
environment carries IP-literal URLs (e.g.
`HTTPS_PROXY=http://<bundle-ip>:8888`).
- The agent guest image is produced from the existing `Dockerfile`
via `smolvm pack create``.smolmachine` artifact, then loaded
into smolvm via `machine create --from <path>`. The image build
step is part of `prepare`, analogous to
`docker_mod.build_image`.
- The PRD 0022 sandbox-escape suite, run with
`BOT_BOTTLE_BACKEND=smolmachines`, passes locally on a
smolmachines-capable host. The suite is updated to skip cleanly
on hosts that can't reach smolmachines (same shape as the
existing `GITEA_ACTIONS == "true"` skip), not to fail.
- README + `CLAUDE.md` updated to document the env-var selection,
the macOS-only scope for v1, and the `smolvm` install
prerequisite.
## Non-goals
- **No Linux KVM support shipped in this PRD.** smolmachines works
on Linux via KVM, but the abstraction win is biggest on macOS
where Docker's shared-VM topology hurts most. Linux can come
later behind the same selector.
- **No removal of the Docker backend.** Both backends ship side by
side. Selection stays env-driven; the manifest does not gain a
`backend` field.
- **No default-backend change.** `docker` remains the default
value of `BOT_BOTTLE_BACKEND`; smolmachines is strictly
opt-in until it has been load-bearing on at least one operator's
workflow for a release cycle.
- **No `--outbound-localhost-only`.** That TSI flag opens the
entire `127.0.0.0/8` range and is the loopback gap the original
draft of this PRD called out. Use `--allow-cidr <bundle-ip>/32`
instead so the agent reaches one IP and one IP only.
- **No gvproxy.** Rejected after the chunk-1 spike against the
real smolvm CLI: smolvm 0.8.0 exposes no virtio-net-over-unixgram
attachment. Adopting gvproxy would have required dropping smolvm
and driving Virtualization.framework via PyObjC; the TSI
single-IP approach gives the same property at a fraction of the
cost.
- **No host bind mounts.** The smolmachines research note flagged
that `-v HOST:GUEST` mounts via virtiofs would defeat the
isolation goal. The manifest already has no concept of host
mounts; this PRD does not introduce one. If a future PRD wants
agent-side access to host files, it must come through a
controlled channel (vsock relay, OCI overlay, supervise sidecar
endpoint).
- **No HTTP API mode.** `smolvm serve` is the long-term-clean
control plane, but v1 drives smolmachines via CLI subprocess
invocations — the lower-overhead first iteration the research
note already endorses.
- **No custom kernel / initrd.** smolmachines uses libkrunfw
only; the agent image is an OCI ref, not a kernel + rootfs pair.
- **No warm-pool or snapshot/restore.** Each bottle gets a fresh
microVM; cold-start cost is paid up front.
- **No supervise/agent-credential rewrites for the new backend.**
Provisioning logic ports as-is; only the *transport* (host-side
port URLs instead of in-network DNS names) changes.
## Scope
### In scope
- New `bot_bottle/backend/smolmachines/` subpackage with the
full set of `BottleBackend` overrides.
- Smolfile generator (TOML) emitting the smolvm 0.8.0 schema:
top-level `image`, `entrypoint`, `cmd`, `env = [...]`,
`[network] allow_cidrs = ["<bundle-ip>/32"]`. (The renderer
that chunk 1 shipped under the gvproxy design — `name=`,
`[[net]]` — gets rewritten as part of this chunk plan.)
- A host-side sidecar-bundle lifecycle manager that brings up
one container per bottle on a dedicated per-bottle docker
bridge with a pinned IP (`--ip <bundle-ip>`), waits for the
daemons to bind their ports, and tears it down with the bottle.
This backend depends on PRD 0024's bundle image; it does not
own the bundle's Dockerfile or init.
- Per-bottle CA install path: the bundle's CA cert lands inside
the microVM via `smolvm machine exec` after start
(analogous to the existing `provision_ca` for Docker).
- Per-bottle docker bridge: a `bot-bottle-bundle-<slug>`
network with a /24 subnet derived from the slug hash; the
bundle gets a pinned IP at `.2` (gateway is `.1`). Pinning the
IP at start time avoids a race between the bundle's IP being
assigned and the Smolfile being written.
- TSI policy: the Smolfile sets `[network] allow_cidrs =
["<bundle-ip>/32"]` and nothing else. The agent can reach the
bundle's IP (any port) and nothing else; no DNS resolution is
available inside the guest, so the agent uses IP-literal URLs.
- Bundle bind addresses: egress binds `127.0.0.1:9099` inside
the bundle (pipelock-only); pipelock / git-gate / supervise
bind `0.0.0.0` so the agent can reach them. This is the
port-granularity TSI's IP-only allowlist doesn't provide.
PRD 0024's bundle init may need a config knob for this;
raised as open question 4.
- Preflight `smolvm` check: if the user selects this backend and
`smolvm` isn't on `$PATH`, die with an install pointer (brew tap
+ version pin TBD in implementation; see open question 3).
- Manifest validation: refuse any bottle field this backend can't
honor (today there are none, since the Docker backend already
rejects host mounts; this is a forward-compat check).
- Tests:
- Smoke unit-level test: Smolfile renderer produces the
expected TOML for a fixture bottle (smolvm 0.8.0 shape).
- Integration test: `prepare → launch → exec("echo hi") →
teardown` on a smolmachines-capable host (skips otherwise
via the same env/platform gate the Docker integration tests
use).
- PRD 0022 suite, re-run with the env var flipped, passes.
### Out of scope
- VM image caching across bottles (each prepare rebuilds from the
OCI archive; layer reuse is whatever smolmachines provides).
- Cross-host bottle relocation (the OCI archive is local-only).
- Operator-facing knobs for vCPU / memory / overlay size (use
sensible defaults; expose as manifest fields in a later PRD if
needed).
- Integration with the `supervise` plane's permission-prompt UX
beyond port plumbing — supervise already speaks HTTP and binds
to whatever loopback the backend hands it.
## Proposed Design
### Backend layout
```
bot_bottle/backend/smolmachines/
__init__.py re-exports SmolmachinesBottleBackend
backend.py SmolmachinesBottleBackend façade
bottle.py SmolmachinesBottle (exec_claude / exec / cp_in / close)
bottle_plan.py SmolmachinesBottlePlan + .print()
bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan
prepare.py resolve_plan(spec, stage_dir, ...) -> SmolmachinesBottlePlan
launch.py @contextmanager launch(plan) -> SmolmachinesBottle
cleanup.py prepare_cleanup / cleanup / list_active
smolfile.py bottle_plan_to_smolfile(...) -> dict + render
sidecar_bundle.py host-side bundle lifecycle (per-bottle docker bridge + pinned IP)
smolvm.py thin subprocess wrapper: machine create/start/exec/stop, pack create
util.py slugify, subnet derivation, OCI archive helpers
provision/ ca.py, prompt.py, skills.py, git.py, supervise.py
```
Note what's NOT here vs. the original draft: `gvproxy.py`,
`vfkit_attach.py`. The gvproxy design needed both; the TSI single-IP
design needs neither.
### Network + egress topology
```
┌── macOS host ─────────────────────────────────────────────────────┐
│ │
│ ┌── per-bottle docker bridge bot-bottle-bundle-<slug> ──┐ │
│ │ subnet: 192.168.X.0/24 (X = hash(slug) mod 254) │ │
│ │ │ │
│ │ ┌── bundle container (pinned --ip 192.168.X.2) ────────┐ │ │
│ │ │ init.py (PRD 0024 Python supervisor) │ │ │
│ │ │ ├─ pipelock (binds 0.0.0.0:8888) │ │ │
│ │ │ ├─ egress (mitmproxy) (binds 127.0.0.1:9099) │ │ │
│ │ │ ├─ git-gate (binds 0.0.0.0:9418) │ │ │
│ │ │ └─ supervise (binds 0.0.0.0:9100) │ │ │
│ │ │ Internal-only egress is unreachable from outside │ │ │
│ │ │ the bundle even though TSI permits the IP. │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────┬─────┘ │
│ │ │
│ ┌── microVM (per bottle, libkrun via smolvm) ──────────▼─┐ │
│ │ Smolfile: [network] allow_cidrs = ["192.168.X.2/32"] │ │
│ │ env: HTTPS_PROXY=http://192.168.X.2:8888 │ │
│ │ GIT_GATE_URL=git://192.168.X.2:9418 (cond.) │ │
│ │ MCP_SUPERVISE_URL=http://192.168.X.2:9100 (cond) │ │
│ │ No other host reachable — TSI denies any connect() │ │
│ │ that isn't to 192.168.X.2. No DNS inside the guest │ │
│ │ (no resolver in the allowlist). │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────┘
```
What the guest can reach, exhaustively: **only `<bundle-ip>` on
ports the bundle binds to 0.0.0.0**. Egress's 127.0.0.1-only bind
makes it bundle-internal; host loopback / LAN / public internet
direct are all refused by TSI's allowlist.
Three changes vs. the Docker backend:
1. **One sidecar container per bottle, not four.** Same bundle
image PRD 0024 ships for the docker backend.
2. **Sidecar container is on a per-bottle docker bridge with a
pinned IP**, reached directly by the smolvm guest's allowed
/32 — no localhost port allocation, no userspace TCP/IP stack
in the middle.
3. **The agent dials IP literals, not hostnames.** TSI doesn't
filter DNS at the protocol level, and we don't put DNS
resolvers in the allowlist, so name resolution is denied by
construction.
### Lifecycle
`SmolmachinesBottleBackend.prepare(spec, stage_dir)`:
1. Cross-backend validation via `BottleBackend._validate` (skills,
git identity files).
2. Derive a per-bottle docker subnet from `sha256(slug) % 254`
(skipping the docker-default 17): `192.168.X.0/24`. The bundle
IP is always `192.168.X.2` (gateway is `.1`).
3. Resolve the agent guest image: `docker build` the existing
`Dockerfile`, then convert the resulting image into a
`.smolmachine` artifact. Empirically `smolvm pack create` only
reads OCI registry refs — it rejects `docker-daemon://`,
`oci-layout://`, `docker-archive:` tarballs, and every other
transport tested. The conversion path is a registry hop: bring
up an ephemeral `registry:2.8.3` container bound to
`127.0.0.1:<random>`, `docker tag` + `docker push` into it,
`smolvm pack create --image localhost:<port>/bot-bottle:<id>`,
tear down the registry. The `.smolmachine` is cached under
`~/.cache/bot-bottle/smolmachines/` keyed by the docker
image ID, so Dockerfile changes invalidate the cache and
unchanged rebuilds skip the whole pipeline.
4. Render the per-bottle Smolfile to `stage_dir/smolfile.toml`
using smolvm 0.8.0's schema:
- `image` / `entrypoint` / `cmd` — bundled into the
`.smolmachine` from the previous step (one Smolfile, one
artifact).
- `env = [...]` — `HTTPS_PROXY`, `NO_PROXY`, `NODE_EXTRA_CA_CERTS`,
etc., all pointing at IP-literal URLs (`http://192.168.X.2:8888`).
- `[network] allow_cidrs = ["192.168.X.2/32"]` — TSI's single
/32 allowlist.
5. Resolve the in-VM CA paths so launch knows where to copy
pipelock's CA after start.
6. Return a `SmolmachinesBottlePlan` carrying the slug, bundle
subnet/IP, `.smolmachine` artifact path, Smolfile path, and
bundle run spec.
`SmolmachinesBottleBackend.launch(plan)`:
1. Create the per-bottle docker bridge network
(`bot-bottle-bundle-<slug>` with the resolved subnet) and
start the sidecar bundle container with `docker run --network
... --ip <bundle-ip> ...`. Wait for its daemons to bind:
pipelock on 8888, git-gate on 9418 (conditional), supervise
on 9100 (conditional). Register teardown callbacks.
2. `smolvm machine create --from <stage>/agent.smolmachine
--smolfile <stage>/smolfile.toml <name>` and
`smolvm machine start --name <name>`. The Smolfile's TSI
allowlist gates outbound to the bundle's /32; libkrun's TSI
layer enforces it.
3. Provisioning: CA install → prompt → skills → git → supervise
config, each via `smolvm machine exec` / `smolvm machine cp`.
4. Yield a `SmolmachinesBottle` whose `exec_claude` / `exec` /
`cp_in` all funnel through `smolvm machine exec` /
`smolvm machine cp`.
5. Teardown: stop and delete the VM → stop + remove the bundle
container → remove the per-bottle docker network.
### Data model
No manifest schema change. `bottles[]` continues to carry
`egress.allowlist`, `env`, `git`, `skills` references, etc.; the
smolmachines backend reads the same fields as the docker backend.
`egress.allowlist` is enforced by pipelock inside the bundle
(unchanged from the docker backend); the guest has no DNS resolver
in TSI's allowlist, so an agent that tries to dial an arbitrary
hostname can't resolve it in the first place — the DNS-exfil
attack from PRD 0022 test 4 is blocked at the resolver step.
The `BottleSpec` dataclass and the `Bottle` ABC do not change.
### Selection wiring
In `bot_bottle/backend/__init__.py`:
```python
from .docker import DockerBottleBackend
from .smolmachines import SmolmachinesBottleBackend
_BACKENDS: dict[str, BottleBackend[Any, Any]] = {
"docker": DockerBottleBackend(),
"smolmachines": SmolmachinesBottleBackend(),
}
```
The existing "unknown backend" `die()` path stays as-is.
### External dependencies
- `smolvm` CLI binary on `$PATH` (one new external dep, gated by
the preflight check). Pinned version policy is deferred to the
open questions; v1 reads `smolvm --version` and refuses to launch
outside a known-good range (currently 0.8.x).
- No `gvproxy` dep (the original draft listed it; dropped after
the chunk-1 spike).
- No `pyobjc-framework-Virtualization` dep (dropped from the
original draft for the same reason).
- No new pure-Python packages. Subprocess + stdlib `tomllib` for
Smolfile authoring.
### Acceptance test plan
- **Unit (smolfile):** `tests/unit/test_smolfile.py` verifies the
renderer produces the expected TOML for a fixture bottle in
smolvm 0.8.0's schema — top-level `image` / `entrypoint` /
`cmd` / `env`, plus `[network] allow_cidrs = ["<bundle-ip>/32"]`
and nothing else under `[network]`.
- **Unit (subnet derivation):** the existing
`test_smolmachines_util.py` covers the per-bottle subnet hash
+ collision-avoidance and stays as-is.
- **Integration smoke:** `tests/integration/test_smolmachines_smoke.py`
with `prepare → launch → exec → teardown`, guarded by a
`smolvm` presence check + macOS / KVM platform check.
- **Localhost-reach probe:** a focused integration test that
brings up a bottle, has the host bind a test service on
`127.0.0.1:<unused-port>`, and asserts the in-bottle agent
cannot connect to it. This is the regression test for the
exact gap `--outbound-localhost-only` would have introduced —
with `--allow-cidr <bundle-ip>/32` only, the probe must fail.
- **Egress-port-bypass probe:** also brings up a bottle and
asserts the in-bottle agent's connect to `<bundle-ip>:9099`
(egress's port) is refused — confirming the bundle-internal
bind of egress to `127.0.0.1` works as the port-granularity
layer TSI doesn't provide.
- **PRD 0022 re-run:** with `BOT_BOTTLE_BACKEND=smolmachines`,
all five attack categories return sandbox-block markers and the
suite passes. The test code does not change beyond the env-var
flip — that's the contract the PRD 0022 abstraction was
designed for.
## Sizing — into chunks
PRD 0024's bundle image is a prerequisite — this PRD assumes
`bot-bottle-sidecars:<pinned>` is available when chunk 3 lands.
1. **Backend skeleton + selection + Smolfile/gvproxy renderers.**
*Shipped (PR #62), but under the now-rejected gvproxy design.*
The Smolfile renderer emits `name = …` / `[[net]]` instead of
smolvm 0.8.0's `image` / `[network] allow_cidrs`. The gvproxy
renderer is dead. Chunk 2 rewrites the Smolfile renderer and
deletes `gvproxy_config.py` / its tests.
2. **VM lifecycle + bundle bringup + Smolfile rewrite.**
`smolvm.py` subprocess wrapper, prepare-time image conversion
(`smolvm pack create` → `.smolmachine`), per-bottle docker
bridge + bundle container with pinned IP, launch path that
starts the bundle and brings up the VM (`smolvm machine create
--from --smolfile`), exec into the VM, tear everything down.
Smoke integration test: `exec("echo hi")` inside a started
VM. Includes the localhost-reach probe + egress-port-bypass
probe from the acceptance plan. The chunk-1 Smolfile renderer
gets rewritten to the smolvm 0.8.0 schema; `gvproxy_config.py`
and `gvproxy.py` (if any) get deleted.
3. **Bundle bind-address mitigation.** Update PRD 0024's bundle
init to bind egress on `127.0.0.1:9099` instead of `0.0.0.0`
(or expose a config knob — open question 4). Reverify the
egress-port-bypass probe. Pipelock / git-gate / supervise
continue to bind `0.0.0.0`.
4. **Provisioning parity with Docker.** CA install via
`smolvm machine cp`, prompt/skills/.git copy-in, supervise
MCP config. End-to-end `start` works for a real agent
manifest.
5. **PRD 0022 sandbox-escape suite green.** Skip-guard update,
small adjustments to test helpers if any (the test uses
`bottle.exec(script)` and inspects `returncode` + body for
sandbox markers — should be transport-agnostic, but verify).
Document the macOS-only scope in README.
## Open questions
1. **~~VMM choice~~ Resolved.** Chunk-1 spike against `smolvm
0.8.0` confirmed there's no virtio-net-over-unixgram option;
the gvproxy design isn't viable on top of smolvm. Resolved
by switching to TSI `--allow-cidr <bundle-ip>/32` + bundle
bind-address mitigation; smolvm stays as the VMM. See the
"Design pivot from the first draft" section.
2. **`smolvm` install policy.** Pin via brew / `curl install.sh`,
or vendor a binary in the repo. v1 likely runs
`smolvm --version` at preflight and accepts a documented range
(currently 0.8.x). The
`curl -sSL https://smolmachines.com/install.sh | sh` path is
what the operator used; document it in the README.
3. **CA install inside the agent guest.** Two paths: bake at
prepare time (one `.smolmachine` artifact per CA fingerprint,
big cache key) vs. inject at start time via `smolvm machine
cp` after the VM is up. PRD 0006 chose the runtime path for
Docker (docker-cp + `update-ca-certificates`); smolvm has the
same shape via `machine cp` + `machine exec`. Default to
runtime injection.
4. **Bundle bind-address knob.** PRD 0024's bundle currently runs
all four daemons under one supervisor with daemon argv
hardcoded. To make egress bind `127.0.0.1:9099` instead of
`0.0.0.0:9099`, either: (a) edit the supervisor's
`_DAEMONS` entry to pass a `--listen-host 127.0.0.1` flag to
mitmdump, OR (b) introduce a per-daemon `bind_localhost`
knob the renderer can set. Option (a) is simpler and matches
that egress is bundle-internal regardless of backend; resolve
in chunk 3.
5. **`bottle.exec(script)` exit-code fidelity.** The PRD 0022 test
suite reads `returncode` + stdout + stderr from
`ExecResult`. Confirm `smolvm machine exec` propagates exit
codes and separated streams. The CLI help mentions a
`--stream` flag for streaming output; behavior under default
(non-stream) mode is what we want — verify in chunk 2.
6. **CI gating.** Gitea's act_runner is Linux without nested KVM,
so this backend's integration tests will skip there for the
same structural reason the Docker bringup tests do. The skip
predicate becomes `not (smolvm_available() and
platform.system() == "Darwin")`. CI coverage for this backend
will come from local runs on the maintainer's macOS host
until a Darwin runner is wired up; ack that as a known gap.
7. **Active bottle discovery.** Docker uses container labels to
enumerate active bottles (`list_active` queries the daemon).
The microVM enumeration story is `smolvm machine ls --json`;
the plan is to filter on a deterministic name prefix
`bot-bottle-<slug>` + cross-reference with on-disk metadata
under `state/<slug>/`.
8. **Loopback scoping (Docker Desktop pivot).** The original
design pinned the bundle at a docker bridge IP and set TSI's
allowlist to `<bundle-ip>/32`. On Docker Desktop / macOS the
daemon runs inside its own Linux VM, so bridge IPs aren't
reachable from macOS networking — TSI's syscall impersonation
can't reach them. Resolution: publish each agent-facing bundle
port on host loopback (`-p 127.0.0.1::<port>`) and set TSI to
`127.0.0.1/32`. **This widens the TSI allowlist to anything
bound to macOS's loopback** — postgres, dev servers, other
bottles' published ports, mDNSResponder, etc.
**Fix + smolvm 0.8.0 workaround.** Allocate each bottle a
unique loopback alias (`127.0.0.16` .. `127.0.0.31`), bind
bundle port-forwards to it, set TSI's allowlist to that
alias's /32. The agent can only reach its own bundle; other
bottles' ports, host loopback services, and the internet are
all denied.
Smolvm 0.8.0 silently drops `--allow-cidr` when combined
with `--from <smolmachine>` (verified empirically:
`agent.config.json` shows `allowed_cidrs:null` despite the
flag). The launcher patches smolvm's persistent state DB
(`~/Library/Application Support/smolvm/server/smolvm.db`,
`vms.data` BLOB) between `machine create` and `machine
start` to set the allowlist directly. Smolvm reads the DB
at start, so TSI enforces. Tested end-to-end: VM → `127.0.0.1`
= "Permission denied"; VM → `<alias>:<bundle-port>` =
connects.
Other paths tried that didn't work: `machine update
--allow-cidr` doesn't exist; stop-edit-`agent.config.json`-
restart fails (file removed on stop); `--smolfile` mutually
exclusive with `--from`; `--image localhost:<port>/...` fails
because smolvm's pull agent can't reach host loopback during
pull. When smolvm honors `--allow-cidr` with `--from`
upstream, the DB patch becomes redundant and can be removed.
## References
- `docs/research/agent-vm-isolation.md` — describes the
gvproxy + `VZFileHandleNetworkDeviceAttachment` path. The
current design no longer needs that recipe (the TSI single-IP
approach replaced it after the chunk-1 spike); kept for
historical context if a future operator needs to drop smolvm
and own the VM lifecycle directly.
- `docs/research/smolmachines-as-vm-backend.md` — evaluation of
smolmachines as the VM lifecycle wrapper. The research note's
TSI-bad-due-to-loopback-gap argument turned out to apply only
to `--outbound-localhost-only`, not to TSI generally; this PRD
uses `--allow-cidr <bundle-ip>/32` instead, sidestepping the
gap.
- `docs/research/agent-sandbox-landscape.md` — identifies
`"runtime": "microvm"`-style opt-in as the borrowable idea;
smolmachines is the concrete implementation.
- PRD 0003 (`docs/prds/0003-bottle-backend-abstraction.md`) — the
backend abstraction this PRD is the first non-Docker consumer
of.
- PRD 0017 (`docs/prds/0017-egress-proxy-via-mitmproxy.md`) — the
egress sidecar the bundle reuses verbatim as pipelock's internal
upstream.
- PRD 0022
(`docs/prds/0022-sandbox-escape-integration-test.md`) — the
acceptance gate for this PRD; the suite already runs through
`get_bottle_backend()` so the env-var flip is the only change
needed to exercise the smolmachines path.
- PRD 0024
(`docs/prds/0024-consolidate-sidecar-bundle.md`) — defines the
single bundle image (`bot-bottle-sidecars`) this PRD
consumes. Prerequisite for chunk 3 of this PRD.