docs(prd-0023): smolmachines bottle backend #53

Merged
didericis merged 3 commits from prd-0023-smolmachines-backend into main 2026-05-27 02:16:12 -04:00
Owner

Summary

New PRD for a second concrete BottleBackendSmolmachinesBottleBackend, opt-in via CLAUDE_BOTTLE_BACKEND=smolmachines. macOS-first; libkrun microVMs driven through the smolvm CLI; TSI + --outbound-localhost-only + vsock DNS filter as the egress primitive. Docker stays the default and ships unchanged.

The interesting topology shift: the four sidecars (pipelock / egress / git-gate / supervise) move from sibling containers on an internal Docker network to host processes on per-bottle loopback ports, plumbed into the guest via Smolfile env (HTTPS_PROXY=http://127.0.0.1:<p1>, etc). That isolates each bottle behind hardware page tables via Hypervisor.framework instead of sharing Docker Desktop's VM.

PRD 0022's sandbox-escape suite is the acceptance gate: it already runs through get_bottle_backend(), so flipping CLAUDE_BOTTLE_BACKEND=smolmachines is the only change required to validate the new backend against all five attack categories.

Sized

5 chunks: skeleton + selection + Smolfile renderer → VM lifecycle + OCI archive build → host-side sidecars (port allocator, teardown ordering) → provisioning parity (CA, prompt, skills, .git, supervise) → PRD 0022 green.

Open questions

Seven. Most load-bearing: sidecar locality (host process vs in-VM init, default A), CA-install timing inside the OCI overlay, exec exit-code fidelity through smolvm machine exec, and CI gating on Gitea (act_runner can't run smolmachines; macOS coverage comes from local until a Darwin runner exists).

## Summary New PRD for a second concrete `BottleBackend` — `SmolmachinesBottleBackend`, opt-in via `CLAUDE_BOTTLE_BACKEND=smolmachines`. macOS-first; libkrun microVMs driven through the `smolvm` CLI; TSI + `--outbound-localhost-only` + vsock DNS filter as the egress primitive. Docker stays the default and ships unchanged. The interesting topology shift: the four sidecars (pipelock / egress / git-gate / supervise) move from sibling containers on an internal Docker network to **host processes on per-bottle loopback ports**, plumbed into the guest via Smolfile `env` (`HTTPS_PROXY=http://127.0.0.1:<p1>`, etc). That isolates each bottle behind hardware page tables via Hypervisor.framework instead of sharing Docker Desktop's VM. PRD 0022's sandbox-escape suite is the acceptance gate: it already runs through `get_bottle_backend()`, so flipping `CLAUDE_BOTTLE_BACKEND=smolmachines` is the only change required to validate the new backend against all five attack categories. ## Sized 5 chunks: skeleton + selection + Smolfile renderer → VM lifecycle + OCI archive build → host-side sidecars (port allocator, teardown ordering) → provisioning parity (CA, prompt, skills, .git, supervise) → PRD 0022 green. ## Open questions Seven. Most load-bearing: sidecar locality (host process vs in-VM init, default A), CA-install timing inside the OCI overlay, exec exit-code fidelity through `smolvm machine exec`, and CI gating on Gitea (act_runner can't run smolmachines; macOS coverage comes from local until a Darwin runner exists).
didericis added 1 commit 2026-05-26 23:19:23 -04:00
docs(prd-0023): smolmachines bottle backend
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m7s
a2ac124d5c
Specs a second concrete BottleBackend selectable via
CLAUDE_BOTTLE_BACKEND=smolmachines: per-agent libkrun microVM on
macOS, sidecars relocated to host-side loopback ports plumbed via
Smolfile env, PRD 0022's sandbox-escape suite as the acceptance
gate (the env-var flip is the only change required). Docker
backend ships unchanged and remains default.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis added 1 commit 2026-05-26 23:41:35 -04:00
docs(prd-0023): make gvproxy the network primitive; reject TSI
test / unit (pull_request) Successful in 19s
test / integration (pull_request) Successful in 1m9s
041da1d7af
TSI's --outbound-localhost-only is permissive on all of
127.0.0.0/8 with no destination-port filter, so any host
loopback service (local Postgres, IDE plugins, another bottle's
sidecar) is reachable from the guest. That's the wrong default
for the malicious-agent threat model.

Reworked the network design around gvproxy + VFKT unixgram
attachment: the guest gets a virtio-net device, gvproxy is the
userspace TCP/IP stack on the host side, and the only thing
reachable from the guest is the explicit port-forward list
(typically just pipelock). Host LAN, host loopback, and the
public internet directly are gone by construction.

VMM choice (smolmachines vs PyObjC + Virtualization.framework)
is an open question contingent on whether libkrun's virtio-net
mode lets us point at a custom unixgram socket. Backend name
stays "smolmachines" either way per the original spec.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis added 1 commit 2026-05-26 23:52:01 -04:00
docs(prd-0023): consume PRD 0024's bundle as the single sidecar
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m11s
4e00430c6e
Replace the four host-side sidecar processes (pipelock + egress +
git-gate + supervise) with a single bundled container per bottle,
defined in PRD 0024 and consumed here. egress is internal to the
bundle as pipelock's upstream; only pipelock, git-gate, and
supervise are externally addressable, and only when the bottle
uses them.

gvproxy port_forwards collapse from one-per-process to one-per-
external-port, all pointing into the one bundle container.
Sizing: chunk 3 becomes "sidecar bundle lifecycle" and depends
on PRD 0024 having landed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis merged commit bce1ea21db into main 2026-05-27 02:16:12 -04:00
Collaborator

Follow-up: the gvproxy network design this PRD landed has been reversed in PR #63 after chunk-1's empirical spike against smolvm 0.8.0. Short version: smolvm exposes no virtio-net-over-unixgram attachment, so gvproxy can't sit in the middle the way this PRD assumed. The TSI single-IP allowlist (--allow-cidr <bundle-ip>/32, no --outbound-localhost-only) gives the same security property — agent can reach exactly one IP and nothing else — at a fraction of the code cost.

See PR #63 for the full revision rationale + chunk-shape adjustments. The Why gvproxy, not TSI section is gone, replaced by How TSI's single-IP allowlist achieves the property.

Follow-up: the gvproxy network design this PRD landed has been reversed in PR #63 after chunk-1's empirical spike against `smolvm 0.8.0`. Short version: smolvm exposes no virtio-net-over-unixgram attachment, so gvproxy can't sit in the middle the way this PRD assumed. The TSI single-IP allowlist (`--allow-cidr <bundle-ip>/32`, no `--outbound-localhost-only`) gives the same security property — agent can reach exactly one IP and nothing else — at a fraction of the code cost. See PR #63 for the full revision rationale + chunk-shape adjustments. The `Why gvproxy, not TSI` section is gone, replaced by `How TSI's single-IP allowlist achieves the property`.
didericis deleted branch prd-0023-smolmachines-backend 2026-05-27 03:48:18 -04:00
Sign in to join this conversation.