docs(prd-0023): pivot to smolvm + TSI single-IP allowlist #63

Merged
didericis merged 1 commits from prd-0023-revise-option-b into main 2026-05-27 03:54:12 -04:00
Owner

Summary

Reverses the network design in PRD 0023 after chunk-1's empirical spike against smolvm 0.8.0 contradicted the research note that motivated the gvproxy path. New design uses smolvm's TSI allowlist with a single /32 (the per-bottle sidecar bundle's pinned docker IP) instead of gvproxy + VZFileHandleNetworkDeviceAttachment.

What changed and why

The original PRD's "why gvproxy, not TSI" argument hinged on TSI's --outbound-localhost-only flag — which is permissive on the whole 127.0.0.0/8 range. Re-examining TSI with the actual smolvm 0.8.0 CLI surface in hand, --allow-cidr <bundle-ip>/32 (without --outbound-localhost-only) gives the same security property: agent can reach exactly one IP, nothing else. Host loopback, LAN, public internet directly are all denied at the VMM layer.

Smolvm doesn't expose a virtio-net-over-unixgram attachment that the gvproxy design needs anyway, so the spike forced a choice between dropping smolvm (DIY VM lifecycle via PyObjC + Virtualization.framework — ~100+ lines of code, no machine exec, no OCI image story) and dropping gvproxy. This PRD chooses to drop gvproxy.

One concession TSI doesn't directly address

TSI's allowlist is IP-granular, not port-granular. So the agent CAN reach any port on the bundle's IP — including egress (:9099), which is supposed to be pipelock's internal upstream. Mitigation: bind egress on 127.0.0.1:9099 inside the bundle (pipelock-only), bind pipelock / git-gate / supervise on 0.0.0.0. The agent's connect to <bundle-ip>:9099 refuses at the socket level even though TSI permits the IP. New acceptance test (egress-port-bypass probe) locks this in.

Backend layout impact

  • Dropped: gvproxy_config.py, gvproxy.py, vfkit_attach.py. No gvproxy lifecycle, no VZFileHandle plumbing.
  • Dropped: gvproxy dep, pyobjc-framework-Virtualization dep.
  • Kept: everything else. Smolvm stays as the VMM.

Chunk shape impact

  • Chunk 1 (already shipped as PR #62) was built around the gvproxy design. The Smolfile renderer it landed emits name = … / [[net]] instead of smolvm 0.8.0's image / [network] allow_cidrs. The gvproxy_config.py renderer is dead. Both get rewritten / deleted as part of chunk 2's work.
  • Chunk 2 now covers: smolvm subprocess wrapper, prepare-time smolvm pack create.smolmachine artifact, per-bottle docker bridge + bundle with pinned IP, VM lifecycle, Smolfile renderer rewrite, the two acceptance probes (localhost-reach + egress-port-bypass).
  • Chunk 3 is the bundle's egress-binds-127.0.0.1 change (was previously the "host-side sidecar relocation" chunk; the bundle is already host-side via PRD 0024).
  • Chunks 4-5 unchanged in spirit.

Open questions resolved

  • #1 VMM choice: resolved. Smolvm stays; gvproxy goes.
  • Two new entries: bundle bind-address knob (where to wire the egress 127.0.0.1 bind) and bottle.exec exit-code fidelity probe (verify in chunk 2).

Followup

I'll leave a comment on the merged PR #53 pointing at this revision so anyone landing on the old PR sees the design pivot.

## Summary Reverses the network design in PRD 0023 after chunk-1's empirical spike against smolvm 0.8.0 contradicted the research note that motivated the gvproxy path. New design uses smolvm's TSI allowlist with a single /32 (the per-bottle sidecar bundle's pinned docker IP) instead of gvproxy + VZFileHandleNetworkDeviceAttachment. ## What changed and why The original PRD's "why gvproxy, not TSI" argument hinged on TSI's `--outbound-localhost-only` flag — which is permissive on the whole `127.0.0.0/8` range. Re-examining TSI with the actual smolvm 0.8.0 CLI surface in hand, `--allow-cidr <bundle-ip>/32` (without `--outbound-localhost-only`) gives the same security property: agent can reach exactly one IP, nothing else. Host loopback, LAN, public internet directly are all denied at the VMM layer. Smolvm doesn't expose a virtio-net-over-unixgram attachment that the gvproxy design needs anyway, so the spike forced a choice between dropping smolvm (DIY VM lifecycle via PyObjC + Virtualization.framework — ~100+ lines of code, no `machine exec`, no OCI image story) and dropping gvproxy. This PRD chooses to drop gvproxy. ## One concession TSI doesn't directly address TSI's allowlist is IP-granular, not port-granular. So the agent CAN reach any port on the bundle's IP — including egress (`:9099`), which is supposed to be pipelock's internal upstream. Mitigation: bind egress on `127.0.0.1:9099` inside the bundle (pipelock-only), bind pipelock / git-gate / supervise on `0.0.0.0`. The agent's connect to `<bundle-ip>:9099` refuses at the socket level even though TSI permits the IP. New acceptance test (`egress-port-bypass probe`) locks this in. ## Backend layout impact - **Dropped:** `gvproxy_config.py`, `gvproxy.py`, `vfkit_attach.py`. No gvproxy lifecycle, no VZFileHandle plumbing. - **Dropped:** `gvproxy` dep, `pyobjc-framework-Virtualization` dep. - **Kept:** everything else. Smolvm stays as the VMM. ## Chunk shape impact - **Chunk 1** (already shipped as PR #62) was built around the gvproxy design. The Smolfile renderer it landed emits `name = …` / `[[net]]` instead of smolvm 0.8.0's `image` / `[network] allow_cidrs`. The `gvproxy_config.py` renderer is dead. Both get rewritten / deleted as part of chunk 2's work. - **Chunk 2** now covers: smolvm subprocess wrapper, prepare-time `smolvm pack create` → `.smolmachine` artifact, per-bottle docker bridge + bundle with pinned IP, VM lifecycle, Smolfile renderer rewrite, the two acceptance probes (localhost-reach + egress-port-bypass). - **Chunk 3** is the bundle's egress-binds-127.0.0.1 change (was previously the "host-side sidecar relocation" chunk; the bundle is already host-side via PRD 0024). - **Chunks 4-5** unchanged in spirit. ## Open questions resolved - **#1 VMM choice:** resolved. Smolvm stays; gvproxy goes. - **Two new entries:** bundle bind-address knob (where to wire the egress 127.0.0.1 bind) and `bottle.exec` exit-code fidelity probe (verify in chunk 2). ## Followup I'll leave a comment on the merged PR #53 pointing at this revision so anyone landing on the old PR sees the design pivot.
didericis added 1 commit 2026-05-27 03:47:33 -04:00
docs(prd-0023): pivot to smolvm + TSI single-IP allowlist
test / unit (pull_request) Successful in 22s
test / integration (pull_request) Successful in 43s
5929caa219
Chunk-1's empirical spike against smolvm 0.8.0 contradicted the
research note that motivated the gvproxy network design: smolvm
exposes no virtio-net-over-unixgram attachment. The first draft's
"why gvproxy, not TSI" argument turns out to apply only to
`--outbound-localhost-only`, not to TSI generally.

New design:

- Bundle (PRD 0024) runs on a dedicated per-bottle docker bridge
  with a pinned IP. Smolfile sets `[network] allow_cidrs =
  ["<bundle-ip>/32"]` and nothing else. Agent can reach the bundle
  and nothing else — host loopback, LAN, public internet directly
  are all refused at the VMM (TSI) layer.
- Bind-address mitigation: egress binds 127.0.0.1:9099 inside the
  bundle (pipelock-internal); pipelock / git-gate / supervise
  bind 0.0.0.0 so the agent (across the TSI allowlist) can reach
  them. This is the port-granularity TSI's IP-only allowlist
  doesn't provide.
- Smolfile renderer rewritten in chunk 2 to smolvm 0.8.0's actual
  schema (image / entrypoint / cmd / env / [network] allow_cidrs).
  The chunk-1 renderer (name= / [[net]]= under the gvproxy
  design) emits the wrong shape and will be replaced.
- Drop gvproxy + VZFileHandleNetworkDeviceAttachment + the
  PyObjC fallback. Backend layout loses gvproxy_config.py,
  gvproxy.py, vfkit_attach.py.
- Acceptance plan adds an egress-port-bypass probe in addition
  to the localhost-reach probe.
- Chunks reshape: chunk 1 stays (renderer rewrite is part of
  chunk 2's cost); chunk 2 covers VM lifecycle + bundle + new
  Smolfile renderer; chunk 3 is the bundle bind-address change;
  chunks 4-5 unchanged in spirit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis merged commit b57256789f into main 2026-05-27 03:54:12 -04:00
Sign in to join this conversation.