feat(smolmachines): per-bottle loopback alias scopes TSI to single /32 #76

Merged
didericis-claude merged 4 commits from smolmachines-loopback-alias-scoping into main 2026-05-27 18:08:02 -04:00
Collaborator

Summary

Per-bottle loopback alias scoping for smolmachines bottles. Each bottle reserves a unique 127.0.0.16 .. 127.0.0.31 alias on lo0, binds the bundle's port-forwards to it, and the agent's TSI allowlist is the alias's /32. Result: the agent can only reach its own bundle — not other bottles' ports, not host loopback services (postgres, dev servers), not the internet.

End-to-end verified on macOS / Docker Desktop:

agent.config.json after start: allowed_cidrs = ["127.0.0.16/32"]
VM → 127.0.0.1:3000      → BLOCKED (Permission denied)   [host probe]
VM → 8.8.8.8:53          → BLOCKED (Permission denied)
VM → 127.0.0.16:<bundle> → CONNECTED

Smolvm 0.8.0 quirk + workaround

The CLI's --allow-cidr flag is silently dropped when combined with --from <smolmachine>. Verified empirically: the persisted agent.config.json shows allowed_cidrs: null despite the flag, and the booted VM reaches all of 127.0.0.0/8.

Workaround: smolvm stores each machine's config as a JSON BLOB in ~/Library/Application Support/smolvm/server/smolvm.db (vms.data), and reads it at machine start. The launcher patches that row between machine create and machine start to set allowed_cidrs directly. TSI enforces from the patched value.

This hack falls away once smolvm honors the flag upstream — force_allowlist becomes a no-op call to remove.

Other paths tried (all dead-ends):

  • machine update --allow-cidr doesn't exist
  • stop-edit-agent.config.json-restart fails (file is removed on stop)
  • --smolfile mutually exclusive with --from
  • --image localhost:<port>/... — smolvm's pull agent can't reach host loopback during pull

Sudo policy (one-time per reboot)

The launcher lazily sudo-prompts to add missing lo0 aliases on first launch per reboot. Aliases persist until reboot; subsequent launches don't prompt. Linux native daemons share the host's network namespace and skip the alias dance entirely.

Code

  • New loopback_alias module: ensure_pool(), allocate(slug), force_allowlist(machine_name, cidrs). Detects macOS / Linux at runtime.
  • BundleLaunchSpec.publish_host_ip carries the alias; start_bundle binds -p <alias>::<port>.
  • bundle_host_port honors the host IP so docker port output disambiguation works against the per-bottle alias.
  • launch.py calls ensure_pool + allocate early, then force_allowlist between machine_create and machine_start.
  • README + PRD 0023 updated; closes gitea issue #75.
  • 593 unit tests pass (+15 new across force_allowlist, alias allocation, and the in-use detection).
## Summary Per-bottle loopback alias scoping for smolmachines bottles. Each bottle reserves a unique `127.0.0.16` .. `127.0.0.31` alias on `lo0`, binds the bundle's port-forwards to it, and the agent's TSI allowlist is the alias's `/32`. Result: the agent can only reach its own bundle — not other bottles' ports, not host loopback services (postgres, dev servers), not the internet. End-to-end verified on macOS / Docker Desktop: ``` agent.config.json after start: allowed_cidrs = ["127.0.0.16/32"] VM → 127.0.0.1:3000 → BLOCKED (Permission denied) [host probe] VM → 8.8.8.8:53 → BLOCKED (Permission denied) VM → 127.0.0.16:<bundle> → CONNECTED ``` ## Smolvm 0.8.0 quirk + workaround The CLI's `--allow-cidr` flag is silently dropped when combined with `--from <smolmachine>`. Verified empirically: the persisted `agent.config.json` shows `allowed_cidrs: null` despite the flag, and the booted VM reaches all of `127.0.0.0/8`. Workaround: smolvm stores each machine's config as a JSON BLOB in `~/Library/Application Support/smolvm/server/smolvm.db` (`vms.data`), and reads it at `machine start`. The launcher patches that row between `machine create` and `machine start` to set `allowed_cidrs` directly. TSI enforces from the patched value. This hack falls away once smolvm honors the flag upstream — `force_allowlist` becomes a no-op call to remove. Other paths tried (all dead-ends): - `machine update --allow-cidr` doesn't exist - stop-edit-`agent.config.json`-restart fails (file is removed on stop) - `--smolfile` mutually exclusive with `--from` - `--image localhost:<port>/...` — smolvm's pull agent can't reach host loopback during pull ## Sudo policy (one-time per reboot) The launcher lazily sudo-prompts to add missing `lo0` aliases on first launch per reboot. Aliases persist until reboot; subsequent launches don't prompt. Linux native daemons share the host's network namespace and skip the alias dance entirely. ## Code - New `loopback_alias` module: `ensure_pool()`, `allocate(slug)`, `force_allowlist(machine_name, cidrs)`. Detects macOS / Linux at runtime. - `BundleLaunchSpec.publish_host_ip` carries the alias; `start_bundle` binds `-p <alias>::<port>`. - `bundle_host_port` honors the host IP so `docker port` output disambiguation works against the per-bottle alias. - `launch.py` calls `ensure_pool` + `allocate` early, then `force_allowlist` between `machine_create` and `machine_start`. - README + PRD 0023 updated; closes gitea issue #75. - 593 unit tests pass (+15 new across `force_allowlist`, alias allocation, and the in-use detection).
didericis-claude added 1 commit 2026-05-27 16:23:36 -04:00
feat(smolmachines): per-bottle loopback alias scopes TSI to single /32
test / unit (pull_request) Successful in 27s
test / integration (pull_request) Successful in 41s
2edc1abb9a
PR #74's Docker-Desktop fix routed the agent through
`127.0.0.1:<random>` loopback forwards, but TSI filters by IP
only — so the allowlist `127.0.0.1/32` let the agent VM reach
**any** host service on macOS loopback (postgres, dev servers,
other bottles' published ports, mDNSResponder, ...). Real
downgrade vs the docker backend's `--internal` network.

Resolution: per-bottle loopback alias.

- New `loopback_alias` module manages a pool of
  `127.0.0.16` .. `127.0.0.31` on `lo0`. macOS only routes
  `127.0.0.1` by default; the extras need `sudo ifconfig lo0
  alias`. `ensure_pool()` lazily adds the missing entries via
  one sudo prompt on first launch per reboot — aliases persist
  on `lo0` until reboot, so subsequent launches skip the
  prompt entirely.
- `allocate(slug)` picks the lowest-numbered unused alias by
  inspecting running bundle containers' port-binding HostIps.
  No on-disk reservation — docker is the source of truth.
- Bundle bringup binds published ports to the allocated alias
  (`docker run -p <alias>::<port>`) instead of `127.0.0.1`.
- TSI allowlist becomes the alias's /32 — narrows reachability
  to this bottle's bundle only.
- Linux native daemons share the host's network namespace;
  `127.0.0.0/8` works without aliases, so the module no-ops on
  non-Darwin and returns `127.0.0.1` from `allocate`.

Tracking issue closed: gitea/issues/75.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis added 1 commit 2026-05-27 16:38:00 -04:00
docs: honest framing of upstream smolvm 0.8.0 allowlist bug
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 40s
a919268d5e
PR #76 originally claimed the per-bottle alias scoping closed
gitea#75 ("agent can reach host loopback"). Verified
empirically that's not actually true: `smolvm 0.8.0 machine
create --from <smolmachine> --net --allow-cidr X/32` silently
drops the allowlist (`agent.config.json` shows `allowed_cidrs:
null`, and the running VM reaches all of `127.0.0.0/8`
regardless).

So the alias-allocation + alias-bind infrastructure is correct
pre-work, but the actual TSI enforcement is blocked on an
upstream smolvm bug. README + PRD 0023 + the module docstring
get reworded to say so plainly. gitea#75 stays open.

Workarounds tried (all dead-ends):
- `machine update --allow-cidr` doesn't exist
- stop-edit-`agent.config.json`-restart fails (smolvm removes
  the file on stop)
- `--smolfile` is mutually exclusive with `--from`
- `--image localhost:<port>/...` fails because smolvm's agent
  process can't reach host loopback during pull

When upstream lands a fix, our existing code (alias allocation,
port-bind, --allow-cidr in launch) will scope correctly without
further changes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis-claude changed title from feat(smolmachines): per-bottle loopback alias scopes TSI to single /32 to infra(smolmachines): per-bottle loopback alias scoping (TSI enforcement upstream-blocked) 2026-05-27 16:38:22 -04:00
didericis added 1 commit 2026-05-27 16:55:07 -04:00
feat(smolmachines): patch smolvm state DB to actually enforce per-bottle allowlist
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 44s
7eda2a66ec
Earlier commit framed this PR as "infrastructure landed, TSI
enforcement blocked on upstream smolvm 0.8.0." Found a clean
workaround that lets us enforce now.

Smolvm persists each machine's config (including
`allowed_cidrs`) as a JSON BLOB in
`~/Library/Application Support/smolvm/server/smolvm.db`,
`vms.data`. `machine create --allow-cidr X/32` silently writes
`allowed_cidrs: null` to that row when combined with `--from`,
but smolvm reads the row at `machine start` — so patching the
row between create and start sets the allowlist for real.

New `loopback_alias.force_allowlist(machine_name, cidrs)` opens
the SQLite DB, JSON-decodes the row, sets `allowed_cidrs`, and
writes back as BLOB (Text type silently corrupts smolvm's
later reads). launch.py calls it immediately after
`machine_create` and before `machine_start`.

Verified end-to-end on macOS / Docker Desktop:

  VM allowlist after start: ["127.0.0.16/32"]
  VM → 127.0.0.1:3000      → BLOCKED (Permission denied)
  VM → 8.8.8.8:53          → BLOCKED (Permission denied)
  VM → 127.0.0.16:<bundle> → CONNECTED

The DB-patch hack is correct only because smolvm reads
`allowed_cidrs` from the row at start time (not derived in-
process). When upstream honors `--allow-cidr` with `--from`,
the call becomes redundant — drop the call and the workaround
is gone.

Tests: 4 new for `force_allowlist` (BLOB round-trip; Linux
no-op; missing DB; missing row). Total 593 unit tests pass.

README + PRD updated to reflect the fix landed (no longer
"infrastructure pending upstream"). gitea#75 can close.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis-claude changed title from infra(smolmachines): per-bottle loopback alias scoping (TSI enforcement upstream-blocked) to feat(smolmachines): per-bottle loopback alias scopes TSI to single /32 2026-05-27 16:55:28 -04:00
didericis added 1 commit 2026-05-27 17:17:03 -04:00
fix(smolmachines): include per-bottle alias in NO_PROXY
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 39s
2f143c7142
claude's HTTPS_PROXY was catching the supervise MCP URL
(`http://<alias>:<port>/`) because NO_PROXY was hardcoded to
`localhost,127.0.0.1` and didn't include the per-bottle
loopback alias. Claude proxied the MCP POST through egress,
egress had no route for the alias, and the connection failed
— `/mcp` showed "supervise · ✘ failed" inside the bottle.

Append the loopback alias to NO_PROXY in launch.py so direct
MCP calls bypass the proxy. The git-gate URL uses `git://`,
which proxies don't touch, so this only affects MCP / HTTP
paths to the bundle.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis-claude merged commit 1e82aed54b into main 2026-05-27 18:08:02 -04:00
Sign in to join this conversation.