feat(smolmachines): patch smolvm state DB to actually enforce per-bottle allowlist
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 44s

Earlier commit framed this PR as "infrastructure landed, TSI
enforcement blocked on upstream smolvm 0.8.0." Found a clean
workaround that lets us enforce now.

Smolvm persists each machine's config (including
`allowed_cidrs`) as a JSON BLOB in
`~/Library/Application Support/smolvm/server/smolvm.db`,
`vms.data`. `machine create --allow-cidr X/32` silently writes
`allowed_cidrs: null` to that row when combined with `--from`,
but smolvm reads the row at `machine start` — so patching the
row between create and start sets the allowlist for real.

New `loopback_alias.force_allowlist(machine_name, cidrs)` opens
the SQLite DB, JSON-decodes the row, sets `allowed_cidrs`, and
writes back as BLOB (Text type silently corrupts smolvm's
later reads). launch.py calls it immediately after
`machine_create` and before `machine_start`.

Verified end-to-end on macOS / Docker Desktop:

  VM allowlist after start: ["127.0.0.16/32"]
  VM → 127.0.0.1:3000      → BLOCKED (Permission denied)
  VM → 8.8.8.8:53          → BLOCKED (Permission denied)
  VM → 127.0.0.16:<bundle> → CONNECTED

The DB-patch hack is correct only because smolvm reads
`allowed_cidrs` from the row at start time (not derived in-
process). When upstream honors `--allow-cidr` with `--from`,
the call becomes redundant — drop the call and the workaround
is gone.

Tests: 4 new for `force_allowlist` (BLOB round-trip; Linux
no-op; missing DB; missing row). Total 593 unit tests pass.

README + PRD updated to reflect the fix landed (no longer
"infrastructure pending upstream"). gitea#75 can close.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 16:55:03 -04:00
parent a919268d5e
commit 7eda2a66ec
5 changed files with 219 additions and 52 deletions
+13 -14
View File
@@ -205,21 +205,20 @@ each reserve a loopback alias from a pool (`127.0.0.16` ..
`127.0.0.31`) and bind their bundle's port-forwards to it; the `127.0.0.31`) and bind their bundle's port-forwards to it; the
first `./cli.py start` after each reboot prompts for sudo to add first `./cli.py start` after each reboot prompts for sudo to add
missing aliases via `ifconfig lo0 alias`. Aliases persist until missing aliases via `ifconfig lo0 alias`. Aliases persist until
reboot; subsequent launches don't prompt. reboot; subsequent launches don't prompt. The agent's TSI
allowlist is the alias's `/32`, so each bottle can only reach
its own bundle's published ports — not other bottles' ports,
not other host loopback services (postgres, dev servers, etc.).
**Known v1 limitation — agent can reach the whole host This enforcement requires a workaround for a smolvm 0.8.0 bug:
loopback:** the alias-allocation infrastructure exists, but TSI the CLI's `--allow-cidr` flag is silently dropped when combined
allowlist enforcement is blocked on a smolvm 0.8.0 upstream bug: with `--from <smolmachine>`. The launcher patches smolvm's
`smolvm machine create --from <smolmachine> --net --allow-cidr persistent state DB
X/32` silently drops the allowlist (the persisted (`~/Library/Application Support/smolvm/server/smolvm.db`)
`agent.config.json` shows `allowed_cidrs: null`, and the running directly between `machine create` and `machine start` to set
VM reaches `127.0.0.0/8` regardless). So while a smolmachines the allowlist. The hack falls away automatically when smolvm
bottle is running, host-local dev services (postgres on 5432, honors the flag upstream — see the `loopback_alias` module's
dev servers, etc.) are reachable from inside the agent even docstring for the investigation trail.
though the launcher's `--allow-cidr` says otherwise. The docker
backend keeps the bottle on a `--internal` docker network and
doesn't have this issue. Tracked in gitea issue #75; will
auto-resolve once smolvm honors the flag.
## Manifest ## Manifest
@@ -202,6 +202,12 @@ def launch(
env=plan.guest_env, env=plan.guest_env,
) )
stack.callback(_smolvm.machine_delete, plan.machine_name) stack.callback(_smolvm.machine_delete, plan.machine_name)
# Workaround smolvm 0.8.0: `--allow-cidr` is silently
# dropped when combined with `--from`. Patch the persisted
# state DB to set the allowlist before start so the booted
# VM's TSI actually enforces. See loopback_alias's module
# docstring for the investigation that led here.
_loopback.force_allowlist(plan.machine_name, [f"{loopback_ip}/32"])
_smolvm.machine_start(plan.machine_name) _smolvm.machine_start(plan.machine_name)
stack.callback(_smolvm.machine_stop, plan.machine_name) stack.callback(_smolvm.machine_stop, plan.machine_name)
@@ -1,30 +1,31 @@
"""Per-bottle loopback alias allocation (PRD 0023, follow-up to """Per-bottle loopback alias allocation + TSI allowlist
the Docker-Desktop fix in PR #74). enforcement (PRD 0023, follow-up to PR #74).
After the pivot to host-loopback port-forwards, the smolmachines After the pivot to host-loopback port-forwards, the smolmachines
TSI allowlist was `127.0.0.1/32` — which meant the agent VM could TSI allowlist was `127.0.0.1/32` — which meant the agent VM could
reach **any** service bound to macOS's loopback, not just the reach **any** service bound to macOS's loopback, not just the
bundle's published ports. That's a real downgrade from the bundle's published ports. Real downgrade from the docker
docker backend's `--internal` network isolation. backend's `--internal` network isolation.
This module is the host-side half of the eventual fix: allocate This module narrows the allowlist by allocating each bottle a
each bottle a unique loopback alias (`127.0.0.16` .. `127.0.0.31` unique loopback alias (`127.0.0.16` .. `127.0.0.31`). The
by default), bind the bundle's port-forwards to that alias, and bundle's port-forwards bind to that alias, and the alias's /32
pass the alias's /32 as smolvm's `--allow-cidr`. If TSI enforced is what TSI allows.
the allowlist, the agent could only reach its own bundle.
**Upstream block, smolvm 0.8.0:** verified empirically that **Smolvm 0.8.0 quirk + workaround.** `smolvm machine create
`smolvm machine create --from <smolmachine> --net --allow-cidr --from <smolmachine> --net --allow-cidr X/32` silently drops the
X/32` silently drops the allowlist. The persisted flag — verified empirically that the agent process's allowlist
`agent.config.json` shows `allowed_cidrs: null`, and the running ends up `null` in smolvm's persistent state DB (`~/Library/
VM can reach any host loopback service regardless of the Application Support/smolvm/server/smolvm.db`, `vms` table,
flag. `machine update --allow-cidr` doesn't exist; stop-edit- `data` BLOB), and the booted VM reaches all of `127.0.0.0/8`
start of `agent.config.json` doesn't work (the file is removed regardless of what we passed. Workaround: after machine_create,
on stop); `--smolfile` is mutually exclusive with `--from`. So open the SQLite DB and patch the row's `allowed_cidrs` field
the alias scoping infrastructure lives here, ready, but the directly. Smolvm reads the DB at machine_start, so the patched
TSI enforcement is blocked on a smolvm upstream fix. Until that value takes effect on boot. Tested: enforcement is real — the
lands, the agent can still reach the whole `127.0.0.0/8`. The guest's connect to a non-allowlisted IP fails with `Permission
README + gitea issue #75 spell this out. denied`. Other paths we tried (machine update, stop-edit-
agent.config.json-restart, --smolfile, --image localhost:N/...)
were dead ends.
macOS only configures `127.0.0.1` on `lo0` by default; the macOS only configures `127.0.0.1` on `lo0` by default; the
additional aliases require `sudo ifconfig lo0 alias`. We lazily additional aliases require `sudo ifconfig lo0 alias`. We lazily
@@ -35,7 +36,7 @@ prompt.
Linux native daemons share the host's network namespace; the Linux native daemons share the host's network namespace; the
whole `127.0.0.0/8` is reachable by default and aliases are whole `127.0.0.0/8` is reachable by default and aliases are
unnecessary. The pool logic detects native-Linux and skips sudo unnecessary. The pool logic detects native-Linux and skips sudo
entirely. entirely; the DB patch is also gated on macOS.
Allocation is coordinated by inspecting running bundle Allocation is coordinated by inspecting running bundle
containers' published host IPs — each bottle's bundle owns the containers' published host IPs — each bottle's bundle owns the
@@ -48,12 +49,30 @@ import json
import os import os
import platform import platform
import re import re
import sqlite3
import subprocess import subprocess
from pathlib import Path
from typing import Iterable from typing import Iterable
from ...log import die, info from ...log import die, info
# smolvm's persistent VM state on macOS — a SQLite DB whose `vms`
# table holds one JSON BLOB per machine. The Linux path is
# different, but smolmachines is macOS-only in v1 (PRD 0023) so
# we hard-code this. If the file moves under us we'll see a
# clear FileNotFoundError; not worth defensive cross-platform
# detection until the backend actually needs Linux.
_SMOLVM_DB_PATH = (
Path.home()
/ "Library"
/ "Application Support"
/ "smolvm"
/ "server"
/ "smolvm.db"
)
# Sixteen aliases by default. Tunable for hosts that want more # Sixteen aliases by default. Tunable for hosts that want more
# concurrent bottles (each bottle reserves one alias for its # concurrent bottles (each bottle reserves one alias for its
# bundle bringup). The range is chosen to avoid the reserved # bundle bringup). The range is chosen to avoid the reserved
@@ -103,6 +122,52 @@ def ensure_pool() -> None:
) )
def force_allowlist(machine_name: str, allowed_cidrs: list[str]) -> None:
"""Patch smolvm's persistent VM-state DB to set the machine's
`allowed_cidrs` to the given list. Workaround for smolvm
0.8.0's silent-drop of `--allow-cidr` when used with `--from`.
Must run AFTER `smolvm machine create` (the row has to
exist) and BEFORE `smolvm machine start` (smolvm reads the
row on start; in-flight VMs don't pick up changes). Once
smolvm honors the CLI flag upstream this whole function is
redundant — flag-respecting create + remove this call from
launch.
No-op on non-macOS — the DB path differs and the Linux
smolmachines code path isn't exercised in v1."""
if not _is_macos():
return
if not _SMOLVM_DB_PATH.is_file():
die(
f"smolvm state DB not found at {_SMOLVM_DB_PATH}. "
f"smolvm 0.8.0 expected? `smolvm --version` to check."
)
con = sqlite3.connect(str(_SMOLVM_DB_PATH))
try:
cur = con.cursor()
row = cur.execute(
"SELECT data FROM vms WHERE name = ?", (machine_name,),
).fetchone()
if row is None:
die(
f"smolvm DB has no row for machine {machine_name!r}"
f"machine_create must run before force_allowlist."
)
cfg = json.loads(row[0])
cfg["allowed_cidrs"] = list(allowed_cidrs)
# Write as BLOB (the column type smolvm uses) — passing a
# plain str makes sqlite store it as Text and smolvm then
# fails to read it.
cur.execute(
"UPDATE vms SET data = ? WHERE name = ?",
(sqlite3.Binary(json.dumps(cfg).encode()), machine_name),
)
con.commit()
finally:
con.close()
def allocate(slug: str) -> str: def allocate(slug: str) -> str:
"""Pick the lowest-numbered alias from the pool not already """Pick the lowest-numbered alias from the pool not already
in use by a running smolmachines bundle. Bails when the pool in use by a running smolmachines bundle. Bails when the pool
@@ -186,4 +251,4 @@ def _host_ips_for_container(name: str) -> Iterable[str]:
return seen return seen
__all__ = ["allocate", "ensure_pool"] __all__ = ["allocate", "ensure_pool", "force_allowlist"]
+25 -15
View File
@@ -611,21 +611,31 @@ PRD 0024's bundle image is a prerequisite — this PRD assumes
bound to macOS's loopback** — postgres, dev servers, other bound to macOS's loopback** — postgres, dev servers, other
bottles' published ports, mDNSResponder, etc. bottles' published ports, mDNSResponder, etc.
**Attempted fix + upstream block (`smolmachines-loopback- **Fix + smolvm 0.8.0 workaround.** Allocate each bottle a
alias-scoping` branch).** Allocate each bottle a unique unique loopback alias (`127.0.0.16` .. `127.0.0.31`), bind
loopback alias (`127.0.0.16` .. `127.0.0.31`), bind bundle bundle port-forwards to it, set TSI's allowlist to that
port-forwards to it, set TSI's `--allow-cidr` to that /32. alias's /32. The agent can only reach its own bundle; other
Verified empirically that `smolvm 0.8.0 machine create --from bottles' ports, host loopback services, and the internet are
<smolmachine> --net --allow-cidr X/32` **silently drops the all denied.
allowlist** — `agent.config.json` shows `allowed_cidrs:null`
and the VM reaches all of `127.0.0.0/8` regardless of the Smolvm 0.8.0 silently drops `--allow-cidr` when combined
flag. Workarounds tried: `machine update --allow-cidr` with `--from <smolmachine>` (verified empirically:
doesn't exist; stop-edit-`agent.config.json`-restart fails `agent.config.json` shows `allowed_cidrs:null` despite the
(file is removed on stop); `--smolfile` is mutually exclusive flag). The launcher patches smolvm's persistent state DB
with `--from`. Alias-allocation infrastructure is in place (`~/Library/Application Support/smolvm/server/smolvm.db`,
so the day smolvm honors `--allow-cidr` with `--from`, the `vms.data` BLOB) between `machine create` and `machine
scoping starts working. Until then the agent can reach the start` to set the allowlist directly. Smolvm reads the DB
whole host loopback. Tracked in gitea issue #75. at start, so TSI enforces. Tested end-to-end: VM → `127.0.0.1`
= "Permission denied"; VM → `<alias>:<bundle-port>` =
connects.
Other paths tried that didn't work: `machine update
--allow-cidr` doesn't exist; stop-edit-`agent.config.json`-
restart fails (file removed on stop); `--smolfile` mutually
exclusive with `--from`; `--image localhost:<port>/...` fails
because smolvm's pull agent can't reach host loopback during
pull. When smolvm honors `--allow-cidr` with `--from`
upstream, the DB patch becomes redundant and can be removed.
## References ## References
@@ -7,8 +7,12 @@ inspecting running bundle containers' port bindings."""
from __future__ import annotations from __future__ import annotations
import json
import sqlite3
import subprocess import subprocess
import tempfile
import unittest import unittest
from pathlib import Path
from unittest.mock import patch from unittest.mock import patch
from claude_bottle.backend.smolmachines import loopback_alias from claude_bottle.backend.smolmachines import loopback_alias
@@ -187,5 +191,88 @@ class TestAliasInUseDetection(unittest.TestCase):
self.assertEqual(set(), loopback_alias._aliases_in_use()) self.assertEqual(set(), loopback_alias._aliases_in_use())
class TestForceAllowlist(unittest.TestCase):
"""Smolvm 0.8.0 silently drops `--allow-cidr` with `--from`,
so `force_allowlist` opens the state DB directly and sets
the row's `allowed_cidrs` field. Round-trip tests against a
real SQLite DB to lock down the BLOB encoding."""
def setUp(self):
self._tmp = tempfile.TemporaryDirectory(prefix="smolvm-db.")
self.db = Path(self._tmp.name) / "smolvm.db"
con = sqlite3.connect(str(self.db))
con.execute(
"CREATE TABLE vms (name TEXT PRIMARY KEY NOT NULL, data BLOB NOT NULL)"
)
# Mimic smolvm's row shape (the JSON keys that exist on
# creation; allowed_cidrs is the field we patch).
cfg = {
"name": "demo-vm",
"cpus": 4,
"mem": 8192,
"network": True,
"allowed_cidrs": None,
}
con.execute(
"INSERT INTO vms (name, data) VALUES (?, ?)",
("demo-vm", sqlite3.Binary(json.dumps(cfg).encode())),
)
con.commit()
con.close()
def tearDown(self):
self._tmp.cleanup()
def test_patches_allowed_cidrs_on_row(self):
with patch.object(loopback_alias, "_is_macos", return_value=True), \
patch.object(loopback_alias, "_SMOLVM_DB_PATH", self.db):
loopback_alias.force_allowlist("demo-vm", ["127.0.0.16/32"])
con = sqlite3.connect(str(self.db))
row = con.execute(
"SELECT typeof(data), data FROM vms WHERE name='demo-vm'",
).fetchone()
con.close()
# Must round-trip as BLOB (the column type smolvm reads).
self.assertEqual("blob", row[0])
cfg = json.loads(row[1])
self.assertEqual(["127.0.0.16/32"], cfg["allowed_cidrs"])
# Other fields preserved verbatim.
self.assertEqual(4, cfg["cpus"])
self.assertTrue(cfg["network"])
def test_noop_on_linux(self):
with patch.object(loopback_alias, "_is_macos", return_value=False), \
patch.object(loopback_alias, "_SMOLVM_DB_PATH", self.db):
loopback_alias.force_allowlist("demo-vm", ["127.0.0.16/32"])
# DB row should be untouched.
con = sqlite3.connect(str(self.db))
cfg = json.loads(con.execute(
"SELECT data FROM vms WHERE name='demo-vm'",
).fetchone()[0])
con.close()
self.assertIsNone(cfg["allowed_cidrs"])
def test_dies_on_missing_db(self):
with patch.object(loopback_alias, "_is_macos", return_value=True), \
patch.object(
loopback_alias, "_SMOLVM_DB_PATH",
Path("/nonexistent/smolvm.db"),
), patch.object(
loopback_alias, "die", side_effect=SystemExit("die"),
):
with self.assertRaises(SystemExit):
loopback_alias.force_allowlist("demo-vm", ["127.0.0.16/32"])
def test_dies_on_missing_row(self):
with patch.object(loopback_alias, "_is_macos", return_value=True), \
patch.object(loopback_alias, "_SMOLVM_DB_PATH", self.db), \
patch.object(
loopback_alias, "die", side_effect=SystemExit("die"),
):
with self.assertRaises(SystemExit):
loopback_alias.force_allowlist("not-in-db", ["127.0.0.16/32"])
if __name__ == "__main__": if __name__ == "__main__":
unittest.main() unittest.main()