Commit Graph

2 Commits

Author SHA1 Message Date
didericis-claude a3a9ec065e feat(cleanup): walk every backend, reap smolmachines orphans too
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 43s
test / unit (push) Successful in 29s
test / integration (push) Successful in 38s
`./cli.py cleanup` previously called only the env-var-selected
backend's `prepare_cleanup` / `cleanup` — so a leftover smolvm
machine + bundle container + bundle network from a crashed
smolmachines bottle would survive a default `docker`-mode cleanup
indefinitely.

Smolmachines now has a real `cleanup` module (alongside
`enumerate.py` from issue #77) that walks:

  - smolvm machines named `claude-bottle-*` (via
    `smolvm machine ls --json`)
  - bundle containers `claude-bottle-sidecars-*`
  - bundle networks `claude-bottle-bundle-*`

Cleanup runs stop+delete on the machines, force-rm on the
containers, network rm on the networks. Each step is best-effort
so a failed rm doesn't block the rest.

`cli.py cleanup` walks every backend in `known_backend_names()`
and runs each backend's `cleanup` after a single y/N prompt that
shows a combined plan.

State dirs (`~/.claude-bottle/state/<slug>/`) are shared layout
with the docker backend, which still owns the orphan-state-dir
bucket. It now consults `enumerate_active_bottles()` for the
cross-backend live identity set so a running smolmachines
bottle's state dir isn't reaped during a cleanup.

Tests: smolmachines cleanup (prepare + cleanup ordering + failure
handling); cross-backend orphan protection on the docker
state-dir check; CLI cmd_cleanup walks both backends, short-
circuits on all-empty, aborts on N. 617 unit tests pass.

End-to-end verified on this host:
  $ smolvm machine ls --json | jq '.[].name'
  "claude-bottle-researcher-m3hxd"
  $ ./cli.py cleanup
  --- smolmachines backend ---
  smolvm machine:  claude-bottle-researcher-m3hxd
  remove all of the above? [y/N]

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 19:56:41 -04:00
didericis 20f411b22e feat(smolmachines): backend skeleton + Smolfile/gvproxy renderers (PRD 0023 chunk 1)
test / unit (pull_request) Successful in 22s
test / integration (pull_request) Successful in 43s
Ships the smolmachines backend's prepare side: subpackage layout,
`_BACKENDS` registration under "smolmachines", preflight check
for `smolvm` + `gvproxy` on PATH, and the two config-file
renderers (Smolfile TOML + gvproxy YAML). Launch raises
NotImplementedError until chunk 2.

New module layout (mirrors backend/docker/):
  claude_bottle/backend/smolmachines/
    __init__.py            re-exports SmolmachinesBottleBackend
    backend.py             SmolmachinesBottleBackend façade
    bottle.py              SmolmachinesBottle stub (NotImpl until ch2)
    bottle_plan.py         SmolmachinesBottlePlan + .print()
    bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan stub
    prepare.py             resolve_plan: writes both config files
    smolfile.py            TOML renderer (stdlib, no tomli_w dep)
    gvproxy_config.py      YAML renderer (same shape as pipelock_yaml)
    util.py                preflight + per-slug subnet + loopback port

The renderers are pure functions. `resolve_plan` runs the
preflight, allocates one host-side loopback port per active
sidecar (pipelock always; git-gate / supervise conditional),
derives a per-slug gvproxy subnet (hash-mod-254, skipping the
docker-default 17), and writes:

  - <stage>/gvproxy.yaml: subnet + DNS rule resolving only
    `proxy.internal` + port_forwards (one per active sidecar).
  - <stage>/smolfile.toml: guest command/env + virtio-net device
    backed by gvproxy's unixgram socket. No TSI flags — see
    PRD 0023 "Why gvproxy, not TSI".

The agent's HTTPS_PROXY etc. point at `proxy.internal:<gateway-
port>` so the guest dials through gvproxy. gvproxy resolves only
`proxy.internal` → the gateway IP, and forwards exactly the
listed ports to the host-side sidecar bundle (PRD 0024); every
other destination — host LAN, host loopback, public internet
directly — is unreachable by construction.

29 new unit tests covering renderer correctness, subnet
derivation stability + collision-avoidance, loopback port
allocation, and preflight error paths. Full unit suite: 532
passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 02:22:08 -04:00