Commit Graph

6 Commits

Author SHA1 Message Date
didericis-claude 1dfc359141 feat(smolmachines): thread inner Plans + bundle daemons run (PRD 0023 chunk 4b)
test / unit (pull_request) Successful in 21s
test / integration (pull_request) Successful in 42s
Bundle daemons (pipelock, egress, optionally git-gate + supervise)
now actually start with their config files bind-mounted from the
inner Plans the docker backend already produces. Chunks 2d + 3
ran with daemons_csv="" so the bundle's init supervisor idled;
chunk 4b wires up the real path: agent → pipelock → egress →
internet (when routes declared) is now functional, modulo agent-
image gaps (claude-code / TLS-trust-store / git in the guest)
that chunk 4c addresses.

bottle_plan.py — added the four inner Plan fields:
  proxy_plan: PipelockProxyPlan
  git_gate_plan: GitGatePlan
  egress_plan: EgressPlan
  supervise_plan: SupervisePlan | None

Same shape the docker backend's plan uses. Docker-network-only
fields (internal_network, egress_network) stay at dataclass
defaults — the smolmachines bundle is on a per-bottle bridge
with a pinned IP, not docker's --internal + egress topology.

prepare.py — instantiates DockerPipelockProxy / DockerEgress /
DockerGitGate / DockerSupervise and calls their .prepare()
methods to write the per-bottle config files (pipelock.yaml,
routes.yaml, git-gate entrypoint/hooks, supervise queue dir)
under the per-bottle state dir. (The "Docker" prefix on the
class names is a misnomer here — .prepare() is platform-neutral,
inherited from each sidecar's ABC. A future cleanup could factor
the prepare logic out of the docker subpackage.)

launch.py — major rewrite:
  - pipelock_tls_init at launch (always); egress_tls_init only
    when the bottle declares routes (otherwise the CA files
    aren't bind-mounted and openssl runs would be wasted).
  - Inner Plans updated in place with launch-time CA paths +
    EGRESS_UPSTREAM_PROXY = http://127.0.0.1:8888 (egress's
    upstream is pipelock on the bundle's own loopback; same
    container's network namespace).
  - BundleLaunchSpec env + volumes built from the inner Plans:
    pipelock.yaml + CA + key (always); egress routes + CAs +
    upstream env + token-slot bare names (when routes); git-gate
    entrypoint + hooks + per-upstream identity files (when
    upstreams); supervise queue dir + env (when enabled).
  - daemons_csv = ["egress", "pipelock"] + ["git-gate"] (if
    upstreams) + ["supervise"] (if enabled).
  - Token env values resolved from host env via
    `egress_resolve_token_values` and threaded into the
    docker-run subprocess env (bare-name -e entries in spec
    inherit from there — values never land on argv).

Tests:
- 552 unit passing (no new unit cases; fixture updated to
  populate the new plan fields).
- 5 integration cases passing locally (Darwin + smolvm + docker
  + not GITEA_ACTIONS):
    * test_smoke_exec_echo — still works.
    * test_localhost_reach_probe — host loopback still refused.
    * test_egress_port_bypass_probe — <bundle-ip>:9099 still
      refused, NOW WITH EGRESS ACTUALLY RUNNING (chunk 3's
      127.0.0.1 bind-address is doing its job).
    * test_prompt_file_lands_in_guest — still works.
    * test_pipelock_answers_on_bundle_ip — NEW. From inside the
      guest, wget to <bundle-ip>:8888 gets an HTTP response
      (not "connection refused") — proves pipelock is actually
      listening and the bind-mount + CA generation path works.

What's left in chunk 4:
- 4c: agent-image-conversion (claude-code + git + curl +
  ca-certificates in the guest). Chunk 2d's alpine placeholder
  stays for now.
- 4d: provision_ca + provision_git + provision_supervise once
  the agent image has the required tools.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 05:29:02 -04:00
didericis-claude 9e3b7e441e feat(smolmachines): provision_prompt + provision_skills (PRD 0023 chunk 4a)
test / unit (pull_request) Successful in 21s
test / integration (pull_request) Successful in 43s
First slice of chunk 4: implement the two provisioning methods
that don't depend on agent-image tooling beyond `cp` and
`mkdir`. provision_ca / provision_git / provision_supervise
land once the agent-image gap is solved (chunk 4b+) — they need
update-ca-certificates, git, and the claude binary respectively,
none of which the chunk-2d alpine placeholder provides.

What this PR ships:

- `claude_bottle/backend/smolmachines/provision/` subpackage
  with `prompt.py` + `skills.py`. Each routes through
  `smolvm.machine_cp` / `machine_exec`. provision_prompt mirrors
  the docker contract (file always copied; return value drives
  --append-system-prompt-file iff the agent has a non-empty
  prompt). provision_skills mkdir + cp per skill, matching
  the docker backend's loop.
- prepare.py now writes the prompt file under
  agent_state_dir(slug) with the agent's `prompt` body, mode
  0o600. The in-guest path is `/root/.claude-bottle-prompt.txt`
  (alpine has no `node` user; will become `/home/node/...` once
  the real claude-bottle image lands).
- launch.py calls `provision(plan, machine_name)` after
  machine_start. The returned prompt path threads to
  SmolmachinesBottle so exec_claude can add
  --append-system-prompt-file when the agent has a prompt.
- backend.py: provision_prompt / provision_skills now real;
  provision_git is a deliberate stub (waiting on the git-gate
  inner Plan + git in the agent image). provision_supervise
  stays the chunk-2d stub.

Tests:
- 7 new unit cases (test_smolmachines_provision.py): argv
  shape (mocked smolvm.machine_cp / .machine_exec),
  prompt return-value contract, no-op-with-no-skills,
  CLAUDE_BOTTLE_GUEST_SKILLS_DIR override, fail-on-missing-skill.
- 1 new integration case in test_smolmachines_launch.py:
  end-to-end verification that the prompt file lands in the
  alpine guest at /root/.claude-bottle-prompt.txt with the
  expected content (via `bottle.exec("cat ...")`). The smoke +
  the two TSI probes stay green.

552 unit + 4 integration (Darwin+smolvm+docker gated) passing.

What's left in chunk 4:
- 4b: thread the inner Plans (PipelockProxyPlan / EgressPlan /
  GitGatePlan / SupervisePlan) through prepare + launch so the
  bundle daemons actually run (currently daemons_csv="").
- 4c: the agent-image-conversion gap — get claude-code + git +
  curl + ca-certificates into the guest image (build a
  .smolmachine via `pack create --from-vm` after manual setup,
  or push the docker image to a registry smolvm can pull).
- 4d: provision_ca + provision_git + provision_supervise once
  4b + 4c land.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 05:08:17 -04:00
didericis-claude 9f65b137b9 feat(smolmachines): end-to-end launch + Bottle.exec + smoke + probes (PRD 0023 chunk 2d)
test / unit (pull_request) Successful in 21s
test / integration (pull_request) Successful in 41s
test / unit (push) Successful in 22s
test / integration (push) Successful in 41s
End-to-end launch flow for the smolmachines backend. Brings up
the per-bottle docker bridge + sidecar bundle, creates and
starts the smolvm guest pointed at the bundle's pinned IP via
TSI's `--allow-cidr <bundle-ip>/32`, yields a SmolmachinesBottle
handle that routes exec/cp through `smolvm machine exec / cp`,
tears everything down on context exit.

launch.py:
- ExitStack-managed: create_bundle_network → start_bundle →
  machine_create → machine_start (each registered for reverse
  teardown).
- daemons_csv="" for chunk 2d — bundle init logs "no daemons
  selected" and idles. Real daemon bringup with inner-Plan-driven
  env + volumes lands in chunk 4.

bottle.py:
- SmolmachinesBottle.exec → smolvm.machine_exec (captured).
- SmolmachinesBottle.exec_claude → direct subprocess.run with
  inherited TTY for interactive sessions.
- SmolmachinesBottle.cp_in → smolvm.machine_cp.

Architecture pivots forced by smolvm 0.8.0's CLI shape:
1. `--from <smolmachine>` and `--smolfile <toml>` are MUTUALLY
   EXCLUSIVE in smolvm 0.8.0. We need --from to avoid the
   registry-pull race that bit us on machine_start (libkrun
   agent's network attempt got refused by macOS with
   "connect: permission denied" on IPv6). So Smolfile is dropped
   entirely; per-bottle env + allow_cidrs flow as CLI flags
   (`--allow-cidr CIDR`, `-e K=V`) directly to machine_create.
2. `smolvm pack create --image` doesn't pull from the local
   docker daemon — only OCI registries via crane. The real
   claude-bottle:latest image lives in the local docker daemon
   and isn't reachable that way. Chunk 2d ships with an alpine
   placeholder; the agent-image-conversion gap belongs to
   chunk 4 (push the image to a registry, or smolvm grows a
   docker-daemon transport).

Other changes:
- machine_create grew `image=` / `from_path=` / `allow_cidrs=`
  / `env=` kwargs; smolfile= dropped.
- bottle_plan: smolfile_path → agent_from_path + guest_env.
- prepare: pack_create against `alpine:latest`, cached under
  ~/.cache/claude-bottle/smolmachines/ keyed by image ref.
- Deleted smolfile.py + test_smolfile.py (dead code now).

Tests:
- Unit: 540 passing (smolvm wrapper grew 4 new flag forms; one
  test renamed to reflect --from + --allow-cidr + -e combo).
- Integration: 3 new cases in tests/integration/
  test_smolmachines_launch.py, gated on Darwin + smolvm on PATH
  + docker + not GITEA_ACTIONS:
    * smoke: bottle.exec("echo hello-from-vm") round-trips with
      the correct stdout + returncode.
    * localhost-reach probe: agent dials 127.0.0.1:9 → connect
      refused (TSI's <bundle-ip>/32 allowlist doesn't include
      loopback). The regression test for the gap the PRD design
      pivot was about.
    * egress-port-bypass probe: agent dials <bundle-ip>:9099
      (egress's port) → connect refused. Chunk 2d has no
      daemons running so nothing's listening anyway; chunk 3
      will preserve this property once egress is up but bound
      to 127.0.0.1 inside the bundle.

End-to-end smoke + both probes green locally on macOS with
smolvm 0.8.0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 04:39:52 -04:00
didericis-claude c73d717f71 feat(smolmachines): rewrite Smolfile to smolvm 0.8.0 schema + drop gvproxy (PRD 0023 chunk 2a)
test / unit (pull_request) Successful in 21s
test / integration (pull_request) Successful in 39s
First sub-PR of chunk 2: rewrite the renderer chunk 1 shipped to
match smolvm 0.8.0's actual Smolfile shape, delete the dead
gvproxy renderer + its tests, simplify the prepare flow now that
there's no gvproxy socket + no loopback-port allocation.

Smolfile renderer:
- Old shape (under the abandoned gvproxy design): name = ...,
  command = [...], [[net]] attachment = "unixgram",
  socket = "...".
- New shape (smolvm 0.8.0): env = [...] (sorted K=V pairs),
  [network] allow_cidrs = ["<bundle-ip>/32"]. Nothing else.
  image / entrypoint / cmd come from the .smolmachine artifact
  built in chunk 2b; cpus / memory left at smolvm defaults.
- Tests assert no leakage of TSI's --outbound-localhost-only or
  the old gvproxy/unixgram keys.

util.py:
- smolmachines_gvproxy_subnet → smolmachines_bundle_subnet,
  returning (subnet, gateway, bundle_ip). bundle_ip is always
  at .2 (gateway .1); subnet is /24, third octet derived from
  the slug hash, skipping the docker-default 17 to avoid the
  common 192.168.17.x collision.
- allocate_loopback_port: deleted. The bundle gets a pinned
  docker IP now; the agent dials that IP directly through TSI.
- smolmachines_preflight: dropped the gvproxy check; only
  smolvm is required.

prepare.py:
- Drops the gvproxy.yaml render + the loopback port allocation
  + the gvproxy_socket field on the plan.
- Derives subnet / gateway / bundle_ip from the slug and
  populates the new SmolmachinesBottlePlan fields.
- Agent env now uses IP-literal URLs (http://<bundle-ip>:8888
  etc) since the guest will have no DNS resolver inside TSI's
  allowlist.

bottle_plan.py:
- Old fields: gvproxy_config_path, gvproxy_socket,
  gvproxy_subnet, gvproxy_gateway, host_port_map.
- New fields: bundle_subnet, bundle_gateway, bundle_ip,
  smolfile_path. (smolmachine artifact path lands in chunk 2b.)

Net: -410 lines. Full unit suite: 516 passing.

The VM lifecycle + bundle bringup + launch wiring + smoke tests
land in chunk 2b.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 04:01:07 -04:00
didericis 2aca9e609a refactor(backend): extract shared print_multi for plan preflights
test / unit (pull_request) Successful in 21s
test / integration (pull_request) Successful in 42s
Addresses PR #62 review comments on
claude_bottle/backend/smolmachines/bottle_plan.py:

- Lift the multi-value label printer (was a nested helper inside
  DockerBottlePlan.print) into a new module
  claude_bottle/backend/print_util.py:print_multi. Both backends
  use it for env / skills / git / egress lines.

- Strip the three smolmachines-preflight lines the review flagged:
  the gvproxy subnet line, the smolfile path line, and the
  gvproxy-config path line. Internal detail — operators see the
  agent / env / skills / bottle / git / egress that already
  matter on the docker side, and nothing else.

- Add `git → upstream` to the smolmachines git output to match
  what's useful at preflight time (the docker version shows
  upstream_host:port; this is similar shape).

Leaves the slug=spec.identity-or-mint pattern alone pending a
reply on PR comment #432 — the docker backend uses the same
pattern to preserve identity across `resume`, so dropping it
would silently break the resume path once smolmachines launch
lands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 02:36:03 -04:00
didericis 20f411b22e feat(smolmachines): backend skeleton + Smolfile/gvproxy renderers (PRD 0023 chunk 1)
test / unit (pull_request) Successful in 22s
test / integration (pull_request) Successful in 43s
Ships the smolmachines backend's prepare side: subpackage layout,
`_BACKENDS` registration under "smolmachines", preflight check
for `smolvm` + `gvproxy` on PATH, and the two config-file
renderers (Smolfile TOML + gvproxy YAML). Launch raises
NotImplementedError until chunk 2.

New module layout (mirrors backend/docker/):
  claude_bottle/backend/smolmachines/
    __init__.py            re-exports SmolmachinesBottleBackend
    backend.py             SmolmachinesBottleBackend façade
    bottle.py              SmolmachinesBottle stub (NotImpl until ch2)
    bottle_plan.py         SmolmachinesBottlePlan + .print()
    bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan stub
    prepare.py             resolve_plan: writes both config files
    smolfile.py            TOML renderer (stdlib, no tomli_w dep)
    gvproxy_config.py      YAML renderer (same shape as pipelock_yaml)
    util.py                preflight + per-slug subnet + loopback port

The renderers are pure functions. `resolve_plan` runs the
preflight, allocates one host-side loopback port per active
sidecar (pipelock always; git-gate / supervise conditional),
derives a per-slug gvproxy subnet (hash-mod-254, skipping the
docker-default 17), and writes:

  - <stage>/gvproxy.yaml: subnet + DNS rule resolving only
    `proxy.internal` + port_forwards (one per active sidecar).
  - <stage>/smolfile.toml: guest command/env + virtio-net device
    backed by gvproxy's unixgram socket. No TSI flags — see
    PRD 0023 "Why gvproxy, not TSI".

The agent's HTTPS_PROXY etc. point at `proxy.internal:<gateway-
port>` so the guest dials through gvproxy. gvproxy resolves only
`proxy.internal` → the gateway IP, and forwards exactly the
listed ports to the host-side sidecar bundle (PRD 0024); every
other destination — host LAN, host loopback, public internet
directly — is unreachable by construction.

29 new unit tests covering renderer correctness, subnet
derivation stability + collision-avoidance, loopback port
allocation, and preflight error paths. Full unit suite: 532
passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 02:22:08 -04:00