bot-bottle

Author	SHA1	Message	Date
didericis-claude	5486170be1	fix(smolmachines): route agent through egress when routes declared, wait for VM warm-up test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 42s Details Two related bugs: 1. Auth chain bypassed egress. After the Docker-Desktop port pivot, the agent always dialed pipelock directly — meaning egress (which holds the real OAuth token and rewrites the Authorization header) wasn't in the request path. Bearer placeholder reached anthropic verbatim → 401 "Invalid bearer token". Fix: when the bottle declares egress.routes, the agent's first hop is egress (publish egress port 9099 to host loopback, leave pipelock bundle-internal). Without routes, the agent dials pipelock directly. Same hop order as the docker backend. 2. provision_ca's update-ca-certificates SIGKILLed at ~100ms on Docker Desktop. Back-to-back `smolvm machine exec` calls immediately after machine_start hit a VM warm-up race in libkrun's exec channel; the second exec's child got SIGKILL'd before producing more than the first line of stdout. The agent's trust store never got the egress MITM CA's hash symlink, so curl/openssl couldn't validate the TLS chain. Fix: 1.5s sleep after machine_start (empirically enough), plus fold provision_ca's chown + chmod + update-ca-certificates into one `sh -c` so we only pay one exec round trip. Bail with a clear error if update-ca- certificates doesn't report "1 added" (failing silently was how the original SIGKILL went unnoticed). Net effect on Docker Desktop / macOS: claude's HTTPS_PROXY is `http://127.0.0.1:<egress port>`, egress rewrites auth, pipelock allowlists + DLPs, request reaches api.anthropic.com with a real token. End-to-end verified. Also drops the PRD-0023-chunk-3 EGRESS_LISTEN_HOST=127.0.0.1 mitigation. The original concern (agent bypassing pipelock by dialing egress's port on the bundle IP) doesn't apply in this topology: the agent can only reach whatever port we publish on host loopback, and egress is the only HTTP/HTTPS chokepoint that gets published. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 15:57:18 -04:00
didericis-claude	4f136a9932	fix(smolmachines): agent dials bundle via host loopback ports, not docker bridge IP test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 39s Details Claude hung on outbound network calls under CLAUDE_BOTTLE_BACKEND=smolmachines: Unable to connect to API (FailedToOpenSocket) Root cause: the PRD-0023 design pinned the bundle at a docker bridge IP (192.168.X.2) and set the smolvm guest's TSI allowlist to `<bundle-ip>/32`. On native Linux this works — host shares the docker bridge's network namespace, TSI's syscall impersonation reaches the bridge IP directly. On Docker Desktop (macOS), the daemon runs in its own Linux VM and docker bridge IPs aren't reachable from macOS networking, so the smolvm guest's TSI requests die "Network is unreachable" before they hit pipelock. Fix: publish each agent-facing bundle daemon's port on host loopback (-p 127.0.0.1::PORT), discover the random host-side ports after start, and route the agent through `127.0.0.1:<host port>` instead of the bridge IP. macOS loopback is the surface Docker Desktop's gvproxy forwards into the daemon's VM, so the chain (guest TSI -> macOS loopback -> daemon VM port-forward -> bundle container) works on both Docker Desktop and native Linux. Concrete changes: - BundleLaunchSpec: add `ports_to_publish` so start_bundle adds `-p 127.0.0.1::PORT` for the agent-facing ports (pipelock always; git-gate when upstreams declared; supervise when enabled). Egress's port stays bundle-internal. - sidecar_bundle.bundle_host_port(): wrap `docker port <bundle> <container_port>/tcp` so launch can look up the random host-side mapping after start. - launch.py: discover the host ports, build URLs of the form `http://127.0.0.1:<host port>` / `git://127.0.0.1:<host port>`, stamp onto guest_env + new agent_*_url fields on the plan. - launch.py: TSI allow_cidrs flips to `["127.0.0.1/32"]`. The bundle IP is no longer the agent's target. - prepare.py: stop synthesizing HTTPS_PROXY / GIT_GATE_URL / MCP_SUPERVISE_URL at prepare time — launch owns those now (the values depend on a port docker hasn't assigned yet). - provision_git: gate_host from plan.agent_git_gate_host. - provision_supervise: URL from plan.agent_supervise_url. End-to-end verified on Docker Desktop / macOS: guest dials pipelock through TSI, pipelock forwards to api.anthropic.com, the API responds with 401 (i.e. it received the request). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 15:31:44 -04:00
didericis-claude	da1e5e1ba8	fix(smolmachines): pass --net explicitly when allow_cidrs is set test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 40s Details smolvm 0.8.0 docs say `--allow-cidr` implies `--net`, but empirically the implication only fires when no `--from` is set. `--from PATH --allow-cidr X/32` silently produces a machine with network: false and no routes in the guest — claude lands inside with HTTPS_PROXY pointing at the bundle's pinned IP but every connect fails with "Network is unreachable" / FailedToOpenSocket in claude's UI. Reproduce + verify: $ smolvm machine create --from <pack> --allow-cidr X/32 nettest $ smolvm machine ls --json \| jq '.[].network' # false $ smolvm machine create --from <pack> --net --allow-cidr X/32 nettest2 $ smolvm machine ls --json \| jq '.[].network' # true Add `--net` whenever `allow_cidrs` is non-empty. No change to the no-allow-cidr code path. Test added to lock down both branches. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 15:21:43 -04:00
didericis-claude	47eb56bd10	fix(smolmachines): use containerized crane to push, bypassing docker daemon's HTTPS preference test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 42s Details The previous fix (`host.docker.internal:<port>` for daemon-side push) still failed: Get "https://host.docker.internal:53958/v2/": http: server gave HTTP response to HTTPS client `host.docker.internal` is reachable from Docker Desktop's daemon VM but isn't in the daemon's default insecure-registries CIDRs (only `::1/128` and `127.0.0.0/8` are), so docker push tries HTTPS, hits a plain-HTTP registry, and refuses. The daemon.json fix (`"insecure-registries": ["host.docker.internal"]`) works but is a one-time manual step in Docker Desktop's UI — not something we can do for the user. Sidestep the daemon push entirely: 1. docker build (as before) — local layer cache makes no-change rebuilds cheap. 2. docker save the image to a per-digest tarball alongside the cached `.smolmachine`. 3. Start an ephemeral registry container on a per-session docker network, with `-p :5000` so the host can also reach it for the pack step. 4. docker run a one-shot crane container on the SAME network, mount the tarball, `crane push --insecure /img.tar <registry-container>:5000/...`. Container DNS resolves the registry on the network; `--insecure` forces plain HTTP. 5. `smolvm pack create --image localhost:<host port>/...` from the host. smolvm's bundled crane auto-falls-back to HTTP for localhost addresses, so no insecure-registries config is needed on that side. 6. Tear down everything; reap the tarball (registries hold the same bytes, no need to keep both around). Net effect: the docker daemon never does an HTTP/HTTPS-policy decision on our behalf. `docker push` is gone from the prepare path; `docker save`, `docker network create`, `docker run` (for registry + crane) replace it. Tested end-to-end on Docker Desktop / macOS: `_ensure_smolmachine ("claude-bottle:latest")` produces a 204MB `.smolmachine.smolmachine` artifact. Adds: - backend/docker/util.py:save() — thin docker save wrapper. - local_registry.crane_push_tarball() — one-shot crane run on the registry's network. - CRANE_IMAGE constant pinned by digest (gcr.io/go-containerregistry/crane@sha256:0ae17ecb...). Removes: - backend/docker/util.py:tag() / push() — unused without daemon push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 14:52:40 -04:00
didericis-claude	f4026ea3ae	fix(smolmachines): docker push fails on Docker Desktop — daemon-side route differs from host loopback test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 42s Details `./cli.py start <agent>` under CLAUDE_BOTTLE_BACKEND=smolmachines died at `docker push localhost:<port>/claude-bottle:<id>` with `Get "http://localhost:<port>/v2/": context deadline exceeded`. Cause: chunk 4c bound the ephemeral registry to `127.0.0.1::5000` and used `localhost:<port>` as the only image-ref hostname. On Docker Desktop the daemon runs inside its own Linux VM — its `localhost` is the VM's loopback, not the host's, so the daemon cannot reach a registry bound to the host's 127.0.0.1. Fix: bind the registry to all interfaces (`-p :5000`) so it's reachable from both sides, and yield two endpoints: - `daemon_endpoint` — `host.docker.internal:<port>` on Docker Desktop (daemon-side hostname for the host VM gateway), `localhost:<port>` on a native Linux daemon that shares the host's network namespace. Used for `docker tag` + `docker push`. - `host_endpoint` — always `localhost:<port>`. Used for `smolvm pack create`, which runs as a host process. The registry stores images by repo+tag, so a push to `host.docker.internal:<port>/cb:<id>` and a pull from `localhost:<port>/cb:<id>` resolve to the same blob — the hostname in a ref is just routing. Detection uses `docker info --format '{{.OperatingSystem}}'`, which returns "Docker Desktop" on macOS/Windows Desktop and the host's OS name on native daemons. Trade-off: all-interface binding briefly publishes the registry on every interface (~5-10s during prepare). The pushed image is built from the public repo Dockerfile (no secrets), the port is random, and the window is short — acceptable for v1 of a personal dev tool. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 14:41:26 -04:00
didericis-claude	ac8c7ba696	feat(smolmachines): provision_ca + provision_git + provision_supervise (PRD 0023 chunk 4d) test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 43s Details test / unit (push) Successful in 26s Details test / integration (push) Successful in 42s Details End-to-end provisioning parity with the docker backend. After this chunk a smolmachines bottle has a working trust store, git-gate gitconfig, and supervise MCP registration — same shape as docker, dispatched via `smolvm machine cp` / `smolvm machine exec` instead of `docker cp` / `docker exec`. Adds three new provision modules: - ca.py: select egress vs pipelock CA (same logic as docker), machine cp + update-ca-certificates, log sha256 fingerprint. - git.py: copy host .git when --cwd was passed; render ~/.gitconfig with insteadOf URLs. URL prefix is `git://<bundle_ip>:9418/...` (no DNS in the TSI-allowlisted guest) vs docker's `git://git-gate/...`. - supervise.py: `claude mcp add` via machine_exec; URL is `http://<bundle_ip>:9100/`. Failure is logged but non-fatal (matches docker). Shared render: `render_git_gate_gitconfig` moves out of backend/docker/provision/git.py into the platform-neutral claude_bottle/git_gate.py (renamed to git_gate_render_gitconfig for consistency with the existing git_gate_render_* helpers), parameterized on a `gate_host` argument so both backends use the same logic with different addresses. Path/user fixups for the post-chunk-4c agent image (real claude-bottle image, USER node, $HOME=/home/node): - prompt.py default path moves from /root/... to /home/node/.claude-bottle-prompt.txt; chown + chmod after machine cp. - skills.py default skills dir moves from /root/.claude/skills to /home/node/.claude/skills; chown -R per skill. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 14:15:58 -04:00
didericis-claude	1fa17d1822	feat(smolmachines): build agent image from repo Dockerfile (PRD 0023 chunk 4c) test / unit (pull_request) Successful in 21s Details test / unit (push) Successful in 21s Details test / integration (push) Successful in 42s Details test / integration (pull_request) Successful in 41s Details Replaces the alpine:latest placeholder with a real claude-bottle agent image, converted into a .smolmachine artifact via an ephemeral local OCI registry. Why the registry hop: smolvm pack create only accepts OCI registry refs. Empirically it rejects docker-daemon://, oci-layout://, docker-archive: tarballs, and every other transport tested — the crane backend treats anything with a scheme prefix as a registry hostname. To convert a locally-built docker image into a .smolmachine we have to push it somewhere smolvm can pull from. Smallest path: bring up registry:2.8.3 bound to 127.0.0.1:<random>, docker tag + docker push into it, smolvm pack create --image localhost:<port>/claude-bottle:<id>, tear down the registry. The .smolmachine is cached under ~/.cache/claude-bottle/smolmachines/ keyed by the docker image ID (first 16 hex chars of the sha256), so a Dockerfile change picks up a new image ID and invalidates the cache. Unchanged rebuilds skip the whole build → registry → pack pipeline. This puts `docker build` in smolmachines prepare (the docker backend defers it to launch). Necessary because pack_create needs the image ID to derive the cache key, and prepare is the only hook ahead of launch that runs once per slug. Adds: - claude_bottle/backend/docker/util.py: image_id / tag / push helpers (thin docker CLI wrappers). - claude_bottle/backend/smolmachines/local_registry.py: ephemeral_registry() context manager; pins registry:2.8.3 by digest, binds 127.0.0.1::5000 (loopback-only), force-removes on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 13:51:02 -04:00
didericis-claude	519a71f2e7	refactor(docker): drop legacy names from capability_apply teardown test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 40s Details Last of the per-sidecar legacy names. `_per_bottle_container_names` used to list the four pre-bundle sidecars (cred-proxy, pipelock, git-gate, supervise) so capability-apply's teardown would force-rm them on remediation. None of those containers exist anymore — the four daemons run in the sidecar bundle (PRD 0024), so the list collapses to the agent + the bundle. Integration test follows: the fake supervise-sidecar setup, which existed to give teardown an extra container to clean up, switches to a fake sidecar bundle with the current name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 13:07:15 -04:00
didericis-claude	727f30d422	refactor(docker): drop legacy per-sidecar container_name functions test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 41s Details Same line of cleanup as the supervise rename: the per-sidecar container names (`claude-bottle-pipelock-<slug>`, `claude-bottle-egress-<slug>`, `claude-bottle-git-gate-<slug>`) were docker-network aliases pointing at the bundle, kept so legacy URLs would keep resolving. Replaces them with short hostnames (`pipelock`, `egress`, `git-gate`) matching the existing `EGRESS_HOSTNAME` pattern, and inlines the bundle-loopback URL (`http://127.0.0.1:8888`) for the in-bundle egress→pipelock hop — matching what smolmachines already does. Drops the three `*_container_name` functions, `pipelock_proxy_url`, and `git_gate_host`. Their callers move to the new constants: - `PIPELOCK_HOSTNAME = "pipelock"` (claude_bottle/pipelock.py) - `GIT_GATE_HOSTNAME = "git-gate"` (claude_bottle/git_gate.py) - `BUNDLE_LOCAL_PIPELOCK_URL` (backend/docker/pipelock.py) The agent's HTTP_PROXY now reads `http://pipelock:8888` (vs the old `http://claude-bottle-pipelock-<slug>:8888`); the gitconfig insteadOf rewrites become `git://git-gate/<repo>.git`. The prepare- time orphan probe is collapsed onto the bundle container name (`claude-bottle-sidecars-<slug>`) instead of the four legacy per-sidecar names that no backend creates anymore. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 13:04:48 -04:00
didericis-claude	8ecba2b458	refactor(docker): drop legacy supervise_container_name alias test / unit (pull_request) Successful in 22s Details test / integration (pull_request) Successful in 40s Details Supervise runs inside the sidecar bundle (PRD 0024), not in its own container. The `claude-bottle-supervise-<slug>` per-sidecar name only existed as a docker-network alias on the bundle so legacy code paths that referenced the old name would still resolve. Nothing inside the project relies on that resolution anymore — the short `supervise` alias is the one all consumers use — so the legacy long-form is dead. Drops the function entirely, plus its registration as a network alias and as an orphan probe in prepare. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 12:52:47 -04:00
didericis-claude	73dc0d4a40	refactor(sidecars): instantiate sidecar ABCs directly from any backend test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 40s Details The four sidecar prepare-time helpers (PipelockProxy, Egress, GitGate, Supervise) had docker-flavored subclasses that existed only as instantiation shims for ABCs that already had no abstract methods. PipelockProxy.prepare() reached for class-level CA path constants that were only defined on the docker subclass — so smolmachines had to import DockerPipelockProxy to render pipelock yaml, reaching across the backend boundary for what's actually a platform-neutral operation. This moves the universal in-container CA paths (PIPELOCK_CA_CERT_IN_CONTAINER / PIPELOCK_CA_KEY_IN_CONTAINER) to claude_bottle/pipelock.py, drops the class-attr indirection on the ABC, and deletes the four empty docker subclasses. Both backends now instantiate the ABCs directly; the docker-side modules keep the docker-flavored helpers (image pin, container naming, host CA mint) and re-export the moved pipelock constants for compat. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 05:42:20 -04:00
didericis-claude	1dfc359141	feat(smolmachines): thread inner Plans + bundle daemons run (PRD 0023 chunk 4b) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 42s Details Bundle daemons (pipelock, egress, optionally git-gate + supervise) now actually start with their config files bind-mounted from the inner Plans the docker backend already produces. Chunks 2d + 3 ran with daemons_csv="" so the bundle's init supervisor idled; chunk 4b wires up the real path: agent → pipelock → egress → internet (when routes declared) is now functional, modulo agent- image gaps (claude-code / TLS-trust-store / git in the guest) that chunk 4c addresses. bottle_plan.py — added the four inner Plan fields: proxy_plan: PipelockProxyPlan git_gate_plan: GitGatePlan egress_plan: EgressPlan supervise_plan: SupervisePlan \| None Same shape the docker backend's plan uses. Docker-network-only fields (internal_network, egress_network) stay at dataclass defaults — the smolmachines bundle is on a per-bottle bridge with a pinned IP, not docker's --internal + egress topology. prepare.py — instantiates DockerPipelockProxy / DockerEgress / DockerGitGate / DockerSupervise and calls their .prepare() methods to write the per-bottle config files (pipelock.yaml, routes.yaml, git-gate entrypoint/hooks, supervise queue dir) under the per-bottle state dir. (The "Docker" prefix on the class names is a misnomer here — .prepare() is platform-neutral, inherited from each sidecar's ABC. A future cleanup could factor the prepare logic out of the docker subpackage.) launch.py — major rewrite: - pipelock_tls_init at launch (always); egress_tls_init only when the bottle declares routes (otherwise the CA files aren't bind-mounted and openssl runs would be wasted). - Inner Plans updated in place with launch-time CA paths + EGRESS_UPSTREAM_PROXY = http://127.0.0.1:8888 (egress's upstream is pipelock on the bundle's own loopback; same container's network namespace). - BundleLaunchSpec env + volumes built from the inner Plans: pipelock.yaml + CA + key (always); egress routes + CAs + upstream env + token-slot bare names (when routes); git-gate entrypoint + hooks + per-upstream identity files (when upstreams); supervise queue dir + env (when enabled). - daemons_csv = ["egress", "pipelock"] + ["git-gate"] (if upstreams) + ["supervise"] (if enabled). - Token env values resolved from host env via `egress_resolve_token_values` and threaded into the docker-run subprocess env (bare-name -e entries in spec inherit from there — values never land on argv). Tests: - 552 unit passing (no new unit cases; fixture updated to populate the new plan fields). - 5 integration cases passing locally (Darwin + smolvm + docker + not GITEA_ACTIONS): * test_smoke_exec_echo — still works. * test_localhost_reach_probe — host loopback still refused. * test_egress_port_bypass_probe — <bundle-ip>:9099 still refused, NOW WITH EGRESS ACTUALLY RUNNING (chunk 3's 127.0.0.1 bind-address is doing its job). * test_prompt_file_lands_in_guest — still works. * test_pipelock_answers_on_bundle_ip — NEW. From inside the guest, wget to <bundle-ip>:8888 gets an HTTP response (not "connection refused") — proves pipelock is actually listening and the bind-mount + CA generation path works. What's left in chunk 4: - 4c: agent-image-conversion (claude-code + git + curl + ca-certificates in the guest). Chunk 2d's alpine placeholder stays for now. - 4d: provision_ca + provision_git + provision_supervise once the agent image has the required tools. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 05:29:02 -04:00
didericis-claude	9e3b7e441e	feat(smolmachines): provision_prompt + provision_skills (PRD 0023 chunk 4a) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 43s Details First slice of chunk 4: implement the two provisioning methods that don't depend on agent-image tooling beyond `cp` and `mkdir`. provision_ca / provision_git / provision_supervise land once the agent-image gap is solved (chunk 4b+) — they need update-ca-certificates, git, and the claude binary respectively, none of which the chunk-2d alpine placeholder provides. What this PR ships: - `claude_bottle/backend/smolmachines/provision/` subpackage with `prompt.py` + `skills.py`. Each routes through `smolvm.machine_cp` / `machine_exec`. provision_prompt mirrors the docker contract (file always copied; return value drives --append-system-prompt-file iff the agent has a non-empty prompt). provision_skills mkdir + cp per skill, matching the docker backend's loop. - prepare.py now writes the prompt file under agent_state_dir(slug) with the agent's `prompt` body, mode 0o600. The in-guest path is `/root/.claude-bottle-prompt.txt` (alpine has no `node` user; will become `/home/node/...` once the real claude-bottle image lands). - launch.py calls `provision(plan, machine_name)` after machine_start. The returned prompt path threads to SmolmachinesBottle so exec_claude can add --append-system-prompt-file when the agent has a prompt. - backend.py: provision_prompt / provision_skills now real; provision_git is a deliberate stub (waiting on the git-gate inner Plan + git in the agent image). provision_supervise stays the chunk-2d stub. Tests: - 7 new unit cases (test_smolmachines_provision.py): argv shape (mocked smolvm.machine_cp / .machine_exec), prompt return-value contract, no-op-with-no-skills, CLAUDE_BOTTLE_GUEST_SKILLS_DIR override, fail-on-missing-skill. - 1 new integration case in test_smolmachines_launch.py: end-to-end verification that the prompt file lands in the alpine guest at /root/.claude-bottle-prompt.txt with the expected content (via `bottle.exec("cat ...")`). The smoke + the two TSI probes stay green. 552 unit + 4 integration (Darwin+smolvm+docker gated) passing. What's left in chunk 4: - 4b: thread the inner Plans (PipelockProxyPlan / EgressPlan / GitGatePlan / SupervisePlan) through prepare + launch so the bundle daemons actually run (currently daemons_csv=""). - 4c: the agent-image-conversion gap — get claude-code + git + curl + ca-certificates into the guest image (build a .smolmachine via `pack create --from-vm` after manual setup, or push the docker image to a registry smolvm can pull). - 4d: provision_ca + provision_git + provision_supervise once 4b + 4c land. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 05:08:17 -04:00
didericis-claude	909029085e	feat(sidecars): egress binds 127.0.0.1 when EGRESS_LISTEN_HOST is set (PRD 0023 chunk 3) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 41s Details Egress's bind address is now env-driven via EGRESS_LISTEN_HOST. Unset → mitmdump's default (all interfaces) — the docker backend's behavior, unchanged. Set to `127.0.0.1` → mitmdump binds localhost only. The smolmachines launch sets EGRESS_LISTEN_HOST=127.0.0.1 in the bundle's env unconditionally. TSI's allowlist is `<bundle-ip>/32` (IP-only, not port-granular), which would otherwise let the agent dial `<bundle-ip>:9099` and bypass pipelock's DLP by talking to egress directly. Binding egress to localhost inside the bundle closes that gap at the socket level — the agent still reaches the IP (TSI permits it) but egress refuses the connect because it's not listening on the docker bridge interface. The docker backend doesn't set the env var because its agent dials egress directly via the docker network alias — egress MUST be reachable from outside the bundle there. The asymmetry is documented in the entrypoint script's comment. Changes: - egress_entrypoint.sh: read EGRESS_LISTEN_HOST, conditionally pass `--listen-host <host>` to mitmdump. - smolmachines/launch.py: BundleLaunchSpec.environment now includes `EGRESS_LISTEN_HOST=127.0.0.1`. - New unit tests (5): the entrypoint script's argv shape under various env combinations, verified via a fake mitmdump shim that prints its argv. 545 unit + 3 integration tests passing. The egress-port-bypass probe from chunk 2d still passes (chunk 2d ran with daemons_csv="" so no egress was up; chunk 3 makes the probe preserve its property once egress IS up in chunk 4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 04:49:22 -04:00
didericis-claude	9f65b137b9	feat(smolmachines): end-to-end launch + Bottle.exec + smoke + probes (PRD 0023 chunk 2d) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 41s Details test / unit (push) Successful in 22s Details test / integration (push) Successful in 41s Details End-to-end launch flow for the smolmachines backend. Brings up the per-bottle docker bridge + sidecar bundle, creates and starts the smolvm guest pointed at the bundle's pinned IP via TSI's `--allow-cidr <bundle-ip>/32`, yields a SmolmachinesBottle handle that routes exec/cp through `smolvm machine exec / cp`, tears everything down on context exit. launch.py: - ExitStack-managed: create_bundle_network → start_bundle → machine_create → machine_start (each registered for reverse teardown). - daemons_csv="" for chunk 2d — bundle init logs "no daemons selected" and idles. Real daemon bringup with inner-Plan-driven env + volumes lands in chunk 4. bottle.py: - SmolmachinesBottle.exec → smolvm.machine_exec (captured). - SmolmachinesBottle.exec_claude → direct subprocess.run with inherited TTY for interactive sessions. - SmolmachinesBottle.cp_in → smolvm.machine_cp. Architecture pivots forced by smolvm 0.8.0's CLI shape: 1. `--from <smolmachine>` and `--smolfile <toml>` are MUTUALLY EXCLUSIVE in smolvm 0.8.0. We need --from to avoid the registry-pull race that bit us on machine_start (libkrun agent's network attempt got refused by macOS with "connect: permission denied" on IPv6). So Smolfile is dropped entirely; per-bottle env + allow_cidrs flow as CLI flags (`--allow-cidr CIDR`, `-e K=V`) directly to machine_create. 2. `smolvm pack create --image` doesn't pull from the local docker daemon — only OCI registries via crane. The real claude-bottle:latest image lives in the local docker daemon and isn't reachable that way. Chunk 2d ships with an alpine placeholder; the agent-image-conversion gap belongs to chunk 4 (push the image to a registry, or smolvm grows a docker-daemon transport). Other changes: - machine_create grew `image=` / `from_path=` / `allow_cidrs=` / `env=` kwargs; smolfile= dropped. - bottle_plan: smolfile_path → agent_from_path + guest_env. - prepare: pack_create against `alpine:latest`, cached under ~/.cache/claude-bottle/smolmachines/ keyed by image ref. - Deleted smolfile.py + test_smolfile.py (dead code now). Tests: - Unit: 540 passing (smolvm wrapper grew 4 new flag forms; one test renamed to reflect --from + --allow-cidr + -e combo). - Integration: 3 new cases in tests/integration/ test_smolmachines_launch.py, gated on Darwin + smolvm on PATH + docker + not GITEA_ACTIONS: * smoke: bottle.exec("echo hello-from-vm") round-trips with the correct stdout + returncode. * localhost-reach probe: agent dials 127.0.0.1:9 → connect refused (TSI's <bundle-ip>/32 allowlist doesn't include loopback). The regression test for the gap the PRD design pivot was about. * egress-port-bypass probe: agent dials <bundle-ip>:9099 (egress's port) → connect refused. Chunk 2d has no daemons running so nothing's listening anyway; chunk 3 will preserve this property once egress is up but bound to 127.0.0.1 inside the bundle. End-to-end smoke + both probes green locally on macOS with smolvm 0.8.0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 04:39:52 -04:00
didericis-claude	495be7f9c0	feat(smolmachines): bundle bringup on per-bottle docker bridge (PRD 0023 chunk 2c) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 43s Details claude_bottle/backend/smolmachines/sidecar_bundle.py — primitives for the per-bottle bridge + bundle container with pinned IP: - bundle_network_name(slug) / bundle_container_name(slug) - create_bundle_network(name, subnet, gateway) - remove_bundle_network(name) - start_bundle(BundleLaunchSpec, env=) - stop_bundle(slug) `BundleLaunchSpec` carries the launch-time fields (network + subnet + gateway + bundle_ip + daemons_csv + environment + volumes). Wiring it up from the inner Plans (PipelockProxyPlan, EgressPlan, GitGatePlan, SupervisePlan) is chunk 2d's job; this module is the docker-argv surface only. Pinning the bundle IP via `docker run --ip <bundle-ip>` is what makes smolvm's TSI allowlist (`<bundle-ip>/32`) safe to compute at prepare time — without pinning, we'd have to inspect the assigned IP after start and feed it back into the Smolfile. Idempotent semantics where it matters: `create_bundle_network` treats "already exists" as success, `remove_bundle_network` + `stop_bundle` treat "no such ..." as success. Other failures die / warn depending on whether the launch flow can recover. Tests: - 15 unit cases (mocked subprocess.run): argv shape for create / remove / start / stop, idempotent paths, host-env inheritance to docker run subprocess. - 1 integration case (real docker daemon, gated on docker available + not GITEA_ACTIONS): end-to-end bringup of an empty-daemons bundle on a 192.168.211.0/24 bridge, confirms the container lands at the pinned IP. Skipped if the claude-bottle-sidecars:latest image isn't built (operator hasn't run a docker bottle yet). 546 unit tests passing. Real-docker bundle bringup green locally. Launch wiring + provisioning + PRD 0022 acceptance probes land in chunk 2d. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 04:19:31 -04:00
didericis-claude	9c333bc130	feat(smolmachines): smolvm subprocess wrapper (PRD 0023 chunk 2b) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 41s Details claude_bottle/backend/smolmachines/smolvm.py — one thin Python function per smolvm CLI subcommand the launch flow needs: - pack_create(image, output) → smolvm pack create - machine_create(name, from_path, smolfile) → smolvm machine create - machine_start(name) → smolvm machine start - machine_stop(name) → smolvm machine stop - machine_delete(name) → smolvm machine delete -f - machine_exec(name, argv, env, workdir, timeout) → smolvm machine exec - machine_cp(src, dst) → smolvm machine cp - is_available() → shutil.which check The wrapper hides the CLI's inconsistent name-flag style (positional NAME on create/delete, --name on start/stop/exec/ status) behind a uniform `name=` kwarg. Two return shapes: - SmolvmRunResult (returncode + stdout + stderr) from machine_exec, because callers care about the in-VM command's exit code. - Raises SmolvmError on non-zero for all other commands; failure to create/start/stop a VM is fatal to the launch flow, not branched on. Tests: - 15 unit cases mocking subprocess.run, covering argv shape per subcommand (the --name vs positional inconsistency locked down), SmolvmError on non-zero for non-exec paths, SmolvmRunResult passthrough on exec, empty-path cp no-op. - 2 integration cases against the real smolvm binary (gated on Darwin + smolvm on PATH + not GITEA_ACTIONS): smolvm --help responds, machine ls --json parses as a list (the contract chunk 4's list_active will consume). 531 unit tests passing. Real-smolvm smoke green locally. Bundle bringup + launch wiring + the localhost-reach / egress-port-bypass probes land in chunks 2c + 2d. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 04:11:36 -04:00
didericis-claude	c73d717f71	feat(smolmachines): rewrite Smolfile to smolvm 0.8.0 schema + drop gvproxy (PRD 0023 chunk 2a) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 39s Details First sub-PR of chunk 2: rewrite the renderer chunk 1 shipped to match smolvm 0.8.0's actual Smolfile shape, delete the dead gvproxy renderer + its tests, simplify the prepare flow now that there's no gvproxy socket + no loopback-port allocation. Smolfile renderer: - Old shape (under the abandoned gvproxy design): name = ..., command = [...], [[net]] attachment = "unixgram", socket = "...". - New shape (smolvm 0.8.0): env = [...] (sorted K=V pairs), [network] allow_cidrs = ["<bundle-ip>/32"]. Nothing else. image / entrypoint / cmd come from the .smolmachine artifact built in chunk 2b; cpus / memory left at smolvm defaults. - Tests assert no leakage of TSI's --outbound-localhost-only or the old gvproxy/unixgram keys. util.py: - smolmachines_gvproxy_subnet → smolmachines_bundle_subnet, returning (subnet, gateway, bundle_ip). bundle_ip is always at .2 (gateway .1); subnet is /24, third octet derived from the slug hash, skipping the docker-default 17 to avoid the common 192.168.17.x collision. - allocate_loopback_port: deleted. The bundle gets a pinned docker IP now; the agent dials that IP directly through TSI. - smolmachines_preflight: dropped the gvproxy check; only smolvm is required. prepare.py: - Drops the gvproxy.yaml render + the loopback port allocation + the gvproxy_socket field on the plan. - Derives subnet / gateway / bundle_ip from the slug and populates the new SmolmachinesBottlePlan fields. - Agent env now uses IP-literal URLs (http://<bundle-ip>:8888 etc) since the guest will have no DNS resolver inside TSI's allowlist. bottle_plan.py: - Old fields: gvproxy_config_path, gvproxy_socket, gvproxy_subnet, gvproxy_gateway, host_port_map. - New fields: bundle_subnet, bundle_gateway, bundle_ip, smolfile_path. (smolmachine artifact path lands in chunk 2b.) Net: -410 lines. Full unit suite: 516 passing. The VM lifecycle + bundle bringup + launch wiring + smoke tests land in chunk 2b. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 04:01:07 -04:00
didericis	20f411b22e	feat(smolmachines): backend skeleton + Smolfile/gvproxy renderers (PRD 0023 chunk 1) test / unit (pull_request) Successful in 22s Details test / integration (pull_request) Successful in 43s Details Ships the smolmachines backend's prepare side: subpackage layout, `_BACKENDS` registration under "smolmachines", preflight check for `smolvm` + `gvproxy` on PATH, and the two config-file renderers (Smolfile TOML + gvproxy YAML). Launch raises NotImplementedError until chunk 2. New module layout (mirrors backend/docker/): claude_bottle/backend/smolmachines/ __init__.py re-exports SmolmachinesBottleBackend backend.py SmolmachinesBottleBackend façade bottle.py SmolmachinesBottle stub (NotImpl until ch2) bottle_plan.py SmolmachinesBottlePlan + .print() bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan stub prepare.py resolve_plan: writes both config files smolfile.py TOML renderer (stdlib, no tomli_w dep) gvproxy_config.py YAML renderer (same shape as pipelock_yaml) util.py preflight + per-slug subnet + loopback port The renderers are pure functions. `resolve_plan` runs the preflight, allocates one host-side loopback port per active sidecar (pipelock always; git-gate / supervise conditional), derives a per-slug gvproxy subnet (hash-mod-254, skipping the docker-default 17), and writes: - <stage>/gvproxy.yaml: subnet + DNS rule resolving only `proxy.internal` + port_forwards (one per active sidecar). - <stage>/smolfile.toml: guest command/env + virtio-net device backed by gvproxy's unixgram socket. No TSI flags — see PRD 0023 "Why gvproxy, not TSI". The agent's HTTPS_PROXY etc. point at `proxy.internal:<gateway- port>` so the guest dials through gvproxy. gvproxy resolves only `proxy.internal` → the gateway IP, and forwards exactly the listed ports to the host-side sidecar bundle (PRD 0024); every other destination — host LAN, host loopback, public internet directly — is unreachable by construction. 29 new unit tests covering renderer correctness, subnet derivation stability + collision-avoidance, loopback port allocation, and preflight error paths. Full unit suite: 532 passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 02:22:08 -04:00
didericis	5b9ceaaaee	fix(sidecars): per-daemon pipelock restart keeps supervise socket alive test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 43s Details `apply_allowlist_change` used `docker restart <bundle>` to make pipelock reload, which bounced ALL four daemons — including supervise, whose MCP socket the agent's claude-code client had open. That dropped the connection. A second apply works because supervise has come back up by then. Fix: per-daemon restart via SIGUSR1. - New `_Supervisor.restart_daemon(name)` terminates one named child and spawns a replacement in place. Other daemons keep running. - main() wires SIGUSR1 → `restart_daemon("pipelock")`. Pipelock has no in-process reload, so this is its analog of egress's SIGHUP-reload-addon path. Pipelock is the only daemon that currently needs hot-config reload via restart; if others acquire the need, add a new signal. - `apply_allowlist_change` now `docker kill --signal USR1 <bundle>` instead of `docker restart`. Supervise / egress / git-gate keep running across the apply. Tests: - New `_Supervisor.restart_daemon` cases: replaces in place (different pid post-restart, sibling daemon unchanged), unknown name is a no-op, restart-during-shutdown is a no-op. - `test_pipelock_apply` rewritten to bring up the bundle image with `CLAUDE_BOTTLE_SIDECAR_DAEMONS=pipelock` so the supervisor is PID 1 and handles SIGUSR1. The previous standalone-pipelock setup wouldn't survive SIGUSR1 (pipelock default disposition is terminate). Test builds the bundle image in setUpClass (cached layers make repeat runs fast). 531 tests passing locally (unit + integration). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 02:12:37 -04:00
didericis	0848344438	fix(sidecars): apply_routes_change targets the bundle + SIGHUP forwarding test / unit (pull_request) Successful in 20s Details test / integration (pull_request) Successful in 42s Details Two bugs surfaced when applying an egress route change: 1. egress_apply.py still targeted claude-bottle-egress-<slug> — the legacy per-sidecar container that no longer exists (it's a docker-network alias on the bundle now). Switched it to sidecar_bundle_container_name(slug), matching the chunk-5 fix already made to pipelock_apply.py. 2. `docker kill --signal HUP <bundle>` lands SIGHUP on the supervisor (PID 1 in the bundle), which previously had no SIGHUP handler — the signal was ignored. Added `_Supervisor.forward_signal(sig, daemon_name)` and a SIGHUP handler in main() that forwards to the egress daemon so mitmdump's addon reload still works under the bundle. Tests: - New _Supervisor.forward_signal cases: forwards to the named child (Python subprocess as the SIGHUP target — bash trap + stdout=PIPE deferral interferes with the production-style test); unknown-daemon name is a no-op. Stale-reference cleanup (separate issue surfaced while looking at this): - claude_bottle/{egress,git_gate,egress_addon, egress_addon_core,supervise_server}.py: Dockerfile.egress / Dockerfile.git-gate / Dockerfile.supervise references updated to Dockerfile.sidecars (the old per-sidecar Dockerfiles were deleted in PRD 0024 chunk 5). - tests/README.md: dropped the entry for test_pipelock_sidecar_smoke (deleted in chunk 3) and added the new bundle integration tests. - git_gate.py: stale `DockerGitGate.start via docker cp` reference (the method was deleted in chunk 3) rewritten to the bind-mount path the renderer uses now. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 01:56:38 -04:00
didericis	62f6f8db34	refactor(sidecars): bundle is the only shape (PRD 0024 chunk 5) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 43s Details The CLAUDE_BOTTLE_SIDECAR_BUNDLE feature flag is gone. Every bottle ships with the agent + bundle pair — no opt-in, no legacy four-sidecar fallback. Changes: - Renderer (compose.py): bottle_plan_to_compose unconditionally emits {agent, sidecars}. Deleted _pipelock_service, _git_gate_service, _egress_service, _supervise_service helpers. _agent_service.depends_on collapses to ["sidecars"]. - sidecar_bundle.py: deleted sidecar_bundle_enabled (the flag parser). SIDECAR_BUNDLE_IMAGE + container-name helper stay. - pipelock_apply.py: docker cp + docker restart now target sidecar_bundle_container_name(slug). Bundle restart bounces all four daemons together (per-daemon reload is the eventual feature, not v1). - Per-sidecar modules trimmed: - egress.py: dropped EGRESS_IMAGE, EGRESS_DOCKERFILE, build_egress_image, egress_url. Kept EGRESS_PORT, CA paths, egress_container_name (still used by the renderer's network aliases). - git_gate.py: dropped GIT_GATE_IMAGE, GIT_GATE_DOCKERFILE, build_git_gate_image. Kept git_gate_host + GIT_GATE_PORT. - supervise.py: dropped SUPERVISE_IMAGE, SUPERVISE_DOCKERFILE, build_supervise_image, supervise_url. - Deleted Dockerfile.{egress,git-gate,supervise}. The bundle's Dockerfile.sidecars is the only sidecar image now. - test_compose.py: deleted TestPipelockAlwaysPresent, TestConditionalGitGate, TestConditionalEgress, TestConditionalSupervise, TestFullMatrix (legacy-shape only), TestSidecarBundleFlag (flag is gone). TestSidecarBundleShape drops its patch.dict wrapper. TestAgentAlwaysPresent's depends_on cases collapse to one. - test_pipelock_apply.py: bringup container name uses sidecar_bundle_container_name(slug) to match the production target. - README.md Architecture section rewritten to describe the agent + bundle pair. Net: -626 lines. Test status: 498 unit + 27 integration + 1 skipped (chunk-4 pending — superseded by this chunk's rewrite). Locally verified end-to-end bottle launch produces exactly 2 containers (claude-bottle-<slug> + claude-bottle-sidecars-<slug>). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 01:37:21 -04:00
didericis	2287b0dd08	test(sidecars): integration sweep for the bundle path (PRD 0024 chunk 4) test / unit (pull_request) Successful in 20s Details test / integration (pull_request) Successful in 40s Details Three deliverables: 1. Rewrite test_pipelock_apply bringup with a direct `docker run`. Replaces the .start-based bringup deleted in chunk 3. Stages the yaml + CAs to the real pipelock_state_dir so the bind- mount target matches what apply_allowlist_change writes to — the legacy .start path did this implicitly because it lived inside the production flow; the new bringup needs to be explicit about the path. All 4 cases pass. 2. New tests/integration/test_sidecar_bundle_compose.py: end- to-end smoke with CLAUDE_BOTTLE_SIDECAR_BUNDLE=1. Brings up a real bottle via the compose path and verifies the agent can reach pipelock + supervise through the bundle's legacy aliases (no agent-side config changes between flag positions). Skipped under act_runner — multi-stage build + bind mounts. 3. Two bundle-path bugs surfaced and fixed while running PRD 0022 with the flag on: - egress_entrypoint.sh: add `--set confdir=/home/mitmproxy/ .mitmproxy` so mitmdump finds the bind-mounted CA. The legacy Dockerfile.egress runs as user mitmproxy (~mitmproxy resolves correctly); the bundle runs as root and otherwise would look in /root/.mitmproxy/ and mint a NEW CA the agent doesn't trust. Symptom: PRD 0022 attack-3 curl failed with "unable to get local issuer certificate". - sidecar_init.py: add `--listen 0.0.0.0:8888` to pipelock's argv. Without it pipelock defaults to 127.0.0.1, so the in-bundle egress's upstream connect to the `claude-bottle-pipelock-<slug>` alias arrives over the docker network and gets refused. The legacy renderer passed this flag verbatim; the bundle dropped it. Symptom: egress returned HTTP 502 with "Connect call failed ('172.x.x.x', 8888)". PRD 0022's 5-attack sandbox-escape suite now passes with the bundle flag on AND off. Test status: - Unit: 533 passing. - Integration: 9 passing locally with flag off, 5 passing with flag on. Bundle compose smoke + PRD 0022 sandbox-escape both green under CLAUDE_BOTTLE_SIDECAR_BUNDLE=1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 01:15:14 -04:00
didericis	539234f29e	refactor(sidecars): drop vestigial start/stop methods (PRD 0024 chunk 3) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 41s Details Compose-up has owned per-container lifecycle since PRD 0018 ch3; the .start() / .stop() methods on DockerPipelockProxy / DockerEgress / DockerGitGate / DockerSupervise (and their abstractmethod declarations in the four base ABCs) were already documented as vestigial. With the bundle path in flight (PRD 0024 ch2), they are truly dead — collapse to nothing. Changes: - Removed start/stop methods from the four DockerSidecar classes. Plan dataclasses, image/path constants, container-name helpers, and the .prepare() methods all stay (the renderer + apply path still need them). - Removed the matching @abstractmethod declarations in the base ABCs so concrete subclasses don't have to stub them. - launch.launch() and prepare.resolve_plan() no longer take proxy/git_gate/egress/supervise instance parameters. backend.py loses the four instance attributes it threaded through. prepare.resolve_plan() instantiates the four classes itself to call their .prepare() methods. - Deleted four integration tests that only exercised the removed lifecycle: test_pipelock_sidecar_smoke, test_supervise_sidecar, test_git_gate_sidecar, test_git_gate_mirror. - Dropped the .stop-idempotency case in test_orphan_cleanup; the network-cleanup cases stay (those test real production code). - Marked test_pipelock_apply @skip pending chunk 4 — its bringup helper used .start; chunk 4 rewrites it with direct `docker run`. Dockerfile deletion deferred to chunk 5 (when the bundle flag default flips) — the legacy compose path still needs Dockerfile.{egress,git-gate,supervise} until then. Net: 708 lines removed, 80 added. 533 unit tests + 27 integration tests passing (5 skipped: the chunk-4-pending case + existing GITEA_ACTIONS guards). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 01:01:10 -04:00
didericis	a1180adec1	feat(compose): emit bundle shape behind feature flag (PRD 0024 chunk 2) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 1m12s Details The docker backend's compose renderer now emits a single `sidecars` service in place of the four per-sidecar services when CLAUDE_BOTTLE_SIDECAR_BUNDLE is truthy. Default (unset/0/ false) keeps the legacy five-service shape so existing operators don't have to migrate atomically; chunks 4-5 flip the default and delete the flag. New module claude_bottle/backend/docker/sidecar_bundle.py owns the bundle image constant (CLAUDE_BOTTLE_SIDECAR_IMAGE env var override + claude-bottle-sidecars:latest default), the Dockerfile reference, the container-name helper, and the flag-parser. The bundle service: - joins both internal + egress networks with aliases for every legacy shortname + per-slug long form so the agent's HTTPS_PROXY URL (which dials `egress` or `claude-bottle-pipelock-<slug>`) keeps resolving with no agent-side change - carries CLAUDE_BOTTLE_SIDECAR_DAEMONS=<csv> for the init supervisor to narrow which daemons to start - carries the union of the four prior services' daemon-private env vars (EGRESS_UPSTREAM_PROXY, SUPERVISE_*, token env names) - does NOT carry HTTPS_PROXY/HTTP_PROXY/NO_PROXY — those would route git-gate's git fetches through pipelock by mistake - union'd bind-mounts at the same in-container paths as before HTTPS_PROXY scoping moved into egress_entrypoint.sh so only mitmdump's subprocess sees it. In the legacy four-sidecar shape the env vars also lived in the egress service's compose env; the shell script's export is additionally defensive. Tests: - All 44 existing TestCompose cases pass unchanged (flag off → legacy shape). - 20 new TestSidecarBundleShape cases assert on the bundle's services / aliases / env / volumes / depends_on under the flag. - 8 new TestSidecarBundleFlag cases lock down the env-var parser (unset / 0 / false / no / off → disabled; everything else → enabled). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 00:43:08 -04:00
didericis	62109a1caf	fix(sidecars): child death no longer tears down the bundle test / unit (pull_request) Successful in 20s Details test / integration (pull_request) Successful in 1m8s Details Reverses chunk 1's "any unexpected child death tears down the rest" policy. New behavior: a daemon dying is logged but does NOT initiate shutdown — the surviving daemons keep running and whatever the dead one served starts failing visibly on the agent side. The supervisor exits only when (a) it receives SIGTERM/SIGINT, or (b) every child has died on its own. Eventual design is restart-the-dead-daemon plus a notification to the supervise sidecar so the operator sees the event explicitly; this commit ships only the "log and leave alone" half. PRD 0024 open question 1 updated to reflect the new intent. Tests updated: replaced "crash propagates exit code via auto-teardown" with three cases that exercise the new policy (crash without shutdown leaves survivors up, crash-then-signal surfaces the nonzero code, all-children-die-unattended still converges the loop). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 00:19:50 -04:00
didericis	61f63684ac	feat(sidecars): bundle image + Python init supervisor (PRD 0024 chunk 1) test / unit (pull_request) Successful in 22s Details test / integration (pull_request) Successful in 1m12s Details New Dockerfile.sidecars multi-stage build: pulls the pinned pipelock and gitleaks binaries into a mitmproxy-base final image, installs git + openssh-client, and ships the project's egress addon + supervise server alongside a stdlib-Python init at /app/sidecar_init.py. The init supervisor (claude_bottle/sidecar_init.py) is PID 1 in the bundle. It spawns the daemons named in CLAUDE_BOTTLE_SIDECAR_DAEMONS (or all four by default), propagates SIGTERM/SIGINT to children with an 8s grace before SIGKILL, and exits with the first-unexpected-child exit code so a daemon crash tears down the bundle (per PRD 0024 open question 1's default). claude_bottle/egress_entrypoint.sh extracted verbatim from Dockerfile.egress's prior inline sh -c so the supervisor can call it as a normal child. Tests: - unit: _selected_daemons env-var subset behavior (7 cases), _Supervisor signal/exit-code semantics including SIGKILL escalation, and end-to-end main() via subprocess. - integration: builds the image and probes that pipelock, gitleaks, mitmdump, and the supervise Python module are present + executable, plus a no-daemons-selected smoke test of the entrypoint wiring. Skipped under act_runner (200+MB base pulls + multi-stage build). Renderer collapse and the deletion of Dockerfile.{egress,git-gate, supervise} land in chunk 2 + 3. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 00:05:06 -04:00
didericis	5c17fcdf90	test(integration): skip sandbox-escape suite under act_runner test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m10s Details The Gitea CI runner shares the host docker socket but not its filesystem, so pipelock_tls_init's host bind-mount path for CA files is invisible to the runner container — the same constraint that already gates the other bottle-bringup integration tests. PRD 0022's test suite was missing this guard; it failed on the post-merge main build with "pipelock tls init did not produce ca files". Mirror the existing skipIf pattern at the class level. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-26 23:03:57 -04:00
didericis	23f50f7720	fix(pipelock): scan all request headers + fix attack-3 destination test / unit (pull_request) Successful in 19s Details test / integration (pull_request) Failing after 49s Details Two related changes the PRD 0022 sandbox-escape test surfaced: 1. `pipelock_build_config` now emits `request_body_scanning.scan_headers: true` and `header_mode: all`. Pipelock's default `header_mode: sensitive` only checks Authorization / Cookie / X-Api-Key / X-Token / Proxy-Authorization / X-Goog-Api-Key — an agent attempting exfil could trivially pick a non-sensitive header (`X-Custom: $SECRET`) and slip through. `all` closes the gap; pipelock caps it by the same max_body_bytes the body scan uses. 2. Test 3 (HTTP exfil shapes) now targets raw.githubusercontent.com instead of api.anthropic.com. api.anthropic.com is in `DEFAULT_TLS_PASSTHROUGH` — pipelock can't MITM it because real LLM conversation bodies false-positive on DLP scanners (BIP-39 etc.). The trade-off is documented in `pipelock.DEFAULT_TLS_PASSTHROUGH`; the test now exercises a host where the sandbox is actually supposed to block. All 5 sandbox-escape attacks now produce HTTP 403 with the expected sandbox marker (`egress:`, `pipelock`, or `blocked:`): - Attack 1 (non-allowlisted host) ✓ egress - Attack 2 (non-allowlisted IP + spoof) ✓ egress - Attack 3a (URL path) ✓ pipelock DLP - Attack 3b (URL query) ✓ pipelock DLP - Attack 3c (request body) ✓ pipelock DLP - Attack 3d (request header) ✓ pipelock DLP (scan_headers) - Attack 4a (crafted subdomain) ✓ egress - Attack 4b (direct dig @8.8.8.8) ✓ network isolation - Attack 5 (README push, 3 secret shapes) ✓ gitleaks (pre-upstream) 489 unit tests pass (1 updated for the new request_body_scanning shape). Full integration suite passes in ~6s.	2026-05-26 22:38:38 -04:00
didericis	e2231f46a3	test(integration): PRD 0022 sandbox-escape suite (chunks 1-5) test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Failing after 2m13s Details End-to-end test that brings up a real bottle with allowlisted egress + git-gate + three planted secrets, then runs five attacks from inside the agent container. Chunks 1-5 implemented in one pass against the Docker backend: Attack 1 — non-allowlisted hostname (curl evil.example.com) ✓ blocked by egress Attack 2 — non-allowlisted IP literal (198.51.100.1) + host- header spoof via curl --resolve ✓ both blocked by egress Attack 3 — HTTP exfil to allowlisted destination via path / query / body / header ✗ ALL FOUR LEAK — request reaches api.anthropic.com with the secret embedded. Pipelock's DLP doesn't catch the anthropic-key shape in the body, and nothing scans path / query / headers. Attack 4 — DNS exfil via crafted subdomain + direct dig @8.8.8.8 query ✓ both blocked (egress rejects subdomain, internal network has no path to 8.8.8.8) Attack 5 — README push through git-gate with secret-bearing attacker URL (parameterized over anthropic / AWS / generic shapes); ordering check that gitleaks fires BEFORE any upstream attempt ✓ all three secret shapes blocked by gitleaks Per PRD 0022 Q1 the assertion in attack 3 is authoritative — HTTP 403 with an egress/pipelock marker in the body is the only acceptable outcome. Any 4xx from upstream means the secret reached the network. The four failing sub-tests are real sandbox gaps that need their own remediation PRDs before this test merges green. Also adds `dnsutils` (dig) to the base agent image so attack 4's direct-DNS check has a tool to run. CI: no changes needed — `.gitea/workflows/test.yml` already runs `tests/integration/` and the suite skip_unless_dockers cleanly when the runner has no Docker socket.	2026-05-26 22:23:45 -04:00
didericis	1a1ba6abd5	fix(dashboard): fall back to fresh claude when --continue has no session test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details `--continue` exits non-zero when an agent has been spun up but never typed at — there's no transcript to resume. Re-attaching to such an agent via Enter (tmux mode) was crashing the pane. Wrap the resume invocation in `sh -c '<cmd> --continue \|\| <cmd>'` so a failed `--continue` cleanly falls through to a fresh claude. The shell adds microseconds and the fallback only kicks in when --continue would have failed anyway. New `_build_resume_argv_with_fallback(bottle)` builds the shell-wrapped docker exec argv with proper shlex quoting (so paths-with-spaces in `--append-system-prompt-file` survive). Only the tmux re-attach path uses it; first-attach + foreground handoff are unchanged. 489 unit tests pass (4 new for the fallback builder).	2026-05-26 15:34:21 -04:00
didericis	8d6e382af5	feat(dashboard): auto-focus next agent on stop, or close pane test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m5s Details After `x` stops a dashboard-owned bottle, slide focus to the next agent in the agents pane (the one filling the stopped row, or the new last row if the stopped was last) and respawn the right pane with that agent's claude session via `--continue`. If no agents remain, close the right pane via `tmux kill-pane`. Two new helpers: - `_tmux_close_right_pane(tmux_state)` — kills the tracked pane (if it exists) and clears pane_id / slug. - `_pick_next_after_stop(agents_before, selected_index, stopped_slug)` — pure chooser returning (new_index, agent) or None. Tested directly. Outside tmux, only the selected_agent index slides; no auto-attach (foreground handoff would take over the terminal, disruptive). 485 unit tests pass (6 new for the pick helper).	2026-05-26 15:21:20 -04:00
didericis	2ba84c5ba0	feat(dashboard): stop hook clears tmux state + right-pane row marker test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m6s Details PRD 0021 chunk 4 (final). Two adjustments to close the split-pane loop: 1. `_stop_bottle_flow` clears `tmux_state['slug']` when the stopped bottle was the right-pane occupant. The pane itself stays in place (claude exits with "container not found"); the operator presses Enter on a different agent to repurpose it via respawn-pane. 2. `_render` accepts `right_pane_slug` and marks the matching agents-pane row with a `*` prefix + A_BOLD (when it's not also the focused row — focused selection still wins for visibility). Gives the operator a clear visual link between which agent the dashboard says is "active right now" and which one is visible to their right. Wired through `_main_loop`: passes `tmux_state` to `_stop_bottle_flow` on `x`, and `tmux_state.get('slug')` to `_render` on every tick. 479 unit tests pass (1 new for the tmux_state-preservation on non-owned stop). PRD 0021 implementation complete pending merge.	2026-05-26 14:29:59 -04:00
didericis	9944878277	feat(dashboard): tmux split-pane helpers + Enter dispatch PRD 0021 chunk 2. New tmux integration: when `\$TMUX` is set and the operator presses Enter on a focused agent row, the dashboard spawns / respawns the right pane with that bottle's claude session instead of taking over the terminal via curses.endwin. Mechanics: - `_in_tmux()` — true when `\$TMUX` is set. - `_tmux_split_pane_create` — first attach: `tmux split-window -h -P -F '#{pane_id}'` opens a right pane and prints its id for tracking. - `_tmux_respawn_pane` — subsequent attaches: `tmux respawn-pane -k -t <id>` swaps the content without re-splitting. - `_tmux_pane_exists` — `tmux list-panes` check before respawn so a manually-closed pane gracefully falls back to a fresh split. - `_attach_in_tmux` — owns the create-or-respawn state machine, mutates `tmux_state` ({pane_id, slug}) so the main loop tracks the right-pane occupant. - `_attach_via_handoff` — the previous curses-endwin path, extracted as the fallback when tmux is missing or fails. - `_attach_to_bottle` dispatches: in tmux + state available → `_attach_in_tmux`; otherwise → handoff. Main loop gets `tmux_state: dict = {"pane_id": None, "slug": None}`. Chunks 3 + 4 wire it through the new-agent flow and the stop hook. `FileNotFoundError`-safe `subprocess.run` calls around every tmux invocation — a missing tmux binary cleanly falls back to the handoff for that keypress. 478 unit tests pass (10 new for the pure argv builders + `_claude_runtime_args`).	2026-05-26 14:26:40 -04:00
didericis	2303cbc0be	refactor(bottle): extract claude_docker_argv from exec_claude test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m10s Details PRD 0021 chunk 1. The tmux split-pane helpers (chunk 2+) need the same docker-exec argv that `exec_claude` builds — including the `--append-system-prompt-file <path>` flag the bottle's provisioner copies into place. Extract the argv construction into a pure `claude_docker_argv(argv, *, tty)` method so both foreground (`subprocess.run`) and tmux paths (`tmux respawn-pane …`) build from the same source. `exec_claude` becomes a one-liner that runs subprocess.run on the argv. No behavior change; 472 unit tests pass (7 new for the pure builder).	2026-05-26 14:21:04 -04:00
didericis	3ed3745982	feat(dashboard): `x` stops a dashboard-owned bottle (PRD 0020 chunk 4) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m7s Details Final PRD 0020 chunk. `x` on a focused agents-pane row tears down the selected bottle if the dashboard owns it (started via the chunk-2 `n` flow): pops `(cm, bottle, identity)` from the main loop's bottles map, snapshots the transcript best-effort, calls `cm.__exit__(None, None, None)` to drive the existing compose-down + network-remove sequence, then `settle_state` to honor any pre-existing preserve marker. On a non-owned slug (discovered via `list_active_slugs` but not in the dashboard's bottles dict — i.e., previous-dashboard or external `./cli.py start` bottle), `x` is a no-op with a status hint pointing at `./cli.py cleanup`. Matches the PRD's cross-dashboard re-attach model: the dashboard can re-attach either kind, but can only tear down its own. The PRD's chunk 5 ("quit-cleanup") is satisfied by the existing no-op behavior of `q` — per the user's resolved-question answer, quit leaves bottles running unchanged. No code change needed for that. Footer surfaces `[x] stop`. 465 unit tests pass (1 new for the non-owned no-op path; the owned path is integration territory because it drives a real compose-down).	2026-05-26 03:46:57 -04:00
didericis	572306ddb6	feat(dashboard): Enter on agents pane re-attaches to bottle test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m11s Details PRD 0020 chunk 3. Enter on a focused agents-pane row drops to a claude session inside the selected bottle. Works for both dashboard-owned bottles (looks up the stored Bottle handle in the main loop's `bottles` dict) and externally-discovered ones (synthesizes a DockerBottle from the slug → `claude-bottle-<slug>` container name). For the synthesized path, the `--append-system-prompt-file` target resolves via metadata.json + the manifest's agent prompt if both can be read; otherwise the re-attach runs without the flag (claude defaults to no system prompt, the bottle's other state is untouched). Shares the curses.endwin → attach → refresh handoff with the chunk-2 new-agent flow via a new `_attach_to_bottle` helper. Footer reshuffled to advertise `[Enter] view/attach`. 464 unit tests pass (3 new for `_bottle_for_slug`).	2026-05-26 03:39:58 -04:00
didericis	309ffaa4ab	feat(dashboard): agent picker modal + new-agent (`n`) flow test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details PRD 0020 chunk 2. Pressing `n` opens a modal that lists every agent from the manifest with `(N running)` suffixes for ones that already have bottles up. Type to filter (substring, case-insensitive); j/k or arrows to navigate; Enter to confirm; Esc clears the filter on first press, exits the picker on the second. On confirmation, the dashboard runs: - `prepare_with_preflight` from chunk 1 with curses-modal render + prompt callables (the preflight modal centers the plan summary + captures [y/N]). - `backend.launch(plan).__enter__()` — enters but doesn't bind the context to a `with`. The (cm, bottle, identity) tuple lands in the main loop's `bottles` dict keyed by slug. - `curses.endwin()` → `attach_claude(bottle)` → `stdscr.refresh()` handoff. The agent's claude session takes over the terminal; on exit the dashboard re-renders with the bottle now visible in the agents pane. Crucially the context manager is held alive in `bottles` — never `__exit__`'d at quit. Chunk 4 will wire `x` to that exit; for now bottles started from the dashboard stay running until explicit cleanup. Matches the PRD's "q does not tear down" decision. Footer surfaces `[n] new agent`. 461 unit tests pass (8 new for `_filter_agents` and `_running_counts`).	2026-05-26 03:22:44 -04:00
didericis	a56be6beb5	refactor(start): extract prepare_with_preflight + attach_claude test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details PRD 0020 chunk 1. `cli/start.py`'s `_launch_bottle` did three things in one function: prepare + preflight, attach claude, and settle state on teardown. Split them so the dashboard (PRD 0020 chunk 2+) can reuse the prepare + attach pieces piecewise without going through the CLI's one-shot orchestrator: - `prepare_with_preflight(spec, , stage_dir, render_preflight, prompt_yes, dry_run)` — injects render + prompt callables so the CLI binds them to stderr/stdin while the dashboard binds them to a curses modal. Returns `(plan, identity)`; identity is set after `backend.prepare` returns so callers can reap the prepare-time state dir on abort via `settle_state` in their finally — preserving today's preflight-N cleanup. - `attach_claude(bottle, , remote_control)` — runs claude inside the bottle and returns its exit code. The dashboard calls this from inside a `curses.endwin` → … → `stdscr.refresh()` handoff. - `capture_session_state` / `settle_state` lose their leading underscore; the dashboard will call them on session-end + explicit-stop respectively. `_launch_bottle` becomes a thin orchestrator over those helpers. No behavior change; all 453 unit tests pass and `./cli.py start implementer --dry-run` produces identical preflight output.	2026-05-26 03:12:29 -04:00
didericis	c9825cf701	refactor(egress): write routes.yaml as actual YAML, not JSON-in-yml test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details `egress_render_routes` now emits hand-rolled YAML in the same style as `pipelock_render_yaml`. The egress addon parses it via `yaml_subset.parse_yaml_subset` — the same parser the manifest loader + pipelock_apply use. Why bother: routes.yaml is bind-mounted into the egress sidecar AND surfaced to operators through `routes edit` (PRD 0019). JSON- in-yml renders ugly in $EDITOR and signals "this is data" rather than "this is config you can read at a glance". Real YAML reads cleanly. Mechanics: - `yaml_subset.py` drops its `claude_bottle.log` dependency. Errors now raise `YamlSubsetError` (a `ValueError`); the manifest loader + pipelock_apply catch it at the boundary and forward to `die` / `PipelockApplyError` so callers see the same behavior they did before. - `Dockerfile.egress` adds one COPY line for `yaml_subset.py` so it sits flat in `/app/` next to the addon. The addon uses an absolute-import-with-fallback shim so the same file works inside the container AND from the host's unit tests. - `egress_apply._merge_single_route` round-trips current routes.yaml through `parse_yaml_subset` + a new `_render_routes_payload` helper instead of `json.loads` + `json.dumps`. End-to-end: rebuilt the egress image, ran `./cli.py start` to a full bring-up, confirmed the addon's boot log shows `egress: loaded 9 route(s)` — i.e., the YAML parses inside the container. 453 unit + 3 integration tests pass.	2026-05-26 02:17:42 -04:00
didericis	7b29c81f27	feat(dashboard): agent-scoped e/p, drop discover-and-prompt path test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m6s Details PRD 0019 chunk 4 (final). The `e` (routes edit) and `p` (pipelock edit) keys now require an agent selection in the agents pane. Pressing them with the proposals pane focused, with no active agents, or with an out-of-range selection is a no-op with a status hint ("no agent selected; Tab into the agents pane first"). The discover-and-prompt scaffolding inside `_operator_edit_routes_flow` / `_operator_edit_allowlist_flow` / `_operator_edit_flow` is gone. The flows now take an `ActiveAgent` + required-service name; they refuse with a clear message when the bottle lacks the requested sidecar (e.g., `routes edit` against a bottle with no `bottle.egress.routes` declared). The `discover_egress_slugs` + `discover_pipelock_slugs` + `_discover_active_with_service` helpers come out — they had no remaining callers. Footer now reads `[e/p] edit selected agent`.	2026-05-26 01:50:28 -04:00
didericis	0abffc4d90	feat(dashboard): Tab toggle + per-pane selection state test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m4s Details PRD 0019 chunk 3. The TUI now has two focusable panes — proposals and agents — and `Tab` toggles which one the `j/k`/arrow keys move through. Each pane keeps its own selection index. Switching panes doesn't lose the position in the other; the cursor (`>` + reverse-video row) appears only in the focused pane. The label line on each pane shows "(focused)" when active. Footer reshuffled: `[Tab] switch pane [j/k] move [Enter] view [a/m/r] proposal [e/p] edit [q] quit`. When the agents pane is focused and there's no status message to display, the idle status line surfaces the currently-selected agent (or "[no active agents]" / "[no agent selected]" fallbacks) so the operator knows what an agent-scoped edit verb will target after chunk 4 wires them up. Proposal action keys (a/m/r/Enter) are gated on the proposals pane being focused — pressing them with the agents pane focused is a no-op. e/p still use the global discover-and-prompt flow for one more chunk; chunk 4 swaps them to read the agents-pane selection.	2026-05-26 01:37:23 -04:00
didericis	cfd8f269ba	feat(dashboard): render active agents pane below proposals test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m4s Details PRD 0019 chunk 2. The TUI's main render now draws two panes: proposals on top (existing), active agents on the bottom (new). Header counts both totals. The agents pane refreshes on the same 1s tick — agents starting/stopping reflect without operator action. Each agent row shows slug, agent name, started-time (HH:MM:SS of the metadata.json timestamp), and the bracketed list of sidecars currently up. The `agent` service is filtered out of the displayed list — it's always present so it'd be noise; the sidecars are the differentiator. A bottle whose only running service is `agent` (sidecars still warming up) renders as `(starting)`. No selection model yet — that's chunk 3. The cursor stays in the proposals pane; `j/k`/arrow nav and the proposal action keys are unchanged.	2026-05-26 01:23:59 -04:00
didericis	6e4a9f606f	feat(dashboard): discover_active_agents helper + ActiveAgent dataclass test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m6s Details PRD 0019 chunk 1. New `discover_active_agents()` in dashboard.py returns one `ActiveAgent(slug, agent_name, started_at, services)` per currently-running compose project: - Slugs come from `list_active_slugs()` (chunk-5 shared helper). - The service set per project comes from ONE label-filtered `docker ps` call (PRD open question #1: avoids N per-bottle `compose ps` invocations on each 1s refresh tick). - agent_name + started_at come from each bottle's metadata.json; "?" / "" fallbacks when the file is missing so the row renders rather than vanishes. Not wired into the TUI yet — chunk 2 renders the agents pane. The parser (`_parse_services_by_project`) is split out as a pure function so the conditional-input shape can be unit-tested without docker.	2026-05-26 01:11:54 -04:00
didericis	1fa3745832	refactor(dashboard): discover via docker compose ls test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m8s Details PRD 0018 chunk 5. The dashboard's operator-edit verbs (`routes edit`, `pipelock edit`) enumerated running sidecars via `docker ps --filter name=...` prefix scans. Switch to `docker compose ls`-based discovery so the dashboard, cleanup CLI, and launch step all agree on what's running. Mechanics: - `claude_bottle/backend/docker/compose.py` grows three shared helpers: `list_compose_projects` (the JSON parse moved out of cleanup), `slug_from_compose_project` (inverse of `compose_project_name`), and `list_active_slugs` (sugar over the first two for the common "what's running?" question). - cleanup.py drops its private `_list_compose_projects` + `_PROJECT_PREFIX` in favor of the shared ones; `list_active` simplifies (one compose-ls call, not two). - dashboard.py's `_discover_sidecar_slugs` becomes `_discover_active_with_service`: cross-references the active slug list with a label-filtered `docker ps` so only bottles whose given service container is actually up surface in the edit menu. Bottles without an egress sidecar (no bottle.egress.routes) no longer appear for `routes edit`. 3 new unit tests cover the slug ↔ compose-project naming contract; manual probe with a fake compose project confirms both `discover_egress_slugs` and `discover_pipelock_slugs` return the expected slug.	2026-05-26 00:14:16 -04:00
didericis	aee249f119	refactor(cleanup): compose-ls driven, plus orphan state-dir reaping test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m9s Details PRD 0018 chunk 4. `claude-bottle cleanup` now derives its work from `docker compose ls --all --format json`, filtered to projects whose name starts with `claude-bottle-`. Per project: one `compose down --volumes` removes the containers + the compose-managed networks atomically. The plan also enumerates three fallback buckets: - Stray containers — `claude-bottle-` containers with no `com.docker.compose.project` label (left over from pre-compose code paths). Cleared via `docker rm -f`. - Stray networks — `claude-bottle-` networks with no compose project label. Cleared via `docker network rm`. - Orphan state dirs — per-bottle `~/.claude-bottle/state/<id>/` dirs with no live project AND no `.preserve` marker. The `.preserve` marker (capability-block or auto-preserve-on-crash) explicitly opts-out of reaping; manual `rm -rf` is the only path for preserved state. cli/cleanup.py collapses to a single y/N prompt — backend.prepare_cleanup returns everything in one plan, backend.cleanup processes everything, no more double-prompt for state. The CLI-side state-dir enumeration + `_state_summary` flags from PR #25 are gone; the backend's orphan-detection rules subsume them.	2026-05-25 23:48:02 -04:00
didericis	f1c5816d1f	refactor(compose): drop pre-create networks + pipelock CIDR allowlist PRD 0018 chunk 4 spike: empirically verified that pipelock's SSRF guard checks proxied-request destinations (e.g. api.anthropic.com → public IP) and not source IPs of incoming connections. The bottle's own internal CIDR was being added to ssrf.ip_allowlist defensively, but that defense isn't load-bearing — direct pipelock probe (`curl --proxy http://pipelock https://api.anthropic.com/`) returns 404 from upstream rather than blocking on SSRF. So: - Networks become compose-managed (`internal: true` on the internal network; the egress one is a normal user-defined bridge). Compose creates + removes them via up/down. - launch.py drops the `docker network create` + `network_inspect_cidr` + pipelock yaml re-render dance. - The pre-create/external scaffolding from chunk 3 goes with it. End-to-end `./cli.py start` still works; cleanup leaves no orphans. If real-world use surfaces an SSRF block we hadn't predicted, the allowlist can come back via subnet-pinning rather than pre-create.	2026-05-25 23:48:02 -04:00
didericis	cefdc8c6e9	feat(launch): switch start to docker compose project per bottle test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m5s Details PRD 0018 chunk 3. Each instance is now one `docker compose` project: - launch.py renders the compose spec via chunk-1's bottle_plan_to_compose, writes it to state/<slug>/docker-compose.yml, `docker compose up -d`s, and (on teardown) dumps `docker compose logs --no-color --timestamps` to state/<slug>/compose.log before `docker compose down`. - Networks are pre-created (`docker network create --internal` + user-defined bridge) so pipelock yaml can know the internal CIDR before compose-up. Compose references them with `external: true`; the launch step's ExitStack still owns network removal. - Agent still runs `sleep infinity`; claude reaches it via `docker exec -it` exactly like before (per the PRD's resolved TTY question). - metadata.json grows a `compose_project` field so dashboard / cleanup tooling can derive compose invocations without re-deriving the slug. Security follow-ups from chunk-2 review: (b) CA private keys: pipelock + egress ca-key.pem land at 0o600 explicitly. The mitmproxy cert+key concat stays 0o644 because the egress container's uid-1000 user reads it through the bind mount; parent dir at 0o700 still restricts host-side reach. (c) Apply atomicity: egress_apply + pipelock_apply switch from `docker cp` to host-side write-temp-then-rename on the bind-mount source. POSIX rename is atomic on the same filesystem, so a sidecar SIGHUP racing the apply can't see a half-written routes.yaml / pipelock.yaml. Per-sidecar Docker{Sidecar}.start/stop methods stay in place — the integration test suite drives them directly to validate each image in isolation, which is still useful. launch.py no longer calls them; a follow-up chunk can prune if the integration tests move to the compose lifecycle. git-gate entrypoint's chmod 600 on the keyfile + known_hosts now tolerates EROFS (`\|\| true`) — the host SSH key is already 0600 (SSH refuses to load otherwise), so the inside-container chmod was already a no-op in the docker-cp path and now just needs to not error on the read-only bind mount. 422 unit tests pass; supervise integration test passes; end-to-end `./cli.py start implementer` brings up the project, attaches, captures full merged logs on teardown, and reaps all containers + networks.	2026-05-25 23:16:40 -04:00
didericis	4760a09263	feat(compose): pure renderer for bottle plan -> compose dict test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m5s Details PRD 0018 chunk 1. New module `claude_bottle/backend/docker/compose.py` exposing `bottle_plan_to_compose(plan) -> dict` — a pure function that translates a fully-resolved DockerBottlePlan into a Compose v2 spec. Not wired in yet. Tests cover the conditional-service matrix (git on/off × egress on/off × supervise on/off) plus per-service shape (images vs builds, network aliases, bind mounts, env vars, depends_on).	2026-05-25 22:28:50 -04:00
didericis	1e5b0dcfca	refactor: rename egress-proxy → egress everywhere test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m10s Details The manifest key is `egress:` now; finish the rename so the rest of the codebase matches. Files (Dockerfile.egress, claude_bottle/egress.py etc.), classes (Egress, EgressConfig, EgressRoute, EgressPlan, DockerEgress), constants (EGRESS_HOSTNAME, EGRESS_ROUTES, ...), container name prefix (claude-bottle-egress-*), docker network alias (egress), the introspection host (_egress.local), the MCP tool IDs (egress-block, list-egress-routes), and the preflight label all drop the `-proxy` suffix.	2026-05-25 21:59:47 -04:00

1 2 3 4

173 Commits