Bundle daemons (pipelock, egress, optionally git-gate + supervise)
now actually start with their config files bind-mounted from the
inner Plans the docker backend already produces. Chunks 2d + 3
ran with daemons_csv="" so the bundle's init supervisor idled;
chunk 4b wires up the real path: agent → pipelock → egress →
internet (when routes declared) is now functional, modulo agent-
image gaps (claude-code / TLS-trust-store / git in the guest)
that chunk 4c addresses.
bottle_plan.py — added the four inner Plan fields:
proxy_plan: PipelockProxyPlan
git_gate_plan: GitGatePlan
egress_plan: EgressPlan
supervise_plan: SupervisePlan | None
Same shape the docker backend's plan uses. Docker-network-only
fields (internal_network, egress_network) stay at dataclass
defaults — the smolmachines bundle is on a per-bottle bridge
with a pinned IP, not docker's --internal + egress topology.
prepare.py — instantiates DockerPipelockProxy / DockerEgress /
DockerGitGate / DockerSupervise and calls their .prepare()
methods to write the per-bottle config files (pipelock.yaml,
routes.yaml, git-gate entrypoint/hooks, supervise queue dir)
under the per-bottle state dir. (The "Docker" prefix on the
class names is a misnomer here — .prepare() is platform-neutral,
inherited from each sidecar's ABC. A future cleanup could factor
the prepare logic out of the docker subpackage.)
launch.py — major rewrite:
- pipelock_tls_init at launch (always); egress_tls_init only
when the bottle declares routes (otherwise the CA files
aren't bind-mounted and openssl runs would be wasted).
- Inner Plans updated in place with launch-time CA paths +
EGRESS_UPSTREAM_PROXY = http://127.0.0.1:8888 (egress's
upstream is pipelock on the bundle's own loopback; same
container's network namespace).
- BundleLaunchSpec env + volumes built from the inner Plans:
pipelock.yaml + CA + key (always); egress routes + CAs +
upstream env + token-slot bare names (when routes); git-gate
entrypoint + hooks + per-upstream identity files (when
upstreams); supervise queue dir + env (when enabled).
- daemons_csv = ["egress", "pipelock"] + ["git-gate"] (if
upstreams) + ["supervise"] (if enabled).
- Token env values resolved from host env via
`egress_resolve_token_values` and threaded into the
docker-run subprocess env (bare-name -e entries in spec
inherit from there — values never land on argv).
Tests:
- 552 unit passing (no new unit cases; fixture updated to
populate the new plan fields).
- 5 integration cases passing locally (Darwin + smolvm + docker
+ not GITEA_ACTIONS):
* test_smoke_exec_echo — still works.
* test_localhost_reach_probe — host loopback still refused.
* test_egress_port_bypass_probe — <bundle-ip>:9099 still
refused, NOW WITH EGRESS ACTUALLY RUNNING (chunk 3's
127.0.0.1 bind-address is doing its job).
* test_prompt_file_lands_in_guest — still works.
* test_pipelock_answers_on_bundle_ip — NEW. From inside the
guest, wget to <bundle-ip>:8888 gets an HTTP response
(not "connection refused") — proves pipelock is actually
listening and the bind-mount + CA generation path works.
What's left in chunk 4:
- 4c: agent-image-conversion (claude-code + git + curl +
ca-certificates in the guest). Chunk 2d's alpine placeholder
stays for now.
- 4d: provision_ca + provision_git + provision_supervise once
the agent image has the required tools.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses PR #69 review comment: `del plan, target` was just a
silence-the-unused-arg gesture but reads oddly for a stub. `pass`
is the standard "this is a stub" sentinel.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First slice of chunk 4: implement the two provisioning methods
that don't depend on agent-image tooling beyond `cp` and
`mkdir`. provision_ca / provision_git / provision_supervise
land once the agent-image gap is solved (chunk 4b+) — they need
update-ca-certificates, git, and the claude binary respectively,
none of which the chunk-2d alpine placeholder provides.
What this PR ships:
- `claude_bottle/backend/smolmachines/provision/` subpackage
with `prompt.py` + `skills.py`. Each routes through
`smolvm.machine_cp` / `machine_exec`. provision_prompt mirrors
the docker contract (file always copied; return value drives
--append-system-prompt-file iff the agent has a non-empty
prompt). provision_skills mkdir + cp per skill, matching
the docker backend's loop.
- prepare.py now writes the prompt file under
agent_state_dir(slug) with the agent's `prompt` body, mode
0o600. The in-guest path is `/root/.claude-bottle-prompt.txt`
(alpine has no `node` user; will become `/home/node/...` once
the real claude-bottle image lands).
- launch.py calls `provision(plan, machine_name)` after
machine_start. The returned prompt path threads to
SmolmachinesBottle so exec_claude can add
--append-system-prompt-file when the agent has a prompt.
- backend.py: provision_prompt / provision_skills now real;
provision_git is a deliberate stub (waiting on the git-gate
inner Plan + git in the agent image). provision_supervise
stays the chunk-2d stub.
Tests:
- 7 new unit cases (test_smolmachines_provision.py): argv
shape (mocked smolvm.machine_cp / .machine_exec),
prompt return-value contract, no-op-with-no-skills,
CLAUDE_BOTTLE_GUEST_SKILLS_DIR override, fail-on-missing-skill.
- 1 new integration case in test_smolmachines_launch.py:
end-to-end verification that the prompt file lands in the
alpine guest at /root/.claude-bottle-prompt.txt with the
expected content (via `bottle.exec("cat ...")`). The smoke +
the two TSI probes stay green.
552 unit + 4 integration (Darwin+smolvm+docker gated) passing.
What's left in chunk 4:
- 4b: thread the inner Plans (PipelockProxyPlan / EgressPlan /
GitGatePlan / SupervisePlan) through prepare + launch so the
bundle daemons actually run (currently daemons_csv="").
- 4c: the agent-image-conversion gap — get claude-code + git +
curl + ca-certificates into the guest image (build a
.smolmachine via `pack create --from-vm` after manual setup,
or push the docker image to a registry smolvm can pull).
- 4d: provision_ca + provision_git + provision_supervise once
4b + 4c land.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Egress's bind address is now env-driven via EGRESS_LISTEN_HOST.
Unset → mitmdump's default (all interfaces) — the docker
backend's behavior, unchanged. Set to `127.0.0.1` → mitmdump
binds localhost only.
The smolmachines launch sets EGRESS_LISTEN_HOST=127.0.0.1 in
the bundle's env unconditionally. TSI's allowlist is
`<bundle-ip>/32` (IP-only, not port-granular), which would
otherwise let the agent dial `<bundle-ip>:9099` and bypass
pipelock's DLP by talking to egress directly. Binding egress
to localhost inside the bundle closes that gap at the socket
level — the agent still reaches the IP (TSI permits it) but
egress refuses the connect because it's not listening on the
docker bridge interface.
The docker backend doesn't set the env var because its agent
dials egress directly via the docker network alias — egress
MUST be reachable from outside the bundle there. The
asymmetry is documented in the entrypoint script's comment.
Changes:
- egress_entrypoint.sh: read EGRESS_LISTEN_HOST, conditionally
pass `--listen-host <host>` to mitmdump.
- smolmachines/launch.py: BundleLaunchSpec.environment now
includes `EGRESS_LISTEN_HOST=127.0.0.1`.
- New unit tests (5): the entrypoint script's argv shape under
various env combinations, verified via a fake mitmdump shim
that prints its argv.
545 unit + 3 integration tests passing. The egress-port-bypass
probe from chunk 2d still passes (chunk 2d ran with daemons_csv=""
so no egress was up; chunk 3 makes the probe preserve its
property once egress IS up in chunk 4).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end launch flow for the smolmachines backend. Brings up
the per-bottle docker bridge + sidecar bundle, creates and
starts the smolvm guest pointed at the bundle's pinned IP via
TSI's `--allow-cidr <bundle-ip>/32`, yields a SmolmachinesBottle
handle that routes exec/cp through `smolvm machine exec / cp`,
tears everything down on context exit.
launch.py:
- ExitStack-managed: create_bundle_network → start_bundle →
machine_create → machine_start (each registered for reverse
teardown).
- daemons_csv="" for chunk 2d — bundle init logs "no daemons
selected" and idles. Real daemon bringup with inner-Plan-driven
env + volumes lands in chunk 4.
bottle.py:
- SmolmachinesBottle.exec → smolvm.machine_exec (captured).
- SmolmachinesBottle.exec_claude → direct subprocess.run with
inherited TTY for interactive sessions.
- SmolmachinesBottle.cp_in → smolvm.machine_cp.
Architecture pivots forced by smolvm 0.8.0's CLI shape:
1. `--from <smolmachine>` and `--smolfile <toml>` are MUTUALLY
EXCLUSIVE in smolvm 0.8.0. We need --from to avoid the
registry-pull race that bit us on machine_start (libkrun
agent's network attempt got refused by macOS with
"connect: permission denied" on IPv6). So Smolfile is dropped
entirely; per-bottle env + allow_cidrs flow as CLI flags
(`--allow-cidr CIDR`, `-e K=V`) directly to machine_create.
2. `smolvm pack create --image` doesn't pull from the local
docker daemon — only OCI registries via crane. The real
claude-bottle:latest image lives in the local docker daemon
and isn't reachable that way. Chunk 2d ships with an alpine
placeholder; the agent-image-conversion gap belongs to
chunk 4 (push the image to a registry, or smolvm grows a
docker-daemon transport).
Other changes:
- machine_create grew `image=` / `from_path=` / `allow_cidrs=`
/ `env=` kwargs; smolfile= dropped.
- bottle_plan: smolfile_path → agent_from_path + guest_env.
- prepare: pack_create against `alpine:latest`, cached under
~/.cache/claude-bottle/smolmachines/ keyed by image ref.
- Deleted smolfile.py + test_smolfile.py (dead code now).
Tests:
- Unit: 540 passing (smolvm wrapper grew 4 new flag forms; one
test renamed to reflect --from + --allow-cidr + -e combo).
- Integration: 3 new cases in tests/integration/
test_smolmachines_launch.py, gated on Darwin + smolvm on PATH
+ docker + not GITEA_ACTIONS:
* smoke: bottle.exec("echo hello-from-vm") round-trips with
the correct stdout + returncode.
* localhost-reach probe: agent dials 127.0.0.1:9 → connect
refused (TSI's <bundle-ip>/32 allowlist doesn't include
loopback). The regression test for the gap the PRD design
pivot was about.
* egress-port-bypass probe: agent dials <bundle-ip>:9099
(egress's port) → connect refused. Chunk 2d has no
daemons running so nothing's listening anyway; chunk 3
will preserve this property once egress is up but bound
to 127.0.0.1 inside the bundle.
End-to-end smoke + both probes green locally on macOS with
smolvm 0.8.0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
claude_bottle/backend/smolmachines/sidecar_bundle.py — primitives
for the per-bottle bridge + bundle container with pinned IP:
- bundle_network_name(slug) / bundle_container_name(slug)
- create_bundle_network(name, subnet, gateway)
- remove_bundle_network(name)
- start_bundle(BundleLaunchSpec, env=)
- stop_bundle(slug)
`BundleLaunchSpec` carries the launch-time fields (network +
subnet + gateway + bundle_ip + daemons_csv + environment +
volumes). Wiring it up from the inner Plans (PipelockProxyPlan,
EgressPlan, GitGatePlan, SupervisePlan) is chunk 2d's job; this
module is the docker-argv surface only.
Pinning the bundle IP via `docker run --ip <bundle-ip>` is what
makes smolvm's TSI allowlist (`<bundle-ip>/32`) safe to compute
at prepare time — without pinning, we'd have to inspect the
assigned IP after start and feed it back into the Smolfile.
Idempotent semantics where it matters: `create_bundle_network`
treats "already exists" as success, `remove_bundle_network` +
`stop_bundle` treat "no such ..." as success. Other failures
die / warn depending on whether the launch flow can recover.
Tests:
- 15 unit cases (mocked subprocess.run): argv shape for create
/ remove / start / stop, idempotent paths, host-env
inheritance to docker run subprocess.
- 1 integration case (real docker daemon, gated on docker
available + not GITEA_ACTIONS): end-to-end bringup of an
empty-daemons bundle on a 192.168.211.0/24 bridge, confirms
the container lands at the pinned IP. Skipped if the
claude-bottle-sidecars:latest image isn't built (operator
hasn't run a docker bottle yet).
546 unit tests passing. Real-docker bundle bringup green
locally.
Launch wiring + provisioning + PRD 0022 acceptance probes
land in chunk 2d.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
claude_bottle/backend/smolmachines/smolvm.py — one thin Python
function per smolvm CLI subcommand the launch flow needs:
- pack_create(image, output) → smolvm pack create
- machine_create(name, from_path,
smolfile) → smolvm machine create
- machine_start(name) → smolvm machine start
- machine_stop(name) → smolvm machine stop
- machine_delete(name) → smolvm machine delete -f
- machine_exec(name, argv, env,
workdir, timeout) → smolvm machine exec
- machine_cp(src, dst) → smolvm machine cp
- is_available() → shutil.which check
The wrapper hides the CLI's inconsistent name-flag style
(positional NAME on create/delete, --name on start/stop/exec/
status) behind a uniform `name=` kwarg.
Two return shapes:
- SmolvmRunResult (returncode + stdout + stderr) from
machine_exec, because callers care about the in-VM
command's exit code.
- Raises SmolvmError on non-zero for all other commands;
failure to create/start/stop a VM is fatal to the launch
flow, not branched on.
Tests:
- 15 unit cases mocking subprocess.run, covering argv shape
per subcommand (the --name vs positional inconsistency
locked down), SmolvmError on non-zero for non-exec paths,
SmolvmRunResult passthrough on exec, empty-path cp no-op.
- 2 integration cases against the real smolvm binary
(gated on Darwin + smolvm on PATH + not GITEA_ACTIONS):
smolvm --help responds, machine ls --json parses as a
list (the contract chunk 4's list_active will consume).
531 unit tests passing. Real-smolvm smoke green locally.
Bundle bringup + launch wiring + the localhost-reach /
egress-port-bypass probes land in chunks 2c + 2d.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First sub-PR of chunk 2: rewrite the renderer chunk 1 shipped to
match smolvm 0.8.0's actual Smolfile shape, delete the dead
gvproxy renderer + its tests, simplify the prepare flow now that
there's no gvproxy socket + no loopback-port allocation.
Smolfile renderer:
- Old shape (under the abandoned gvproxy design): name = ...,
command = [...], [[net]] attachment = "unixgram",
socket = "...".
- New shape (smolvm 0.8.0): env = [...] (sorted K=V pairs),
[network] allow_cidrs = ["<bundle-ip>/32"]. Nothing else.
image / entrypoint / cmd come from the .smolmachine artifact
built in chunk 2b; cpus / memory left at smolvm defaults.
- Tests assert no leakage of TSI's --outbound-localhost-only or
the old gvproxy/unixgram keys.
util.py:
- smolmachines_gvproxy_subnet → smolmachines_bundle_subnet,
returning (subnet, gateway, bundle_ip). bundle_ip is always
at .2 (gateway .1); subnet is /24, third octet derived from
the slug hash, skipping the docker-default 17 to avoid the
common 192.168.17.x collision.
- allocate_loopback_port: deleted. The bundle gets a pinned
docker IP now; the agent dials that IP directly through TSI.
- smolmachines_preflight: dropped the gvproxy check; only
smolvm is required.
prepare.py:
- Drops the gvproxy.yaml render + the loopback port allocation
+ the gvproxy_socket field on the plan.
- Derives subnet / gateway / bundle_ip from the slug and
populates the new SmolmachinesBottlePlan fields.
- Agent env now uses IP-literal URLs (http://<bundle-ip>:8888
etc) since the guest will have no DNS resolver inside TSI's
allowlist.
bottle_plan.py:
- Old fields: gvproxy_config_path, gvproxy_socket,
gvproxy_subnet, gvproxy_gateway, host_port_map.
- New fields: bundle_subnet, bundle_gateway, bundle_ip,
smolfile_path. (smolmachine artifact path lands in chunk 2b.)
Net: -410 lines. Full unit suite: 516 passing.
The VM lifecycle + bundle bringup + launch wiring + smoke tests
land in chunk 2b.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Chunk-1's empirical spike against smolvm 0.8.0 contradicted the
research note that motivated the gvproxy network design: smolvm
exposes no virtio-net-over-unixgram attachment. The first draft's
"why gvproxy, not TSI" argument turns out to apply only to
`--outbound-localhost-only`, not to TSI generally.
New design:
- Bundle (PRD 0024) runs on a dedicated per-bottle docker bridge
with a pinned IP. Smolfile sets `[network] allow_cidrs =
["<bundle-ip>/32"]` and nothing else. Agent can reach the bundle
and nothing else — host loopback, LAN, public internet directly
are all refused at the VMM (TSI) layer.
- Bind-address mitigation: egress binds 127.0.0.1:9099 inside the
bundle (pipelock-internal); pipelock / git-gate / supervise
bind 0.0.0.0 so the agent (across the TSI allowlist) can reach
them. This is the port-granularity TSI's IP-only allowlist
doesn't provide.
- Smolfile renderer rewritten in chunk 2 to smolvm 0.8.0's actual
schema (image / entrypoint / cmd / env / [network] allow_cidrs).
The chunk-1 renderer (name= / [[net]]= under the gvproxy
design) emits the wrong shape and will be replaced.
- Drop gvproxy + VZFileHandleNetworkDeviceAttachment + the
PyObjC fallback. Backend layout loses gvproxy_config.py,
gvproxy.py, vfkit_attach.py.
- Acceptance plan adds an egress-port-bypass probe in addition
to the localhost-reach probe.
- Chunks reshape: chunk 1 stays (renderer rewrite is part of
chunk 2's cost); chunk 2 covers VM lifecycle + bundle + new
Smolfile renderer; chunk 3 is the bundle bind-address change;
chunks 4-5 unchanged in spirit.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses PR #62 review comments on
claude_bottle/backend/smolmachines/bottle_plan.py:
- Lift the multi-value label printer (was a nested helper inside
DockerBottlePlan.print) into a new module
claude_bottle/backend/print_util.py:print_multi. Both backends
use it for env / skills / git / egress lines.
- Strip the three smolmachines-preflight lines the review flagged:
the gvproxy subnet line, the smolfile path line, and the
gvproxy-config path line. Internal detail — operators see the
agent / env / skills / bottle / git / egress that already
matter on the docker side, and nothing else.
- Add `git → upstream` to the smolmachines git output to match
what's useful at preflight time (the docker version shows
upstream_host:port; this is similar shape).
Leaves the slug=spec.identity-or-mint pattern alone pending a
reply on PR comment #432 — the docker backend uses the same
pattern to preserve identity across `resume`, so dropping it
would silently break the resume path once smolmachines launch
lands.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Ships the smolmachines backend's prepare side: subpackage layout,
`_BACKENDS` registration under "smolmachines", preflight check
for `smolvm` + `gvproxy` on PATH, and the two config-file
renderers (Smolfile TOML + gvproxy YAML). Launch raises
NotImplementedError until chunk 2.
New module layout (mirrors backend/docker/):
claude_bottle/backend/smolmachines/
__init__.py re-exports SmolmachinesBottleBackend
backend.py SmolmachinesBottleBackend façade
bottle.py SmolmachinesBottle stub (NotImpl until ch2)
bottle_plan.py SmolmachinesBottlePlan + .print()
bottle_cleanup_plan.py SmolmachinesBottleCleanupPlan stub
prepare.py resolve_plan: writes both config files
smolfile.py TOML renderer (stdlib, no tomli_w dep)
gvproxy_config.py YAML renderer (same shape as pipelock_yaml)
util.py preflight + per-slug subnet + loopback port
The renderers are pure functions. `resolve_plan` runs the
preflight, allocates one host-side loopback port per active
sidecar (pipelock always; git-gate / supervise conditional),
derives a per-slug gvproxy subnet (hash-mod-254, skipping the
docker-default 17), and writes:
- <stage>/gvproxy.yaml: subnet + DNS rule resolving only
`proxy.internal` + port_forwards (one per active sidecar).
- <stage>/smolfile.toml: guest command/env + virtio-net device
backed by gvproxy's unixgram socket. No TSI flags — see
PRD 0023 "Why gvproxy, not TSI".
The agent's HTTPS_PROXY etc. point at `proxy.internal:<gateway-
port>` so the guest dials through gvproxy. gvproxy resolves only
`proxy.internal` → the gateway IP, and forwards exactly the
listed ports to the host-side sidecar bundle (PRD 0024); every
other destination — host LAN, host loopback, public internet
directly — is unreachable by construction.
29 new unit tests covering renderer correctness, subnet
derivation stability + collision-avoidance, loopback port
allocation, and preflight error paths. Full unit suite: 532
passing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`apply_allowlist_change` used `docker restart <bundle>` to make
pipelock reload, which bounced ALL four daemons — including
supervise, whose MCP socket the agent's claude-code client had
open. That dropped the connection. A second apply works because
supervise has come back up by then.
Fix: per-daemon restart via SIGUSR1.
- New `_Supervisor.restart_daemon(name)` terminates one named
child and spawns a replacement in place. Other daemons keep
running.
- main() wires SIGUSR1 → `restart_daemon("pipelock")`. Pipelock
has no in-process reload, so this is its analog of egress's
SIGHUP-reload-addon path. Pipelock is the only daemon that
currently needs hot-config reload via restart; if others
acquire the need, add a new signal.
- `apply_allowlist_change` now `docker kill --signal USR1
<bundle>` instead of `docker restart`. Supervise / egress /
git-gate keep running across the apply.
Tests:
- New `_Supervisor.restart_daemon` cases: replaces in place
(different pid post-restart, sibling daemon unchanged),
unknown name is a no-op, restart-during-shutdown is a no-op.
- `test_pipelock_apply` rewritten to bring up the bundle image
with `CLAUDE_BOTTLE_SIDECAR_DAEMONS=pipelock` so the
supervisor is PID 1 and handles SIGUSR1. The previous
standalone-pipelock setup wouldn't survive SIGUSR1 (pipelock
default disposition is terminate). Test builds the bundle
image in setUpClass (cached layers make repeat runs fast).
531 tests passing locally (unit + integration).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two bugs surfaced when applying an egress route change:
1. egress_apply.py still targeted claude-bottle-egress-<slug> —
the legacy per-sidecar container that no longer exists (it's
a docker-network alias on the bundle now). Switched it to
sidecar_bundle_container_name(slug), matching the chunk-5
fix already made to pipelock_apply.py.
2. `docker kill --signal HUP <bundle>` lands SIGHUP on the
supervisor (PID 1 in the bundle), which previously had no
SIGHUP handler — the signal was ignored. Added
`_Supervisor.forward_signal(sig, daemon_name)` and a SIGHUP
handler in main() that forwards to the egress daemon so
mitmdump's addon reload still works under the bundle.
Tests:
- New _Supervisor.forward_signal cases: forwards to the named
child (Python subprocess as the SIGHUP target — bash trap +
stdout=PIPE deferral interferes with the production-style
test); unknown-daemon name is a no-op.
Stale-reference cleanup (separate issue surfaced while looking
at this):
- claude_bottle/{egress,git_gate,egress_addon,
egress_addon_core,supervise_server}.py: Dockerfile.egress /
Dockerfile.git-gate / Dockerfile.supervise references updated
to Dockerfile.sidecars (the old per-sidecar Dockerfiles were
deleted in PRD 0024 chunk 5).
- tests/README.md: dropped the entry for
test_pipelock_sidecar_smoke (deleted in chunk 3) and added
the new bundle integration tests.
- git_gate.py: stale `DockerGitGate.start via docker cp`
reference (the method was deleted in chunk 3) rewritten to
the bind-mount path the renderer uses now.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three deliverables:
1. Rewrite test_pipelock_apply bringup with a direct `docker run`.
Replaces the .start-based bringup deleted in chunk 3. Stages
the yaml + CAs to the real pipelock_state_dir so the bind-
mount target matches what apply_allowlist_change writes to —
the legacy .start path did this implicitly because it lived
inside the production flow; the new bringup needs to be
explicit about the path. All 4 cases pass.
2. New tests/integration/test_sidecar_bundle_compose.py: end-
to-end smoke with CLAUDE_BOTTLE_SIDECAR_BUNDLE=1. Brings up
a real bottle via the compose path and verifies the agent
can reach pipelock + supervise through the bundle's legacy
aliases (no agent-side config changes between flag positions).
Skipped under act_runner — multi-stage build + bind mounts.
3. Two bundle-path bugs surfaced and fixed while running PRD
0022 with the flag on:
- egress_entrypoint.sh: add `--set confdir=/home/mitmproxy/
.mitmproxy` so mitmdump finds the bind-mounted CA. The
legacy Dockerfile.egress runs as user mitmproxy (~mitmproxy
resolves correctly); the bundle runs as root and otherwise
would look in /root/.mitmproxy/ and mint a NEW CA the agent
doesn't trust. Symptom: PRD 0022 attack-3 curl failed with
"unable to get local issuer certificate".
- sidecar_init.py: add `--listen 0.0.0.0:8888` to pipelock's
argv. Without it pipelock defaults to 127.0.0.1, so the
in-bundle egress's upstream connect to the
`claude-bottle-pipelock-<slug>` alias arrives over the
docker network and gets refused. The legacy renderer
passed this flag verbatim; the bundle dropped it. Symptom:
egress returned HTTP 502 with "Connect call failed
('172.x.x.x', 8888)".
PRD 0022's 5-attack sandbox-escape suite now passes with the
bundle flag on AND off.
Test status:
- Unit: 533 passing.
- Integration: 9 passing locally with flag off, 5 passing with
flag on. Bundle compose smoke + PRD 0022 sandbox-escape both
green under CLAUDE_BOTTLE_SIDECAR_BUNDLE=1.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Compose-up has owned per-container lifecycle since PRD 0018 ch3;
the .start() / .stop() methods on DockerPipelockProxy /
DockerEgress / DockerGitGate / DockerSupervise (and their
abstractmethod declarations in the four base ABCs) were already
documented as vestigial. With the bundle path in flight
(PRD 0024 ch2), they are truly dead — collapse to nothing.
Changes:
- Removed start/stop methods from the four DockerSidecar
classes. Plan dataclasses, image/path constants,
container-name helpers, and the .prepare() methods all stay
(the renderer + apply path still need them).
- Removed the matching @abstractmethod declarations in the
base ABCs so concrete subclasses don't have to stub them.
- launch.launch() and prepare.resolve_plan() no longer take
proxy/git_gate/egress/supervise instance parameters. backend.py
loses the four instance attributes it threaded through.
prepare.resolve_plan() instantiates the four classes itself
to call their .prepare() methods.
- Deleted four integration tests that only exercised the
removed lifecycle: test_pipelock_sidecar_smoke,
test_supervise_sidecar, test_git_gate_sidecar,
test_git_gate_mirror.
- Dropped the .stop-idempotency case in test_orphan_cleanup;
the network-cleanup cases stay (those test real production
code).
- Marked test_pipelock_apply @skip pending chunk 4 — its
bringup helper used .start; chunk 4 rewrites it with direct
`docker run`.
Dockerfile deletion deferred to chunk 5 (when the bundle flag
default flips) — the legacy compose path still needs
Dockerfile.{egress,git-gate,supervise} until then.
Net: 708 lines removed, 80 added.
533 unit tests + 27 integration tests passing (5 skipped: the
chunk-4-pending case + existing GITEA_ACTIONS guards).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The docker backend's compose renderer now emits a single
`sidecars` service in place of the four per-sidecar services
when CLAUDE_BOTTLE_SIDECAR_BUNDLE is truthy. Default (unset/0/
false) keeps the legacy five-service shape so existing operators
don't have to migrate atomically; chunks 4-5 flip the default
and delete the flag.
New module claude_bottle/backend/docker/sidecar_bundle.py owns
the bundle image constant (CLAUDE_BOTTLE_SIDECAR_IMAGE env var
override + claude-bottle-sidecars:latest default), the
Dockerfile reference, the container-name helper, and the
flag-parser.
The bundle service:
- joins both internal + egress networks with aliases for every
legacy shortname + per-slug long form so the agent's
HTTPS_PROXY URL (which dials `egress` or
`claude-bottle-pipelock-<slug>`) keeps resolving with no
agent-side change
- carries CLAUDE_BOTTLE_SIDECAR_DAEMONS=<csv> for the init
supervisor to narrow which daemons to start
- carries the union of the four prior services' daemon-private
env vars (EGRESS_UPSTREAM_PROXY, SUPERVISE_*, token env names)
- does NOT carry HTTPS_PROXY/HTTP_PROXY/NO_PROXY — those would
route git-gate's git fetches through pipelock by mistake
- union'd bind-mounts at the same in-container paths as before
HTTPS_PROXY scoping moved into egress_entrypoint.sh so only
mitmdump's subprocess sees it. In the legacy four-sidecar shape
the env vars also lived in the egress service's compose env;
the shell script's export is additionally defensive.
Tests:
- All 44 existing TestCompose cases pass unchanged (flag off →
legacy shape).
- 20 new TestSidecarBundleShape cases assert on the bundle's
services / aliases / env / volumes / depends_on under the
flag.
- 8 new TestSidecarBundleFlag cases lock down the env-var
parser (unset / 0 / false / no / off → disabled; everything
else → enabled).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reverts the earlier removal — EXPOSE is doc-only on the
renderer-driven publish path, but keeping it in the Dockerfile
(with the comment naming it as such) documents the bundle's
port surface for anyone reading the file.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reverses chunk 1's "any unexpected child death tears down the
rest" policy. New behavior: a daemon dying is logged but does
NOT initiate shutdown — the surviving daemons keep running and
whatever the dead one served starts failing visibly on the
agent side. The supervisor exits only when (a) it receives
SIGTERM/SIGINT, or (b) every child has died on its own.
Eventual design is restart-the-dead-daemon plus a notification
to the supervise sidecar so the operator sees the event
explicitly; this commit ships only the "log and leave alone"
half. PRD 0024 open question 1 updated to reflect the new
intent.
Tests updated: replaced "crash propagates exit code via
auto-teardown" with three cases that exercise the new policy
(crash without shutdown leaves survivors up, crash-then-signal
surfaces the nonzero code, all-children-die-unattended still
converges the loop).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
EXPOSE doesn't publish ports — the compose renderer does that.
Carrying it just to document the in-container port set adds
noise without doing work.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New Dockerfile.sidecars multi-stage build: pulls the pinned
pipelock and gitleaks binaries into a mitmproxy-base final image,
installs git + openssh-client, and ships the project's egress
addon + supervise server alongside a stdlib-Python init at
/app/sidecar_init.py.
The init supervisor (claude_bottle/sidecar_init.py) is PID 1 in
the bundle. It spawns the daemons named in
CLAUDE_BOTTLE_SIDECAR_DAEMONS (or all four by default),
propagates SIGTERM/SIGINT to children with an 8s grace before
SIGKILL, and exits with the first-unexpected-child exit code so
a daemon crash tears down the bundle (per PRD 0024 open
question 1's default).
claude_bottle/egress_entrypoint.sh extracted verbatim from
Dockerfile.egress's prior inline sh -c so the supervisor can
call it as a normal child.
Tests:
- unit: _selected_daemons env-var subset behavior (7 cases),
_Supervisor signal/exit-code semantics including SIGKILL
escalation, and end-to-end main() via subprocess.
- integration: builds the image and probes that pipelock,
gitleaks, mitmdump, and the supervise Python module are
present + executable, plus a no-daemons-selected smoke test
of the entrypoint wiring. Skipped under act_runner (200+MB
base pulls + multi-stage build).
Renderer collapse and the deletion of Dockerfile.{egress,git-gate,
supervise} land in chunk 2 + 3.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace pipelock + egress + git-gate + supervise as four
separate containers with one bundle image
(claude-bottle-sidecars) running all four daemons under a small
stdlib Python init supervisor. Compose file collapses from five
services to two; same daemons, same ports, same protocols, one
container.
Sized: bundle image + init → renderer collapse (feature-flagged)
→ backend Python trim → integration sweep → flag removal.
Prerequisite for PRD 0023 chunk 3 (smolmachines backend reuses
the same bundle as its sole host-side sidecar container).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the four host-side sidecar processes (pipelock + egress +
git-gate + supervise) with a single bundled container per bottle,
defined in PRD 0024 and consumed here. egress is internal to the
bundle as pipelock's upstream; only pipelock, git-gate, and
supervise are externally addressable, and only when the bottle
uses them.
gvproxy port_forwards collapse from one-per-process to one-per-
external-port, all pointing into the one bundle container.
Sizing: chunk 3 becomes "sidecar bundle lifecycle" and depends
on PRD 0024 having landed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
TSI's --outbound-localhost-only is permissive on all of
127.0.0.0/8 with no destination-port filter, so any host
loopback service (local Postgres, IDE plugins, another bottle's
sidecar) is reachable from the guest. That's the wrong default
for the malicious-agent threat model.
Reworked the network design around gvproxy + VFKT unixgram
attachment: the guest gets a virtio-net device, gvproxy is the
userspace TCP/IP stack on the host side, and the only thing
reachable from the guest is the explicit port-forward list
(typically just pipelock). Host LAN, host loopback, and the
public internet directly are gone by construction.
VMM choice (smolmachines vs PyObjC + Virtualization.framework)
is an open question contingent on whether libkrun's virtio-net
mode lets us point at a custom unixgram socket. Backend name
stays "smolmachines" either way per the original spec.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Specs a second concrete BottleBackend selectable via
CLAUDE_BOTTLE_BACKEND=smolmachines: per-agent libkrun microVM on
macOS, sidecars relocated to host-side loopback ports plumbed via
Smolfile env, PRD 0022's sandbox-escape suite as the acceptance
gate (the env-var flip is the only change required). Docker
backend ships unchanged and remains default.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Gitea CI runner shares the host docker socket but not its
filesystem, so pipelock_tls_init's host bind-mount path for CA
files is invisible to the runner container — the same constraint
that already gates the other bottle-bringup integration tests.
PRD 0022's test suite was missing this guard; it failed on the
post-merge main build with "pipelock tls init did not produce ca
files". Mirror the existing skipIf pattern at the class level.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two related changes the PRD 0022 sandbox-escape test surfaced:
1. `pipelock_build_config` now emits
`request_body_scanning.scan_headers: true` and
`header_mode: all`. Pipelock's default `header_mode:
sensitive` only checks Authorization / Cookie / X-Api-Key
/ X-Token / Proxy-Authorization / X-Goog-Api-Key — an
agent attempting exfil could trivially pick a
non-sensitive header (`X-Custom: $SECRET`) and slip
through. `all` closes the gap; pipelock caps it by the
same max_body_bytes the body scan uses.
2. Test 3 (HTTP exfil shapes) now targets
raw.githubusercontent.com instead of api.anthropic.com.
api.anthropic.com is in `DEFAULT_TLS_PASSTHROUGH` —
pipelock can't MITM it because real LLM conversation
bodies false-positive on DLP scanners (BIP-39 etc.). The
trade-off is documented in `pipelock.DEFAULT_TLS_PASSTHROUGH`;
the test now exercises a host where the sandbox is
actually supposed to block.
All 5 sandbox-escape attacks now produce HTTP 403 with the
expected sandbox marker (`egress:`, `pipelock`, or `blocked:`):
- Attack 1 (non-allowlisted host) ✓ egress
- Attack 2 (non-allowlisted IP + spoof) ✓ egress
- Attack 3a (URL path) ✓ pipelock DLP
- Attack 3b (URL query) ✓ pipelock DLP
- Attack 3c (request body) ✓ pipelock DLP
- Attack 3d (request header) ✓ pipelock DLP (scan_headers)
- Attack 4a (crafted subdomain) ✓ egress
- Attack 4b (direct dig @8.8.8.8) ✓ network isolation
- Attack 5 (README push, 3 secret shapes) ✓ gitleaks (pre-upstream)
489 unit tests pass (1 updated for the new request_body_scanning
shape). Full integration suite passes in ~6s.
End-to-end test that brings up a real bottle with allowlisted
egress + git-gate + three planted secrets, then runs five
attacks from inside the agent container.
Chunks 1-5 implemented in one pass against the Docker backend:
Attack 1 — non-allowlisted hostname (curl evil.example.com)
✓ blocked by egress
Attack 2 — non-allowlisted IP literal (198.51.100.1) + host-
header spoof via curl --resolve
✓ both blocked by egress
Attack 3 — HTTP exfil to allowlisted destination via path /
query / body / header
✗ ALL FOUR LEAK — request reaches api.anthropic.com
with the secret embedded. Pipelock's DLP doesn't
catch the anthropic-key shape in the body, and
nothing scans path / query / headers.
Attack 4 — DNS exfil via crafted subdomain + direct
dig @8.8.8.8 query
✓ both blocked (egress rejects subdomain, internal
network has no path to 8.8.8.8)
Attack 5 — README push through git-gate with secret-bearing
attacker URL (parameterized over anthropic / AWS /
generic shapes); ordering check that gitleaks fires
BEFORE any upstream attempt
✓ all three secret shapes blocked by gitleaks
Per PRD 0022 Q1 the assertion in attack 3 is authoritative —
HTTP 403 with an egress/pipelock marker in the body is the only
acceptable outcome. Any 4xx from upstream means the secret
reached the network. The four failing sub-tests are real
sandbox gaps that need their own remediation PRDs before this
test merges green.
Also adds `dnsutils` (dig) to the base agent image so attack 4's
direct-DNS check has a tool to run.
CI: no changes needed — `.gitea/workflows/test.yml` already runs
`tests/integration/` and the suite skip_unless_dockers cleanly
when the runner has no Docker socket.
All seven open questions now have decisions baked in:
- Q1 (HTTP-exfil scope): authoritative. Every shape MUST
block; chunk 3 expands into remediation sub-PRDs if
any of path/query/header leak today.
- Q3 (fake secret): multiple shapes, parameterized.
Three env vars (TEST_SECRET_ANTHROPIC, _AWS, _GENERIC);
test 5 loops via subTest. Resilient to gitleaks rule
renames.
- Q6 (missing backend): die. `get_bottle_backend()`'s
current behavior surfaces clearly; surprise-skips are
worse than loud failures for new-backend branches.
- Q7 (tool deps): preflight check. setUpClass runs
`which curl && which git && which dig`; SkipTest with
the missing list catches future backends shipping
thinner base images.
Updated implementation chunks + test-5 sketch to match.
No remaining open questions.
User feedback:
- Q2 (direct DNS resolver test): yes — test 4 grows a
second sub-assertion verifying `dig @8.8.8.8` from the
agent has no path out, alongside the existing
crafted-subdomain check.
- Q4 (gitleaks ordering): test 5 grows an ordering check
— asserts the rejection mentions `gitleaks` AND does
NOT mention upstream-network-phase phrases (resolve /
refused / unreachable / upstream). Confirms gitleaks
rejects BEFORE git-gate tries any upstream push.
- Q5 (CI): try it, accept fallback. New chunk 6 adds a
Gitea Actions job marked `continue-on-error: true` —
runs the suite if the runner can host compose, doesn't
block the workflow if docker-in-docker prevents it.
Three open questions remain (1: pipelock's actual DLP
coverage for non-body shapes; 3: realistic fake secret
shape vs. gitleaks regex; 6+7: backend-agnostic invocation
+ required tools — for the smolmachines work).
Draft a PRD for a composite integration test that brings up
a real bottle with a known allowlist + planted secret and
runs five attacks from inside the agent container:
1. Request to non-allowlisted hostname
2. Request to non-allowlisted IP (incl. host-header spoof)
3. Secret exfil via HTTP — path / query / body / headers
4. Secret exfil via crafted DNS subdomain
5. Secret exfil via README link pushed through git-gate
Each attack passes only when blocked with a permissions
error. The suite is backend-agnostic — runs against
whatever CLAUDE_BOTTLE_BACKEND selects — so it becomes the
gate the upcoming smolmachines spike has to pass before that
backend can substitute for Docker.
Sized into 5 chunks (fixture → attacks 1+2 → attack 3 →
attack 4 → attack 5). Seven open questions called out,
biggest being: today's pipelock probably leaks via header /
path / query because DLP only scans bodies — the test will
expose this as a real gap (chunk 3 lands with
`expectedFailure` markers if so).
When a fresh proposal arrives, the dashboard now also:
- Runs `tmux select-pane -t \$TMUX_PANE` (the dashboard's own
pane id, captured at startup) so tmux focus jumps to the
dashboard from wherever the operator was (typically claude
in the right pane).
- Flips internal focus to PANE_PROPOSALS so j/k navigates the
queued items immediately.
- Lands the selected cursor on the first new proposal —
proposals are sorted by arrival ascending, so the earliest
new arrival in the batch gets the cursor.
Stacks with the bell + label highlight from the previous
commit. The operator gets:
1. Audible bell (or tmux activity marker)
2. Tmux focus on the dashboard pane
3. Dashboard's internal focus on the proposals list
4. Cursor on the actual new proposal
5. Pane label flashing `(new!)` in bold green
— all without leaving the keyboard.