fix(cli): install cleanup_all trap before pipelock_start to avoid orphan networks

Previously cleanup_all was defined AFTER network_create_internal /
network_create_egress / pipelock_start ran, so a failure during
pipelock_start (or in network_create_egress added by the prior commit)
would land in the cleanup_stage trap that knows nothing about networks.
The internal and egress networks would survive the failed launch and
accumulate as orphans on the host.

Move the cleanup_all definition + `trap … EXIT INT TERM` install ahead
of the resource creation, and gate the CONTAINER branch on
`-n "${CONTAINER:-}"` since CONTAINER is set earlier in the function
but the trap now runs in the early-failure window. pipelock_stop and
network_remove are already idempotent against missing resources.

Smoke test: with `CLAUDE_BOTTLE_PIPELOCK_IMAGE` pinned to a nonexistent
digest, `./cli.sh start implementer` now creates both networks, fails
at pipelock_start, and exits with both networks removed —
`docker network ls | grep claude-bottle` returns nothing.

Assisted-by: Claude Code
This commit is contained in:
2026-05-08 01:18:40 -04:00
parent a21115e032
commit 713424214e
+16 -7
View File
@@ -530,15 +530,20 @@ cmd_start() {
INTERNAL_NETWORK=""
EGRESS_NETWORK=""
PIPELOCK_CONTAINER=""
INTERNAL_NETWORK="$(network_create_internal "$SLUG")"
EGRESS_NETWORK="$(network_create_egress "$SLUG")"
PIPELOCK_CONTAINER="$(pipelock_start "$SLUG" "$INTERNAL_NETWORK" "$EGRESS_NETWORK" "$STAGE_DIR" "$PIPELOCK_YAML_FILENAME")"
# Cleanup container on exit too. Compose with stage cleanup.
# Order matters: sidecar first, then networks — docker refuses to
# remove a network with attached containers.
# Define cleanup_all and INSTALL THE TRAP before any of the docker
# resources below are created. Without this, a failure in
# network_create_egress or pipelock_start (e.g. the image can't be
# pulled) would leave behind orphan networks that the previous
# cleanup_stage trap had no way to remove. cleanup_all is a no-op
# for resources whose tracking variable is empty, and the helpers
# it calls (pipelock_stop, network_remove) are idempotent against
# missing resources, so installing the trap eagerly here is safe.
#
# Order matters at teardown: sidecar first, then networks — docker
# refuses to remove a network with attached containers.
cleanup_all() {
if container_exists "$CONTAINER"; then
if [ -n "${CONTAINER:-}" ] && container_exists "$CONTAINER"; then
docker rm -f "$CONTAINER" >/dev/null 2>&1 || true
fi
if [ -n "${PIPELOCK_CONTAINER:-}" ]; then
@@ -555,6 +560,10 @@ cmd_start() {
# Replaces the cleanup_stage EXIT trap above; cleanup_all calls cleanup_stage internally.
trap cleanup_all EXIT INT TERM
INTERNAL_NETWORK="$(network_create_internal "$SLUG")"
EGRESS_NETWORK="$(network_create_egress "$SLUG")"
PIPELOCK_CONTAINER="$(pipelock_start "$SLUG" "$INTERNAL_NETWORK" "$EGRESS_NETWORK" "$STAGE_DIR" "$PIPELOCK_YAML_FILENAME")"
# Assemble docker run argv:
# - --rm -d --name CONTAINER
# - --network INTERNAL_NETWORK so the agent's only egress route is