fix(sidecars): per-daemon pipelock restart keeps supervise socket alive #61
Reference in New Issue
Block a user
Delete Branch "fix-pipelock-restart-keeps-bundle-up"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fixes the MCP-socket-drops-on-route-apply behavior: applying a route change made the agent's MCP client lose its connection until a second apply. Cause:
apply_allowlist_changeuseddocker restart <bundle>, which bounced all four daemons including supervise — the exact whole-bundle restart PR #59's notes called out as a v1 trade-off. The user's bug report is that trade-off being unworkable in practice.Fix is per-daemon restart via SIGUSR1.
What changed
_Supervisor.restart_daemon(name)— terminates one named child (SIGTERM → SIGKILL grace), spawns a replacement under the same DaemonSpec. Updatesself.procs[idx]in place so subsequentforward_signal/request_shutdowncalls reach the new pid.main()—signal.signal(signal.SIGUSR1, lambda *_: sup.restart_daemon("pipelock")). Pipelock has no in-process reload (per the existing comment); SIGUSR1 is its analog of the SIGHUP-reload-addon path egress uses. Pipelock is the only daemon that needs this today.apply_allowlist_change—docker kill --signal USR1 <bundle>instead ofdocker restart <bundle>. Supervise / egress / git-gate keep running across the apply. The MCP socket stays open.Tests
_Supervisor.restart_daemoncases: replaces in place (different pid post-restart, sibling daemon unchanged), unknown name is a no-op, restart-during-shutdown is a no-op.test_pipelock_applyrewritten: brings up the bundle image withCLAUDE_BOTTLE_SIDECAR_DAEMONS=pipelockso the supervisor is PID 1 and handles SIGUSR1. The previous standalone-pipelock setup wouldn't survive SIGUSR1 (pipelock default disposition is terminate). Bundle image is built in setUpClass; cached layers make repeats fast.Test status
531 tests passing locally (unit + integration), 1 skipped (the existing GITEA_ACTIONS guard).
`apply_allowlist_change` used `docker restart <bundle>` to make pipelock reload, which bounced ALL four daemons — including supervise, whose MCP socket the agent's claude-code client had open. That dropped the connection. A second apply works because supervise has come back up by then. Fix: per-daemon restart via SIGUSR1. - New `_Supervisor.restart_daemon(name)` terminates one named child and spawns a replacement in place. Other daemons keep running. - main() wires SIGUSR1 → `restart_daemon("pipelock")`. Pipelock has no in-process reload, so this is its analog of egress's SIGHUP-reload-addon path. Pipelock is the only daemon that currently needs hot-config reload via restart; if others acquire the need, add a new signal. - `apply_allowlist_change` now `docker kill --signal USR1 <bundle>` instead of `docker restart`. Supervise / egress / git-gate keep running across the apply. Tests: - New `_Supervisor.restart_daemon` cases: replaces in place (different pid post-restart, sibling daemon unchanged), unknown name is a no-op, restart-during-shutdown is a no-op. - `test_pipelock_apply` rewritten to bring up the bundle image with `CLAUDE_BOTTLE_SIDECAR_DAEMONS=pipelock` so the supervisor is PID 1 and handles SIGUSR1. The previous standalone-pipelock setup wouldn't survive SIGUSR1 (pipelock default disposition is terminate). Test builds the bundle image in setUpClass (cached layers make repeat runs fast). 531 tests passing locally (unit + integration). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>