PRD 0001: Per-agent egress proxy via pipelock #1

Merged
didericis merged 13 commits from prd-0001-per-agent-egress-proxy-via-pipelock into main 2026-05-08 01:56:44 -04:00
Owner

Tracking PR for PRD 0001. Implementation complete on this branch.

Summary

Adds a per-agent pipelock sidecar on a Docker --internal network so each agent container's only egress route is through pipelock's HTTP forward proxy (hostname allowlist + 48-pattern DLP + subdomain-entropy DNS-exfil detection). New lib/network.sh and lib/pipelock.sh modules; cli.sh wires the lifecycle; the bottle schema gains an egress.allowlist array; pipelock image is pinned by digest.

Follow-ups (deferred to a later PR)

  • Make the default allowlist host-env-overridable: change lib/pipelock.sh:81 from a plain assignment to ${CLAUDE_BOTTLE_PIPELOCK_DEFAULT_ALLOWLIST:-...} so users can override without editing the file.
  • Validate hostnames in pipelock_bottle_allowlist before they reach the YAML emitter (regex against ^[A-Za-z0-9.*_-]+$) to prevent malformed manifest entries from breaking the YAML.

Notes

  • ELv2 licensing of pipelock's enterprise/ subtree remains an open question carried over from the PRD; the features used here are all in the Apache-2.0 docs.
Tracking PR for [PRD 0001](https://gitea.dideric.is/didericis/claude-bottle/src/branch/main/docs/prds/0001-per-agent-egress-proxy-via-pipelock.md). Implementation complete on this branch. ## Summary Adds a per-agent pipelock sidecar on a Docker `--internal` network so each agent container's only egress route is through pipelock's HTTP forward proxy (hostname allowlist + 48-pattern DLP + subdomain-entropy DNS-exfil detection). New `lib/network.sh` and `lib/pipelock.sh` modules; `cli.sh` wires the lifecycle; the bottle schema gains an `egress.allowlist` array; pipelock image is pinned by digest. ## Follow-ups (deferred to a later PR) - [ ] Make the default allowlist host-env-overridable: change `lib/pipelock.sh:81` from a plain assignment to `${CLAUDE_BOTTLE_PIPELOCK_DEFAULT_ALLOWLIST:-...}` so users can override without editing the file. - [ ] Validate hostnames in `pipelock_bottle_allowlist` before they reach the YAML emitter (regex against `^[A-Za-z0-9.*_-]+$`) to prevent malformed manifest entries from breaking the YAML. ## Notes - ELv2 licensing of pipelock's `enterprise/` subtree remains an open question carried over from the PRD; the features used here are all in the Apache-2.0 docs.
didericis added 1 commit 2026-05-08 00:52:29 -04:00
didericis added 1 commit 2026-05-08 00:56:54 -04:00
Adds the network half of the PRD 0001 egress topology: per-agent
--internal Docker networks with a slug-derived name and a numeric
conflict suffix that mirrors the container-name scheme in cli.sh.
Helpers cover create / attach / remove and are pipelock-agnostic, so
a future PRD can layer a different sidecar on top without entangling
the two concerns.

Refs: docs/prds/0001-per-agent-egress-proxy-via-pipelock.md

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 00:58:39 -04:00
Adds the pipelock half of the PRD 0001 egress topology:

- Pins the pipelock image by digest (sha256:3b1a39...) for the
  multi-arch ghcr.io/luckypipewrench/pipelock:2.3.0 manifest list,
  resolved on 2026-05-08. The registry uses unprefixed tags, so the
  v2.3.0 GitHub release maps to the 2.3.0 Docker tag.
- Bakes in the default allowlist for Claude Code's required hosts
  (api.anthropic.com, statsig.anthropic.com, sentry.io, claude.ai,
  platform.claude.com, downloads.claude.ai, raw.githubusercontent.com)
  and unions it with the bottle's egress.allowlist for the effective
  list.
- Generates a minimum-viable YAML config at mode 600: strict mode +
  enforce + api_allowlist + forward_proxy.enabled + DLP defaults +
  scan_env. No env values, no secrets, hostnames only. Schema keys
  cite pipelock's docs/configuration.md inline.
- Sidecar lifecycle: docker create → docker cp the YAML in → connect
  to the default bridge for upstream egress → docker start. Avoids
  bind mounts (Docker Desktop ownership quirks). Stop is idempotent
  for use in cli.sh's exit trap.
- Helper for the y/N preflight: one-line summary "<N> hosts allowed
  (host1, host2, host3 +M more)".

Refs: docs/prds/0001-per-agent-egress-proxy-via-pipelock.md
Refs: docs/research/pipelock-assessment.md

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 00:59:16 -04:00
Extends the manifest schema doc-comment to include the new
bottles.<name>.egress.allowlist field added in PRD 0001, and
introduces manifest_bottle_egress_allowlist alongside
manifest_bottle_ssh — same shape as the existing per-bottle
helper, returns one hostname per line, empty for missing field.
The accessor performs only top-level array-type validation;
per-element string typing happens in lib/pipelock.sh next to the
YAML generator that consumes it.

Refs: docs/prds/0001-per-agent-egress-proxy-via-pipelock.md

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:01:24 -04:00
PRD 0001 cli.sh integration:

- Source the new lib/network.sh and lib/pipelock.sh.
- During plan resolution: generate the per-bottle pipelock YAML into
  the existing mktemp stage dir (mode 600, hostnames only) and
  resolve a one-line "<N> hosts allowed (...)" summary.
- Add the egress summary as a sub-bullet under the bottle in the y/N
  preflight, alongside the existing ssh hosts line.
- After the y/N gate (and after build_image): create the per-agent
  --internal Docker network with a slug-derived name, then start the
  pipelock sidecar attached to it.
- docker run argv: agent attaches to the internal network with
  HTTPS_PROXY / HTTP_PROXY pointing at the sidecar by service name on
  that network. NO_PROXY only covers loopback. The internal network
  has no default gateway, so any path that ignores the proxy env
  hits no-route-to-host rather than leaking.
- Exit trap: tear down the agent container, then the sidecar (so the
  network is empty), then remove the network, then run the existing
  stage cleanup. Order matters — docker refuses to remove a network
  with attached containers.
- --dry-run continues to exit before any docker network/run/cp/exec
  call; the YAML write into the mktemp dir is the only new
  side-effect inside the dry-run path.

Verified against a temp fixture: defaults-only bottle shows
"7 hosts allowed", a bottle with two extra entries shows
"9 hosts allowed (api.anthropic.com, api.openai.com, claude.ai,
+6 more)", and dry-run exits before any docker calls.

Refs: docs/prds/0001-per-agent-egress-proxy-via-pipelock.md

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:01:41 -04:00
Adds a short Egress section to the README explaining that agent
containers route HTTP through a per-agent pipelock sidecar on a Docker
--internal network, what the baked-in default allowlist covers, and
how to extend it via bottles.<name>.egress.allowlist with a single
JSON example. Points readers at PRD 0001 and the pipelock assessment
note for the full design and rationale.

Refs: docs/prds/0001-per-agent-egress-proxy-via-pipelock.md

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:16:50 -04:00
Docker's legacy `bridge` network has no embedded DNS resolver — only
user-defined bridges do — so attaching the pipelock sidecar to `bridge`
made it unable to resolve `api.anthropic.com` and dead-ended Claude Code
traffic. Add `network_create_egress`, refactored around a shared
`_network_create_with_prefix` helper, and wire it through `pipelock_start`
and `cli.sh` so the sidecar straddles the agent's --internal network and
a per-agent user-defined egress bridge instead. The agent container
itself still attaches to the internal network only.

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:17:22 -04:00
PR #1 reviewer flagged the sidecar argv as unverified. Pulled the pinned
digest (ghcr.io/luckypipewrench/pipelock@sha256:3b1a39…6de9), inspected
ENTRYPOINT (`/pipelock`) and CMD (`run --listen 0.0.0.0:8888`), and read
`pipelock run --help` directly from the image. The forward-proxy listen
flag is `--listen` (no `--mcp-` prefix) — `--mcp-listen` is for the
separate MCP HTTP listener, not the forward proxy we use. Smoke-tested
the exact argv against the digest and confirmed the /health endpoint
responded on :8888.

The argv was already correct; this commit records the verification in a
load-bearing comment so future readers don't have to re-derive it.

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:18:45 -04:00
Previously cleanup_all was defined AFTER network_create_internal /
network_create_egress / pipelock_start ran, so a failure during
pipelock_start (or in network_create_egress added by the prior commit)
would land in the cleanup_stage trap that knows nothing about networks.
The internal and egress networks would survive the failed launch and
accumulate as orphans on the host.

Move the cleanup_all definition + `trap … EXIT INT TERM` install ahead
of the resource creation, and gate the CONTAINER branch on
`-n "${CONTAINER:-}"` since CONTAINER is set earlier in the function
but the trap now runs in the early-failure window. pipelock_stop and
network_remove are already idempotent against missing resources.

Smoke test: with `CLAUDE_BOTTLE_PIPELOCK_IMAGE` pinned to a nonexistent
digest, `./cli.sh start implementer` now creates both networks, fails
at pipelock_start, and exits with both networks removed —
`docker network ls | grep claude-bottle` returns nothing.

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:23:44 -04:00
The pipelock image is distroless and does not contain /etc/pipelock/, so
docker cp to /etc/pipelock/pipelock.yaml fails with "Could not find the
file /etc/pipelock in container" — docker cp does not create missing
intermediate parent directories when targeting a stopped container, and
no shell is available in the image for a mkdir shim. Move the config
file to /etc/pipelock.yaml (directly under /etc, which always exists)
and update the --config argv to match. Also surface docker cp stderr in
the die message so future failures of this sort are debuggable.

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:39:12 -04:00
The agent container is on an --internal Docker network with no default
route — only the pipelock sidecar is reachable. HTTPS_PROXY routes
HTTP through pipelock, but raw TCP (e.g. SSH on port 30009) had no
egress path, so `git fetch` against any bottle.ssh entry failed with
"Network is unreachable".

Fix: tunnel SSH through pipelock's HTTP CONNECT proxy.
- lib/ssh.sh injects `ProxyCommand socat - PROXY:<pipelock>:%h:%p,proxyport=<n>`
  into each Host block in the in-container ~/.ssh/config. socat is
  already in the image (apt-installed for the ssh-agent forwarder).
- lib/pipelock.sh auto-adds each bottle.ssh[].Hostname to the effective
  allowlist so pipelock permits the CONNECT.
- cli.sh threads the pipelock host:port into ssh_setup.

Note: works for SSH hosts pipelock's SSRF layer doesn't block. CGNAT
(100.64.0.0/10) and other non-RFC1918 ranges should pass; if a future
host gets blocked, expose pipelock's trusted_domains as a follow-up.

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:42:35 -04:00
Pipelock's default SSRF blocklist includes 100.64.0.0/10 (RFC 6598
CGNAT, where Tailscale IPs live) plus all RFC 1918 / link-local
ranges, so a CONNECT to a bottle.ssh[] target on Tailscale was rejected
with `scanner: ssrf, reason: SSRF blocked: <ip> resolves to internal IP`
even after the host appeared in api_allowlist.

Fix: while emitting the YAML, classify each bottle.ssh[].Hostname:
  - IPv4 literal -> ssrf.ip_allowlist as <ip>/32 (canonical CIDR).
  - Hostname     -> trusted_domains (hostname-based SSRF exemption).

Both blocks are emitted only when entries exist, so bottles with no
ssh / no private-IP targets still produce a minimal config.

Assisted-by: Claude Code
didericis added 1 commit 2026-05-08 01:54:30 -04:00
Adds tests/ with a tiny bash assert harness, manifest fixtures, and a
runner. No framework dependency — each test file is self-contained
and exits 0 on pass / 1 on fail; tests/run_tests.sh aggregates.

Unit tests (no docker):
  - pipelock_naming: container_name, proxy_url, proxy_host_port shape
  - pipelock_classify: _pipelock_is_ipv4_literal classifier coverage
  - pipelock_allowlist: bottle_allowlist + ssh hostnames/ip_cidrs/
    trusted_domains + effective_allowlist union/dedup/sort, plus
    rejection of non-string entries
  - pipelock_yaml: emitter shape (mode/enforce/api_allowlist/forward_proxy/
    dlp), conditional ssrf+trusted_domains blocks, secret hygiene
    (manifest env values must not appear in YAML), file mode 600

Integration tests (require docker, skip cleanly otherwise):
  - pipelock_image: pinned digest's ENTRYPOINT is /pipelock and CMD
    contains 'run' and the binary --version succeeds — would catch a
    future image bump that changes the launcher's argv contract
  - pipelock_sidecar_smoke: docker create + cp YAML to /etc/pipelock.yaml
    + start, then probe /health — the regression test for the bug
    where the YAML was written to /etc/pipelock/ (parent dir absent in
    the distroless image)
  - dry_run_plan: cli.sh start --dry-run shows the egress line,
    counts the bottle's entry into the effective allowlist, prints
    the dry-run banner, and creates zero docker resources
  - orphan_cleanup: the cleanup primitives the start-flow trap depends
    on (network_remove, pipelock_stop) are idempotent against
    missing/never-existed resources, so the trap is safe even if
    pipelock_start dies before everything is wired up

Assisted-by: Claude Code
didericis merged commit ba7616a4ae into main 2026-05-08 01:56:44 -04:00
Sign in to join this conversation.