From 39059f69b9ae83fbde06fcef753cca608c15e9c2 Mon Sep 17 00:00:00 2001 From: didericis Date: Fri, 8 May 2026 00:52:21 -0400 Subject: [PATCH] =?UTF-8?q?docs(prd):=20scaffold=20PRD=200001=20=E2=80=94?= =?UTF-8?q?=20Per-agent=20egress=20proxy=20via=20pipelock?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Assisted-by: Claude Code --- ...001-per-agent-egress-proxy-via-pipelock.md | 222 ++++++++++++++++++ 1 file changed, 222 insertions(+) create mode 100644 docs/prds/0001-per-agent-egress-proxy-via-pipelock.md diff --git a/docs/prds/0001-per-agent-egress-proxy-via-pipelock.md b/docs/prds/0001-per-agent-egress-proxy-via-pipelock.md new file mode 100644 index 0000000..4ea1c3d --- /dev/null +++ b/docs/prds/0001-per-agent-egress-proxy-via-pipelock.md @@ -0,0 +1,222 @@ +# PRD 0001: Per-agent egress proxy via pipelock + +- **Status:** Draft +- **Author:** didericis +- **Created:** 2026-05-08 + +## Summary + +Run pipelock as a sidecar container on each claude-bottle agent's only +egress route, scanning all outbound HTTP for hostname allowlist violations +and DLP matches. + +## Problem + +Today the agent container has unrestricted internet egress, and even on +allowed channels there is no content-level inspection. Specifically: + +- Containers have unrestricted internet egress; a misbehaving agent can + POST to any host. +- Allowed channels (`api.anthropic.com`, git remotes) can still carry + content-level exfil with no detection. +- DNS exfil via subdomain encoding is not detected anywhere in the stack. +- MCP tool calls and responses pass through unscanned. + +These gaps are documented in `docs/research/network-egress-guard.md`, +`docs/research/secret-exfil-tripwire-encodings.md`, and +`docs/research/pipelock-assessment.md`. The pipelock assessment recommends +adopting pipelock as the v2 sidecar (in place of smokescreen) layered on +top of a v1 iptables+dnsmasq floor. + +## Goals / Success Criteria + +The feature works when all of the following are observable: + +- The agent container has no default route; `curl https://example.com` + from inside fails when `example.com` is not on the allowlist. +- The agent container can reach `api.anthropic.com` and `claude` runs + end-to-end through the proxy. +- Pipelock blocks a known credential pattern in a request body and + surfaces a structured log line for the block. +- The subdomain-entropy check fires on a `.evil.com` + request. + +The feature is **done** when all of the following ship: + +- `cli.sh start` brings up a per-agent pipelock sidecar on a `--internal` + Docker network and points the agent's `HTTPS_PROXY` at it. +- A per-agent pipelock YAML config is generated from a bottle-level + `egress.allowlist` field, plus baked-in defaults for Claude Code's + required hosts so basic bottles work out of the box. +- The existing `cli.sh` y/N preflight shows the resolved allowlist before + launch. +- When the agent container exits, the pipelock sidecar and the internal + network are torn down cleanly (no orphaned containers or networks). + +## Non-goals + +- Closing every exfil vector. SSH session content, raw TCP, UDP, ICMP, + and TLS-SNI domain fronting all remain known gaps after this PRD ships + and are explicitly out of scope. +- Audit logging or persistent log storage of pipelock decisions. v1 logs + to stdout only; durable logging is a follow-up PRD. +- Replacing the v1 iptables layer. Pipelock sits above iptables, not in + place of it (see `pipelock-assessment.md` §Recommendation). +- Multi-tenant or remote-pipelock deployments. v1 is one pipelock + container per agent container, on the same Docker host. + +## Scope + +### In scope + +- New manifest schema: an `egress` object on each bottle, with + `allowlist: [hostname]` for v1. +- Generation of a per-bottle pipelock YAML config at launch. +- Per-agent Docker `--internal` network creation and teardown. +- Pipelock sidecar container lifecycle (start, attach to network, + receive config, stop on agent exit). +- `HTTPS_PROXY` / `HTTP_PROXY` injection into the agent container. +- Preflight integration: the existing y/N plan in `cli.sh` lists the + resolved allowlist. + +### Out of scope + +- The v1 iptables + ipset + dnsmasq layer (separate PRD; see + `network-egress-guard.md`). +- TLS interception / domain-fronting mitigation. Pipelock does not + terminate TLS and this PRD does not introduce CA-trust injection. +- Per-bottle DLP rule customization beyond pipelock's 48 built-in + patterns. Custom signed rule bundles are deferred. +- Mediator-signed action receipts and any other pipelock features + potentially gated under the ELv2 enterprise subtree (see open + question on licensing in `pipelock-assessment.md`). + +## Proposed Design + +### New services / components + +Two new files under `lib/`: + +- **`lib/pipelock.sh`** — pipelock-specific logic. Generates the + per-bottle YAML config from the manifest's `egress` block plus baked-in + defaults; copies the YAML into the sidecar via `docker cp`; starts and + stops the sidecar container; resolves the allowlist for display in the + preflight. +- **`lib/network.sh`** — Docker network plumbing. Creates the per-agent + `--internal` network (named `claude-bottle-net-` with the same + slug-and-suffix scheme used for container names), attaches the agent + and sidecar to it, removes it on teardown. Kept separate from + `lib/docker.sh` so a future PRD can add non-pipelock network controls + without entangling them with pipelock specifics. + +This split mirrors the existing per-concern lib/ pattern +(`manifest.sh`, `env_resolve.sh`, `skills.sh`, `ssh.sh`). + +### Existing code touched + +- **`cli.sh`** — wire the new lifecycle into `start`: create the + internal network, launch the pipelock sidecar, then launch the agent + container with `HTTPS_PROXY` / `HTTP_PROXY` set to the sidecar's + service name. Add the resolved allowlist to the preflight y/N output. + Tear down sidecar + network in the existing exit trap. +- **`README.md`** — public-facing description should mention that + agent containers route HTTP egress through pipelock by default, and + document the new `egress.allowlist` bottle field. + +`Dockerfile` is intentionally not touched for v1 — `HTTPS_PROXY` / +`HTTP_PROXY` are injected per-launch via `docker run -e`, not baked into +the image. This keeps the image agnostic to whether a sidecar is in use +(useful if a future bottle definition opts out of the proxy for testing). + +`lib/docker.sh` may grow one or two helpers if there is a clean place +for shared primitives, but the network-specific helpers live in +`lib/network.sh`. Decide during implementation; not a contract. + +### Data model changes + +The bottle schema gains an `egress` object. The structure is designed +to allow incremental additions without a breaking rename: + +```jsonc +{ + "bottles": { + "default": { + "env": { "...": "..." }, + "ssh": [], + "egress": { + "allowlist": [ + "api.anthropic.com", + "github.com" + ] + } + } + } +} +``` + +Resolution rules: + +- The effective allowlist is ``. +- Baked-in defaults cover hosts Claude Code itself needs: + `api.anthropic.com`, `statsig.anthropic.com`, `sentry.io`, + `claude.ai`, `platform.claude.com`, `downloads.claude.ai`, + `raw.githubusercontent.com` (per `pipelock-assessment.md` and + Claude Code's network-config docs). +- Bottles with no `egress` block use defaults only. +- Future keys (`dlp`, `mode`, `data_budget`, etc.) are reserved under + the same `egress` object; v1 ignores unknown keys. + +The `agent` schema is unchanged. Egress is a property of the +container/sandbox, not the task — multiple agents pointing at the same +bottle share the same allowlist. + +### External dependencies + +- **Pipelock binary** is pulled from + `ghcr.io/luckypipewrench/pipelock@sha256:`. The digest is + pinned in `lib/pipelock.sh` (or a sibling `.env`-shaped constants + file) and bumped deliberately, mirroring the claude-code version + pinning pattern in `Dockerfile`. +- No new host-side runtimes. The pipelock image is the only new + external artifact. + +## Open questions + +- **ELv2 licensing.** Several capabilities discussed in + `pipelock-assessment.md` (mediator-signed action receipts, signed + rule bundles) may live under the `enterprise/` subtree and require + accepting Elastic License v2 terms. Before implementation, audit + which features used by this PRD are Apache-2.0-core. v1's plan + (proxy + 48 default DLP patterns + subdomain entropy + sidecar + topology) is expected to be core-only, but this should be confirmed. +- **Where to put the digest pin.** A constant in `lib/pipelock.sh` is + the lowest-friction option; a separate `lib/versions.sh` (or similar) + may be cleaner once there are multiple pinned dependencies. Decide + during implementation. +- **Per-agent overrides.** The PRD scopes egress to the bottle. If a + later use case calls for tightening (not loosening) the allowlist for + one agent within a bottle, revisit. Out of scope for v1. +- **Default-allowlist drift.** Claude Code's required hostnames may + change with new versions. v1 hardcodes the current set; a follow-up + could derive them from the pinned claude-code version or a published + manifest from Anthropic. +- **Sidecar log surface.** Pipelock decisions go to the sidecar's + stdout. v1 leaves these visible only via `docker logs ` — + fine for inspection but not aggregated. Persistent / structured + logging is a non-goal here, called out for the follow-up. +- **DNS resolver routing.** Pipelock's subdomain-entropy check fires + on URLs it sees, not on raw UDP/53. Without the v1 dnsmasq layer the + agent could still query a non-allowlisted resolver directly. Document + the dependency on the v1 PRD (or note explicitly that v1 of this PRD + ships with that gap if the iptables PRD lands later). + +## References + +- `docs/research/pipelock-assessment.md` — recommendation and rationale. +- `docs/research/network-egress-guard.md` — v1 iptables+dnsmasq baseline. +- `docs/research/secret-exfil-tripwire-encodings.md` — content-tripwire + framing this PRD partially addresses via pipelock's DLP layer. +- Pipelock README: + +- Claude Code network configuration: +