# PRD 0017: Egress-proxy — universal MITM with path filtering + auth injection - **Status:** Draft - **Author:** didericis - **Created:** 2026-05-25 - **Supersedes:** the cred-proxy sidecar (PRD 0010) — hard cutover. ## Summary Replace the per-bottle cred-proxy sidecar with a new `egress-proxy` sidecar built on mitmproxy. The egress-proxy is the agent's `HTTP_PROXY` / `HTTPS_PROXY` — every agent HTTP/HTTPS request flows through it before reaching pipelock. It owns three jobs that today are split between cred-proxy and pipelock: 1. **MITM the agent's HTTPS.** Uses the per-bottle CA today held by pipelock; that key moves to the egress-proxy. 2. **Path-level allow/deny.** Manifest-declared `path_allowlist` per route. Universal coverage — any HTTPS path the agent reaches for is inspected here, not just traffic that voluntarily dials the cred-proxy URL. 3. **Credential injection.** Continues cred-proxy's existing role: match by hostname (or hostname + path), strip inbound Authorization, inject one based on the route's optional `auth: { scheme, token_ref }` block. Pipelock's role narrows to hostname allowlist + DLP body scanning on the egress-proxy → upstream leg. Pipelock no longer holds the CA private key; no longer the agent's direct proxy. ## Problem PR #25's pipelock-block flow exposed an honest gap: pipelock's `api_allowlist` is hostname-only (verified by probing the binary's strict preset and the `pipelock check --url` output). Approving a proposed `pipelock-block` opens the entire host, not the URL's path. For shared platforms (github.com, gitlab.com, public registries) operators routinely want narrower-than-host granularity — allow github.com/didericis but block github.com/somebody-else. Cred-proxy already does path-prefix routing for credentialed APIs, but it only sees the requests the agent voluntarily routes to it (via `ANTHROPIC_BASE_URL`, `~/.gitconfig` insteadOf, npmrc `registry=`). A raw `curl https://github.com/anyone` from the agent goes to `HTTPS_PROXY=pipelock` directly and bypasses cred-proxy entirely. So extending cred-proxy with `path_allowlist` (the earlier PRD 0017 draft) buys *opt-in* path filtering, not enforcement. For enforcement we need a layer that sits on the agent's `HTTPS_PROXY` path — universal coverage of agent egress. ## Goals / Success Criteria A bottle manifest declares an egress-proxy route with a `path_allowlist`. From inside the bottle, `curl https://github.com/didericis/foo` succeeds; `curl https://github.com/somebody-else/secret` gets a 403 from egress-proxy, never reaches pipelock or the real github. The same holds for any tool inside the bottle that respects `HTTPS_PROXY` — claude-code, git over HTTPS, npm, raw curl, random Python `requests`. No tool-specific rewrite is required for path enforcement. Existing cred-proxy responsibilities continue to work after the cutover: Anthropic OAuth injection for claude-code (via the proxy-side header injection rather than the dotfile rewrite), git-insteadof routing into the proxy stays useful for hostname canonicalisation but is no longer load-bearing for credential delivery. ## Non-goals - Replacing pipelock. Pipelock keeps doing hostname allowlist + DLP body scanning on the egress-proxy → upstream leg. - Building our own MITM stack. mitmproxy already does it; we ship addons. - Backward compatibility with `bottle.cred_proxy.routes[]`. Hard cutover (see Migration). - Path-level rules in pipelock. Upstream feature request is a separate track (file independently); this PRD doesn't depend on it. ## Scope ### In scope - A new `egress-proxy` sidecar replacing the cred-proxy sidecar. mitmproxy image, pinned by digest. Addons in Python. - Per-bottle CA generation **moves from pipelock to egress-proxy**. The agent's trust store is rebuilt against the egress-proxy CA (was pipelock's CA). - Manifest rename: `bottle.cred_proxy.routes[]` → `bottle.egress_proxy.routes[]`. The route shape gains optional `path_allowlist: [, ...]` and a nested optional `auth: { scheme, token_ref }` block (presence/absence of `auth` is the authenticated vs unauthenticated signal — replaces the old `auth_scheme: "none"` pattern). - Agent's `HTTP_PROXY` / `HTTPS_PROXY` env vars repointed at the egress-proxy (was pipelock). - Pipelock retains its sidecar slot and its own DLP + hostname scanner. The agent never dials it directly anymore; egress-proxy uses `HTTPS_PROXY=pipelock` for its outbound leg, matching the current cred-proxy → pipelock pattern. - Existing PRDs that depend on cred-proxy: - PRD 0014 (cred-proxy-block remediation) → renames + retargets apply path. SIGHUP reload semantics carry over to egress-proxy. - PRD 0013 (supervise plane) `cred-proxy-block` MCP tool stays; its proposed file format updates per the new route shape. - Removal of the old cred-proxy code: `claude_bottle/cred_proxy.py`, `cred_proxy_server.py`, `backend/docker/cred_proxy.py`, `provision/cred_proxy.py`, the `Dockerfile.cred-proxy`. Tests updated. ### Out of scope - Pipelock CA path: pipelock keeps generating its *own* CA for any internal TLS termination it still does (e.g., on the egress-proxy → upstream leg if pipelock is the MITM there). Whether pipelock needs that CA at all post-cutover is an open question (probably no — egress-proxy already terminated; pipelock is now downstream of a plain-HTTP forward from egress-proxy). - Glob / regex matching in `path_allowlist`. v1 ships prefix matching; expressive forms are a follow-up. - An MCP tool for the agent to propose `path_allowlist` additions. Today the operator manages this via the manifest + the existing `routes edit ` TUI verb (renamed to `egress-proxy edit `). ## Proposed design ### Topology ``` [Agent] --HTTP_PROXY=egress-proxy--> [egress-proxy (mitmproxy)] MITM with per-bottle CA path_allowlist enforcement Authorization header injection --HTTPS_PROXY=pipelock--> [pipelock] hostname allowlist DLP body scan --egress--> Internet ``` Universal coverage: every HTTP/HTTPS request the agent makes hits egress-proxy first. cred-proxy's URL convention (`http://cred-proxy:9099/...`) goes away — there's no need for the agent to address the proxy by name because it's already on the default proxy path. ### Manifest ```yaml egress_proxy: routes: # Authenticated route — `auth` block carries the injection # config. path_allowlist optional. - host: "api.github.com" auth: scheme: "Bearer" token_ref: "GH_PAT" path_allowlist: - "/repos/didericis/" - "/users/didericis" # Unauthenticated path-filtered route — `auth` omitted # entirely (presence/absence of the key is the auth signal). - host: "github.com" path_allowlist: - "/didericis/" # Bare-pass route: no auth, no path constraint. Useful when # you want a host to skip path filtering but still be # DLP-scanned by pipelock on the outbound leg. - host: "api.anthropic.com" ``` Route matching is on `host` (was `path` prefix). The hostname gates whether a route applies; `path_allowlist` (if present) constrains the URL path under that host. The optional `auth` block carries credential-injection config: - Omit `auth` → no Authorization header injected (replaces the earlier draft's `auth_scheme: "none"`). - `auth.scheme` → one of `Bearer`, `token` (the values cred-proxy supports today; sidesteps the gitea-token quirk). - `auth.token_ref` → host env var holding the secret. Same semantics as cred-proxy's `TokenRef` field today. Validation: `auth` (if present) must contain both `scheme` and `token_ref`. An empty `auth: {}` is an error rather than a synonym for "no auth" — that's what omission is for. ### mitmproxy addon shape The egress-proxy ships a small Python addon that: - Loads the per-bottle routes from `/etc/egress-proxy/routes.yaml` (rendered by the prepare step, docker-cp'd in like cred-proxy's current routes.json). - On `request` hook: match `flow.request.host` → route. If no route matches → forward unchanged (pipelock will hostname-gate it). If route matches and has `path_allowlist`, check `flow.request.path` against the prefix list; 403 with a clear reason if no match. - On approved requests: strip inbound Authorization. If the route carries an `auth` block, inject `Authorization: `. If the route omits `auth`, leave Authorization unset. - SIGHUP / file-mtime watch on `routes.yaml` for hot-reload (same cadence as today's cred-proxy SIGHUP path). mitmproxy's standard CA generation handles per-host leaf certs at SNI time. The per-bottle CA is generated at bottle launch (was pipelock's tls-init step; now egress-proxy's). Agent's trust store gets the egress-proxy CA installed in place of pipelock's. ### Trust-domain concentration The egress-proxy now holds: - Every credential the bottle declared in `egress_proxy.routes[]` (OAuth tokens, PATs, npm tokens). - The per-bottle MITM CA private key. This is a deliberate concentration. With the previous split: - cred-proxy held tokens. - pipelock held the CA. A memory disclosure in cred-proxy exposed tokens; in pipelock, the CA. Both were bad; neither exposed everything. The new egress-proxy in one disclosure exposes both. Mitigations: - mitmproxy runs as an unprivileged user inside the container. - Tokens live in the container's environ (same as cred-proxy today). The CA private key is mounted from the host's stage_dir (mode 600). - Pipelock stays as a separate sidecar, so a compromise of egress-proxy doesn't disable pipelock's hostname check + DLP on the outbound leg — the attacker can forge certs to the agent but can't easily exfil from inside the agent without pipelock noticing. The user (per PR #25 discussion) accepted this concentration in exchange for the one-sidecar consolidation. The PRD records it explicitly. ### Migration — hard cutover No backward-compat alias for `bottle.cred_proxy.routes[]`. At manifest load: - `cred_proxy:` block → `die()` with a clear pointer at this PRD and a migration recipe (rename to `egress_proxy:`, rename `path` → `host`, drop the agent-side URL prefix). - `cred_proxy_routes` field on existing dataclasses removed. - `Dockerfile.cred-proxy` deleted. - `claude_bottle/cred_proxy*.py` deleted. - `claude_bottle/backend/docker/cred_proxy*.py` consolidated into `egress_proxy*.py`. - Provisioner files renamed. - PRDs 0010 (cred-proxy), 0014 (cred-proxy-block remediation) retroactively annotated as "superseded by 0017" — old text preserved, header updated. ### Implementation chunks Plausibly three implementation PRs after this PRD lands: 1. **egress-proxy sidecar core.** Dockerfile + mitmproxy addon + `routes.yaml` schema + lifecycle (prepare / start / stop / SIGHUP). 2. **Manifest + provisioner migration.** Rename cred-proxy throughout the codebase, hard-fail on legacy manifests, update agent CA trust to point at egress-proxy. 3. **PRD 0014 retargeting.** cred-proxy-block remediation's apply path repointed at egress-proxy (SIGHUP, audit log, etc.). Supervise tool description updated. ## Open questions - **mitmproxy addon distribution.** Mount the addon Python file from stage_dir, or bake it into the image. Mount is more hot-reloadable; bake-in is more reproducible. Recommend bake-in, with routes.yaml as the only mounted state. - **Path match semantics.** Prefix-only for v1 (matches PRD 0017 v1 spirit). Globs / regex are a follow-up if operators ask. - **Mode for the `Authorization` strip on inbound.** Pipelock has a similar strip in `sensitive_headers`. Confirm there's no double-strip causing a real header the agent set to disappear unexpectedly. Probably want egress-proxy to be the only stripper for routes that match. - **Pipelock's TLS interception post-cutover.** Today pipelock MITMs the cred-proxy → upstream leg using its own CA. After the cutover, that leg starts as a CONNECT tunnel from egress-proxy (egress-proxy treats pipelock as a plain HTTPS forward proxy). Does pipelock still need to MITM? Probably no — egress-proxy already terminated, body content is already inspected upstream by egress-proxy's addons (or could be). But that means moving DLP from pipelock to egress-proxy, which expands egress-proxy's trust-domain *further*. Punted to the implementation PR to decide. - **Performance.** Two MITM hops in the worst case (agent ↔ egress-proxy and pipelock ↔ upstream if pipelock keeps its interception). Measure under realistic load; if it's a problem, the answer is probably to disable pipelock's TLS interception and let it operate at hostname-only. - **Agent's existing dotfile rewrites.** Today cred-proxy provisions ~/.npmrc with `registry=http://cred-proxy:9099/npm/`, ~/.gitconfig with `insteadOf` rules, etc. After the cutover none of those rewrites are strictly necessary for routing (HTTPS_PROXY catches everything), but they may still be useful for canonicalisation (so the agent's `npm install` doesn't surprise itself by talking to a different registry). Decide per dotfile in the migration PR. ## References - PRD 0010 — cred-proxy (superseded by this PRD). - PRD 0014 — cred-proxy-block remediation (retargeted). - PRD 0013 — supervise plane (tool descriptions updated). - PR #25 — the supervise loop, whose `_apply_pipelock_url` docstring flagged the original "path filtering belongs somewhere" follow-up. - mitmproxy — https://mitmproxy.org/ — chosen as the egress-proxy engine because it's the canonical scriptable MITM forward proxy.