docs(prd-0017): pivot to mitmproxy-based egress-proxy

Significant rewrite of PRD 0017 based on PR #25 design discussion. Original draft proposed adding `path_allowlist` to the existing cred-proxy. That bought opt-in path filtering for tools that voluntarily routed through cred-proxy (Claude Code, git, npm) — but raw `curl https://github.com/foo` from the agent goes to HTTPS_PROXY=pipelock and bypasses cred-proxy entirely, so any universal enforcement claim was a lie. New design: replace cred-proxy with a mitmproxy-based egress-proxy that becomes the agent's HTTP_PROXY/HTTPS_PROXY. Every agent HTTP/HTTPS request flows through it before reaching pipelock. Path-level allow/deny enforcement is universal because the proxy is on every leg. The proxy also absorbs cred-proxy's credential injection role (mitmproxy addon hooks request → strip + inject Authorization). Net sidecar count: unchanged. cred-proxy is replaced 1:1 by egress-proxy. Pipelock stays as hostname allow + DLP downstream of egress-proxy. Decisions baked in per PR-#25 discussion: - Tool: mitmproxy (designed for this; Python addons; well-maintained). - CA custody: egress-proxy holds the per-bottle MITM CA key (concentration accepted; documented in trust-domain section). - Migration: hard cutover. Existing `bottle.cred_proxy.routes[]` manifests fail-fast at load time with a pointer at this PRD. Open questions retained for the implementation PRs: addon distribution (bake vs mount), prefix-vs-glob match, double-strip of Authorization between egress-proxy and pipelock, whether pipelock keeps TLS interception or stays hostname-only post-cutover, performance under two-MITM-hops. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 13:28:53 -04:00
parent 5b925a6699
commit b0d9802469
2 changed files with 309 additions and 195 deletions
@@ -0,0 +1,309 @@
 # PRD 0017: Egress-proxy — universal MITM with path filtering + auth injection
 - **Status:** Draft
 - **Author:** didericis
 - **Created:** 2026-05-25
 - **Supersedes:** the cred-proxy sidecar (PRD 0010) — hard cutover.
 ## Summary
 Replace the per-bottle cred-proxy sidecar with a new `egress-proxy`
 sidecar built on mitmproxy. The egress-proxy is the agent's
 `HTTP_PROXY` / `HTTPS_PROXY` — every agent HTTP/HTTPS request flows
 through it before reaching pipelock. It owns three jobs that today
 are split between cred-proxy and pipelock:
 1. **MITM the agent's HTTPS.** Uses the per-bottle CA today held by
   pipelock; that key moves to the egress-proxy.
 2. **Path-level allow/deny.** Manifest-declared `path_allowlist`
   per route. Universal coverage — any HTTPS path the agent reaches
   for is inspected here, not just traffic that voluntarily dials
   the cred-proxy URL.
 3. **Credential injection.** Continues cred-proxy's existing role:
   match by hostname (or hostname + path), strip inbound
   Authorization, inject one based on `auth_scheme` + `token_ref`.
 Pipelock's role narrows to hostname allowlist + DLP body scanning
 on the egress-proxy → upstream leg. Pipelock no longer holds the
 CA private key; no longer the agent's direct proxy.
 ## Problem
 PR #25's pipelock-block flow exposed an honest gap: pipelock's
 `api_allowlist` is hostname-only (verified by probing the binary's
 strict preset and the `pipelock check --url` output). Approving a
 proposed `pipelock-block` opens the entire host, not the URL's
 path. For shared platforms (github.com, gitlab.com, public
 registries) operators routinely want narrower-than-host granularity
 — allow github.com/didericis but block github.com/somebody-else.
 Cred-proxy already does path-prefix routing for credentialed APIs,
 but it only sees the requests the agent voluntarily routes to it
 (via `ANTHROPIC_BASE_URL`, `~/.gitconfig` insteadOf, npmrc
 `registry=`). A raw `curl https://github.com/anyone` from the agent
 goes to `HTTPS_PROXY=pipelock` directly and bypasses cred-proxy
 entirely. So extending cred-proxy with `path_allowlist` (the earlier
 PRD 0017 draft) buys *opt-in* path filtering, not enforcement.
 For enforcement we need a layer that sits on the agent's
 `HTTPS_PROXY` path — universal coverage of agent egress.
 ## Goals / Success Criteria
 A bottle manifest declares an egress-proxy route with a
 `path_allowlist`. From inside the bottle, `curl
 https://github.com/didericis/foo` succeeds; `curl
 https://github.com/somebody-else/secret` gets a 403 from
 egress-proxy, never reaches pipelock or the real github. The same
 holds for any tool inside the bottle that respects
 `HTTPS_PROXY` — claude-code, git over HTTPS, npm, raw curl, random
 Python `requests`. No tool-specific rewrite is required for path
 enforcement.
 Existing cred-proxy responsibilities continue to work after the
 cutover: Anthropic OAuth injection for claude-code (via the
 proxy-side header injection rather than the dotfile rewrite),
 git-insteadof routing into the proxy stays useful for hostname
 canonicalisation but is no longer load-bearing for credential
 delivery.
 ## Non-goals
 - Replacing pipelock. Pipelock keeps doing hostname allowlist +
  DLP body scanning on the egress-proxy → upstream leg.
 - Building our own MITM stack. mitmproxy already does it; we ship
  addons.
 - Backward compatibility with `bottle.cred_proxy.routes[]`. Hard
  cutover (see Migration).
 - Path-level rules in pipelock. Upstream feature request is a
  separate track (file independently); this PRD doesn't depend on
  it.
 ## Scope
 ### In scope
 - A new `egress-proxy` sidecar replacing the cred-proxy sidecar.
  mitmproxy image, pinned by digest. Addons in Python.
 - Per-bottle CA generation **moves from pipelock to egress-proxy**.
  The agent's trust store is rebuilt against the egress-proxy CA
  (was pipelock's CA).
 - Manifest rename: `bottle.cred_proxy.routes[]` →
  `bottle.egress_proxy.routes[]`. The route shape gains optional
  `path_allowlist: [<prefix>, ...]` and supports `auth_scheme:
  "none"`.
 - Agent's `HTTP_PROXY` / `HTTPS_PROXY` env vars repointed at the
  egress-proxy (was pipelock).
 - Pipelock retains its sidecar slot and its own DLP + hostname
  scanner. The agent never dials it directly anymore; egress-proxy
  uses `HTTPS_PROXY=pipelock` for its outbound leg, matching the
  current cred-proxy → pipelock pattern.
 - Existing PRDs that depend on cred-proxy:
  - PRD 0014 (cred-proxy-block remediation) → renames + retargets
    apply path. SIGHUP reload semantics carry over to egress-proxy.
  - PRD 0013 (supervise plane) `cred-proxy-block` MCP tool stays;
    its proposed file format updates per the new route shape.
 - Removal of the old cred-proxy code: `claude_bottle/cred_proxy.py`,
  `cred_proxy_server.py`, `backend/docker/cred_proxy.py`,
  `provision/cred_proxy.py`, the `Dockerfile.cred-proxy`. Tests
  updated.
 ### Out of scope
 - Pipelock CA path: pipelock keeps generating its *own* CA for
  any internal TLS termination it still does (e.g., on the
  egress-proxy → upstream leg if pipelock is the MITM there).
  Whether pipelock needs that CA at all post-cutover is an open
  question (probably no — egress-proxy already terminated; pipelock
  is now downstream of a plain-HTTP forward from egress-proxy).
 - Glob / regex matching in `path_allowlist`. v1 ships prefix
  matching; expressive forms are a follow-up.
 - An MCP tool for the agent to propose `path_allowlist`
  additions. Today the operator manages this via the manifest +
  the existing `routes edit <bottle>` TUI verb (renamed to
  `egress-proxy edit <bottle>`).
 ## Proposed design
 ### Topology
 ```
 [Agent] --HTTP_PROXY=egress-proxy-->
           [egress-proxy (mitmproxy)]
              MITM with per-bottle CA
              path_allowlist enforcement
              Authorization header injection
            --HTTPS_PROXY=pipelock-->
                  [pipelock]
                    hostname allowlist
                    DLP body scan
                  --egress-->  Internet
 ```
 Universal coverage: every HTTP/HTTPS request the agent makes hits
 egress-proxy first. cred-proxy's URL convention
 (`http://cred-proxy:9099/...`) goes away — there's no need for the
 agent to address the proxy by name because it's already on the
 default proxy path.
 ### Manifest
 ```yaml
 egress_proxy:
  routes:
    # Authenticated route (today's cred-proxy shape, slightly
    # renamed). path_allowlist optional.
    - host: "api.github.com"
      auth_scheme: "Bearer"
      token_ref: "GH_PAT"
      path_allowlist:
        - "/repos/didericis/"
        - "/users/didericis"
    # Unauthenticated path-filtered route.
    - host: "github.com"
      auth_scheme: "none"
      path_allowlist:
        - "/didericis/"
    # Bare-pass route: no auth injection, no path enforcement.
    # Useful when you want a host to skip path filtering but
    # still be DLP-scanned by pipelock.
    - host: "api.anthropic.com"
      auth_scheme: "none"
      # no path_allowlist → all paths pass
 ```
 Route matching is on `host` (was `path` prefix). The hostname
 gates whether a route applies; `path_allowlist` (if present)
 constrains the URL path under that host.
 ### mitmproxy addon shape
 The egress-proxy ships a small Python addon that:
 - Loads the per-bottle routes from `/etc/egress-proxy/routes.yaml`
  (rendered by the prepare step, docker-cp'd in like cred-proxy's
  current routes.json).
 - On `request` hook: match `flow.request.host` → route. If no route
  matches → forward unchanged (pipelock will hostname-gate it). If
  route matches and has `path_allowlist`, check `flow.request.path`
  against the prefix list; 403 with a clear reason if no match.
 - On approved requests: strip inbound Authorization, inject
  `Authorization: <auth_scheme> <token-from-env>` if `auth_scheme
  != "none"`.
 - SIGHUP / file-mtime watch on `routes.yaml` for hot-reload (same
  cadence as today's cred-proxy SIGHUP path).
 mitmproxy's standard CA generation handles per-host leaf certs at
 SNI time. The per-bottle CA is generated at bottle launch (was
 pipelock's tls-init step; now egress-proxy's). Agent's trust store
 gets the egress-proxy CA installed in place of pipelock's.
 ### Trust-domain concentration
 The egress-proxy now holds:
 - Every credential the bottle declared in `egress_proxy.routes[]`
  (OAuth tokens, PATs, npm tokens).
 - The per-bottle MITM CA private key.
 This is a deliberate concentration. With the previous split:
 - cred-proxy held tokens.
 - pipelock held the CA.
 A memory disclosure in cred-proxy exposed tokens; in pipelock,
 the CA. Both were bad; neither exposed everything.
 The new egress-proxy in one disclosure exposes both. Mitigations:
 - mitmproxy runs as an unprivileged user inside the container.
 - Tokens live in the container's environ (same as cred-proxy today).
  The CA private key is mounted from the host's stage_dir (mode 600).
 - Pipelock stays as a separate sidecar, so a compromise of
  egress-proxy doesn't disable pipelock's hostname check + DLP on
  the outbound leg — the attacker can forge certs to the agent but
  can't easily exfil from inside the agent without pipelock
  noticing.
 The user (per PR #25 discussion) accepted this concentration in
 exchange for the one-sidecar consolidation. The PRD records it
 explicitly.
 ### Migration — hard cutover
 No backward-compat alias for `bottle.cred_proxy.routes[]`. At
 manifest load:
 - `cred_proxy:` block → `die()` with a clear pointer at this PRD
  and a migration recipe (rename to `egress_proxy:`, rename
  `path` → `host`, drop the agent-side URL prefix).
 - `cred_proxy_routes` field on existing dataclasses removed.
 - `Dockerfile.cred-proxy` deleted.
 - `claude_bottle/cred_proxy*.py` deleted.
 - `claude_bottle/backend/docker/cred_proxy*.py` consolidated into
  `egress_proxy*.py`.
 - Provisioner files renamed.
 - PRDs 0010 (cred-proxy), 0014 (cred-proxy-block remediation)
  retroactively annotated as "superseded by 0017" — old text
  preserved, header updated.
 ### Implementation chunks
 Plausibly three implementation PRs after this PRD lands:
 1. **egress-proxy sidecar core.** Dockerfile + mitmproxy addon +
   `routes.yaml` schema + lifecycle (prepare / start / stop / SIGHUP).
 2. **Manifest + provisioner migration.** Rename cred-proxy
   throughout the codebase, hard-fail on legacy manifests, update
   agent CA trust to point at egress-proxy.
 3. **PRD 0014 retargeting.** cred-proxy-block remediation's apply
   path repointed at egress-proxy (SIGHUP, audit log, etc.).
   Supervise tool description updated.
 ## Open questions
 - **mitmproxy addon distribution.** Mount the addon Python file
  from stage_dir, or bake it into the image. Mount is more
  hot-reloadable; bake-in is more reproducible. Recommend bake-in,
  with routes.yaml as the only mounted state.
 - **Path match semantics.** Prefix-only for v1 (matches PRD 0017
  v1 spirit). Globs / regex are a follow-up if operators ask.
 - **Mode for the `Authorization` strip on inbound.** Pipelock has a
  similar strip in `sensitive_headers`. Confirm there's no
  double-strip causing a real header the agent set to disappear
  unexpectedly. Probably want egress-proxy to be the only stripper
  for routes that match.
 - **Pipelock's TLS interception post-cutover.** Today pipelock
  MITMs the cred-proxy → upstream leg using its own CA. After the
  cutover, that leg starts as a CONNECT tunnel from egress-proxy
  (egress-proxy treats pipelock as a plain HTTPS forward proxy).
  Does pipelock still need to MITM? Probably no — egress-proxy
  already terminated, body content is already inspected upstream
  by egress-proxy's addons (or could be). But that means moving
  DLP from pipelock to egress-proxy, which expands egress-proxy's
  trust-domain *further*. Punted to the implementation PR to
  decide.
 - **Performance.** Two MITM hops in the worst case (agent ↔
  egress-proxy and pipelock ↔ upstream if pipelock keeps its
  interception). Measure under realistic load; if it's a problem,
  the answer is probably to disable pipelock's TLS interception
  and let it operate at hostname-only.
 - **Agent's existing dotfile rewrites.** Today cred-proxy
  provisions ~/.npmrc with `registry=http://cred-proxy:9099/npm/`,
  ~/.gitconfig with `insteadOf` rules, etc. After the cutover
  none of those rewrites are strictly necessary for routing
  (HTTPS_PROXY catches everything), but they may still be useful
  for canonicalisation (so the agent's `npm install` doesn't
  surprise itself by talking to a different registry). Decide per
  dotfile in the migration PR.
 ## References
 - PRD 0010 — cred-proxy (superseded by this PRD).
 - PRD 0014 — cred-proxy-block remediation (retargeted).
 - PRD 0013 — supervise plane (tool descriptions updated).
 - PR #25 — the supervise loop, whose `_apply_pipelock_url`
  docstring flagged the original "path filtering belongs
  somewhere" follow-up.
 - mitmproxy — https://mitmproxy.org/ — chosen as the egress-proxy
  engine because it's the canonical scriptable MITM forward proxy.
@@ -1,195 +0,0 @@
 # PRD 0017: Path-aware egress filtering via cred-proxy
 - **Status:** Draft
 - **Author:** didericis
 - **Created:** 2026-05-25
 ## Summary
 Pipelock's `api_allowlist` is hostname-only — once a host is on the
 list, every URL path at that host is reachable. For agents working
 on shared platforms (github.com, gitlab.com, public registries),
 this means approving access to one user's content also opens
 access to every other user's content. Cred-proxy already
 path-prefix-routes authenticated traffic; this PRD extends it to
 filter (not just route) paths, including for unauthenticated hosts.
 Per-bottle egress then has two complementary layers: pipelock for
 hostname allow + DLP + body scanning, cred-proxy for path-level
 allow on declared hosts.
 ## Problem
 PR #25's pipelock-block tool delivers an honest but coarse experience:
 the agent reports "I tried hitting `https://github.com/didericis`,
 pipelock 403'd it"; the operator approves and the agent now has
 access to all of github.com. The path in the proposal is captured
 as context but not enforced (PR #25 documents this in
 `_apply_pipelock_url`'s docstring).
 The intended posture for many shared platforms is narrower than
 hostname-level. "Allow the agent to read github.com/didericis but
 not github.com/somebody-else" is a normal ask. Today the egress
 stack can't express that, even though cred-proxy already has 80%
 of the machinery: it path-routes authenticated traffic with
 longest-prefix matching, and the manifest's `cred_proxy.routes[]`
 shape is already a list of `(path, upstream, ...)` rules.
 ## Goals / Success Criteria
 A bottle manifest can declare a cred-proxy route with a
 `path_allowlist` and `auth_scheme: none`. Agents dialing
 `http://cred-proxy:<port>/<route>/<suffix>` hit a 403 from
 cred-proxy when `<suffix>` doesn't match any allowlist entry, and
 a normal forward (no auth header injected) when it does. For
 existing authenticated routes the addition is opt-in: a route
 without `path_allowlist` keeps its current permissive behaviour.
 Demonstrable behavior: a bottle manifest declares
 `{path: "/github/", upstream: "https://github.com", auth_scheme: "none",
 path_allowlist: ["/didericis/"]}`; the agent reaches
 `http://cred-proxy:9099/github/didericis/some-repo` successfully,
 gets a 403 on `http://cred-proxy:9099/github/someone-else/whatever`.
 ## Non-goals
 - Replacing pipelock. Pipelock still does the hostname allowlist,
  DLP body scanning, MCP / WebSocket inspection. Path filtering is
  additive, sitting in front of pipelock for routes that opt in.
 - Auto-routing arbitrary outbound HTTP through cred-proxy. The
  agent's `HTTP_PROXY` stays pointed at pipelock; cred-proxy is
  reached by explicit URL (with a `git-insteadof`-style rewrite
  for the few protocol-level helpers that need it).
 - Reworking pipelock-block. The PR #25 tool stays hostname-only;
  whether a new path-aware proposal tool (or a richer
  pipelock-block) is wanted is an open question for a follow-on
  PRD.
 - Live mutation of the running container or cred-proxy beyond
  what cred-proxy SIGHUP already supports (PRD 0014).
 ## Scope
 ### In scope
 - A new optional `auth_scheme: "none"` mode on cred-proxy routes
  that suppresses Authorization injection while keeping path
  routing + (new) path filtering.
 - A new optional `path_allowlist: [<prefix>, ...]` field per
  cred-proxy route. When present, cred-proxy 403s requests whose
  in-route suffix doesn't match at least one prefix.
 - Manifest schema + validation for the two new fields.
 - Cred-proxy server logic: enforcement on each request after the
  longest-prefix route match.
 - SIGHUP reload picks up `path_allowlist` changes (no new sidecar
  primitives — the existing reload path already re-reads
  `routes.json`).
 ### Out of scope
 - A new MCP tool for the agent to propose `path_allowlist`
  additions. Today the operator manages this via the manifest +
  the existing `routes edit <bottle>` TUI verb.
 - Glob / regex matching. v1 ships prefix matching only; the open
  question lays out the trade-offs.
 - Auto-migrating PR #25's pipelock-block proposals into cred-proxy
  routes. Manual operator decision per host.
 - Provisioner-side dotfile changes for HTTPS-to-cred-proxy rewrites
  on bottles that opt unauth'd hosts onto cred-proxy. Out of scope
  for the engine work; the manifest can already encode it.
 ## Proposed Design
 ### Manifest schema additions
 `bottle.cred_proxy.routes[]` gains two optional fields:
 ```yaml
 cred_proxy:
  routes:
    - path: "/github/"
      upstream: "https://github.com"
      auth_scheme: "none"         # new — no Authorization header
      token_ref: ""               # ignored when auth_scheme is "none"
      path_allowlist:             # new — prefix list; empty / absent = permissive
        - "/didericis/"
        - "/didericis-org/"
 ```
 - `auth_scheme: "none"` joins the existing `Bearer` / `token` values.
  When `none`, `token_ref` must be empty or absent and no
  Authorization header is injected. The route still routes by path
  prefix and forwards to upstream.
 - `path_allowlist` is a list of suffix prefixes (matched after the
  route's `path` is stripped). Empty / absent means permissive
  (current behaviour). When non-empty, the suffix must start with
  at least one of the allowlist entries.
 ### cred-proxy server changes
 Per request:
 1. Strip query string, longest-prefix-match against `routes`.
 2. Compute the suffix = request_path[len(route.path):].
 3. If `route.path_allowlist` is non-empty: require that
   `"/" + suffix` (or just `suffix` — pick a consistent
   normalization) starts with at least one allowlist entry. 403 if
   not.
 4. If `auth_scheme == "none"`: skip the `Authorization` header
   step entirely; otherwise inject as today.
 5. Forward upstream, stream response (unchanged).
 The 403 body should name the route + the disallowed suffix so the
 operator can diagnose. cred-proxy's existing log line at request
 time picks up the new outcome too.
 ### Validation
 At manifest load:
 - `auth_scheme` must be one of `Bearer`, `token`, or `none`.
 - When `auth_scheme == "none"`, `token_ref` is forbidden (clearer
  error than silently ignoring).
 - `path_allowlist` entries must start with `/` and end with `/`
  (matching the existing convention for `route.path`).
 - Duplicate prefixes are deduplicated with a warning, not an
  error.
 ### Migration / backward compatibility
 - Routes without `path_allowlist` behave exactly as today.
 - Routes with `auth_scheme: Bearer | token` behave exactly as today.
 - No existing manifests need editing; the new fields are opt-in.
 ## Open questions
 - **Match semantics: prefix vs glob vs regex.** Prefix is simple
  and matches the existing `route.path` convention. Glob (`/users/*/repos/`)
  adds power but is easy to get wrong (does `*` match a `/`?).
  Regex is the most powerful and the most footguny. Recommend
  prefix-only for v1, glob in a follow-up if operators ask for it.
 - **403 body shape.** Plain text vs JSON. Cred-proxy's existing
  errors use plain text (`send_error(404, "no route for ...")`).
  Match that.
 - **Auth-less routes and TLS interception.** A `none`-auth route
  still routes outbound HTTPS through pipelock (cred-proxy's
  `HTTPS_PROXY` env), so pipelock's CA + body scanner still apply.
  Confirm that pipelock's allowlist needs the upstream host in
  this case — there's no token to make the cred-proxy → upstream
  leg special. Likely yes, same as today.
 - **MCP tool / pipelock-block evolution.** Once path filtering
  exists, the operator may want a way for the agent to propose
  path additions (e.g. "I need /didericis-org/ added to the
  github route"). Today that requires manifest edit + cli.py
  rebuild, or `routes edit` via the dashboard. Whether a new MCP
  tool (or a richer pipelock-block) is wanted is a follow-on PRD
  open question.
 - **Allowlist semantics for the entire route prefix.** Should an
  empty `path_allowlist: []` be allowed? Equivalent to "block
  everything at this upstream" — possibly useful as a tombstone,
  more likely a typo. Recommend treating empty list the same as
  absent (permissive) and flagging in the validation note.
 ## References
 - PRD 0010 — cred-proxy (the engine being extended).
 - PRD 0015 — pipelock block remediation (whose hostname-only
  ceiling motivates this PRD).
 - PR #25 — `_apply_pipelock_url`'s docstring documents the
  follow-up that this PRD formalises.