# PRD 0005: mitmproxy TLS interception for pipelock content scanning - **Status:** Draft (updated 2026-05-12 after open-question walkthrough) - **Author:** didericis - **Created:** 2026-05-12 ## Summary Add a per-bottle **mitmproxy** sidecar in front of pipelock on the egress path. mitmproxy bumps the agent's TLS CONNECT, decrypts the inner HTTP, and hands each request to a vendored Python addon. The addon forwards the decrypted request to pipelock as a plain HTTP forward-proxy call so pipelock's DLP, URL-scan, and header-scan layers fire on real bodies. On the verdict, the addon either short-circuits the flow with a 403 (block) or lets mitmproxy proceed to the real upstream (allow). mitmproxy itself generates the ephemeral per-bottle CA on startup; the public cert is copied into the agent's trust store and the private key dies with the sidecar on teardown. This is Topology A' from `docs/research/tls-mitm-for-pipelock.md` — a variant of the research note's Topology A after a spike showed mitmproxy's `upstream` mode re-wraps decrypted flows in a new CONNECT to the upstream proxy (which would defeat the entire point). The addon recovers the design by emitting plain HTTP to pipelock explicitly instead of relying on mitmproxy's `upstream` chaining. ## Problem PRD 0001 wired pipelock onto every bottle's egress, but the current topology only sees `CONNECT` hostnames and opaque TLS bytes: ``` agent --HTTPS_PROXY--> pipelock --CONNECT host:443--> internet \____________________________ opaque TLS bytes ``` What pipelock cannot scan in this mode is documented in `docs/research/tls-mitm-for-pipelock.md` §What pipelock cannot see today: request URLs and methods, request and response headers, request and response bodies, MCP JSON-RPC payloads, inner-vs-outer hostname (the domain-fronting check), and WebSocket frames inside a TLS-wrapped upgrade. The 48-pattern DLP layer this project relies on in PRD 0001 is therefore inert against every host in the current `DEFAULT_ALLOWLIST` — all of which are HTTPS-only. The integration test added in `tests/integration/test_pipelock_blocks_secret_post.py` demonstrates the gap concretely: pipelock's body-scan layer only fires when the agent is forced to send plain HTTP. Real Claude Code traffic to `api.anthropic.com` goes over CONNECT-tunneled TLS and slips past the scanner. `pipelock-assessment.md` §Scope gaps names this as a known limitation of the proxy-without-TLS-inspection shape. Closing it is the explicit motivation for `tls-mitm-for-pipelock.md`, whose recommendation this PRD implements (with the addon adjustment forced by the upstream-mode spike). ## Goals / Success Criteria The feature works when all of the following are observable: - A Node request from inside a launched bottle to a CONNECT-bumped HTTPS host (e.g. `https://api.anthropic.com/dlp-probe`) carrying a pipelock-recognized credential pattern in the body returns 403 from the bottle's egress chain — not a response from the upstream. The existing `test_pipelock_blocks_secret_post` test path becomes the HTTPS variant of this assertion. - A plain HTTPS GET from inside the bottle to an allowlisted host with no credential pattern (e.g. `GET https://raw.githubusercontent.com/...`) returns the real upstream response — the addon doesn't break clean traffic. - Claude Code itself reaches `api.anthropic.com` end-to-end through the bottle and completes a chat round-trip. No TLS-trust errors in the agent process. - mitmproxy's flow log and pipelock's `body_dlp` / `header_dlp` / `core_dlp` event lines both appear for the same outbound request, confirming the two-stage path is active. The feature is **done** when all of the following ship: - A new `MitmproxyProxy` class with the same `prepare` / `start` / `stop` lifecycle shape as `PipelockProxy`, wired into the Docker backend's launch step. - A vendored Python addon at `claude_bottle/mitmproxy/addon.py` that mitmproxy loads on startup via `mitmdump -s ...`. The sidecar runs in `regular` mode (default), not `upstream` mode. - The bottle launch step starts the mitmproxy sidecar, waits for the sidecar-internal CA to be generated, copies the CA public cert into the agent at `/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, runs `update-ca-certificates` inside the agent, and threads the `NODE_EXTRA_CA_CERTS` / `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` env trio onto the agent container's runtime env. - The agent's `HTTPS_PROXY` / `HTTP_PROXY` point at the mitmproxy sidecar (where they pointed at pipelock under PRD 0001). - pipelock is otherwise unchanged. It continues to load the YAML PRD 0001 generates and runs its existing scanning pipeline; the addon talks to it via the same forward-proxy interface today's `test_pipelock_blocks_secret_post` uses. - On bottle teardown the mitmproxy sidecar is removed and the ephemeral CA private key is gone with it. - An HTTPS variant of `test_pipelock_blocks_secret_post` proves pipelock now blocks a credential POST over HTTPS rather than plain HTTP. - An integration test proves a non-credential HTTPS GET through the chain returns the upstream's real response. - The dry-run preflight (`start --dry-run`) shows the mitmproxy sidecar in both text and `--format=json` output. The JSON contract gains a reserved `egress.mitm: { "enabled": true, "ca_fingerprint": null }` block; fingerprint is always null at dry-run because the CA doesn't exist yet. Real launches emit a one-line stderr log: `claude-bottle: mitm ca fingerprint: ...`. ## Non-goals - **Topology C** — extending pipelock itself to terminate TLS. The research note's recommended long-term shape, but substantial Go work plus the Apache-2.0-vs-ELv2 question. Deferred. - **Topology D as canonical** — mitmproxy with a pipelock `/scan` HTTP endpoint. The addon in this PRD talks to pipelock via its existing forward-proxy interface; no upstream pipelock change needed. - **Persistent or shared CA across bottles.** Each bottle gets a fresh CA generated by its own mitmproxy at startup. - **Selective bumping ("ignore_hosts") as a v1 manifest field.** v1 bumps every CONNECT. If a future allowlisted host turns out to pin (Mobile / Chromium-style cert pinning), a follow-up PRD adds the per-host opt-out via `bottle.egress.tls_bump_ignore`. Strictly additive. - **HTTP/3 / QUIC.** mitmproxy's HTTP/3 support is experimental. v1 relies on the v1-egress iptables layer blocking UDP/443 to force clients onto HTTP/2 over TCP, which mitmproxy 12 inspects natively (verified by spike). - **Raw TCP / non-HTTP TLS interception.** mitmproxy supports it via `--mode reverse:`, not in CONNECT-bump mode. SSH and any future raw-TCP egress route around mitmproxy entirely. - **Trust-store rewiring for non-Debian agent images.** The current `Dockerfile` is `node:22-slim` (Debian). If a future base switches to Red-Hat-family, the `update-ca-certificates` step becomes `update-ca-trust`. Out of scope until the base changes. - **Response-body scanning.** Pipelock supports it; we don't wire it in v1 because the addon would need to ferry the upstream response back through pipelock's scanner, which the forward- proxy interface doesn't support cleanly. v2 candidate. - **MCP scanning on the bumped path.** Only fires on MCP-formatted JSON-RPC payloads inside tool calls. Not relevant to plain HTTPS agent traffic and out of v1 scope. - **Domain-fronting verification.** Once the addon sees the inner `Host` / `:authority`, comparing it to the outer CONNECT target catches domain fronting. Worth ~10 lines in the addon, but defer until the rest of v1 is settled. - **Host-side openssl / `cryptography` for CA generation.** The research note's open question on this is resolved by letting mitmproxy itself generate the CA (it does so on first launch). No new host-side crypto. ## Scope ### In scope - New `claude_bottle/mitmproxy/` package: - `__init__.py` — backend-agnostic. Constants (sidecar port, image-pin digest, the in-container addon path), the abstract `MitmproxyProxy` class with `prepare` / `start` / `stop` shape mirroring `PipelockProxy`, and the small helper that reads the CA fingerprint from a PEM file via `openssl x509 -fingerprint` shelled out. - `addon.py` — the Python addon mitmproxy loads. ~80–150 lines. For each `request` event: forward the decrypted request to pipelock at `http://claude-bottle-pipelock-:8888` as a plain HTTP forward-proxy call (absolute-URI form). Inspect pipelock's response. If status is 403 *and* the body matches pipelock's known block-event shape, set the flow's response to a 403 with pipelock's body and short-circuit. Otherwise, discard pipelock's response (and any wasted upstream-leg response from pipelock's forwarder) and let mitmproxy proceed to the real upstream. - New `claude_bottle/backend/docker/mitmproxy.py` — `DockerMitmproxyProxy(MitmproxyProxy)` with the Docker-specific start/stop lifecycle. `start(plan)` does `docker create` / `docker cp addon.py …` / `docker network connect` / `docker start`, analogous to the existing `DockerPipelockProxy.start`. Injects `CLAUDE_BOTTLE_PIPELOCK_URL` into the sidecar env so the addon knows where pipelock lives. - New provisioner `claude_bottle/backend/docker/provision/ca.py`. Polls mitmproxy for the cert file, copies it through a host stage dir into the agent, runs `update-ca-certificates` inside the agent, computes the SHA-256 fingerprint, and prints the one-line stderr log. - `BottleBackend.provision_ca(plan, target)` joins the four existing provisioner methods on the abstract base. Default impl is no-op so other backends don't break when they don't yet implement TLS interception. - `DockerBottlePlan` grows a `mitmproxy_plan` field mirroring the existing `proxy_plan`. - Agent container `docker run` invocation: - `HTTPS_PROXY` / `HTTP_PROXY` change from the pipelock service name to the mitmproxy service name. - Three `-e` flags set the CA env trio so they're inherited by the eventual `docker exec claude` (Docker propagates run-time env into exec by default; fallback in Q1 below). - Dry-run preflight rendering of the mitmproxy entry (text + JSON). JSON gains `egress.mitm: { "enabled": true, "ca_fingerprint": null }`. - One stderr log line at launch with the CA fingerprint. - Two new integration tests under `tests/integration/`: - `test_mitmproxy_blocks_secret_https_post.py` — HTTPS variant of the existing block-secret test. Asserts pipelock's body DLP fires on a credential POST tunneled through CONNECT. - `test_mitmproxy_allows_normal_https.py` — confirms a plain HTTPS GET on an allowlisted host returns the upstream response, isolating the addon's pass-through path from the block path. - Unit tests for the addon's verdict logic (block vs allow on status + body shape, edge cases) using mitmproxy's `mitmproxy.test` flow fixtures. Unit tests for the proxy config builder (mirroring `tests/unit/test_pipelock_yaml.py`). ### Out of scope - The v1 iptables + dnsmasq layer (separate PRD; see `network-egress-guard.md`). mitmproxy covers HTTP/HTTPS only; raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer. - Pipelock config changes. Pipelock continues to load the YAML PRD 0001 generates; the addon talks to it via the existing forward-proxy interface. - A bottle-level toggle to skip mitmproxy entirely. v1 always wires it in. - Pinning-host detection automation. The cost of finding out (per research) is a single 5-minute test before adding a host; it stays a manual step. - Pipelock upstream contributions for an `X-Pipelock-Verdict` header. Possible follow-up. Until then the addon distinguishes blocks from passes via status + body fingerprint. ## Proposed Design ### Topology ``` agent --HTTPS_PROXY--> mitmproxy --addon--> pipelock (scan) (bump TLS) | ^ | (verdict via status code) | v +-- on allow ----- real upstream (mitmproxy as client) ``` All three containers live on the same per-bottle internal Docker network. mitmproxy and pipelock are both attached to the per-bottle egress bridge for real-internet reach; the agent has no default route. Concretely: - Agent sets `HTTPS_PROXY=http://claude-bottle-mitm-:`. PRD 0001 had this pointing at pipelock; the hostname swap is the only agent-side env change. - mitmproxy runs in **`regular`** mode (default; no `--mode` flag). It bumps every CONNECT, generates fake leaf certs signed by its own CA, and presents them to the agent. - The addon, loaded via `mitmdump -s /addon/addon.py`, intercepts each decrypted `request` event. It forwards the request to pipelock at `http://claude-bottle-pipelock-:8888` as a plain HTTP forward-proxy call (absolute-URI form), so pipelock sees the full URL, headers, and body. - The addon inspects pipelock's response. If status is 403 *and* the response body matches pipelock's known block-event shape, the addon sets the mitmproxy flow's response to a 403 with pipelock's body and short-circuits. Otherwise — including the case where pipelock's forwarder attempted the upstream and got a 4xx — the addon discards pipelock's response and lets mitmproxy proceed to the real upstream. - mitmproxy completes the outbound TLS to the real destination using its built-in trust store, just like any other forward proxy. Pipelock is only involved as a scanner. The trade-off: pipelock makes a wasted upstream forward attempt for every allowed request (it tries to forward over plain HTTP to a real HTTPS-only host, which fails with the upstream's 4xx). This is benign — the scan completes before forwarding, the verdict reaches the addon, the upstream-side request happens to die in pipelock's forwarder rather than reach the agent. Acceptable cost for the visibility win. A pipelock-side improvement (skip the forward when the addon only needs the scan verdict) is a future optimization. ### New components - `claude_bottle/mitmproxy/__init__.py` — backend-agnostic abstract base, constants, the `openssl x509 -fingerprint` helper. - `claude_bottle/mitmproxy/addon.py` — the scanning addon. Reads pipelock's URL from `CLAUDE_BOTTLE_PIPELOCK_URL` (injected into the sidecar env by the proxy's `start`). For each `request` flow: synchronously POST to pipelock; inspect status + body; either short-circuit with 403 or fall through. - `claude_bottle/backend/docker/mitmproxy.py` — `DockerMitmproxyProxy(MitmproxyProxy)` with start/stop, the `docker cp` of the addon into the sidecar before `docker start`, and the `CLAUDE_BOTTLE_PIPELOCK_URL` wiring. ### CA lifecycle Simplified by letting mitmproxy own the generation: - **Generation.** mitmproxy generates a fresh CA on startup inside its container at `/home/mitmproxy/.mitmproxy/mitmproxy-ca-cert.pem` (public) + `mitmproxy-ca.pem` (private). No host-side openssl for *generation*; no host-side Python `cryptography` dep. - **Volume strategy.** Container-internal only. No host bind mount means the CA dies with the container. - **Extraction.** `provision_ca` polls (~1s) for the cert file via `docker exec`, then `docker cp` to host stage dir, then `docker cp` into the agent. Host stage dir gets cleaned up by the existing `start.py` `finally` block. - **Bottle install.** 1. `docker cp /mitm-ca.crt agent-:/usr/local/share/ca-certificates/claude-bottle-mitm.crt` 2. `docker exec -u 0 agent- chmod 644 …` 3. `docker exec -u 0 agent- update-ca-certificates` 4. Three `-e` flags on `docker run` set the env trio (`NODE_EXTRA_CA_CERTS=…/claude-bottle-mitm.crt`, `SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt`, `REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt`) so `docker exec claude` inherits them. - **Teardown.** Sidecar container removed; CA private key gone. - **Fingerprint.** Computed post-extraction via shelled-out `openssl x509 -fingerprint -sha256 -noout`. Logged once to stderr at launch; never the private key. ### Data model changes None to the manifest schema. The dry-run JSON contract gains a reserved `egress.mitm: { "enabled": true, "ca_fingerprint": null }` block. Fingerprint is always null at dry-run (CA doesn't exist yet) but the field is reserved so future schema additions stay non-breaking. A future selective-bump knob would add `bottle.egress.tls_bump_ignore: [host, ...]` per the research note. Strictly additive when it lands. ### Existing code touched - **`claude_bottle/backend/docker/launch.py`** — bring up the mitmproxy sidecar between pipelock and the agent. Repoint the agent's `HTTPS_PROXY` / `HTTP_PROXY` env flags to mitmproxy. Register an `ExitStack` callback for mitmproxy teardown. Print the CA fingerprint once the sidecar reports ready. - **`claude_bottle/backend/docker/prepare.py`** — call into `MitmproxyProxy.prepare(...)` alongside `PipelockProxy.prepare(...)`, populate `DockerBottlePlan.mitmproxy_plan`. - **`claude_bottle/backend/docker/backend.py`** — add the `DockerMitmproxyProxy` instance attribute (`self._mitm`) and thread it through `launch` + cleanup, mirroring `self._proxy`. - **`claude_bottle/backend/docker/bottle_plan.py`** — new `mitmproxy_plan` field. `print()` and `to_dict()` learn to render the mitmproxy entry and the `egress.mitm` JSON block. - **`claude_bottle/backend/__init__.py`** — abstract `BottleBackend.provision_ca` joins the four existing provisioners; default no-op. - **`tests/integration/`** — two new tests as described above. - **`tests/unit/`** — addon-verdict tests, mitmproxy-config builder tests, dry-run-plan test updated for the new `egress.mitm` block. ### External dependencies - **mitmproxy Docker image** pinned by digest on the `12.x` line. Bumped deliberately, mirroring the pipelock pin. Verified by spike to speak h2 on both halves. - No new host-side runtimes. mitmproxy generates the CA; fingerprint via the `openssl` already present on Debian / macOS / ubuntu-latest runners. ## Open questions (rewritten — most of the original v1 questions are now closed by the walkthrough spikes; what remains is addon-implementation specifics worth pinning during the first impl turn.) - **Pipelock's 403-body fingerprint.** The addon needs to distinguish a pipelock block (DLP / host) from a real-upstream 4xx that pipelock's forwarder relayed back. Most likely shape: pipelock's 403 response carries a JSON body with `event` / `scanner` fields, whereas a real-upstream 4xx carries whatever the upstream sent. Pin the exact fingerprint by inspecting pipelock's actual 403 body bytes at impl time. Long-term cleanup: file an upstream feature request for an `X-Pipelock-Verdict: block` response header so the addon can read a structured signal instead of pattern-matching the body. - **Docker run env-var inheritance through docker exec.** Plan assumes `docker run -e VAR=value` propagates to subsequent `docker exec` invocations. The Docker docs say so; not yet empirically pinned on this project's runner setup. Verify in the first impl turn. Trivial fallback: thread the three `-e` flags onto every `DockerBottle.exec*` call. - **Addon synchronous-call latency.** The addon makes a sync HTTP call to pipelock per outbound flow. Pipelock is on the same internal Docker network; expected per-call latency is well under 10ms. Confirm under the parallel-request load Claude Code generates (most likely a non-issue — Claude is single-stream request-wise). - **Addon test fixtures.** mitmproxy ships `mitmproxy.test` with flow fixtures; addons can be unit-tested without a running proxy. Confirm the import path and recommended fixture shape at impl time; structure the addon so the verdict-decision is a pure function that's trivially testable in isolation from any HTTP I/O. - **Pipelock allowing the addon's forwarded request through.** pipelock will see the addon's request as coming from the mitmproxy sidecar's IP on the internal network. Confirm pipelock has no client-IP allowlist that would reject these. Likely fine — pipelock's `client_ip` is informational in the scan event, not a gate. ## References - `docs/research/tls-mitm-for-pipelock.md` — primary source. This PRD implements a variant of §Recommendation (Topology A) after the spike documented under "Open questions" §1 falsified the `upstream` mode assumption. - `docs/research/pipelock-assessment.md` §Scope gaps — names the TLS-inspection gap closed here. - `docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` — egress-proxy baseline this PRD extends. - `docs/prds/0003-bottle-backend-abstraction.md` — backend ABC contract this PRD adds a `provision_ca` method to. - `docs/prds/0004-split-out-provisioners.md` — per-provisioner module pattern reused for the new CA provisioner. - mitmproxy: , - mitmproxy modes: - mitmproxy CA cert installation: - mitmproxy addon API: - Node `NODE_EXTRA_CA_CERTS`: