From c2eacac49f926eee6d65ef42a63845865ad096ab Mon Sep 17 00:00:00 2001 From: didericis Date: Tue, 12 May 2026 12:54:27 -0400 Subject: [PATCH] docs(prd): update 0005 after open-question walkthrough MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Re-grounds the design after walking the eight original open questions interactively. Two structural changes: - Topology A → A'. A spike confirmed mitmproxy's `upstream` mode re-wraps decrypted flows in a new CONNECT to the upstream proxy, which would have left pipelock seeing only ciphertext (the very gap this PRD set out to close). The fix is to run mitmproxy in `regular` mode and ship a vendored Python addon that forwards each decrypted request to pipelock as a plain HTTP forward-proxy call. Pipelock is unchanged. - mitmproxy owns CA generation. The research note's preference for a host-side openssl / cryptography CA turned out to be unnecessary — mitmproxy generates a fresh CA on startup; the public cert is `docker cp`'d into the agent. No new host-side crypto deps. Dry-run can't render a fingerprint (CA doesn't exist yet); launches print it once to stderr. Other Q3–Q8 resolutions folded in: Debian-base `update-ca-certificates` confirmed, mitmproxy 12 verified to speak h2 on both halves, selective-bump deferred to v2, response-body and MCP scanning deferred to v2, domain-fronting deferred to v2. Open questions rewritten — what remains is addon-implementation specifics (pipelock 403-body fingerprint, env-var inheritance through docker exec, addon test fixtures). Co-Authored-By: Claude Opus 4.7 --- docs/prds/0005-mitmproxy-tls-interception.md | 532 +++++++++++-------- 1 file changed, 299 insertions(+), 233 deletions(-) diff --git a/docs/prds/0005-mitmproxy-tls-interception.md b/docs/prds/0005-mitmproxy-tls-interception.md index e0f3d95..89f2a56 100644 --- a/docs/prds/0005-mitmproxy-tls-interception.md +++ b/docs/prds/0005-mitmproxy-tls-interception.md @@ -1,20 +1,30 @@ # PRD 0005: mitmproxy TLS interception for pipelock content scanning -- **Status:** Draft +- **Status:** Draft (updated 2026-05-12 after open-question walkthrough) - **Author:** didericis - **Created:** 2026-05-12 ## Summary Add a per-bottle **mitmproxy** sidecar in front of pipelock on the -egress path so pipelock's DLP, subdomain-entropy, and MCP scanners -fire on the plaintext bodies of HTTPS requests instead of only the -opaque ciphertext that follows a `CONNECT`. mitmproxy terminates the -agent's TLS, hands plaintext HTTP to pipelock as an upstream -forward proxy, and re-establishes TLS to the real destination. A -fresh ephemeral CA is minted per bottle; the CA private key never -leaves the sidecar, and the public cert is wired into the agent -container's trust store at launch. +egress path. mitmproxy bumps the agent's TLS CONNECT, decrypts the +inner HTTP, and hands each request to a vendored Python addon. The +addon forwards the decrypted request to pipelock as a plain HTTP +forward-proxy call so pipelock's DLP, URL-scan, and header-scan +layers fire on real bodies. On the verdict, the addon either +short-circuits the flow with a 403 (block) or lets mitmproxy +proceed to the real upstream (allow). mitmproxy itself generates +the ephemeral per-bottle CA on startup; the public cert is copied +into the agent's trust store and the private key dies with the +sidecar on teardown. + +This is Topology A' from `docs/research/tls-mitm-for-pipelock.md` — +a variant of the research note's Topology A after a spike showed +mitmproxy's `upstream` mode re-wraps decrypted flows in a new +CONNECT to the upstream proxy (which would defeat the entire +point). The addon recovers the design by emitting plain HTTP to +pipelock explicitly instead of relying on mitmproxy's `upstream` +chaining. ## Problem @@ -45,7 +55,8 @@ slips past the scanner. `pipelock-assessment.md` §Scope gaps names this as a known limitation of the proxy-without-TLS-inspection shape. Closing it is the explicit motivation for `tls-mitm-for-pipelock.md`, whose -recommendation this PRD implements. +recommendation this PRD implements (with the addon adjustment +forced by the upstream-mode spike). ## Goals / Success Criteria @@ -53,306 +64,361 @@ The feature works when all of the following are observable: - A Node request from inside a launched bottle to a CONNECT-bumped HTTPS host (e.g. `https://api.anthropic.com/dlp-probe`) carrying a - pipelock-recognized credential pattern in the body returns 403 from - the proxy, not a response from the upstream. The existing - `test_pipelock_blocks_secret_post` test path becomes the HTTPS - variant of this assertion. + pipelock-recognized credential pattern in the body returns 403 + from the bottle's egress chain — not a response from the upstream. + The existing `test_pipelock_blocks_secret_post` test path becomes + the HTTPS variant of this assertion. +- A plain HTTPS GET from inside the bottle to an allowlisted host + with no credential pattern (e.g. `GET https://raw.githubusercontent.com/...`) + returns the real upstream response — the addon doesn't break + clean traffic. - Claude Code itself reaches `api.anthropic.com` end-to-end through the bottle and completes a chat round-trip. No TLS-trust errors in the agent process. -- mitmproxy's TLS-handshake log lines and pipelock's `body_dlp` - event lines both appear for the same outbound request, confirming - the two-stage path is active. +- mitmproxy's flow log and pipelock's `body_dlp` / `header_dlp` / + `core_dlp` event lines both appear for the same outbound request, + confirming the two-stage path is active. The feature is **done** when all of the following ship: - A new `MitmproxyProxy` class with the same `prepare` / `start` / `stop` lifecycle shape as `PipelockProxy`, wired into the Docker backend's launch step. -- The bottle launch step generates a per-bottle ephemeral CA in - `stage_dir`, starts the mitmproxy sidecar with that CA on the - per-bottle internal network, copies the CA public cert into the - agent container's trust store, and points the agent's - `HTTPS_PROXY` / `HTTP_PROXY` at mitmproxy. -- mitmproxy's upstream is the existing pipelock sidecar; pipelock - sees plaintext HTTP from mitmproxy for every previously-HTTPS - request. +- A vendored Python addon at `claude_bottle/mitmproxy/addon.py` + that mitmproxy loads on startup via `mitmdump -s ...`. The sidecar + runs in `regular` mode (default), not `upstream` mode. +- The bottle launch step starts the mitmproxy sidecar, waits for + the sidecar-internal CA to be generated, copies the CA public + cert into the agent at `/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, + runs `update-ca-certificates` inside the agent, and threads the + `NODE_EXTRA_CA_CERTS` / `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` + env trio onto the agent container's runtime env. +- The agent's `HTTPS_PROXY` / `HTTP_PROXY` point at the mitmproxy + sidecar (where they pointed at pipelock under PRD 0001). +- pipelock is otherwise unchanged. It continues to load the YAML + PRD 0001 generates and runs its existing scanning pipeline; the + addon talks to it via the same forward-proxy interface today's + `test_pipelock_blocks_secret_post` uses. - On bottle teardown the mitmproxy sidecar is removed and the ephemeral CA private key is gone with it. -- An integration test (variant of `test_pipelock_blocks_secret_post`) - proves pipelock now blocks a credential POST that goes out over - HTTPS rather than plain HTTP. -- An integration test proves a non-credential HTTPS request to an - allowlisted host (e.g. CONNECT-then-GET on `raw.githubusercontent.com`) - succeeds end-to-end with mitmproxy in the path (no TLS-trust - errors, response body received). +- An HTTPS variant of `test_pipelock_blocks_secret_post` proves + pipelock now blocks a credential POST over HTTPS rather than + plain HTTP. +- An integration test proves a non-credential HTTPS GET through + the chain returns the upstream's real response. - The dry-run preflight (`start --dry-run`) shows the mitmproxy - sidecar in both the text and `--format=json` output alongside the - existing pipelock entry. + sidecar in both text and `--format=json` output. The JSON + contract gains a reserved `egress.mitm: { "enabled": true, "ca_fingerprint": null }` + block; fingerprint is always null at dry-run because the CA + doesn't exist yet. Real launches emit a one-line stderr log: + `claude-bottle: mitm ca fingerprint: ...`. ## Non-goals -- **Topology C** — extending pipelock itself to terminate TLS. That - is the cleanest long-term shape per the research note's - recommendation but is substantial Go work and hits the - Apache-2.0-vs-ELv2 question. Deferred. -- **Topology D** — driving mitmproxy with a pipelock `/scan` HTTP - endpoint. Requires a pipelock surface that doesn't exist today. - Deferred. +- **Topology C** — extending pipelock itself to terminate TLS. The + research note's recommended long-term shape, but substantial Go + work plus the Apache-2.0-vs-ELv2 question. Deferred. +- **Topology D as canonical** — mitmproxy with a pipelock `/scan` + HTTP endpoint. The addon in this PRD talks to pipelock via its + existing forward-proxy interface; no upstream pipelock change + needed. - **Persistent or shared CA across bottles.** Each bottle gets a - fresh CA generated at start and destroyed at teardown. No CA - storage on the host, no cross-bottle reuse. + fresh CA generated by its own mitmproxy at startup. - **Selective bumping ("ignore_hosts") as a v1 manifest field.** - v1 bumps every CONNECT. If a future allowlisted host turns out to - pin (Mobile / Chromium-style cert pinning), a follow-up PRD adds - the per-host opt-out — likely a `bottle.egress.tls_bump_ignore` - field. See Open questions. + v1 bumps every CONNECT. If a future allowlisted host turns out + to pin (Mobile / Chromium-style cert pinning), a follow-up PRD + adds the per-host opt-out via `bottle.egress.tls_bump_ignore`. + Strictly additive. - **HTTP/3 / QUIC.** mitmproxy's HTTP/3 support is experimental. - v1 relies on the v1-egress iptables layer (separate PRD) blocking - UDP/443 to force clients onto HTTP/2 over TCP, which mitmproxy - inspects normally. + v1 relies on the v1-egress iptables layer blocking UDP/443 to + force clients onto HTTP/2 over TCP, which mitmproxy 12 inspects + natively (verified by spike). - **Raw TCP / non-HTTP TLS interception.** mitmproxy supports it via `--mode reverse:`, not in CONNECT-bump mode. SSH and any future raw-TCP egress route around mitmproxy entirely. -- **Trust-store rewiring for non-Debian agent base images.** The +- **Trust-store rewiring for non-Debian agent images.** The current `Dockerfile` is `node:22-slim` (Debian). If a future base switches to Red-Hat-family, the `update-ca-certificates` step becomes `update-ca-trust`. Out of scope until the base changes. +- **Response-body scanning.** Pipelock supports it; we don't wire + it in v1 because the addon would need to ferry the upstream + response back through pipelock's scanner, which the forward- + proxy interface doesn't support cleanly. v2 candidate. +- **MCP scanning on the bumped path.** Only fires on MCP-formatted + JSON-RPC payloads inside tool calls. Not relevant to plain HTTPS + agent traffic and out of v1 scope. +- **Domain-fronting verification.** Once the addon sees the inner + `Host` / `:authority`, comparing it to the outer CONNECT target + catches domain fronting. Worth ~10 lines in the addon, but + defer until the rest of v1 is settled. +- **Host-side openssl / `cryptography` for CA generation.** The + research note's open question on this is resolved by letting + mitmproxy itself generate the CA (it does so on first launch). + No new host-side crypto. ## Scope ### In scope -- New `claude_bottle/mitmproxy.py` mirroring `claude_bottle/pipelock.py`: - config helpers (no backend-specific Docker calls), the - `MitmproxyProxy` abstract class, and the per-bottle CA generation - helpers. -- New `claude_bottle/backend/docker/mitmproxy.py` mirroring - `claude_bottle/backend/docker/pipelock.py`: `DockerMitmproxyProxy` - with the Docker-specific `start` / `stop` lifecycle, the sidecar - container name scheme, and the image pin. -- New provisioner: `claude_bottle/backend/docker/provision/ca.py`, - installing the CA public cert into the agent container at - `/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, running - `update-ca-certificates`, and exporting `NODE_EXTRA_CA_CERTS` / - `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` env vars to the agent - process. The provisioner runs from `BottleBackend.provision` in - the same orchestration as `prompt`, `skills`, `ssh`, `git`. -- Per-agent network reshuffle in `DockerBottleBackend.launch`: - - internal network is unchanged (mitmproxy + pipelock + agent) - - agent's `HTTPS_PROXY` / `HTTP_PROXY` change from pointing at the - pipelock service name to the mitmproxy service name - - mitmproxy's `upstream_proxy` config points at the pipelock - service name on the internal network -- `DockerBottlePlan` grows a `mitmproxy_plan` field analogous to the - existing `proxy_plan` (the pipelock one) so prepare-time state - rides on the plan. -- Dry-run preflight (`start --dry-run` text + JSON) renders the - mitmproxy line and surfaces the CA fingerprint shown in the - bottle's trust store, so the operator can verify what's been - installed. +- New `claude_bottle/mitmproxy/` package: + - `__init__.py` — backend-agnostic. Constants (sidecar port, + image-pin digest, the in-container addon path), the abstract + `MitmproxyProxy` class with `prepare` / `start` / `stop` shape + mirroring `PipelockProxy`, and the small helper that reads the + CA fingerprint from a PEM file via `openssl x509 -fingerprint` + shelled out. + - `addon.py` — the Python addon mitmproxy loads. ~80–150 lines. + For each `request` event: forward the decrypted request to + pipelock at `http://claude-bottle-pipelock-:8888` as a + plain HTTP forward-proxy call (absolute-URI form). Inspect + pipelock's response. If status is 403 *and* the body matches + pipelock's known block-event shape, set the flow's response to + a 403 with pipelock's body and short-circuit. Otherwise, + discard pipelock's response (and any wasted upstream-leg + response from pipelock's forwarder) and let mitmproxy proceed + to the real upstream. +- New `claude_bottle/backend/docker/mitmproxy.py` — + `DockerMitmproxyProxy(MitmproxyProxy)` with the Docker-specific + start/stop lifecycle. `start(plan)` does `docker create` / + `docker cp addon.py …` / `docker network connect` / `docker start`, + analogous to the existing `DockerPipelockProxy.start`. Injects + `CLAUDE_BOTTLE_PIPELOCK_URL` into the sidecar env so the addon + knows where pipelock lives. +- New provisioner `claude_bottle/backend/docker/provision/ca.py`. + Polls mitmproxy for the cert file, copies it through a host + stage dir into the agent, runs `update-ca-certificates` inside + the agent, computes the SHA-256 fingerprint, and prints the + one-line stderr log. +- `BottleBackend.provision_ca(plan, target)` joins the four + existing provisioner methods on the abstract base. Default impl + is no-op so other backends don't break when they don't yet + implement TLS interception. +- `DockerBottlePlan` grows a `mitmproxy_plan` field mirroring the + existing `proxy_plan`. +- Agent container `docker run` invocation: + - `HTTPS_PROXY` / `HTTP_PROXY` change from the pipelock service + name to the mitmproxy service name. + - Three `-e` flags set the CA env trio so they're inherited by + the eventual `docker exec claude` (Docker propagates run-time + env into exec by default; fallback in Q1 below). +- Dry-run preflight rendering of the mitmproxy entry (text + JSON). + JSON gains `egress.mitm: { "enabled": true, "ca_fingerprint": null }`. +- One stderr log line at launch with the CA fingerprint. - Two new integration tests under `tests/integration/`: - - `test_mitmproxy_blocks_secret_https_post.py` — the HTTPS - variant of the existing `_blocks_secret_post` test. + - `test_mitmproxy_blocks_secret_https_post.py` — HTTPS variant + of the existing block-secret test. Asserts pipelock's body + DLP fires on a credential POST tunneled through CONNECT. - `test_mitmproxy_allows_normal_https.py` — confirms a plain - HTTPS GET to a non-credential-bearing path through mitmproxy + - pipelock returns the upstream response, asserting no trust / - handshake breakage. -- Unit tests for the new config builder (mirroring the pipelock - YAML unit tests) and for the CA generation helper. + HTTPS GET on an allowlisted host returns the upstream response, + isolating the addon's pass-through path from the block path. +- Unit tests for the addon's verdict logic (block vs allow on + status + body shape, edge cases) using mitmproxy's `mitmproxy.test` + flow fixtures. Unit tests for the proxy config builder + (mirroring `tests/unit/test_pipelock_yaml.py`). ### Out of scope - The v1 iptables + dnsmasq layer (separate PRD; see - `network-egress-guard.md`). mitmproxy covers HTTP/HTTPS only. - Raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer. -- Pipelock config changes. Pipelock continues to load the YAML PRD - 0001 already generates. mitmproxy is opaque to it; pipelock just - sees plain HTTP from a forward-proxy client. -- A bottle-level toggle to skip mitmproxy entirely. v1 always wires - it in. If a use case appears for an unintercepted bottle - (e.g. testing pipelock's CONNECT-mode behavior in isolation), - that's a follow-up. + `network-egress-guard.md`). mitmproxy covers HTTP/HTTPS only; + raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer. +- Pipelock config changes. Pipelock continues to load the YAML + PRD 0001 generates; the addon talks to it via the existing + forward-proxy interface. +- A bottle-level toggle to skip mitmproxy entirely. v1 always + wires it in. - Pinning-host detection automation. The cost of finding out (per - the research note) is a single 5-minute test before adding a - host; it stays a manual step. + research) is a single 5-minute test before adding a host; it + stays a manual step. +- Pipelock upstream contributions for an `X-Pipelock-Verdict` header. + Possible follow-up. Until then the addon distinguishes blocks + from passes via status + body fingerprint. ## Proposed Design ### Topology ``` -agent --HTTPS_PROXY--> mitmproxy --HTTP_PROXY--> pipelock --> internet - (bump TLS) (scan plain) (real TLS) +agent --HTTPS_PROXY--> mitmproxy --addon--> pipelock (scan) + (bump TLS) | + ^ | (verdict via status code) + | v + +-- on allow ----- real upstream + (mitmproxy as client) ``` All three containers live on the same per-bottle internal Docker network. mitmproxy and pipelock are both attached to the per-bottle -egress bridge so they can reach the host network; the agent has no -default route, exactly as today. +egress bridge for real-internet reach; the agent has no default +route. Concretely: -- `agent` sets `HTTPS_PROXY=http://claude-bottle-mitm-:`. - Currently this points at `claude-bottle-pipelock-`. The - hostname swap is the only agent-side env change. -- `mitmproxy` runs with `--mode upstream:http://claude-bottle-pipelock-:` - so its decrypted plaintext is forwarded to pipelock as a regular - upstream forward-proxy request. (Research open question #1 calls - this out: mitmproxy 10+ documentation says `upstream` mode forwards - the original request shape; verify against the pinned version at - implementation time. If forwarding wraps a new CONNECT, fall back - to `regular` mode with a chained proxy declared in mitmproxy's - config and route plain HTTP to pipelock by hand.) -- `pipelock` continues to listen on its existing port and receives - plain HTTP from mitmproxy. No pipelock config change. +- Agent sets `HTTPS_PROXY=http://claude-bottle-mitm-:`. + PRD 0001 had this pointing at pipelock; the hostname swap is the + only agent-side env change. +- mitmproxy runs in **`regular`** mode (default; no `--mode` flag). + It bumps every CONNECT, generates fake leaf certs signed by its + own CA, and presents them to the agent. +- The addon, loaded via `mitmdump -s /addon/addon.py`, intercepts + each decrypted `request` event. It forwards the request to + pipelock at `http://claude-bottle-pipelock-:8888` as a + plain HTTP forward-proxy call (absolute-URI form), so pipelock + sees the full URL, headers, and body. +- The addon inspects pipelock's response. If status is 403 *and* + the response body matches pipelock's known block-event shape, + the addon sets the mitmproxy flow's response to a 403 with + pipelock's body and short-circuits. Otherwise — including the + case where pipelock's forwarder attempted the upstream and got + a 4xx — the addon discards pipelock's response and lets + mitmproxy proceed to the real upstream. +- mitmproxy completes the outbound TLS to the real destination + using its built-in trust store, just like any other forward + proxy. Pipelock is only involved as a scanner. + +The trade-off: pipelock makes a wasted upstream forward attempt +for every allowed request (it tries to forward over plain HTTP to +a real HTTPS-only host, which fails with the upstream's 4xx). This +is benign — the scan completes before forwarding, the verdict +reaches the addon, the upstream-side request happens to die in +pipelock's forwarder rather than reach the agent. Acceptable cost +for the visibility win. A pipelock-side improvement (skip the +forward when the addon only needs the scan verdict) is a future +optimization. ### New components -Two new modules, matching PRD 0001's split between -backend-agnostic config and backend-specific lifecycle: - -- **`claude_bottle/mitmproxy.py`** — backend-agnostic. The config - builder (mitmproxy YAML / TOML — confirm format), the abstract - `MitmproxyProxy` class with `prepare(...)` writing the config and - the ephemeral CA into `stage_dir`, the CA generation helper - (RSA-2048 or ECDSA-P256 — pick at impl time, research suggests - ECDSA for cert-gen speed), and constants for the sidecar's - internal-network port and image pin. -- **`claude_bottle/backend/docker/mitmproxy.py`** — Docker - implementation. `DockerMitmproxyProxy(MitmproxyProxy)` with - `start(plan)` doing `docker create` / `docker cp` / `docker - network connect` / `docker start` analogous to - `DockerPipelockProxy.start`. `stop(target)` removes the sidecar - idempotently. - -The provisioner that installs the CA cert into the agent's trust -store lives at `claude_bottle/backend/docker/provision/ca.py` and -plugs into the existing `BottleBackend.provision` orchestration. The -abstract `BottleBackend.provision_ca` method joins -`provision_prompt` / `provision_skills` / `provision_ssh` / -`provision_git` on the base class (PRD 0004's pattern), with a -default no-op implementation so other backends don't break when -they don't yet implement it. +- `claude_bottle/mitmproxy/__init__.py` — backend-agnostic + abstract base, constants, the `openssl x509 -fingerprint` helper. +- `claude_bottle/mitmproxy/addon.py` — the scanning addon. + Reads pipelock's URL from `CLAUDE_BOTTLE_PIPELOCK_URL` (injected + into the sidecar env by the proxy's `start`). For each + `request` flow: synchronously POST to pipelock; inspect status + + body; either short-circuit with 403 or fall through. +- `claude_bottle/backend/docker/mitmproxy.py` — + `DockerMitmproxyProxy(MitmproxyProxy)` with start/stop, the + `docker cp` of the addon into the sidecar before `docker start`, + and the `CLAUDE_BOTTLE_PIPELOCK_URL` wiring. ### CA lifecycle -Per `tls-mitm-for-pipelock.md` §CA lifecycle: +Simplified by letting mitmproxy own the generation: -- **Generation.** Host-side in `MitmproxyProxy.prepare`, written to - `stage_dir/mitm-ca.key` (mode 600) and `stage_dir/mitm-ca.crt` - (mode 644). The `.key` is copied into the mitmproxy container at - start; nothing else touches it. -- **Bottle injection.** `provision_ca` copies only the public - `.crt` into the agent container at - `/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, runs - `update-ca-certificates` as root inside the container, and sets - `NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, - `SSL_CERT_FILE`, and `REQUESTS_CA_BUNDLE` for the agent process. - Belt-and-suspenders because some libraries honor only env vars. -- **Teardown.** The mitmproxy sidecar container is destroyed; the - CA key vanishes with it. Nothing persists on the host outside - `stage_dir`, which the start command already deletes in its - finally block. -- **Cost.** ECDSA-P256 CA + per-host leaf generation runs in - milliseconds; the per-bottle Docker pull and network plumbing - dominate startup time. +- **Generation.** mitmproxy generates a fresh CA on startup + inside its container at `/home/mitmproxy/.mitmproxy/mitmproxy-ca-cert.pem` + (public) + `mitmproxy-ca.pem` (private). No host-side openssl + for *generation*; no host-side Python `cryptography` dep. +- **Volume strategy.** Container-internal only. No host bind + mount means the CA dies with the container. +- **Extraction.** `provision_ca` polls (~1s) for the cert file + via `docker exec`, then `docker cp` to host stage dir, then + `docker cp` into the agent. Host stage dir gets cleaned up by + the existing `start.py` `finally` block. +- **Bottle install.** + 1. `docker cp /mitm-ca.crt agent-:/usr/local/share/ca-certificates/claude-bottle-mitm.crt` + 2. `docker exec -u 0 agent- chmod 644 …` + 3. `docker exec -u 0 agent- update-ca-certificates` + 4. Three `-e` flags on `docker run` set the env trio + (`NODE_EXTRA_CA_CERTS=…/claude-bottle-mitm.crt`, + `SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt`, + `REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt`) so + `docker exec claude` inherits them. +- **Teardown.** Sidecar container removed; CA private key gone. +- **Fingerprint.** Computed post-extraction via shelled-out + `openssl x509 -fingerprint -sha256 -noout`. Logged once to + stderr at launch; never the private key. ### Data model changes -None in v1. The manifest schema is unchanged. mitmproxy is always -on for every bottle once this PRD ships. +None to the manifest schema. The dry-run JSON contract gains a +reserved `egress.mitm: { "enabled": true, "ca_fingerprint": null }` +block. Fingerprint is always null at dry-run (CA doesn't exist +yet) but the field is reserved so future schema additions stay +non-breaking. -A future selective-bump knob (per `tls-mitm-for-pipelock.md` open -question #5) would land on `bottle.egress.tls_bump_ignore` as a -list of hostnames. The shape mirrors `egress.allowlist`. Adding it -later is a strictly additive change. +A future selective-bump knob would add +`bottle.egress.tls_bump_ignore: [host, ...]` per the research +note. Strictly additive when it lands. ### Existing code touched - **`claude_bottle/backend/docker/launch.py`** — bring up the - mitmproxy sidecar after the pipelock sidecar but before the agent - container, repoint the agent's `HTTPS_PROXY` / `HTTP_PROXY` env - flags, register an `ExitStack` callback to stop mitmproxy on - teardown. + mitmproxy sidecar between pipelock and the agent. Repoint the + agent's `HTTPS_PROXY` / `HTTP_PROXY` env flags to mitmproxy. + Register an `ExitStack` callback for mitmproxy teardown. Print + the CA fingerprint once the sidecar reports ready. - **`claude_bottle/backend/docker/prepare.py`** — call into - `MitmproxyProxy.prepare(...)` alongside the existing - `PipelockProxy.prepare(...)`, populate - `DockerBottlePlan.mitmproxy_plan`. + `MitmproxyProxy.prepare(...)` alongside `PipelockProxy.prepare(...)`, + populate `DockerBottlePlan.mitmproxy_plan`. - **`claude_bottle/backend/docker/backend.py`** — add the `DockerMitmproxyProxy` instance attribute (`self._mitm`) and - thread it through `launch` + cleanup, mirroring the existing - `self._proxy` pattern. + thread it through `launch` + cleanup, mirroring `self._proxy`. - **`claude_bottle/backend/docker/bottle_plan.py`** — new - `mitmproxy_plan: MitmproxyProxyPlan` field on - `DockerBottlePlan`. `print()` and `to_dict()` learn to render it. + `mitmproxy_plan` field. `print()` and `to_dict()` learn to + render the mitmproxy entry and the `egress.mitm` JSON block. - **`claude_bottle/backend/__init__.py`** — abstract - `BottleBackend.provision_ca(plan, target)` joins the other four - provisioners. Default impl is a no-op (so a future fly backend - isn't forced to implement TLS interception in v1). + `BottleBackend.provision_ca` joins the four existing + provisioners; default no-op. - **`tests/integration/`** — two new tests as described above. -- **`tests/unit/`** — config-builder unit tests; CA-helper unit - tests; updated dry-run-plan test pinning the mitmproxy entry. +- **`tests/unit/`** — addon-verdict tests, mitmproxy-config + builder tests, dry-run-plan test updated for the new + `egress.mitm` block. ### External dependencies -- **mitmproxy Docker image** pulled from - `mitmproxy/mitmproxy@sha256:`. The digest is pinned in - `claude_bottle/mitmproxy.py` and bumped deliberately, mirroring - the pipelock pin. Tag line `mitmproxy/mitmproxy:11.x` per - research §Image pin for mitmproxy. -- No new host-side runtimes. CA generation uses Python's `cryptography` - if it's already a transitive dep; otherwise use `openssl` shelled - out from the host-side prepare step. Decide at impl time after - confirming what's available on the runner without adding deps. +- **mitmproxy Docker image** pinned by digest on the `12.x` line. + Bumped deliberately, mirroring the pipelock pin. Verified by + spike to speak h2 on both halves. +- No new host-side runtimes. mitmproxy generates the CA; + fingerprint via the `openssl` already present on Debian / macOS + / ubuntu-latest runners. ## Open questions -- **mitmproxy upstream-proxy mode mechanics.** Whether `upstream` - mode forwards decrypted plaintext to pipelock or re-wraps it in a - CONNECT. Documented behavior changed between mitmproxy 8 and 10. - Needs verification against the pinned version at impl time. If - `upstream` re-wraps, fall back to `regular` mode plus a chained - proxy directive routing plain HTTP to pipelock. -- **Pipelock plain-HTTP scanning coverage.** Pipelock's - `forward_proxy.enabled: true` accepts both `GET http://…` and - `CONNECT host:443`. Confirm by reading - `github.com/luckyPipewrench/pipelock/blob/main/docs/configuration.md` - that the full DLP / MCP / subdomain-entropy pipeline runs on the - HTTP path; some pipelock layers may be gated on CONNECT only. -- **CA installation in the Anthropic-provided Claude Code image.** - The base image determines whether `update-ca-certificates` - (Debian) or `update-ca-trust` (Red Hat) applies. Confirm against - the `Dockerfile` before writing the provisioner; v1 assumes - Debian (`node:22-slim`). -- **HTTP/2 ALPN end-to-end.** Node's HTTP client negotiates `h2` - via ALPN. Confirm the pinned mitmproxy version speaks `h2` to - both halves without silently downgrading to `http/1.1`, which - would be a noticeable performance regression on bulk transfers. -- **Selective-bump policy surface.** Where does the - "tunnel this hostname blindly" decision live when (not if) a - pinning host appears? Recommended shape per research: - `bottle.egress.tls_bump_ignore: ["example.com"]`, a list of - hostnames mitmproxy passes through via `ignore_hosts`. Defer - until needed; record the shape so the follow-up is mechanical. -- **CA generation: Python `cryptography` vs. shelled-out - `openssl`.** Adding `cryptography` brings a substantial transitive - graph; shelling to `openssl` keeps the host-side prepare step - dep-light. Decide at impl time based on what's already on the - runner. Either way, the CA is per-bottle and ephemeral. -- **Domain-fronting verification.** Once pipelock sees the inner - `Host` / `:authority`, comparing it to the outer `CONNECT` target - catches domain fronting. Whether pipelock has a rule for this or - we need to add one is a follow-up; out of scope here. -- **Dry-run preflight rendering of the CA.** Show the fingerprint - but never the private key. Confirm the exact dry-run JSON shape - during implementation; the field set is part of the CLI's user- - facing contract (per PRD 0003 §to_dict notes). +(rewritten — most of the original v1 questions are now closed by +the walkthrough spikes; what remains is addon-implementation +specifics worth pinning during the first impl turn.) + +- **Pipelock's 403-body fingerprint.** The addon needs to + distinguish a pipelock block (DLP / host) from a real-upstream + 4xx that pipelock's forwarder relayed back. Most likely shape: + pipelock's 403 response carries a JSON body with `event` / + `scanner` fields, whereas a real-upstream 4xx carries whatever + the upstream sent. Pin the exact fingerprint by inspecting + pipelock's actual 403 body bytes at impl time. Long-term + cleanup: file an upstream feature request for an + `X-Pipelock-Verdict: block` response header so the addon can + read a structured signal instead of pattern-matching the body. +- **Docker run env-var inheritance through docker exec.** Plan + assumes `docker run -e VAR=value` propagates to subsequent + `docker exec` invocations. The Docker docs say so; not yet + empirically pinned on this project's runner setup. Verify in + the first impl turn. Trivial fallback: thread the three `-e` + flags onto every `DockerBottle.exec*` call. +- **Addon synchronous-call latency.** The addon makes a sync HTTP + call to pipelock per outbound flow. Pipelock is on the same + internal Docker network; expected per-call latency is well + under 10ms. Confirm under the parallel-request load Claude Code + generates (most likely a non-issue — Claude is single-stream + request-wise). +- **Addon test fixtures.** mitmproxy ships `mitmproxy.test` with + flow fixtures; addons can be unit-tested without a running + proxy. Confirm the import path and recommended fixture shape at + impl time; structure the addon so the verdict-decision is a + pure function that's trivially testable in isolation from any + HTTP I/O. +- **Pipelock allowing the addon's forwarded request through.** + pipelock will see the addon's request as coming from the + mitmproxy sidecar's IP on the internal network. Confirm + pipelock has no client-IP allowlist that would reject these. + Likely fine — pipelock's `client_ip` is informational in the + scan event, not a gate. ## References -- `docs/research/tls-mitm-for-pipelock.md` — primary source; this - PRD implements the recommendation in §Recommendation (Topology A). +- `docs/research/tls-mitm-for-pipelock.md` — primary source. This + PRD implements a variant of §Recommendation (Topology A) after + the spike documented under "Open questions" §1 falsified the + `upstream` mode assumption. - `docs/research/pipelock-assessment.md` §Scope gaps — names the TLS-inspection gap closed here. - `docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` — @@ -363,9 +429,9 @@ later is a strictly additive change. module pattern reused for the new CA provisioner. - mitmproxy: , -- mitmproxy `upstream_proxy` mode: - +- mitmproxy modes: - mitmproxy CA cert installation: +- mitmproxy addon API: - Node `NODE_EXTRA_CA_CERTS`: