Captures the design for putting a mitmproxy sidecar in front of pipelock on the egress path so pipelock's body / header / MCP scanners see plaintext for the HTTPS hosts in the default allowlist. Implements Topology A from docs/research/tls-mitm-for-pipelock.md with a per-bottle ephemeral CA, no manifest schema change in v1, and selective-bumping deferred until a pinning host appears. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
18 KiB
PRD 0005: mitmproxy TLS interception for pipelock content scanning
- Status: Draft
- Author: didericis
- Created: 2026-05-12
Summary
Add a per-bottle mitmproxy sidecar in front of pipelock on the
egress path so pipelock's DLP, subdomain-entropy, and MCP scanners
fire on the plaintext bodies of HTTPS requests instead of only the
opaque ciphertext that follows a CONNECT. mitmproxy terminates the
agent's TLS, hands plaintext HTTP to pipelock as an upstream
forward proxy, and re-establishes TLS to the real destination. A
fresh ephemeral CA is minted per bottle; the CA private key never
leaves the sidecar, and the public cert is wired into the agent
container's trust store at launch.
Problem
PRD 0001 wired pipelock onto every bottle's egress, but the current
topology only sees CONNECT hostnames and opaque TLS bytes:
agent --HTTPS_PROXY--> pipelock --CONNECT host:443--> internet
\____________________________
opaque TLS bytes
What pipelock cannot scan in this mode is documented in
docs/research/tls-mitm-for-pipelock.md §What pipelock cannot see
today: request URLs and methods, request and response headers,
request and response bodies, MCP JSON-RPC payloads, inner-vs-outer
hostname (the domain-fronting check), and WebSocket frames inside a
TLS-wrapped upgrade. The 48-pattern DLP layer this project relies on
in PRD 0001 is therefore inert against every host in the current
DEFAULT_ALLOWLIST — all of which are HTTPS-only.
The integration test added in tests/integration/test_pipelock_blocks_secret_post.py
demonstrates the gap concretely: pipelock's body-scan layer only
fires when the agent is forced to send plain HTTP. Real Claude Code
traffic to api.anthropic.com goes over CONNECT-tunneled TLS and
slips past the scanner.
pipelock-assessment.md §Scope gaps names this as a known
limitation of the proxy-without-TLS-inspection shape. Closing it is
the explicit motivation for tls-mitm-for-pipelock.md, whose
recommendation this PRD implements.
Goals / Success Criteria
The feature works when all of the following are observable:
- A Node request from inside a launched bottle to a CONNECT-bumped
HTTPS host (e.g.
https://api.anthropic.com/dlp-probe) carrying a pipelock-recognized credential pattern in the body returns 403 from the proxy, not a response from the upstream. The existingtest_pipelock_blocks_secret_posttest path becomes the HTTPS variant of this assertion. - Claude Code itself reaches
api.anthropic.comend-to-end through the bottle and completes a chat round-trip. No TLS-trust errors in the agent process. - mitmproxy's TLS-handshake log lines and pipelock's
body_dlpevent lines both appear for the same outbound request, confirming the two-stage path is active.
The feature is done when all of the following ship:
- A new
MitmproxyProxyclass with the sameprepare/start/stoplifecycle shape asPipelockProxy, wired into the Docker backend's launch step. - The bottle launch step generates a per-bottle ephemeral CA in
stage_dir, starts the mitmproxy sidecar with that CA on the per-bottle internal network, copies the CA public cert into the agent container's trust store, and points the agent'sHTTPS_PROXY/HTTP_PROXYat mitmproxy. - mitmproxy's upstream is the existing pipelock sidecar; pipelock sees plaintext HTTP from mitmproxy for every previously-HTTPS request.
- On bottle teardown the mitmproxy sidecar is removed and the ephemeral CA private key is gone with it.
- An integration test (variant of
test_pipelock_blocks_secret_post) proves pipelock now blocks a credential POST that goes out over HTTPS rather than plain HTTP. - An integration test proves a non-credential HTTPS request to an
allowlisted host (e.g. CONNECT-then-GET on
raw.githubusercontent.com) succeeds end-to-end with mitmproxy in the path (no TLS-trust errors, response body received). - The dry-run preflight (
start --dry-run) shows the mitmproxy sidecar in both the text and--format=jsonoutput alongside the existing pipelock entry.
Non-goals
- Topology C — extending pipelock itself to terminate TLS. That is the cleanest long-term shape per the research note's recommendation but is substantial Go work and hits the Apache-2.0-vs-ELv2 question. Deferred.
- Topology D — driving mitmproxy with a pipelock
/scanHTTP endpoint. Requires a pipelock surface that doesn't exist today. Deferred. - Persistent or shared CA across bottles. Each bottle gets a fresh CA generated at start and destroyed at teardown. No CA storage on the host, no cross-bottle reuse.
- Selective bumping ("ignore_hosts") as a v1 manifest field.
v1 bumps every CONNECT. If a future allowlisted host turns out to
pin (Mobile / Chromium-style cert pinning), a follow-up PRD adds
the per-host opt-out — likely a
bottle.egress.tls_bump_ignorefield. See Open questions. - HTTP/3 / QUIC. mitmproxy's HTTP/3 support is experimental. v1 relies on the v1-egress iptables layer (separate PRD) blocking UDP/443 to force clients onto HTTP/2 over TCP, which mitmproxy inspects normally.
- Raw TCP / non-HTTP TLS interception. mitmproxy supports it
via
--mode reverse:, not in CONNECT-bump mode. SSH and any future raw-TCP egress route around mitmproxy entirely. - Trust-store rewiring for non-Debian agent base images. The
current
Dockerfileisnode:22-slim(Debian). If a future base switches to Red-Hat-family, theupdate-ca-certificatesstep becomesupdate-ca-trust. Out of scope until the base changes.
Scope
In scope
- New
claude_bottle/mitmproxy.pymirroringclaude_bottle/pipelock.py: config helpers (no backend-specific Docker calls), theMitmproxyProxyabstract class, and the per-bottle CA generation helpers. - New
claude_bottle/backend/docker/mitmproxy.pymirroringclaude_bottle/backend/docker/pipelock.py:DockerMitmproxyProxywith the Docker-specificstart/stoplifecycle, the sidecar container name scheme, and the image pin. - New provisioner:
claude_bottle/backend/docker/provision/ca.py, installing the CA public cert into the agent container at/usr/local/share/ca-certificates/claude-bottle-mitm.crt, runningupdate-ca-certificates, and exportingNODE_EXTRA_CA_CERTS/SSL_CERT_FILE/REQUESTS_CA_BUNDLEenv vars to the agent process. The provisioner runs fromBottleBackend.provisionin the same orchestration asprompt,skills,ssh,git. - Per-agent network reshuffle in
DockerBottleBackend.launch:- internal network is unchanged (mitmproxy + pipelock + agent)
- agent's
HTTPS_PROXY/HTTP_PROXYchange from pointing at the pipelock service name to the mitmproxy service name - mitmproxy's
upstream_proxyconfig points at the pipelock service name on the internal network
DockerBottlePlangrows amitmproxy_planfield analogous to the existingproxy_plan(the pipelock one) so prepare-time state rides on the plan.- Dry-run preflight (
start --dry-runtext + JSON) renders the mitmproxy line and surfaces the CA fingerprint shown in the bottle's trust store, so the operator can verify what's been installed. - Two new integration tests under
tests/integration/:test_mitmproxy_blocks_secret_https_post.py— the HTTPS variant of the existing_blocks_secret_posttest.test_mitmproxy_allows_normal_https.py— confirms a plain HTTPS GET to a non-credential-bearing path through mitmproxy + pipelock returns the upstream response, asserting no trust / handshake breakage.
- Unit tests for the new config builder (mirroring the pipelock YAML unit tests) and for the CA generation helper.
Out of scope
- The v1 iptables + dnsmasq layer (separate PRD; see
network-egress-guard.md). mitmproxy covers HTTP/HTTPS only. Raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer. - Pipelock config changes. Pipelock continues to load the YAML PRD 0001 already generates. mitmproxy is opaque to it; pipelock just sees plain HTTP from a forward-proxy client.
- A bottle-level toggle to skip mitmproxy entirely. v1 always wires it in. If a use case appears for an unintercepted bottle (e.g. testing pipelock's CONNECT-mode behavior in isolation), that's a follow-up.
- Pinning-host detection automation. The cost of finding out (per the research note) is a single 5-minute test before adding a host; it stays a manual step.
Proposed Design
Topology
agent --HTTPS_PROXY--> mitmproxy --HTTP_PROXY--> pipelock --> internet
(bump TLS) (scan plain) (real TLS)
All three containers live on the same per-bottle internal Docker network. mitmproxy and pipelock are both attached to the per-bottle egress bridge so they can reach the host network; the agent has no default route, exactly as today.
Concretely:
agentsetsHTTPS_PROXY=http://claude-bottle-mitm-<slug>:<port>. Currently this points atclaude-bottle-pipelock-<slug>. The hostname swap is the only agent-side env change.mitmproxyruns with--mode upstream:http://claude-bottle-pipelock-<slug>:<pipelock-port>so its decrypted plaintext is forwarded to pipelock as a regular upstream forward-proxy request. (Research open question #1 calls this out: mitmproxy 10+ documentation saysupstreammode forwards the original request shape; verify against the pinned version at implementation time. If forwarding wraps a new CONNECT, fall back toregularmode with a chained proxy declared in mitmproxy's config and route plain HTTP to pipelock by hand.)pipelockcontinues to listen on its existing port and receives plain HTTP from mitmproxy. No pipelock config change.
New components
Two new modules, matching PRD 0001's split between backend-agnostic config and backend-specific lifecycle:
claude_bottle/mitmproxy.py— backend-agnostic. The config builder (mitmproxy YAML / TOML — confirm format), the abstractMitmproxyProxyclass withprepare(...)writing the config and the ephemeral CA intostage_dir, the CA generation helper (RSA-2048 or ECDSA-P256 — pick at impl time, research suggests ECDSA for cert-gen speed), and constants for the sidecar's internal-network port and image pin.claude_bottle/backend/docker/mitmproxy.py— Docker implementation.DockerMitmproxyProxy(MitmproxyProxy)withstart(plan)doingdocker create/docker cp/docker network connect/docker startanalogous toDockerPipelockProxy.start.stop(target)removes the sidecar idempotently.
The provisioner that installs the CA cert into the agent's trust
store lives at claude_bottle/backend/docker/provision/ca.py and
plugs into the existing BottleBackend.provision orchestration. The
abstract BottleBackend.provision_ca method joins
provision_prompt / provision_skills / provision_ssh /
provision_git on the base class (PRD 0004's pattern), with a
default no-op implementation so other backends don't break when
they don't yet implement it.
CA lifecycle
Per tls-mitm-for-pipelock.md §CA lifecycle:
- Generation. Host-side in
MitmproxyProxy.prepare, written tostage_dir/mitm-ca.key(mode 600) andstage_dir/mitm-ca.crt(mode 644). The.keyis copied into the mitmproxy container at start; nothing else touches it. - Bottle injection.
provision_cacopies only the public.crtinto the agent container at/usr/local/share/ca-certificates/claude-bottle-mitm.crt, runsupdate-ca-certificatesas root inside the container, and setsNODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/claude-bottle-mitm.crt,SSL_CERT_FILE, andREQUESTS_CA_BUNDLEfor the agent process. Belt-and-suspenders because some libraries honor only env vars. - Teardown. The mitmproxy sidecar container is destroyed; the
CA key vanishes with it. Nothing persists on the host outside
stage_dir, which the start command already deletes in its finally block. - Cost. ECDSA-P256 CA + per-host leaf generation runs in milliseconds; the per-bottle Docker pull and network plumbing dominate startup time.
Data model changes
None in v1. The manifest schema is unchanged. mitmproxy is always on for every bottle once this PRD ships.
A future selective-bump knob (per tls-mitm-for-pipelock.md open
question #5) would land on bottle.egress.tls_bump_ignore as a
list of hostnames. The shape mirrors egress.allowlist. Adding it
later is a strictly additive change.
Existing code touched
claude_bottle/backend/docker/launch.py— bring up the mitmproxy sidecar after the pipelock sidecar but before the agent container, repoint the agent'sHTTPS_PROXY/HTTP_PROXYenv flags, register anExitStackcallback to stop mitmproxy on teardown.claude_bottle/backend/docker/prepare.py— call intoMitmproxyProxy.prepare(...)alongside the existingPipelockProxy.prepare(...), populateDockerBottlePlan.mitmproxy_plan.claude_bottle/backend/docker/backend.py— add theDockerMitmproxyProxyinstance attribute (self._mitm) and thread it throughlaunch+ cleanup, mirroring the existingself._proxypattern.claude_bottle/backend/docker/bottle_plan.py— newmitmproxy_plan: MitmproxyProxyPlanfield onDockerBottlePlan.print()andto_dict()learn to render it.claude_bottle/backend/__init__.py— abstractBottleBackend.provision_ca(plan, target)joins the other four provisioners. Default impl is a no-op (so a future fly backend isn't forced to implement TLS interception in v1).tests/integration/— two new tests as described above.tests/unit/— config-builder unit tests; CA-helper unit tests; updated dry-run-plan test pinning the mitmproxy entry.
External dependencies
- mitmproxy Docker image pulled from
mitmproxy/mitmproxy@sha256:<digest>. The digest is pinned inclaude_bottle/mitmproxy.pyand bumped deliberately, mirroring the pipelock pin. Tag linemitmproxy/mitmproxy:11.xper research §Image pin for mitmproxy. - No new host-side runtimes. CA generation uses Python's
cryptographyif it's already a transitive dep; otherwise useopensslshelled out from the host-side prepare step. Decide at impl time after confirming what's available on the runner without adding deps.
Open questions
- mitmproxy upstream-proxy mode mechanics. Whether
upstreammode forwards decrypted plaintext to pipelock or re-wraps it in a CONNECT. Documented behavior changed between mitmproxy 8 and 10. Needs verification against the pinned version at impl time. Ifupstreamre-wraps, fall back toregularmode plus a chained proxy directive routing plain HTTP to pipelock. - Pipelock plain-HTTP scanning coverage. Pipelock's
forward_proxy.enabled: trueaccepts bothGET http://…andCONNECT host:443. Confirm by readinggithub.com/luckyPipewrench/pipelock/blob/main/docs/configuration.mdthat the full DLP / MCP / subdomain-entropy pipeline runs on the HTTP path; some pipelock layers may be gated on CONNECT only. - CA installation in the Anthropic-provided Claude Code image.
The base image determines whether
update-ca-certificates(Debian) orupdate-ca-trust(Red Hat) applies. Confirm against theDockerfilebefore writing the provisioner; v1 assumes Debian (node:22-slim). - HTTP/2 ALPN end-to-end. Node's HTTP client negotiates
h2via ALPN. Confirm the pinned mitmproxy version speaksh2to both halves without silently downgrading tohttp/1.1, which would be a noticeable performance regression on bulk transfers. - Selective-bump policy surface. Where does the
"tunnel this hostname blindly" decision live when (not if) a
pinning host appears? Recommended shape per research:
bottle.egress.tls_bump_ignore: ["example.com"], a list of hostnames mitmproxy passes through viaignore_hosts. Defer until needed; record the shape so the follow-up is mechanical. - CA generation: Python
cryptographyvs. shelled-outopenssl. Addingcryptographybrings a substantial transitive graph; shelling toopensslkeeps the host-side prepare step dep-light. Decide at impl time based on what's already on the runner. Either way, the CA is per-bottle and ephemeral. - Domain-fronting verification. Once pipelock sees the inner
Host/:authority, comparing it to the outerCONNECTtarget catches domain fronting. Whether pipelock has a rule for this or we need to add one is a follow-up; out of scope here. - Dry-run preflight rendering of the CA. Show the fingerprint but never the private key. Confirm the exact dry-run JSON shape during implementation; the field set is part of the CLI's user- facing contract (per PRD 0003 §to_dict notes).
References
docs/research/tls-mitm-for-pipelock.md— primary source; this PRD implements the recommendation in §Recommendation (Topology A).docs/research/pipelock-assessment.md§Scope gaps — names the TLS-inspection gap closed here.docs/prds/0001-per-agent-egress-proxy-via-pipelock.md— egress-proxy baseline this PRD extends.docs/prds/0003-bottle-backend-abstraction.md— backend ABC contract this PRD adds aprovision_camethod to.docs/prds/0004-split-out-provisioners.md— per-provisioner module pattern reused for the new CA provisioner.- mitmproxy: https://mitmproxy.org, https://github.com/mitmproxy/mitmproxy
- mitmproxy
upstream_proxymode: https://docs.mitmproxy.org/stable/concepts/modes/#upstream-proxy - mitmproxy CA cert installation: https://docs.mitmproxy.org/stable/concepts/certificates/
- Node
NODE_EXTRA_CA_CERTS: https://nodejs.org/api/cli.html#node_extra_ca_certsfile