docs(prd): add 0005 mitmproxy TLS interception
Captures the design for putting a mitmproxy sidecar in front of pipelock on the egress path so pipelock's body / header / MCP scanners see plaintext for the HTTPS hosts in the default allowlist. Implements Topology A from docs/research/tls-mitm-for-pipelock.md with a per-bottle ephemeral CA, no manifest schema change in v1, and selective-bumping deferred until a pinning host appears. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,371 @@
|
||||
# PRD 0005: mitmproxy TLS interception for pipelock content scanning
|
||||
|
||||
- **Status:** Draft
|
||||
- **Author:** didericis
|
||||
- **Created:** 2026-05-12
|
||||
|
||||
## Summary
|
||||
|
||||
Add a per-bottle **mitmproxy** sidecar in front of pipelock on the
|
||||
egress path so pipelock's DLP, subdomain-entropy, and MCP scanners
|
||||
fire on the plaintext bodies of HTTPS requests instead of only the
|
||||
opaque ciphertext that follows a `CONNECT`. mitmproxy terminates the
|
||||
agent's TLS, hands plaintext HTTP to pipelock as an upstream
|
||||
forward proxy, and re-establishes TLS to the real destination. A
|
||||
fresh ephemeral CA is minted per bottle; the CA private key never
|
||||
leaves the sidecar, and the public cert is wired into the agent
|
||||
container's trust store at launch.
|
||||
|
||||
## Problem
|
||||
|
||||
PRD 0001 wired pipelock onto every bottle's egress, but the current
|
||||
topology only sees `CONNECT` hostnames and opaque TLS bytes:
|
||||
|
||||
```
|
||||
agent --HTTPS_PROXY--> pipelock --CONNECT host:443--> internet
|
||||
\____________________________
|
||||
opaque TLS bytes
|
||||
```
|
||||
|
||||
What pipelock cannot scan in this mode is documented in
|
||||
`docs/research/tls-mitm-for-pipelock.md` §What pipelock cannot see
|
||||
today: request URLs and methods, request and response headers,
|
||||
request and response bodies, MCP JSON-RPC payloads, inner-vs-outer
|
||||
hostname (the domain-fronting check), and WebSocket frames inside a
|
||||
TLS-wrapped upgrade. The 48-pattern DLP layer this project relies on
|
||||
in PRD 0001 is therefore inert against every host in the current
|
||||
`DEFAULT_ALLOWLIST` — all of which are HTTPS-only.
|
||||
|
||||
The integration test added in `tests/integration/test_pipelock_blocks_secret_post.py`
|
||||
demonstrates the gap concretely: pipelock's body-scan layer only
|
||||
fires when the agent is forced to send plain HTTP. Real Claude Code
|
||||
traffic to `api.anthropic.com` goes over CONNECT-tunneled TLS and
|
||||
slips past the scanner.
|
||||
|
||||
`pipelock-assessment.md` §Scope gaps names this as a known
|
||||
limitation of the proxy-without-TLS-inspection shape. Closing it is
|
||||
the explicit motivation for `tls-mitm-for-pipelock.md`, whose
|
||||
recommendation this PRD implements.
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
The feature works when all of the following are observable:
|
||||
|
||||
- A Node request from inside a launched bottle to a CONNECT-bumped
|
||||
HTTPS host (e.g. `https://api.anthropic.com/dlp-probe`) carrying a
|
||||
pipelock-recognized credential pattern in the body returns 403 from
|
||||
the proxy, not a response from the upstream. The existing
|
||||
`test_pipelock_blocks_secret_post` test path becomes the HTTPS
|
||||
variant of this assertion.
|
||||
- Claude Code itself reaches `api.anthropic.com` end-to-end through
|
||||
the bottle and completes a chat round-trip. No TLS-trust errors
|
||||
in the agent process.
|
||||
- mitmproxy's TLS-handshake log lines and pipelock's `body_dlp`
|
||||
event lines both appear for the same outbound request, confirming
|
||||
the two-stage path is active.
|
||||
|
||||
The feature is **done** when all of the following ship:
|
||||
|
||||
- A new `MitmproxyProxy` class with the same `prepare` / `start` /
|
||||
`stop` lifecycle shape as `PipelockProxy`, wired into the Docker
|
||||
backend's launch step.
|
||||
- The bottle launch step generates a per-bottle ephemeral CA in
|
||||
`stage_dir`, starts the mitmproxy sidecar with that CA on the
|
||||
per-bottle internal network, copies the CA public cert into the
|
||||
agent container's trust store, and points the agent's
|
||||
`HTTPS_PROXY` / `HTTP_PROXY` at mitmproxy.
|
||||
- mitmproxy's upstream is the existing pipelock sidecar; pipelock
|
||||
sees plaintext HTTP from mitmproxy for every previously-HTTPS
|
||||
request.
|
||||
- On bottle teardown the mitmproxy sidecar is removed and the
|
||||
ephemeral CA private key is gone with it.
|
||||
- An integration test (variant of `test_pipelock_blocks_secret_post`)
|
||||
proves pipelock now blocks a credential POST that goes out over
|
||||
HTTPS rather than plain HTTP.
|
||||
- An integration test proves a non-credential HTTPS request to an
|
||||
allowlisted host (e.g. CONNECT-then-GET on `raw.githubusercontent.com`)
|
||||
succeeds end-to-end with mitmproxy in the path (no TLS-trust
|
||||
errors, response body received).
|
||||
- The dry-run preflight (`start --dry-run`) shows the mitmproxy
|
||||
sidecar in both the text and `--format=json` output alongside the
|
||||
existing pipelock entry.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **Topology C** — extending pipelock itself to terminate TLS. That
|
||||
is the cleanest long-term shape per the research note's
|
||||
recommendation but is substantial Go work and hits the
|
||||
Apache-2.0-vs-ELv2 question. Deferred.
|
||||
- **Topology D** — driving mitmproxy with a pipelock `/scan` HTTP
|
||||
endpoint. Requires a pipelock surface that doesn't exist today.
|
||||
Deferred.
|
||||
- **Persistent or shared CA across bottles.** Each bottle gets a
|
||||
fresh CA generated at start and destroyed at teardown. No CA
|
||||
storage on the host, no cross-bottle reuse.
|
||||
- **Selective bumping ("ignore_hosts") as a v1 manifest field.**
|
||||
v1 bumps every CONNECT. If a future allowlisted host turns out to
|
||||
pin (Mobile / Chromium-style cert pinning), a follow-up PRD adds
|
||||
the per-host opt-out — likely a `bottle.egress.tls_bump_ignore`
|
||||
field. See Open questions.
|
||||
- **HTTP/3 / QUIC.** mitmproxy's HTTP/3 support is experimental.
|
||||
v1 relies on the v1-egress iptables layer (separate PRD) blocking
|
||||
UDP/443 to force clients onto HTTP/2 over TCP, which mitmproxy
|
||||
inspects normally.
|
||||
- **Raw TCP / non-HTTP TLS interception.** mitmproxy supports it
|
||||
via `--mode reverse:`, not in CONNECT-bump mode. SSH and any
|
||||
future raw-TCP egress route around mitmproxy entirely.
|
||||
- **Trust-store rewiring for non-Debian agent base images.** The
|
||||
current `Dockerfile` is `node:22-slim` (Debian). If a future base
|
||||
switches to Red-Hat-family, the `update-ca-certificates` step
|
||||
becomes `update-ca-trust`. Out of scope until the base changes.
|
||||
|
||||
## Scope
|
||||
|
||||
### In scope
|
||||
|
||||
- New `claude_bottle/mitmproxy.py` mirroring `claude_bottle/pipelock.py`:
|
||||
config helpers (no backend-specific Docker calls), the
|
||||
`MitmproxyProxy` abstract class, and the per-bottle CA generation
|
||||
helpers.
|
||||
- New `claude_bottle/backend/docker/mitmproxy.py` mirroring
|
||||
`claude_bottle/backend/docker/pipelock.py`: `DockerMitmproxyProxy`
|
||||
with the Docker-specific `start` / `stop` lifecycle, the sidecar
|
||||
container name scheme, and the image pin.
|
||||
- New provisioner: `claude_bottle/backend/docker/provision/ca.py`,
|
||||
installing the CA public cert into the agent container at
|
||||
`/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, running
|
||||
`update-ca-certificates`, and exporting `NODE_EXTRA_CA_CERTS` /
|
||||
`SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` env vars to the agent
|
||||
process. The provisioner runs from `BottleBackend.provision` in
|
||||
the same orchestration as `prompt`, `skills`, `ssh`, `git`.
|
||||
- Per-agent network reshuffle in `DockerBottleBackend.launch`:
|
||||
- internal network is unchanged (mitmproxy + pipelock + agent)
|
||||
- agent's `HTTPS_PROXY` / `HTTP_PROXY` change from pointing at the
|
||||
pipelock service name to the mitmproxy service name
|
||||
- mitmproxy's `upstream_proxy` config points at the pipelock
|
||||
service name on the internal network
|
||||
- `DockerBottlePlan` grows a `mitmproxy_plan` field analogous to the
|
||||
existing `proxy_plan` (the pipelock one) so prepare-time state
|
||||
rides on the plan.
|
||||
- Dry-run preflight (`start --dry-run` text + JSON) renders the
|
||||
mitmproxy line and surfaces the CA fingerprint shown in the
|
||||
bottle's trust store, so the operator can verify what's been
|
||||
installed.
|
||||
- Two new integration tests under `tests/integration/`:
|
||||
- `test_mitmproxy_blocks_secret_https_post.py` — the HTTPS
|
||||
variant of the existing `_blocks_secret_post` test.
|
||||
- `test_mitmproxy_allows_normal_https.py` — confirms a plain
|
||||
HTTPS GET to a non-credential-bearing path through mitmproxy +
|
||||
pipelock returns the upstream response, asserting no trust /
|
||||
handshake breakage.
|
||||
- Unit tests for the new config builder (mirroring the pipelock
|
||||
YAML unit tests) and for the CA generation helper.
|
||||
|
||||
### Out of scope
|
||||
|
||||
- The v1 iptables + dnsmasq layer (separate PRD; see
|
||||
`network-egress-guard.md`). mitmproxy covers HTTP/HTTPS only.
|
||||
Raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer.
|
||||
- Pipelock config changes. Pipelock continues to load the YAML PRD
|
||||
0001 already generates. mitmproxy is opaque to it; pipelock just
|
||||
sees plain HTTP from a forward-proxy client.
|
||||
- A bottle-level toggle to skip mitmproxy entirely. v1 always wires
|
||||
it in. If a use case appears for an unintercepted bottle
|
||||
(e.g. testing pipelock's CONNECT-mode behavior in isolation),
|
||||
that's a follow-up.
|
||||
- Pinning-host detection automation. The cost of finding out (per
|
||||
the research note) is a single 5-minute test before adding a
|
||||
host; it stays a manual step.
|
||||
|
||||
## Proposed Design
|
||||
|
||||
### Topology
|
||||
|
||||
```
|
||||
agent --HTTPS_PROXY--> mitmproxy --HTTP_PROXY--> pipelock --> internet
|
||||
(bump TLS) (scan plain) (real TLS)
|
||||
```
|
||||
|
||||
All three containers live on the same per-bottle internal Docker
|
||||
network. mitmproxy and pipelock are both attached to the per-bottle
|
||||
egress bridge so they can reach the host network; the agent has no
|
||||
default route, exactly as today.
|
||||
|
||||
Concretely:
|
||||
|
||||
- `agent` sets `HTTPS_PROXY=http://claude-bottle-mitm-<slug>:<port>`.
|
||||
Currently this points at `claude-bottle-pipelock-<slug>`. The
|
||||
hostname swap is the only agent-side env change.
|
||||
- `mitmproxy` runs with `--mode upstream:http://claude-bottle-pipelock-<slug>:<pipelock-port>`
|
||||
so its decrypted plaintext is forwarded to pipelock as a regular
|
||||
upstream forward-proxy request. (Research open question #1 calls
|
||||
this out: mitmproxy 10+ documentation says `upstream` mode forwards
|
||||
the original request shape; verify against the pinned version at
|
||||
implementation time. If forwarding wraps a new CONNECT, fall back
|
||||
to `regular` mode with a chained proxy declared in mitmproxy's
|
||||
config and route plain HTTP to pipelock by hand.)
|
||||
- `pipelock` continues to listen on its existing port and receives
|
||||
plain HTTP from mitmproxy. No pipelock config change.
|
||||
|
||||
### New components
|
||||
|
||||
Two new modules, matching PRD 0001's split between
|
||||
backend-agnostic config and backend-specific lifecycle:
|
||||
|
||||
- **`claude_bottle/mitmproxy.py`** — backend-agnostic. The config
|
||||
builder (mitmproxy YAML / TOML — confirm format), the abstract
|
||||
`MitmproxyProxy` class with `prepare(...)` writing the config and
|
||||
the ephemeral CA into `stage_dir`, the CA generation helper
|
||||
(RSA-2048 or ECDSA-P256 — pick at impl time, research suggests
|
||||
ECDSA for cert-gen speed), and constants for the sidecar's
|
||||
internal-network port and image pin.
|
||||
- **`claude_bottle/backend/docker/mitmproxy.py`** — Docker
|
||||
implementation. `DockerMitmproxyProxy(MitmproxyProxy)` with
|
||||
`start(plan)` doing `docker create` / `docker cp` / `docker
|
||||
network connect` / `docker start` analogous to
|
||||
`DockerPipelockProxy.start`. `stop(target)` removes the sidecar
|
||||
idempotently.
|
||||
|
||||
The provisioner that installs the CA cert into the agent's trust
|
||||
store lives at `claude_bottle/backend/docker/provision/ca.py` and
|
||||
plugs into the existing `BottleBackend.provision` orchestration. The
|
||||
abstract `BottleBackend.provision_ca` method joins
|
||||
`provision_prompt` / `provision_skills` / `provision_ssh` /
|
||||
`provision_git` on the base class (PRD 0004's pattern), with a
|
||||
default no-op implementation so other backends don't break when
|
||||
they don't yet implement it.
|
||||
|
||||
### CA lifecycle
|
||||
|
||||
Per `tls-mitm-for-pipelock.md` §CA lifecycle:
|
||||
|
||||
- **Generation.** Host-side in `MitmproxyProxy.prepare`, written to
|
||||
`stage_dir/mitm-ca.key` (mode 600) and `stage_dir/mitm-ca.crt`
|
||||
(mode 644). The `.key` is copied into the mitmproxy container at
|
||||
start; nothing else touches it.
|
||||
- **Bottle injection.** `provision_ca` copies only the public
|
||||
`.crt` into the agent container at
|
||||
`/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, runs
|
||||
`update-ca-certificates` as root inside the container, and sets
|
||||
`NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/claude-bottle-mitm.crt`,
|
||||
`SSL_CERT_FILE`, and `REQUESTS_CA_BUNDLE` for the agent process.
|
||||
Belt-and-suspenders because some libraries honor only env vars.
|
||||
- **Teardown.** The mitmproxy sidecar container is destroyed; the
|
||||
CA key vanishes with it. Nothing persists on the host outside
|
||||
`stage_dir`, which the start command already deletes in its
|
||||
finally block.
|
||||
- **Cost.** ECDSA-P256 CA + per-host leaf generation runs in
|
||||
milliseconds; the per-bottle Docker pull and network plumbing
|
||||
dominate startup time.
|
||||
|
||||
### Data model changes
|
||||
|
||||
None in v1. The manifest schema is unchanged. mitmproxy is always
|
||||
on for every bottle once this PRD ships.
|
||||
|
||||
A future selective-bump knob (per `tls-mitm-for-pipelock.md` open
|
||||
question #5) would land on `bottle.egress.tls_bump_ignore` as a
|
||||
list of hostnames. The shape mirrors `egress.allowlist`. Adding it
|
||||
later is a strictly additive change.
|
||||
|
||||
### Existing code touched
|
||||
|
||||
- **`claude_bottle/backend/docker/launch.py`** — bring up the
|
||||
mitmproxy sidecar after the pipelock sidecar but before the agent
|
||||
container, repoint the agent's `HTTPS_PROXY` / `HTTP_PROXY` env
|
||||
flags, register an `ExitStack` callback to stop mitmproxy on
|
||||
teardown.
|
||||
- **`claude_bottle/backend/docker/prepare.py`** — call into
|
||||
`MitmproxyProxy.prepare(...)` alongside the existing
|
||||
`PipelockProxy.prepare(...)`, populate
|
||||
`DockerBottlePlan.mitmproxy_plan`.
|
||||
- **`claude_bottle/backend/docker/backend.py`** — add the
|
||||
`DockerMitmproxyProxy` instance attribute (`self._mitm`) and
|
||||
thread it through `launch` + cleanup, mirroring the existing
|
||||
`self._proxy` pattern.
|
||||
- **`claude_bottle/backend/docker/bottle_plan.py`** — new
|
||||
`mitmproxy_plan: MitmproxyProxyPlan` field on
|
||||
`DockerBottlePlan`. `print()` and `to_dict()` learn to render it.
|
||||
- **`claude_bottle/backend/__init__.py`** — abstract
|
||||
`BottleBackend.provision_ca(plan, target)` joins the other four
|
||||
provisioners. Default impl is a no-op (so a future fly backend
|
||||
isn't forced to implement TLS interception in v1).
|
||||
- **`tests/integration/`** — two new tests as described above.
|
||||
- **`tests/unit/`** — config-builder unit tests; CA-helper unit
|
||||
tests; updated dry-run-plan test pinning the mitmproxy entry.
|
||||
|
||||
### External dependencies
|
||||
|
||||
- **mitmproxy Docker image** pulled from
|
||||
`mitmproxy/mitmproxy@sha256:<digest>`. The digest is pinned in
|
||||
`claude_bottle/mitmproxy.py` and bumped deliberately, mirroring
|
||||
the pipelock pin. Tag line `mitmproxy/mitmproxy:11.x` per
|
||||
research §Image pin for mitmproxy.
|
||||
- No new host-side runtimes. CA generation uses Python's `cryptography`
|
||||
if it's already a transitive dep; otherwise use `openssl` shelled
|
||||
out from the host-side prepare step. Decide at impl time after
|
||||
confirming what's available on the runner without adding deps.
|
||||
|
||||
## Open questions
|
||||
|
||||
- **mitmproxy upstream-proxy mode mechanics.** Whether `upstream`
|
||||
mode forwards decrypted plaintext to pipelock or re-wraps it in a
|
||||
CONNECT. Documented behavior changed between mitmproxy 8 and 10.
|
||||
Needs verification against the pinned version at impl time. If
|
||||
`upstream` re-wraps, fall back to `regular` mode plus a chained
|
||||
proxy directive routing plain HTTP to pipelock.
|
||||
- **Pipelock plain-HTTP scanning coverage.** Pipelock's
|
||||
`forward_proxy.enabled: true` accepts both `GET http://…` and
|
||||
`CONNECT host:443`. Confirm by reading
|
||||
`github.com/luckyPipewrench/pipelock/blob/main/docs/configuration.md`
|
||||
that the full DLP / MCP / subdomain-entropy pipeline runs on the
|
||||
HTTP path; some pipelock layers may be gated on CONNECT only.
|
||||
- **CA installation in the Anthropic-provided Claude Code image.**
|
||||
The base image determines whether `update-ca-certificates`
|
||||
(Debian) or `update-ca-trust` (Red Hat) applies. Confirm against
|
||||
the `Dockerfile` before writing the provisioner; v1 assumes
|
||||
Debian (`node:22-slim`).
|
||||
- **HTTP/2 ALPN end-to-end.** Node's HTTP client negotiates `h2`
|
||||
via ALPN. Confirm the pinned mitmproxy version speaks `h2` to
|
||||
both halves without silently downgrading to `http/1.1`, which
|
||||
would be a noticeable performance regression on bulk transfers.
|
||||
- **Selective-bump policy surface.** Where does the
|
||||
"tunnel this hostname blindly" decision live when (not if) a
|
||||
pinning host appears? Recommended shape per research:
|
||||
`bottle.egress.tls_bump_ignore: ["example.com"]`, a list of
|
||||
hostnames mitmproxy passes through via `ignore_hosts`. Defer
|
||||
until needed; record the shape so the follow-up is mechanical.
|
||||
- **CA generation: Python `cryptography` vs. shelled-out
|
||||
`openssl`.** Adding `cryptography` brings a substantial transitive
|
||||
graph; shelling to `openssl` keeps the host-side prepare step
|
||||
dep-light. Decide at impl time based on what's already on the
|
||||
runner. Either way, the CA is per-bottle and ephemeral.
|
||||
- **Domain-fronting verification.** Once pipelock sees the inner
|
||||
`Host` / `:authority`, comparing it to the outer `CONNECT` target
|
||||
catches domain fronting. Whether pipelock has a rule for this or
|
||||
we need to add one is a follow-up; out of scope here.
|
||||
- **Dry-run preflight rendering of the CA.** Show the fingerprint
|
||||
but never the private key. Confirm the exact dry-run JSON shape
|
||||
during implementation; the field set is part of the CLI's user-
|
||||
facing contract (per PRD 0003 §to_dict notes).
|
||||
|
||||
## References
|
||||
|
||||
- `docs/research/tls-mitm-for-pipelock.md` — primary source; this
|
||||
PRD implements the recommendation in §Recommendation (Topology A).
|
||||
- `docs/research/pipelock-assessment.md` §Scope gaps — names the
|
||||
TLS-inspection gap closed here.
|
||||
- `docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` —
|
||||
egress-proxy baseline this PRD extends.
|
||||
- `docs/prds/0003-bottle-backend-abstraction.md` — backend ABC
|
||||
contract this PRD adds a `provision_ca` method to.
|
||||
- `docs/prds/0004-split-out-provisioners.md` — per-provisioner
|
||||
module pattern reused for the new CA provisioner.
|
||||
- mitmproxy: <https://mitmproxy.org>,
|
||||
<https://github.com/mitmproxy/mitmproxy>
|
||||
- mitmproxy `upstream_proxy` mode:
|
||||
<https://docs.mitmproxy.org/stable/concepts/modes/#upstream-proxy>
|
||||
- mitmproxy CA cert installation:
|
||||
<https://docs.mitmproxy.org/stable/concepts/certificates/>
|
||||
- Node `NODE_EXTRA_CA_CERTS`:
|
||||
<https://nodejs.org/api/cli.html#node_extra_ca_certsfile>
|
||||
Reference in New Issue
Block a user