docs(prd): update 0005 after open-question walkthrough
Re-grounds the design after walking the eight original open questions interactively. Two structural changes: - Topology A → A'. A spike confirmed mitmproxy's `upstream` mode re-wraps decrypted flows in a new CONNECT to the upstream proxy, which would have left pipelock seeing only ciphertext (the very gap this PRD set out to close). The fix is to run mitmproxy in `regular` mode and ship a vendored Python addon that forwards each decrypted request to pipelock as a plain HTTP forward-proxy call. Pipelock is unchanged. - mitmproxy owns CA generation. The research note's preference for a host-side openssl / cryptography CA turned out to be unnecessary — mitmproxy generates a fresh CA on startup; the public cert is `docker cp`'d into the agent. No new host-side crypto deps. Dry-run can't render a fingerprint (CA doesn't exist yet); launches print it once to stderr. Other Q3–Q8 resolutions folded in: Debian-base `update-ca-certificates` confirmed, mitmproxy 12 verified to speak h2 on both halves, selective-bump deferred to v2, response-body and MCP scanning deferred to v2, domain-fronting deferred to v2. Open questions rewritten — what remains is addon-implementation specifics (pipelock 403-body fingerprint, env-var inheritance through docker exec, addon test fixtures). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -1,20 +1,30 @@
|
||||
# PRD 0005: mitmproxy TLS interception for pipelock content scanning
|
||||
|
||||
- **Status:** Draft
|
||||
- **Status:** Draft (updated 2026-05-12 after open-question walkthrough)
|
||||
- **Author:** didericis
|
||||
- **Created:** 2026-05-12
|
||||
|
||||
## Summary
|
||||
|
||||
Add a per-bottle **mitmproxy** sidecar in front of pipelock on the
|
||||
egress path so pipelock's DLP, subdomain-entropy, and MCP scanners
|
||||
fire on the plaintext bodies of HTTPS requests instead of only the
|
||||
opaque ciphertext that follows a `CONNECT`. mitmproxy terminates the
|
||||
agent's TLS, hands plaintext HTTP to pipelock as an upstream
|
||||
forward proxy, and re-establishes TLS to the real destination. A
|
||||
fresh ephemeral CA is minted per bottle; the CA private key never
|
||||
leaves the sidecar, and the public cert is wired into the agent
|
||||
container's trust store at launch.
|
||||
egress path. mitmproxy bumps the agent's TLS CONNECT, decrypts the
|
||||
inner HTTP, and hands each request to a vendored Python addon. The
|
||||
addon forwards the decrypted request to pipelock as a plain HTTP
|
||||
forward-proxy call so pipelock's DLP, URL-scan, and header-scan
|
||||
layers fire on real bodies. On the verdict, the addon either
|
||||
short-circuits the flow with a 403 (block) or lets mitmproxy
|
||||
proceed to the real upstream (allow). mitmproxy itself generates
|
||||
the ephemeral per-bottle CA on startup; the public cert is copied
|
||||
into the agent's trust store and the private key dies with the
|
||||
sidecar on teardown.
|
||||
|
||||
This is Topology A' from `docs/research/tls-mitm-for-pipelock.md` —
|
||||
a variant of the research note's Topology A after a spike showed
|
||||
mitmproxy's `upstream` mode re-wraps decrypted flows in a new
|
||||
CONNECT to the upstream proxy (which would defeat the entire
|
||||
point). The addon recovers the design by emitting plain HTTP to
|
||||
pipelock explicitly instead of relying on mitmproxy's `upstream`
|
||||
chaining.
|
||||
|
||||
## Problem
|
||||
|
||||
@@ -45,7 +55,8 @@ slips past the scanner.
|
||||
`pipelock-assessment.md` §Scope gaps names this as a known
|
||||
limitation of the proxy-without-TLS-inspection shape. Closing it is
|
||||
the explicit motivation for `tls-mitm-for-pipelock.md`, whose
|
||||
recommendation this PRD implements.
|
||||
recommendation this PRD implements (with the addon adjustment
|
||||
forced by the upstream-mode spike).
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
@@ -53,306 +64,361 @@ The feature works when all of the following are observable:
|
||||
|
||||
- A Node request from inside a launched bottle to a CONNECT-bumped
|
||||
HTTPS host (e.g. `https://api.anthropic.com/dlp-probe`) carrying a
|
||||
pipelock-recognized credential pattern in the body returns 403 from
|
||||
the proxy, not a response from the upstream. The existing
|
||||
`test_pipelock_blocks_secret_post` test path becomes the HTTPS
|
||||
variant of this assertion.
|
||||
pipelock-recognized credential pattern in the body returns 403
|
||||
from the bottle's egress chain — not a response from the upstream.
|
||||
The existing `test_pipelock_blocks_secret_post` test path becomes
|
||||
the HTTPS variant of this assertion.
|
||||
- A plain HTTPS GET from inside the bottle to an allowlisted host
|
||||
with no credential pattern (e.g. `GET https://raw.githubusercontent.com/...`)
|
||||
returns the real upstream response — the addon doesn't break
|
||||
clean traffic.
|
||||
- Claude Code itself reaches `api.anthropic.com` end-to-end through
|
||||
the bottle and completes a chat round-trip. No TLS-trust errors
|
||||
in the agent process.
|
||||
- mitmproxy's TLS-handshake log lines and pipelock's `body_dlp`
|
||||
event lines both appear for the same outbound request, confirming
|
||||
the two-stage path is active.
|
||||
- mitmproxy's flow log and pipelock's `body_dlp` / `header_dlp` /
|
||||
`core_dlp` event lines both appear for the same outbound request,
|
||||
confirming the two-stage path is active.
|
||||
|
||||
The feature is **done** when all of the following ship:
|
||||
|
||||
- A new `MitmproxyProxy` class with the same `prepare` / `start` /
|
||||
`stop` lifecycle shape as `PipelockProxy`, wired into the Docker
|
||||
backend's launch step.
|
||||
- The bottle launch step generates a per-bottle ephemeral CA in
|
||||
`stage_dir`, starts the mitmproxy sidecar with that CA on the
|
||||
per-bottle internal network, copies the CA public cert into the
|
||||
agent container's trust store, and points the agent's
|
||||
`HTTPS_PROXY` / `HTTP_PROXY` at mitmproxy.
|
||||
- mitmproxy's upstream is the existing pipelock sidecar; pipelock
|
||||
sees plaintext HTTP from mitmproxy for every previously-HTTPS
|
||||
request.
|
||||
- A vendored Python addon at `claude_bottle/mitmproxy/addon.py`
|
||||
that mitmproxy loads on startup via `mitmdump -s ...`. The sidecar
|
||||
runs in `regular` mode (default), not `upstream` mode.
|
||||
- The bottle launch step starts the mitmproxy sidecar, waits for
|
||||
the sidecar-internal CA to be generated, copies the CA public
|
||||
cert into the agent at `/usr/local/share/ca-certificates/claude-bottle-mitm.crt`,
|
||||
runs `update-ca-certificates` inside the agent, and threads the
|
||||
`NODE_EXTRA_CA_CERTS` / `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE`
|
||||
env trio onto the agent container's runtime env.
|
||||
- The agent's `HTTPS_PROXY` / `HTTP_PROXY` point at the mitmproxy
|
||||
sidecar (where they pointed at pipelock under PRD 0001).
|
||||
- pipelock is otherwise unchanged. It continues to load the YAML
|
||||
PRD 0001 generates and runs its existing scanning pipeline; the
|
||||
addon talks to it via the same forward-proxy interface today's
|
||||
`test_pipelock_blocks_secret_post` uses.
|
||||
- On bottle teardown the mitmproxy sidecar is removed and the
|
||||
ephemeral CA private key is gone with it.
|
||||
- An integration test (variant of `test_pipelock_blocks_secret_post`)
|
||||
proves pipelock now blocks a credential POST that goes out over
|
||||
HTTPS rather than plain HTTP.
|
||||
- An integration test proves a non-credential HTTPS request to an
|
||||
allowlisted host (e.g. CONNECT-then-GET on `raw.githubusercontent.com`)
|
||||
succeeds end-to-end with mitmproxy in the path (no TLS-trust
|
||||
errors, response body received).
|
||||
- An HTTPS variant of `test_pipelock_blocks_secret_post` proves
|
||||
pipelock now blocks a credential POST over HTTPS rather than
|
||||
plain HTTP.
|
||||
- An integration test proves a non-credential HTTPS GET through
|
||||
the chain returns the upstream's real response.
|
||||
- The dry-run preflight (`start --dry-run`) shows the mitmproxy
|
||||
sidecar in both the text and `--format=json` output alongside the
|
||||
existing pipelock entry.
|
||||
sidecar in both text and `--format=json` output. The JSON
|
||||
contract gains a reserved `egress.mitm: { "enabled": true, "ca_fingerprint": null }`
|
||||
block; fingerprint is always null at dry-run because the CA
|
||||
doesn't exist yet. Real launches emit a one-line stderr log:
|
||||
`claude-bottle: mitm ca fingerprint: <sha256-first-16>...`.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **Topology C** — extending pipelock itself to terminate TLS. That
|
||||
is the cleanest long-term shape per the research note's
|
||||
recommendation but is substantial Go work and hits the
|
||||
Apache-2.0-vs-ELv2 question. Deferred.
|
||||
- **Topology D** — driving mitmproxy with a pipelock `/scan` HTTP
|
||||
endpoint. Requires a pipelock surface that doesn't exist today.
|
||||
Deferred.
|
||||
- **Topology C** — extending pipelock itself to terminate TLS. The
|
||||
research note's recommended long-term shape, but substantial Go
|
||||
work plus the Apache-2.0-vs-ELv2 question. Deferred.
|
||||
- **Topology D as canonical** — mitmproxy with a pipelock `/scan`
|
||||
HTTP endpoint. The addon in this PRD talks to pipelock via its
|
||||
existing forward-proxy interface; no upstream pipelock change
|
||||
needed.
|
||||
- **Persistent or shared CA across bottles.** Each bottle gets a
|
||||
fresh CA generated at start and destroyed at teardown. No CA
|
||||
storage on the host, no cross-bottle reuse.
|
||||
fresh CA generated by its own mitmproxy at startup.
|
||||
- **Selective bumping ("ignore_hosts") as a v1 manifest field.**
|
||||
v1 bumps every CONNECT. If a future allowlisted host turns out to
|
||||
pin (Mobile / Chromium-style cert pinning), a follow-up PRD adds
|
||||
the per-host opt-out — likely a `bottle.egress.tls_bump_ignore`
|
||||
field. See Open questions.
|
||||
v1 bumps every CONNECT. If a future allowlisted host turns out
|
||||
to pin (Mobile / Chromium-style cert pinning), a follow-up PRD
|
||||
adds the per-host opt-out via `bottle.egress.tls_bump_ignore`.
|
||||
Strictly additive.
|
||||
- **HTTP/3 / QUIC.** mitmproxy's HTTP/3 support is experimental.
|
||||
v1 relies on the v1-egress iptables layer (separate PRD) blocking
|
||||
UDP/443 to force clients onto HTTP/2 over TCP, which mitmproxy
|
||||
inspects normally.
|
||||
v1 relies on the v1-egress iptables layer blocking UDP/443 to
|
||||
force clients onto HTTP/2 over TCP, which mitmproxy 12 inspects
|
||||
natively (verified by spike).
|
||||
- **Raw TCP / non-HTTP TLS interception.** mitmproxy supports it
|
||||
via `--mode reverse:`, not in CONNECT-bump mode. SSH and any
|
||||
future raw-TCP egress route around mitmproxy entirely.
|
||||
- **Trust-store rewiring for non-Debian agent base images.** The
|
||||
- **Trust-store rewiring for non-Debian agent images.** The
|
||||
current `Dockerfile` is `node:22-slim` (Debian). If a future base
|
||||
switches to Red-Hat-family, the `update-ca-certificates` step
|
||||
becomes `update-ca-trust`. Out of scope until the base changes.
|
||||
- **Response-body scanning.** Pipelock supports it; we don't wire
|
||||
it in v1 because the addon would need to ferry the upstream
|
||||
response back through pipelock's scanner, which the forward-
|
||||
proxy interface doesn't support cleanly. v2 candidate.
|
||||
- **MCP scanning on the bumped path.** Only fires on MCP-formatted
|
||||
JSON-RPC payloads inside tool calls. Not relevant to plain HTTPS
|
||||
agent traffic and out of v1 scope.
|
||||
- **Domain-fronting verification.** Once the addon sees the inner
|
||||
`Host` / `:authority`, comparing it to the outer CONNECT target
|
||||
catches domain fronting. Worth ~10 lines in the addon, but
|
||||
defer until the rest of v1 is settled.
|
||||
- **Host-side openssl / `cryptography` for CA generation.** The
|
||||
research note's open question on this is resolved by letting
|
||||
mitmproxy itself generate the CA (it does so on first launch).
|
||||
No new host-side crypto.
|
||||
|
||||
## Scope
|
||||
|
||||
### In scope
|
||||
|
||||
- New `claude_bottle/mitmproxy.py` mirroring `claude_bottle/pipelock.py`:
|
||||
config helpers (no backend-specific Docker calls), the
|
||||
`MitmproxyProxy` abstract class, and the per-bottle CA generation
|
||||
helpers.
|
||||
- New `claude_bottle/backend/docker/mitmproxy.py` mirroring
|
||||
`claude_bottle/backend/docker/pipelock.py`: `DockerMitmproxyProxy`
|
||||
with the Docker-specific `start` / `stop` lifecycle, the sidecar
|
||||
container name scheme, and the image pin.
|
||||
- New provisioner: `claude_bottle/backend/docker/provision/ca.py`,
|
||||
installing the CA public cert into the agent container at
|
||||
`/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, running
|
||||
`update-ca-certificates`, and exporting `NODE_EXTRA_CA_CERTS` /
|
||||
`SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` env vars to the agent
|
||||
process. The provisioner runs from `BottleBackend.provision` in
|
||||
the same orchestration as `prompt`, `skills`, `ssh`, `git`.
|
||||
- Per-agent network reshuffle in `DockerBottleBackend.launch`:
|
||||
- internal network is unchanged (mitmproxy + pipelock + agent)
|
||||
- agent's `HTTPS_PROXY` / `HTTP_PROXY` change from pointing at the
|
||||
pipelock service name to the mitmproxy service name
|
||||
- mitmproxy's `upstream_proxy` config points at the pipelock
|
||||
service name on the internal network
|
||||
- `DockerBottlePlan` grows a `mitmproxy_plan` field analogous to the
|
||||
existing `proxy_plan` (the pipelock one) so prepare-time state
|
||||
rides on the plan.
|
||||
- Dry-run preflight (`start --dry-run` text + JSON) renders the
|
||||
mitmproxy line and surfaces the CA fingerprint shown in the
|
||||
bottle's trust store, so the operator can verify what's been
|
||||
installed.
|
||||
- New `claude_bottle/mitmproxy/` package:
|
||||
- `__init__.py` — backend-agnostic. Constants (sidecar port,
|
||||
image-pin digest, the in-container addon path), the abstract
|
||||
`MitmproxyProxy` class with `prepare` / `start` / `stop` shape
|
||||
mirroring `PipelockProxy`, and the small helper that reads the
|
||||
CA fingerprint from a PEM file via `openssl x509 -fingerprint`
|
||||
shelled out.
|
||||
- `addon.py` — the Python addon mitmproxy loads. ~80–150 lines.
|
||||
For each `request` event: forward the decrypted request to
|
||||
pipelock at `http://claude-bottle-pipelock-<slug>:8888` as a
|
||||
plain HTTP forward-proxy call (absolute-URI form). Inspect
|
||||
pipelock's response. If status is 403 *and* the body matches
|
||||
pipelock's known block-event shape, set the flow's response to
|
||||
a 403 with pipelock's body and short-circuit. Otherwise,
|
||||
discard pipelock's response (and any wasted upstream-leg
|
||||
response from pipelock's forwarder) and let mitmproxy proceed
|
||||
to the real upstream.
|
||||
- New `claude_bottle/backend/docker/mitmproxy.py` —
|
||||
`DockerMitmproxyProxy(MitmproxyProxy)` with the Docker-specific
|
||||
start/stop lifecycle. `start(plan)` does `docker create` /
|
||||
`docker cp addon.py …` / `docker network connect` / `docker start`,
|
||||
analogous to the existing `DockerPipelockProxy.start`. Injects
|
||||
`CLAUDE_BOTTLE_PIPELOCK_URL` into the sidecar env so the addon
|
||||
knows where pipelock lives.
|
||||
- New provisioner `claude_bottle/backend/docker/provision/ca.py`.
|
||||
Polls mitmproxy for the cert file, copies it through a host
|
||||
stage dir into the agent, runs `update-ca-certificates` inside
|
||||
the agent, computes the SHA-256 fingerprint, and prints the
|
||||
one-line stderr log.
|
||||
- `BottleBackend.provision_ca(plan, target)` joins the four
|
||||
existing provisioner methods on the abstract base. Default impl
|
||||
is no-op so other backends don't break when they don't yet
|
||||
implement TLS interception.
|
||||
- `DockerBottlePlan` grows a `mitmproxy_plan` field mirroring the
|
||||
existing `proxy_plan`.
|
||||
- Agent container `docker run` invocation:
|
||||
- `HTTPS_PROXY` / `HTTP_PROXY` change from the pipelock service
|
||||
name to the mitmproxy service name.
|
||||
- Three `-e` flags set the CA env trio so they're inherited by
|
||||
the eventual `docker exec claude` (Docker propagates run-time
|
||||
env into exec by default; fallback in Q1 below).
|
||||
- Dry-run preflight rendering of the mitmproxy entry (text + JSON).
|
||||
JSON gains `egress.mitm: { "enabled": true, "ca_fingerprint": null }`.
|
||||
- One stderr log line at launch with the CA fingerprint.
|
||||
- Two new integration tests under `tests/integration/`:
|
||||
- `test_mitmproxy_blocks_secret_https_post.py` — the HTTPS
|
||||
variant of the existing `_blocks_secret_post` test.
|
||||
- `test_mitmproxy_blocks_secret_https_post.py` — HTTPS variant
|
||||
of the existing block-secret test. Asserts pipelock's body
|
||||
DLP fires on a credential POST tunneled through CONNECT.
|
||||
- `test_mitmproxy_allows_normal_https.py` — confirms a plain
|
||||
HTTPS GET to a non-credential-bearing path through mitmproxy +
|
||||
pipelock returns the upstream response, asserting no trust /
|
||||
handshake breakage.
|
||||
- Unit tests for the new config builder (mirroring the pipelock
|
||||
YAML unit tests) and for the CA generation helper.
|
||||
HTTPS GET on an allowlisted host returns the upstream response,
|
||||
isolating the addon's pass-through path from the block path.
|
||||
- Unit tests for the addon's verdict logic (block vs allow on
|
||||
status + body shape, edge cases) using mitmproxy's `mitmproxy.test`
|
||||
flow fixtures. Unit tests for the proxy config builder
|
||||
(mirroring `tests/unit/test_pipelock_yaml.py`).
|
||||
|
||||
### Out of scope
|
||||
|
||||
- The v1 iptables + dnsmasq layer (separate PRD; see
|
||||
`network-egress-guard.md`). mitmproxy covers HTTP/HTTPS only.
|
||||
Raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer.
|
||||
- Pipelock config changes. Pipelock continues to load the YAML PRD
|
||||
0001 already generates. mitmproxy is opaque to it; pipelock just
|
||||
sees plain HTTP from a forward-proxy client.
|
||||
- A bottle-level toggle to skip mitmproxy entirely. v1 always wires
|
||||
it in. If a use case appears for an unintercepted bottle
|
||||
(e.g. testing pipelock's CONNECT-mode behavior in isolation),
|
||||
that's a follow-up.
|
||||
`network-egress-guard.md`). mitmproxy covers HTTP/HTTPS only;
|
||||
raw TCP, UDP, ICMP, and direct DNS still need the IP-level layer.
|
||||
- Pipelock config changes. Pipelock continues to load the YAML
|
||||
PRD 0001 generates; the addon talks to it via the existing
|
||||
forward-proxy interface.
|
||||
- A bottle-level toggle to skip mitmproxy entirely. v1 always
|
||||
wires it in.
|
||||
- Pinning-host detection automation. The cost of finding out (per
|
||||
the research note) is a single 5-minute test before adding a
|
||||
host; it stays a manual step.
|
||||
research) is a single 5-minute test before adding a host; it
|
||||
stays a manual step.
|
||||
- Pipelock upstream contributions for an `X-Pipelock-Verdict` header.
|
||||
Possible follow-up. Until then the addon distinguishes blocks
|
||||
from passes via status + body fingerprint.
|
||||
|
||||
## Proposed Design
|
||||
|
||||
### Topology
|
||||
|
||||
```
|
||||
agent --HTTPS_PROXY--> mitmproxy --HTTP_PROXY--> pipelock --> internet
|
||||
(bump TLS) (scan plain) (real TLS)
|
||||
agent --HTTPS_PROXY--> mitmproxy --addon--> pipelock (scan)
|
||||
(bump TLS) |
|
||||
^ | (verdict via status code)
|
||||
| v
|
||||
+-- on allow ----- real upstream
|
||||
(mitmproxy as client)
|
||||
```
|
||||
|
||||
All three containers live on the same per-bottle internal Docker
|
||||
network. mitmproxy and pipelock are both attached to the per-bottle
|
||||
egress bridge so they can reach the host network; the agent has no
|
||||
default route, exactly as today.
|
||||
egress bridge for real-internet reach; the agent has no default
|
||||
route.
|
||||
|
||||
Concretely:
|
||||
|
||||
- `agent` sets `HTTPS_PROXY=http://claude-bottle-mitm-<slug>:<port>`.
|
||||
Currently this points at `claude-bottle-pipelock-<slug>`. The
|
||||
hostname swap is the only agent-side env change.
|
||||
- `mitmproxy` runs with `--mode upstream:http://claude-bottle-pipelock-<slug>:<pipelock-port>`
|
||||
so its decrypted plaintext is forwarded to pipelock as a regular
|
||||
upstream forward-proxy request. (Research open question #1 calls
|
||||
this out: mitmproxy 10+ documentation says `upstream` mode forwards
|
||||
the original request shape; verify against the pinned version at
|
||||
implementation time. If forwarding wraps a new CONNECT, fall back
|
||||
to `regular` mode with a chained proxy declared in mitmproxy's
|
||||
config and route plain HTTP to pipelock by hand.)
|
||||
- `pipelock` continues to listen on its existing port and receives
|
||||
plain HTTP from mitmproxy. No pipelock config change.
|
||||
- Agent sets `HTTPS_PROXY=http://claude-bottle-mitm-<slug>:<port>`.
|
||||
PRD 0001 had this pointing at pipelock; the hostname swap is the
|
||||
only agent-side env change.
|
||||
- mitmproxy runs in **`regular`** mode (default; no `--mode` flag).
|
||||
It bumps every CONNECT, generates fake leaf certs signed by its
|
||||
own CA, and presents them to the agent.
|
||||
- The addon, loaded via `mitmdump -s /addon/addon.py`, intercepts
|
||||
each decrypted `request` event. It forwards the request to
|
||||
pipelock at `http://claude-bottle-pipelock-<slug>:8888` as a
|
||||
plain HTTP forward-proxy call (absolute-URI form), so pipelock
|
||||
sees the full URL, headers, and body.
|
||||
- The addon inspects pipelock's response. If status is 403 *and*
|
||||
the response body matches pipelock's known block-event shape,
|
||||
the addon sets the mitmproxy flow's response to a 403 with
|
||||
pipelock's body and short-circuits. Otherwise — including the
|
||||
case where pipelock's forwarder attempted the upstream and got
|
||||
a 4xx — the addon discards pipelock's response and lets
|
||||
mitmproxy proceed to the real upstream.
|
||||
- mitmproxy completes the outbound TLS to the real destination
|
||||
using its built-in trust store, just like any other forward
|
||||
proxy. Pipelock is only involved as a scanner.
|
||||
|
||||
The trade-off: pipelock makes a wasted upstream forward attempt
|
||||
for every allowed request (it tries to forward over plain HTTP to
|
||||
a real HTTPS-only host, which fails with the upstream's 4xx). This
|
||||
is benign — the scan completes before forwarding, the verdict
|
||||
reaches the addon, the upstream-side request happens to die in
|
||||
pipelock's forwarder rather than reach the agent. Acceptable cost
|
||||
for the visibility win. A pipelock-side improvement (skip the
|
||||
forward when the addon only needs the scan verdict) is a future
|
||||
optimization.
|
||||
|
||||
### New components
|
||||
|
||||
Two new modules, matching PRD 0001's split between
|
||||
backend-agnostic config and backend-specific lifecycle:
|
||||
|
||||
- **`claude_bottle/mitmproxy.py`** — backend-agnostic. The config
|
||||
builder (mitmproxy YAML / TOML — confirm format), the abstract
|
||||
`MitmproxyProxy` class with `prepare(...)` writing the config and
|
||||
the ephemeral CA into `stage_dir`, the CA generation helper
|
||||
(RSA-2048 or ECDSA-P256 — pick at impl time, research suggests
|
||||
ECDSA for cert-gen speed), and constants for the sidecar's
|
||||
internal-network port and image pin.
|
||||
- **`claude_bottle/backend/docker/mitmproxy.py`** — Docker
|
||||
implementation. `DockerMitmproxyProxy(MitmproxyProxy)` with
|
||||
`start(plan)` doing `docker create` / `docker cp` / `docker
|
||||
network connect` / `docker start` analogous to
|
||||
`DockerPipelockProxy.start`. `stop(target)` removes the sidecar
|
||||
idempotently.
|
||||
|
||||
The provisioner that installs the CA cert into the agent's trust
|
||||
store lives at `claude_bottle/backend/docker/provision/ca.py` and
|
||||
plugs into the existing `BottleBackend.provision` orchestration. The
|
||||
abstract `BottleBackend.provision_ca` method joins
|
||||
`provision_prompt` / `provision_skills` / `provision_ssh` /
|
||||
`provision_git` on the base class (PRD 0004's pattern), with a
|
||||
default no-op implementation so other backends don't break when
|
||||
they don't yet implement it.
|
||||
- `claude_bottle/mitmproxy/__init__.py` — backend-agnostic
|
||||
abstract base, constants, the `openssl x509 -fingerprint` helper.
|
||||
- `claude_bottle/mitmproxy/addon.py` — the scanning addon.
|
||||
Reads pipelock's URL from `CLAUDE_BOTTLE_PIPELOCK_URL` (injected
|
||||
into the sidecar env by the proxy's `start`). For each
|
||||
`request` flow: synchronously POST to pipelock; inspect status
|
||||
+ body; either short-circuit with 403 or fall through.
|
||||
- `claude_bottle/backend/docker/mitmproxy.py` —
|
||||
`DockerMitmproxyProxy(MitmproxyProxy)` with start/stop, the
|
||||
`docker cp` of the addon into the sidecar before `docker start`,
|
||||
and the `CLAUDE_BOTTLE_PIPELOCK_URL` wiring.
|
||||
|
||||
### CA lifecycle
|
||||
|
||||
Per `tls-mitm-for-pipelock.md` §CA lifecycle:
|
||||
Simplified by letting mitmproxy own the generation:
|
||||
|
||||
- **Generation.** Host-side in `MitmproxyProxy.prepare`, written to
|
||||
`stage_dir/mitm-ca.key` (mode 600) and `stage_dir/mitm-ca.crt`
|
||||
(mode 644). The `.key` is copied into the mitmproxy container at
|
||||
start; nothing else touches it.
|
||||
- **Bottle injection.** `provision_ca` copies only the public
|
||||
`.crt` into the agent container at
|
||||
`/usr/local/share/ca-certificates/claude-bottle-mitm.crt`, runs
|
||||
`update-ca-certificates` as root inside the container, and sets
|
||||
`NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/claude-bottle-mitm.crt`,
|
||||
`SSL_CERT_FILE`, and `REQUESTS_CA_BUNDLE` for the agent process.
|
||||
Belt-and-suspenders because some libraries honor only env vars.
|
||||
- **Teardown.** The mitmproxy sidecar container is destroyed; the
|
||||
CA key vanishes with it. Nothing persists on the host outside
|
||||
`stage_dir`, which the start command already deletes in its
|
||||
finally block.
|
||||
- **Cost.** ECDSA-P256 CA + per-host leaf generation runs in
|
||||
milliseconds; the per-bottle Docker pull and network plumbing
|
||||
dominate startup time.
|
||||
- **Generation.** mitmproxy generates a fresh CA on startup
|
||||
inside its container at `/home/mitmproxy/.mitmproxy/mitmproxy-ca-cert.pem`
|
||||
(public) + `mitmproxy-ca.pem` (private). No host-side openssl
|
||||
for *generation*; no host-side Python `cryptography` dep.
|
||||
- **Volume strategy.** Container-internal only. No host bind
|
||||
mount means the CA dies with the container.
|
||||
- **Extraction.** `provision_ca` polls (~1s) for the cert file
|
||||
via `docker exec`, then `docker cp` to host stage dir, then
|
||||
`docker cp` into the agent. Host stage dir gets cleaned up by
|
||||
the existing `start.py` `finally` block.
|
||||
- **Bottle install.**
|
||||
1. `docker cp <host stage>/mitm-ca.crt agent-<slug>:/usr/local/share/ca-certificates/claude-bottle-mitm.crt`
|
||||
2. `docker exec -u 0 agent-<slug> chmod 644 …`
|
||||
3. `docker exec -u 0 agent-<slug> update-ca-certificates`
|
||||
4. Three `-e` flags on `docker run` set the env trio
|
||||
(`NODE_EXTRA_CA_CERTS=…/claude-bottle-mitm.crt`,
|
||||
`SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt`,
|
||||
`REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt`) so
|
||||
`docker exec claude` inherits them.
|
||||
- **Teardown.** Sidecar container removed; CA private key gone.
|
||||
- **Fingerprint.** Computed post-extraction via shelled-out
|
||||
`openssl x509 -fingerprint -sha256 -noout`. Logged once to
|
||||
stderr at launch; never the private key.
|
||||
|
||||
### Data model changes
|
||||
|
||||
None in v1. The manifest schema is unchanged. mitmproxy is always
|
||||
on for every bottle once this PRD ships.
|
||||
None to the manifest schema. The dry-run JSON contract gains a
|
||||
reserved `egress.mitm: { "enabled": true, "ca_fingerprint": null }`
|
||||
block. Fingerprint is always null at dry-run (CA doesn't exist
|
||||
yet) but the field is reserved so future schema additions stay
|
||||
non-breaking.
|
||||
|
||||
A future selective-bump knob (per `tls-mitm-for-pipelock.md` open
|
||||
question #5) would land on `bottle.egress.tls_bump_ignore` as a
|
||||
list of hostnames. The shape mirrors `egress.allowlist`. Adding it
|
||||
later is a strictly additive change.
|
||||
A future selective-bump knob would add
|
||||
`bottle.egress.tls_bump_ignore: [host, ...]` per the research
|
||||
note. Strictly additive when it lands.
|
||||
|
||||
### Existing code touched
|
||||
|
||||
- **`claude_bottle/backend/docker/launch.py`** — bring up the
|
||||
mitmproxy sidecar after the pipelock sidecar but before the agent
|
||||
container, repoint the agent's `HTTPS_PROXY` / `HTTP_PROXY` env
|
||||
flags, register an `ExitStack` callback to stop mitmproxy on
|
||||
teardown.
|
||||
mitmproxy sidecar between pipelock and the agent. Repoint the
|
||||
agent's `HTTPS_PROXY` / `HTTP_PROXY` env flags to mitmproxy.
|
||||
Register an `ExitStack` callback for mitmproxy teardown. Print
|
||||
the CA fingerprint once the sidecar reports ready.
|
||||
- **`claude_bottle/backend/docker/prepare.py`** — call into
|
||||
`MitmproxyProxy.prepare(...)` alongside the existing
|
||||
`PipelockProxy.prepare(...)`, populate
|
||||
`DockerBottlePlan.mitmproxy_plan`.
|
||||
`MitmproxyProxy.prepare(...)` alongside `PipelockProxy.prepare(...)`,
|
||||
populate `DockerBottlePlan.mitmproxy_plan`.
|
||||
- **`claude_bottle/backend/docker/backend.py`** — add the
|
||||
`DockerMitmproxyProxy` instance attribute (`self._mitm`) and
|
||||
thread it through `launch` + cleanup, mirroring the existing
|
||||
`self._proxy` pattern.
|
||||
thread it through `launch` + cleanup, mirroring `self._proxy`.
|
||||
- **`claude_bottle/backend/docker/bottle_plan.py`** — new
|
||||
`mitmproxy_plan: MitmproxyProxyPlan` field on
|
||||
`DockerBottlePlan`. `print()` and `to_dict()` learn to render it.
|
||||
`mitmproxy_plan` field. `print()` and `to_dict()` learn to
|
||||
render the mitmproxy entry and the `egress.mitm` JSON block.
|
||||
- **`claude_bottle/backend/__init__.py`** — abstract
|
||||
`BottleBackend.provision_ca(plan, target)` joins the other four
|
||||
provisioners. Default impl is a no-op (so a future fly backend
|
||||
isn't forced to implement TLS interception in v1).
|
||||
`BottleBackend.provision_ca` joins the four existing
|
||||
provisioners; default no-op.
|
||||
- **`tests/integration/`** — two new tests as described above.
|
||||
- **`tests/unit/`** — config-builder unit tests; CA-helper unit
|
||||
tests; updated dry-run-plan test pinning the mitmproxy entry.
|
||||
- **`tests/unit/`** — addon-verdict tests, mitmproxy-config
|
||||
builder tests, dry-run-plan test updated for the new
|
||||
`egress.mitm` block.
|
||||
|
||||
### External dependencies
|
||||
|
||||
- **mitmproxy Docker image** pulled from
|
||||
`mitmproxy/mitmproxy@sha256:<digest>`. The digest is pinned in
|
||||
`claude_bottle/mitmproxy.py` and bumped deliberately, mirroring
|
||||
the pipelock pin. Tag line `mitmproxy/mitmproxy:11.x` per
|
||||
research §Image pin for mitmproxy.
|
||||
- No new host-side runtimes. CA generation uses Python's `cryptography`
|
||||
if it's already a transitive dep; otherwise use `openssl` shelled
|
||||
out from the host-side prepare step. Decide at impl time after
|
||||
confirming what's available on the runner without adding deps.
|
||||
- **mitmproxy Docker image** pinned by digest on the `12.x` line.
|
||||
Bumped deliberately, mirroring the pipelock pin. Verified by
|
||||
spike to speak h2 on both halves.
|
||||
- No new host-side runtimes. mitmproxy generates the CA;
|
||||
fingerprint via the `openssl` already present on Debian / macOS
|
||||
/ ubuntu-latest runners.
|
||||
|
||||
## Open questions
|
||||
|
||||
- **mitmproxy upstream-proxy mode mechanics.** Whether `upstream`
|
||||
mode forwards decrypted plaintext to pipelock or re-wraps it in a
|
||||
CONNECT. Documented behavior changed between mitmproxy 8 and 10.
|
||||
Needs verification against the pinned version at impl time. If
|
||||
`upstream` re-wraps, fall back to `regular` mode plus a chained
|
||||
proxy directive routing plain HTTP to pipelock.
|
||||
- **Pipelock plain-HTTP scanning coverage.** Pipelock's
|
||||
`forward_proxy.enabled: true` accepts both `GET http://…` and
|
||||
`CONNECT host:443`. Confirm by reading
|
||||
`github.com/luckyPipewrench/pipelock/blob/main/docs/configuration.md`
|
||||
that the full DLP / MCP / subdomain-entropy pipeline runs on the
|
||||
HTTP path; some pipelock layers may be gated on CONNECT only.
|
||||
- **CA installation in the Anthropic-provided Claude Code image.**
|
||||
The base image determines whether `update-ca-certificates`
|
||||
(Debian) or `update-ca-trust` (Red Hat) applies. Confirm against
|
||||
the `Dockerfile` before writing the provisioner; v1 assumes
|
||||
Debian (`node:22-slim`).
|
||||
- **HTTP/2 ALPN end-to-end.** Node's HTTP client negotiates `h2`
|
||||
via ALPN. Confirm the pinned mitmproxy version speaks `h2` to
|
||||
both halves without silently downgrading to `http/1.1`, which
|
||||
would be a noticeable performance regression on bulk transfers.
|
||||
- **Selective-bump policy surface.** Where does the
|
||||
"tunnel this hostname blindly" decision live when (not if) a
|
||||
pinning host appears? Recommended shape per research:
|
||||
`bottle.egress.tls_bump_ignore: ["example.com"]`, a list of
|
||||
hostnames mitmproxy passes through via `ignore_hosts`. Defer
|
||||
until needed; record the shape so the follow-up is mechanical.
|
||||
- **CA generation: Python `cryptography` vs. shelled-out
|
||||
`openssl`.** Adding `cryptography` brings a substantial transitive
|
||||
graph; shelling to `openssl` keeps the host-side prepare step
|
||||
dep-light. Decide at impl time based on what's already on the
|
||||
runner. Either way, the CA is per-bottle and ephemeral.
|
||||
- **Domain-fronting verification.** Once pipelock sees the inner
|
||||
`Host` / `:authority`, comparing it to the outer `CONNECT` target
|
||||
catches domain fronting. Whether pipelock has a rule for this or
|
||||
we need to add one is a follow-up; out of scope here.
|
||||
- **Dry-run preflight rendering of the CA.** Show the fingerprint
|
||||
but never the private key. Confirm the exact dry-run JSON shape
|
||||
during implementation; the field set is part of the CLI's user-
|
||||
facing contract (per PRD 0003 §to_dict notes).
|
||||
(rewritten — most of the original v1 questions are now closed by
|
||||
the walkthrough spikes; what remains is addon-implementation
|
||||
specifics worth pinning during the first impl turn.)
|
||||
|
||||
- **Pipelock's 403-body fingerprint.** The addon needs to
|
||||
distinguish a pipelock block (DLP / host) from a real-upstream
|
||||
4xx that pipelock's forwarder relayed back. Most likely shape:
|
||||
pipelock's 403 response carries a JSON body with `event` /
|
||||
`scanner` fields, whereas a real-upstream 4xx carries whatever
|
||||
the upstream sent. Pin the exact fingerprint by inspecting
|
||||
pipelock's actual 403 body bytes at impl time. Long-term
|
||||
cleanup: file an upstream feature request for an
|
||||
`X-Pipelock-Verdict: block` response header so the addon can
|
||||
read a structured signal instead of pattern-matching the body.
|
||||
- **Docker run env-var inheritance through docker exec.** Plan
|
||||
assumes `docker run -e VAR=value` propagates to subsequent
|
||||
`docker exec` invocations. The Docker docs say so; not yet
|
||||
empirically pinned on this project's runner setup. Verify in
|
||||
the first impl turn. Trivial fallback: thread the three `-e`
|
||||
flags onto every `DockerBottle.exec*` call.
|
||||
- **Addon synchronous-call latency.** The addon makes a sync HTTP
|
||||
call to pipelock per outbound flow. Pipelock is on the same
|
||||
internal Docker network; expected per-call latency is well
|
||||
under 10ms. Confirm under the parallel-request load Claude Code
|
||||
generates (most likely a non-issue — Claude is single-stream
|
||||
request-wise).
|
||||
- **Addon test fixtures.** mitmproxy ships `mitmproxy.test` with
|
||||
flow fixtures; addons can be unit-tested without a running
|
||||
proxy. Confirm the import path and recommended fixture shape at
|
||||
impl time; structure the addon so the verdict-decision is a
|
||||
pure function that's trivially testable in isolation from any
|
||||
HTTP I/O.
|
||||
- **Pipelock allowing the addon's forwarded request through.**
|
||||
pipelock will see the addon's request as coming from the
|
||||
mitmproxy sidecar's IP on the internal network. Confirm
|
||||
pipelock has no client-IP allowlist that would reject these.
|
||||
Likely fine — pipelock's `client_ip` is informational in the
|
||||
scan event, not a gate.
|
||||
|
||||
## References
|
||||
|
||||
- `docs/research/tls-mitm-for-pipelock.md` — primary source; this
|
||||
PRD implements the recommendation in §Recommendation (Topology A).
|
||||
- `docs/research/tls-mitm-for-pipelock.md` — primary source. This
|
||||
PRD implements a variant of §Recommendation (Topology A) after
|
||||
the spike documented under "Open questions" §1 falsified the
|
||||
`upstream` mode assumption.
|
||||
- `docs/research/pipelock-assessment.md` §Scope gaps — names the
|
||||
TLS-inspection gap closed here.
|
||||
- `docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` —
|
||||
@@ -363,9 +429,9 @@ later is a strictly additive change.
|
||||
module pattern reused for the new CA provisioner.
|
||||
- mitmproxy: <https://mitmproxy.org>,
|
||||
<https://github.com/mitmproxy/mitmproxy>
|
||||
- mitmproxy `upstream_proxy` mode:
|
||||
<https://docs.mitmproxy.org/stable/concepts/modes/#upstream-proxy>
|
||||
- mitmproxy modes: <https://docs.mitmproxy.org/stable/concepts/modes/>
|
||||
- mitmproxy CA cert installation:
|
||||
<https://docs.mitmproxy.org/stable/concepts/certificates/>
|
||||
- mitmproxy addon API: <https://docs.mitmproxy.org/stable/addons-overview/>
|
||||
- Node `NODE_EXTRA_CA_CERTS`:
|
||||
<https://nodejs.org/api/cli.html#node_extra_ca_certsfile>
|
||||
|
||||
Reference in New Issue
Block a user