# PRD 0006: pipelock native TLS interception - **Status:** Draft - **Author:** didericis - **Created:** 2026-05-12 ## Summary Turn on pipelock's built-in `tls_interception` so its DLP / URL / header / MCP scanners fire on the plaintext of HTTPS requests instead of only the outer `CONNECT` hostname. Pipelock generates a per-bottle ephemeral CA at launch (`pipelock tls init`); the public cert is installed into the agent container's trust store and the private key dies with the sidecar on teardown. The existing per-agent sidecar topology from PRD 0001 is otherwise unchanged — one container, no addon, no second proxy. This supersedes the closed PR #8 / branch `mitmproxy-tls-interception`, which built a mitmproxy + addon chain on the (falsified) premise that pipelock could not MITM. Empirical proof from the impl-time spike: with `tls_interception: { enabled: true, ca_cert, ca_key }` in the pipelock config, pipelock answered a credential POST over HTTPS with `STATUS=403 / body: blocked: request body contains secret: GitHub Token` and emitted both `scanner:"tls_intercept"` and `scanner:"body_dlp"` events. ## Problem PRD 0001 wired pipelock onto every bottle's egress, but pipelock ran with its default `tls_interception.enabled: false`. The agent container's only egress route is pipelock, but pipelock only saw `CONNECT` hostnames and the encrypted bytes inside the tunnel. Pipelock's headline scanners — request body DLP (48 credential patterns), header DLP, URL DLP, subdomain entropy, MCP scanning, response-body scanning — all need plaintext to fire. Against the HTTPS-only hosts in `DEFAULT_ALLOWLIST` (`api.anthropic.com`, `raw.githubusercontent.com`, etc.) they are effectively disabled. The existing `tests/integration/test_pipelock_blocks_secret_post` test only fires because it forces the agent to send plain HTTP through pipelock's forward-proxy mode. Real Claude Code traffic uses HTTPS via CONNECT and slips past the scanner. ## Goals / Success Criteria The feature works when all of the following are observable: - A Node / curl request from inside a launched bottle to a CONNECT-bumped HTTPS host (e.g. `https://api.anthropic.com/dlp-probe`) carrying a pipelock-recognized credential pattern in the body returns 403 from pipelock with the documented `blocked: request body contains secret: …` body. Pipelock's `body_dlp` event fires on the decrypted request. - A clean HTTPS GET from inside the bottle to an allowlisted host (e.g. `https://raw.githubusercontent.com/...`) returns the real upstream response — TLS interception doesn't break legitimate traffic. - The agent's TLS library trusts pipelock's bumped leaf certs (per the bottle's installed CA); no TLS-trust errors. - Claude Code reaches `api.anthropic.com` end-to-end through the bottle and completes a chat round-trip. The feature is **done** when all of the following ship: - `pipelock_build_config` / `pipelock_render_yaml` emit a `tls_interception` block with `enabled: true` and the per-bottle CA cert/key paths. The defaults (`cert_ttl: 24h`, `cert_cache_size: 10000`, `passthrough_domains: []`) are kept; only `enabled` and the cert paths are populated. - The prepare step generates a per-bottle CA via `pipelock tls init` in a one-shot container, writes `ca.pem` and `ca-key.pem` to `stage_dir`. Paths land on the `DockerBottlePlan`. - `DockerPipelockProxy.start` mounts the stage dir into the sidecar (read-only) so the running pipelock can read its CA. - `BottleBackend.provision_ca` (new) copies the CA public cert into the agent at `/usr/local/share/ca-certificates/bot-bottle-mitm.crt`, runs `update-ca-certificates`, and sets the `NODE_EXTRA_CA_CERTS` / `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` env trio on the agent container's runtime env. Default no-op on the abstract base so other backends aren't forced to implement. - The launch step prints a one-line stderr log with the SHA-256 fingerprint of the public CA cert (computed via stdlib `ssl.PEM_cert_to_DER_cert` + `hashlib.sha256`). - On bottle teardown the sidecar is removed and the CA private key is gone with it. - Two new integration tests under `tests/integration/`: - HTTPS variant of the credential-post block test (proves the `tls_intercept` + `body_dlp` chain fires end-to-end). - Clean HTTPS GET test (proves the allow path doesn't break TLS trust and returns real upstream content). - The dry-run preflight (`start --dry-run`) renders the new TLS layer. Text: one line under the egress summary. JSON: a reserved `egress.tls_interception: { enabled: true, ca_fingerprint: null }` block — fingerprint is null at dry-run because the CA only exists after launch. ## Non-goals - A second proxy in the chain. Pipelock does the bumping natively; the mitmproxy approach was based on a wrong premise (closed PR #8). - Per-bottle override to disable interception. v1 always enables `tls_interception`. The pipelock-side `passthrough_domains` list is the right knob if a future allowlisted host turns out to pin certs — exposing it through the manifest is a follow-up. - A long-lived / shared CA across bottles. Each bottle gets a fresh CA generated by `pipelock tls init` and destroyed with the sidecar. - Tuning `cert_ttl`, `cert_cache_size`, `max_response_bytes`, `cross_request_detection`, or other pipelock advanced features. Defaults from `pipelock generate config --preset strict` are fine for v1. - Trust-store paths for non-Debian agent images. `node:22-slim` is Debian; `update-ca-certificates` is the right command. A Red-Hat-family base would need `update-ca-trust`. - HTTP/3 / QUIC. Pipelock's interception is HTTP/HTTPS-over-TLS; UDP/443 still needs an iptables layer (separate PRD). ## Scope ### In scope - **`bot_bottle/pipelock.py`** changes: - Extend `pipelock_build_config` to include `tls_interception: { enabled: true, ca_cert: , ca_key: }`. Paths are populated from the plan; the function's signature grows a `cert_path` / `key_path` pair or reads them off `Bottle` once they're stored. - Extend `pipelock_render_yaml` to emit the new block. - **`bot_bottle/backend/docker/pipelock.py`** changes: - New helper `pipelock_tls_init(stage_dir)` runs the upstream image as a one-shot: `docker run --rm -v :/h -e PIPELOCK_HOME=/h pipelock tls init`, leaving `ca.pem` and `ca-key.pem` under `stage_dir`. The host file owner is whatever the upstream image's user is; the sidecar mount is read-only so this is fine. - `DockerPipelockProxy.start` `docker cp`s the CA cert + key into the sidecar at `/etc/pipelock/ca.pem` and `/etc/pipelock/ca-key.pem` between `docker create` and `docker start`, mirroring the existing pattern for the YAML config. If pipelock's image runs as non-root, a `docker exec -u 0 chown pipelock:pipelock /etc/pipelock/ca*.pem` lands between the `cp` and the `start`. - **`bot_bottle/backend/__init__.py`**: new abstract method `provision_ca(plan, target)` on `BottleBackend`, default no-op. `BottleBackend.provision` orchestrates `ca → prompt → skills → ssh → git`. - **`bot_bottle/backend/docker/provision/ca.py`** (new): - Reads the cert from `stage_dir` (already written by prepare). - `docker cp` into the agent. - `docker exec -u 0 ... chmod 644 ...` + `update-ca-certificates`. - Computes the SHA-256 fingerprint with stdlib (`ssl` + `hashlib`), emits one stderr log line. - **`bot_bottle/backend/docker/launch.py`**: - Three new `-e` flags on the agent's `docker run`: `NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/bot-bottle-mitm.crt`, `SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt`, `REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt`. - `HTTPS_PROXY` / `HTTP_PROXY` continue to point at pipelock (unchanged from PRD 0001 — the mitmproxy detour in PR #8 is abandoned). - **`bot_bottle/backend/docker/bottle_plan.py`**: - One new `info(...)` line in `print()` noting TLS interception is on. - `to_dict()` gains an `egress.tls_interception: { enabled: true, ca_fingerprint: null }` block. Reserved for future population. - **`bot_bottle/backend/docker/prepare.py`**: call `pipelock_tls_init(stage_dir)` and write the resolved cert/key paths onto the plan (either on the existing `proxy_plan` field or on the parent `DockerBottlePlan`). - **Tests:** - `tests/integration/test_pipelock_blocks_secret_https_post.py` (new) — HTTPS variant of the existing block test. - `tests/integration/test_pipelock_allows_normal_https.py` (new) — clean HTTPS GET succeeds. - `tests/unit/test_pipelock_yaml.py` updated to assert the new `tls_interception` block in the rendered config. - `tests/integration/test_dry_run_plan.py` updated to assert the new `egress.tls_interception` JSON block. ### Out of scope - Modifying pipelock itself. We're using existing config knobs. - A manifest field to disable / customize interception per bottle. Doable but premature. - Wiring `passthrough_domains`. The default `[]` is correct for v1; add the manifest field when a pinning host shows up. The shape is pre-recorded so the follow-up is mechanical: `bottle.egress.tls_passthrough_domains: [host, ...]`, mirroring the existing `egress.allowlist`. - `cross_request_detection`, `entropy_budget`, `fragment_reassembly`, `reverse_proxy`, `scan_api` — features pipelock exposes but we don't need for the body-DLP gap. ## Proposed Design ### Topology ``` agent --HTTPS_PROXY--> pipelock --[bumps TLS]--> internet (sees plaintext: URL, headers, body) ``` Same single-sidecar shape as PRD 0001. The only addition is `tls_interception` in pipelock's config plus the per-bottle CA generated at prepare time. ### CA lifecycle - **Generation.** Host-side, at prepare time, via a one-shot `docker run --rm -v :/h -e PIPELOCK_HOME=/h pipelock tls init`. Output: `/ca.pem` + `/ca-key.pem`, mode 600. - **Sidecar install.** `DockerPipelockProxy.start` `docker cp`s the CA cert + key into the sidecar at `/etc/pipelock/ca.pem` and `/etc/pipelock/ca-key.pem` between `docker create` and `docker start`. Same pattern the proxy already uses for the YAML config — no bind-mount, no UID/permission concern from the one-shot generation step. The rendered YAML references the in-container paths. - **Bottle install.** `provision_ca` (Docker impl) does `docker cp /ca.pem agent:/usr/local/share/ca-certificates/bot-bottle-mitm.crt`, then `update-ca-certificates`. The CA env trio is set at `docker run -e` time (Docker propagates run-time env into `docker exec`). - **Per-bottle ephemerality.** Enforced by *regenerating per launch*, not by validity windows. Pipelock's defaults (`cert_ttl: 24h` for leaves, `--validity 87600h` for the CA) are fine — the CA lives only as long as the sidecar, which is the bottle's lifetime. - **Teardown.** Sidecar removed via `ExitStack` callback, then the launch context manager's outer `finally` `shutil.rmtree`s `stage_dir`. CA dies with both, in that order, so the sidecar is never reading a deleted mount on shutdown. - **Fingerprint.** Computed via stdlib in `provision_ca` and logged once to stderr (`bot-bottle: mitm ca fingerprint: sha256:…`). The private key never appears in any log. ### Data model changes None to the manifest schema. The dry-run JSON contract grows a reserved `egress.tls_interception` block; the fingerprint is always null at dry-run because the CA doesn't exist yet. ### Existing code touched Surgical, all on the existing pipelock path: - `bot_bottle/pipelock.py` — config builder + YAML renderer. - `bot_bottle/backend/__init__.py` — abstract `provision_ca`. - `bot_bottle/backend/docker/pipelock.py` — `tls init` helper, sidecar volume mount. - `bot_bottle/backend/docker/prepare.py` — CA paths on plan. - `bot_bottle/backend/docker/launch.py` — CA env trio on agent. - `bot_bottle/backend/docker/backend.py` — `provision_ca` dispatch + thread `self._proxy` through prepare/launch unchanged shape. - `bot_bottle/backend/docker/bottle_plan.py` — preflight rendering. - `bot_bottle/backend/docker/provision/ca.py` (new). Net diff is meaningfully smaller than PR #8 because pipelock already does the work — no addon, no second sidecar, no second backend module. ### External dependencies - **Pipelock image** — unchanged pin from PRD 0001 (`ghcr.io/luckypipewrench/pipelock@sha256:3b1a3941…`, matching pipelock v2.3.0). No new image dependency. - **No host-side crypto deps.** CA generation uses the pipelock image's own `tls init` command in a one-shot container. Fingerprint uses Python stdlib `ssl` + `hashlib`. ## References - `docs/research/pipelock-assessment.md` (now corrected) — pipelock capability assessment including the `tls_interception` block. - `docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` — egress-proxy baseline this PRD extends. - `docs/prds/0003-bottle-backend-abstraction.md` — backend ABC contract this PRD adds a `provision_ca` method to. - `docs/prds/0004-split-out-provisioners.md` — per-provisioner module pattern reused for the new CA provisioner. - Pipelock `tls` CLI (in-image help): `pipelock tls init / install-ca / show-ca`. - Closed PR #8 — earlier mitmproxy-based design built on the falsified "pipelock can't MITM" premise; archived for context.