Survey of TLS-MITM tools (mitmproxy, Squid+ssl_bump, Go libraries) and five candidate topologies for adding TLS termination to the egress path so pipelock's DLP, subdomain-entropy, and MCP scanners can fire on plaintext bodies. Recommends mitmproxy in front of pipelock for v1 with a per-bottle ephemeral CA. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
24 KiB
TLS interception for pipelock content scanning
Research into adding TLS termination ("MITM") to the egress path so that
pipelock's scanning pipeline can see plaintext HTTP request and response
bodies, instead of only the CONNECT host and opaque ciphertext.
Summary
- Pipelock today sees
CONNECThostnames and the encrypted bytes that follow. Its DLP, subdomain-entropy, and MCP scanners cannot fire on TLS-encrypted bodies, which is the gap explicitly named under "Scope gaps" inpipelock-assessment.md("Pipelock does not perform TLS inspection (no CA trust injection)"). - Closing that gap requires a TLS-terminating proxy that bumps
CONNECT, presents a leaf certificate for the target hostname signed by a CA the bottle's trust store accepts, decrypts the inner HTTP, and re-establishes TLS to the real upstream. - The mature open-source option is mitmproxy. Squid +
ssl_bumpis the heavier production-grade alternative. The Go ecosystem (goproxy,gomitmproxy,martian) is suitable only if we want a custom binary tightly coupled to pipelock. - Recommended v1 topology: mitmproxy in front of pipelock on the same egress route. mitmproxy terminates client TLS, forwards plaintext to pipelock as its upstream HTTP proxy, and re-encrypts to the real upstream. Pipelock stays unchanged.
- Per-bottle ephemeral CA, generated at bottle start and destroyed on teardown. The CA private key lives only on the sidecar; the bottle's trust store only ever sees the public cert.
- Cert pinning is a known caveat but a small one given the narrow allowlist in this project. Selective bumping is the mitigation if a future allowlisted host turns out to pin.
What pipelock cannot see today
The current egress topology (per pipelock-assessment.md):
agent --HTTPS_PROXY--> pipelock --CONNECT host:443--> internet
\____________________________
opaque TLS bytes
The agent's client (Claude Code, curl, an MCP server, a Python SDK)
sends CONNECT api.anthropic.com:443. Pipelock checks the hostname
against its api_allowlist, replies 200 Connection Established, and
then blindly relays bytes between the two TCP halves. The TLS handshake
and everything inside it happens end-to-end between the agent and the
real upstream.
What pipelock can scan in this mode:
CONNECTtarget hostname (SNI is not even needed).- TLS record framing and lengths (useful for budgets, useless for DLP).
- Plain HTTP/1.1 to non-HTTPS destinations (irrelevant — there are none
in
DEFAULT_ALLOWLIST).
What pipelock cannot scan in this mode:
- Request URL, method, headers, body.
- Response status, headers, body.
- MCP JSON-RPC payloads inside the TLS session.
- WebSocket frames inside a TLS-wrapped upgrade.
- Whether the inner SNI or HTTP
Host/:authoritymatches the outerCONNECTtarget (domain-fronting check).
The 48-pattern DLP layer, the subdomain-entropy check (insofar as it inspects URLs rather than DNS-resolver queries), the request-redaction feature added in v2.3.0, and bidirectional MCP scanning all require plaintext to operate on. Without TLS termination, those layers are inert against any HTTPS destination — which is every destination in the current allowlist.
How TLS interception works
The mechanics of CONNECT bumping, end to end:
- Agent issues
CONNECT. The HTTP client seesHTTPS_PROXYset, so it opens a TCP connection to the proxy and sendsCONNECT api.anthropic.com:443 HTTP/1.1. - Proxy answers
200. Standard tunnel-established response. - Proxy starts TLS as the server. Instead of relaying bytes, the
proxy itself performs a TLS handshake with the agent. It needs a
server certificate for
api.anthropic.com— so on first contact for that hostname, the proxy generates a leaf certificate withCN=api.anthropic.comand a SAN for the same, signs it with its own CA private key, and presents that cert. Subsequent connections to the same hostname reuse the cached leaf. - Agent verifies the cert. The agent's TLS library walks the chain
to a trusted root. Because the bottle's trust store contains the
proxy's CA cert, validation succeeds. The agent has no way to tell
it isn't talking to the real
api.anthropic.com. - Proxy opens its own TLS to the real upstream. As a client this
time, using the system root store, talking to the real
api.anthropic.com. Real SNI, real cert chain validated normally. - Proxy bridges the two TLS sessions. Decrypts on the server side, re-encrypts on the client side, and scans the plaintext in between.
This is what every TLS-terminating egress proxy does. The trade-offs live in three places:
- CA trust injection. Step 4 only works if the bottle's trust store contains the proxy's CA. Mechanics covered under "CA lifecycle" below.
- Cert generation cost. Generating an RSA-2048 leaf cert takes ~50 ms; ECDSA P-256 is ~5 ms. Cache leaves per (hostname, SAN list) to keep this off the steady-state hot path.
- Protocol coverage. The proxy needs to speak HTTP/1.1, HTTP/2 (ALPN
h2), and ideally WebSocket. HTTP/3 / QUIC is UDP and requires a separate code path; for v1, blocking UDP/443 at the iptables layer forces clients to fall back to HTTP/2, which we can inspect.
Tools
mitmproxy
- What it is. Python (with Rust crypto bits) interactive HTTPS proxy.
Reference open-source implementation of the bump pattern. Ships as
mitmproxy(TUI),mitmweb(browser UI), andmitmdump(headless). - Cert handling. Generates a CA on first run under
~/.mitmproxy/. Per-host leaves are generated on demand and cached in memory. Cert cache keyed by (hostname, SAN extensions inferred from upstream cert). - Protocols. HTTP/1.1, HTTP/2, WebSocket fully supported. HTTP/3
exists as experimental. Raw TCP / non-HTTP TLS supported via
--mode reverse:but not in CONNECT-bump mode. - Extensibility. Python addon API. An addon module can inspect or
modify any
request/response/tcp_messageflow. The pipelock integration in Topology D below uses this. - Selective bumping.
ignore_hostsregex; matching CONNECTs are tunneled blindly instead of bumped. Critical for the cert-pinning mitigation. - Docker image.
mitmproxy/mitmproxyon Docker Hub. Single binary for the CLI, ~80 MB image. Configurable via flags or~/.mitmproxy/config.yaml. - Project URL. https://mitmproxy.org, https://github.com/mitmproxy/mitmproxy.
Most mature, best-documented, lowest-effort integration. Default choice for v1.
Squid + ssl_bump
- What it is. Squid is a long-running C++ caching proxy.
ssl_bumpis its TLS-interception feature, controlled by per-CONNECT actions:splice(tunnel blindly),bump(decrypt and re-encrypt),peek(look at TLS hello then decide),stare(look at server cert then decide),terminate(abort the connection). - Cert handling. Configured via
sslcrtd_program— a helper that generates and caches per-host certs. CA cert and key referenced by PEM paths insquid.conf. - Protocols. HTTP/1.1 fully; HTTP/2 to clients via recent versions; no scripted addons.
- Extensibility. ICAP (Internet Content Adaptation Protocol) for external scanners — Squid POSTs each request/response to an ICAP service that can modify or reject. This is the formal version of Topology D below.
- Production track record. Used at corporate-proxy scale (large enterprises, ISPs). Heavyweight for a single-bottle sidecar.
- Project URL. https://wiki.squid-cache.org/Features/SslPeekAndSplice.
Right tool if pipelock grows an ICAP server endpoint. Otherwise, more config surface than this project needs.
Go libraries: goproxy, gomitmproxy, martian
goproxy(elazarl) — long-lived Go library, basic CONNECT-bumping proxy with a handler API. Sparse on HTTP/2. https://github.com/elazarl/goproxygomitmproxy(AdGuard) — newer, cleaner API; built for AdGuard Home / DNS-filtering products. HTTP/2 support is partial. https://github.com/AdguardTeam/gomitmproxymartian(Google) — request/response modifier framework with a JSON-configurable rule engine. Used internally at Google; public ecosystem thin. https://github.com/google/martian
These are relevant only if we decide to write a custom TLS-terminating binary that links pipelock's scanning packages directly — Topology C below. They are not faster than mitmproxy for the v1 sidecar shape; they are smaller and more direct, at the cost of writing more Go.
Disqualified
- Caddy, Envoy, HAProxy. All can terminate TLS at a reverse-proxy vhost. None ship a "bump on CONNECT and forward plaintext to a downstream proxy" mode out of the box. Adapting any of them to this shape is more work than starting from mitmproxy.
- Cloudflare Gateway, Zscaler, NetSkope, Forcepoint. Managed cloud egress with TLS inspection. Wrong topology — they live outside the host, not as a per-bottle sidecar, and they require trusting a vendor with full plaintext.
- Charles Proxy, Burp Suite. Closed-source GUI tools for developer capture and security testing. Not appropriate as headless sidecars.
mitmdumpstandalone vs. embedding mitmproxy as a library. Both are mitmproxy. Calling out only to note: the project ships both a CLI and a Python API; addons can be loaded either way.
Topologies
Five candidate topologies, ordered roughly from least to most coupled between the two components.
A — mitmproxy in front of pipelock (recommended)
agent --HTTPS_PROXY--> mitmproxy --HTTP_PROXY--> pipelock --> internet
(bump TLS) (scan plain) (real TLS)
mitmproxy terminates the agent's TLS connection, decrypts, and then forwards the inner HTTP request to pipelock by treating pipelock as its own upstream HTTP forward proxy. Pipelock receives plaintext HTTP exactly as if the agent had used HTTP, applies its full scanning pipeline, and forwards to mitmproxy's upstream client half — which re-establishes TLS to the real destination.
Concretely the agent's HTTPS_PROXY points at mitmproxy; mitmproxy's
upstream_proxy config points at pipelock; pipelock's network reach
includes the real internet.
- Wins. Pipelock unchanged. mitmproxy unchanged from default configuration. Each component has one job. Failure modes are clear per layer.
- Costs. Two sidecars per bottle instead of one. One extra decrypt / re-encrypt hop, ~5–15 ms per request in steady state.
- Open question. How exactly mitmproxy forwards to pipelock matters
for whether pipelock sees TLS again or only HTTP. mitmproxy's
upstreammode wraps the decrypted request in another CONNECT if the destination is HTTPS — which would re-encrypt before pipelock sees it, defeating the point. The correct mode isupstreamwith TLS re-origination disabled, orregularmode with a chained proxy. The v2 release of mitmproxy reworked this; needs verification against the current docs at integration time.
B — pipelock in front of mitmproxy (ruled out)
agent --HTTPS_PROXY--> pipelock --CONNECT?--> mitmproxy --> internet
(sees CONNECT only) (bump TLS)
Pipelock would receive a CONNECT and decide to allow or deny based
on hostname, then tunnel to mitmproxy. mitmproxy would terminate TLS
and see plaintext — but pipelock would never see the plaintext, which
is the whole point of the exercise. The scanning still happens (in
mitmproxy), but it isn't pipelock doing it, so we'd need an entirely
different rule engine. Ruled out.
C — Extend pipelock itself to terminate TLS
Two sub-variants:
C.1 — Upstream a tls_terminate mode. Submit a feature to
pipelock that adds CONNECT bumping and per-host cert generation in Go,
using crypto/tls and the existing scanning packages. Pipelock becomes
a self-contained MITM proxy. License question matters here: the Apache
2.0 core can grow new features in-tree, but if upstream insists this
belongs in enterprise/ (ELv2), we either accept ELv2 or fork.
C.2 — Wrap pipelock in a thin Go binary in the same container. A
small Go program does the TLS half (CONNECT parsing, cert generation,
TLS handshake) and pipes plaintext to pipelock over UDS or loopback.
The wrapper is ours; pipelock is unmodified. No license question.
- Wins. Single component on the egress path. Pipelock owns the
scanning end-to-end, including domain-fronting checks (SNI vs.
Hostvs.CONNECT). - Costs. Real Go engineering effort. CA generation, cert caching, TLS handshake, HTTP/2 ALPN negotiation, WebSocket upgrade — all things mitmproxy already solves.
- When. Right shape for v2 or v3 once the v1 mitmproxy-in-front topology has proven the integration works and the scanning rules are stable.
D — mitmproxy as the proxy, pipelock as a content-scan subroutine
agent --HTTPS_PROXY--> mitmproxy --> internet
(bump TLS)
|
v
POST /scan to pipelock
<- allow / block / redact
A Python addon in mitmproxy sends each decrypted request (and response)
to a pipelock HTTP /scan endpoint and gates the flow on the verdict.
mitmproxy handles all networking; pipelock is the rule engine only.
- Wins. Clean separation of concerns. Pipelock doesn't have to speak TLS at all. The addon is small, ~100 lines of Python.
- Costs. Requires pipelock to expose a scan API. The current Apache
2.0 core does not document one. If
/scanlives inenterprise/, ELv2 applies. If it doesn't exist, we'd be asking pipelock for a new surface. - Variant. Squid's ICAP path is the formalized version of the same pattern.
E — Single container, two processes
mitmproxy and pipelock share a container, started by supervisord or
s6-overlay. Networking simplifies to localhost. Lifecycle complicates:
container restart now means restarting both; failure of one process is
not visible at the Docker layer; logs interleave.
- Wins. Slightly less Docker plumbing in
cli.py. - Costs. Operational complexity not worth the savings. The two containers are independent processes with independent failure modes; Docker is the right tool for that.
Net: not recommended.
CA lifecycle
The CA private key is the asset to defend. With it, anyone can issue certs that the bottle's trust store will accept for any hostname. So:
Per-bottle ephemeral CA. At bottle start, generate a fresh RSA-2048 or ECDSA-P256 CA inside the mitmproxy sidecar. Export only the public cert (PEM) into the bottle's trust store at one of:
/usr/local/share/ca-certificates/claude-bottle-mitm.crtfollowed byupdate-ca-certificates(Debian/Ubuntu base images)./etc/pki/ca-trust/source/anchors/withupdate-ca-trust(Red-Hat-family).$NODE_EXTRA_CA_CERTSfor Node-based agents (Claude Code).$SSL_CERT_FILE/$REQUESTS_CA_BUNDLEfor Python SDKs.
The private key never leaves the sidecar's filesystem. The CA cert public half is the only artifact that crosses into the bottle.
On bottle teardown, the sidecar container is destroyed; the CA dies with it. The next bottle gets a fresh CA. No long-lived MITM CA on disk.
Why not a shared per-host CA. A persistent CA across bottles is faster (no generation at start) but is a real liability: if any bottle exfiltrates the CA cert public half (which it can — it's in the trust store by design), an attacker on the host network could in principle impersonate any host to any bottle. With a per-bottle CA, the exfil gains nothing: the CA is bottle-local and dies in minutes.
Generation cost. RSA-2048 CA generation is ~200 ms; ECDSA-P256 is ~5 ms. Either is irrelevant against the per-bottle Docker pull and network setup cost.
Where the CA lives in the bottle's trust store. Both: a
distribution-standard path with update-ca-certificates, and the
env-var path. Belt and suspenders, because some Node and Python
libraries honor the env vars only, and some load only /etc/ssl/certs/
directly.
Cert pinning (brief)
A client that pins ignores the trust store and refuses any cert whose public key isn't on a hardcoded list. Three observations for this project:
- The current
DEFAULT_ALLOWLIST(api.anthropic.com,statsig.anthropic.com,sentry.io,claude.ai,platform.claude.com,downloads.claude.ai,raw.githubusercontent.com) does not appear to include any host that pins against server-side SDKs. Server-side SDKs (Node, Python) almost universally honor system trust andNODE_EXTRA_CA_CERTS/SSL_CERT_FILE. Mobile SDKs and Chromium pin; we don't run those. - If a future allowlisted host turns out to pin, the mitigation is
selective bumping via mitmproxy
ignore_hosts: that specific hostname tunnels blindly and pipelock loses DLP coverage for it. Coverage on every other host is unaffected. - The cost of finding out: a single 5-minute test before adding a host — point mitmproxy at the host, observe whether the client succeeds.
Not a v1 blocker. Document the failure mode and the mitigation.
Comparison table
| A: mitmproxy → pipelock | B: pipelock → mitmproxy | C: TLS in pipelock | D: mitmproxy + scan API | E: one container | |
|---|---|---|---|---|---|
| Pipelock sees plaintext | yes | no | yes | yes (via /scan) | yes |
| Code change to pipelock | none | none | substantial | adds /scan endpoint | none |
| Sidecar count | 2 | 2 | 1 | 2 | 1 |
| Cert generation owner | mitmproxy | mitmproxy | pipelock | mitmproxy | mitmproxy |
| Selective bumping | mitmproxy ignore_hosts |
mitmproxy ignore_hosts |
pipelock config | mitmproxy ignore_hosts |
mitmproxy ignore_hosts |
| Failure isolation per process | yes | yes | n/a (one process) | yes | no (shared container) |
| License question | none | none | ELv2 risk | ELv2 risk | none |
| v1 effort | low | low (but pointless) | high | medium | low |
| Long-term shape | interim | n/a | best | possible | not recommended |
Recommendation
Adopt Topology A for v1. Add a mitmproxy sidecar to the egress
topology, in front of pipelock on the same per-bottle internal network.
The agent's HTTPS_PROXY points at mitmproxy; mitmproxy's upstream is
pipelock; pipelock's upstream is the real internet.
Concretely:
- Add a
MitmproxyProxyclass alongsidePipelockProxy, with the sameprepare/start/stoplifecycle. The class generates a per-bottle CA instage_dir, exports the public cert into a second file, and writes a mitmproxy config that:- bumps every CONNECT by default
- uses
upstream_proxy = http://pipelock-<slug>:<port> - listens on a known port inside the per-bottle internal network
- Extend the bottle launch step to copy the CA public cert into the
agent container under
/usr/local/share/ca-certificates/claude-bottle-mitm.crt, runupdate-ca-certificates, and setNODE_EXTRA_CA_CERTS/SSL_CERT_FILE/REQUESTS_CA_BUNDLEaccordingly. - Repoint the agent's
HTTPS_PROXYandHTTP_PROXYfrom the pipelock container to the mitmproxy container. - Verify mitmproxy's upstream-proxy mode forwards plaintext (not a
re-wrapped CONNECT) to pipelock; if not, use
regularmode with a chained proxy directive. - Test that pipelock's DLP, subdomain-entropy, and MCP scanners now
fire on real request bodies for
api.anthropic.comtraffic.
Defer Topologies C and D. Topology C (extending pipelock to terminate TLS) is the cleanest long-term shape but is a substantial build and runs into the Apache 2.0 vs. ELv2 question. Topology D (mitmproxy with pipelock as a scan API) is attractive but requires a pipelock surface that doesn't exist today. Both are valid v2 targets; neither is the right starting point.
The network-egress-guard.md v1 iptables + dnsmasq layer remains
necessary alongside this — TLS interception covers HTTP/HTTPS only;
raw TCP, UDP/443 (QUIC), UDP/53 (DNS), and ICMP still need the
IP-level default-deny.
Open questions
- mitmproxy upstream-proxy mode mechanics. Does mitmproxy in
upstream_proxymode forward decrypted HTTP plaintext to the upstream, or does it wrap it in a new CONNECT? The documented behavior changed between mitmproxy 8 and 10. Needs verification against the version we pin. - Pipelock's behavior when receiving plain HTTP. Pipelock's
forward_proxy.enabled: trueaccepts bothGET http://...(plain HTTP) andCONNECT host:443(HTTPS). After Topology A is wired up, pipelock will see only plain HTTP — does its DLP / MCP scanning pipeline run the full set of layers, or are some gated on the CONNECT path? Confirm by readinggithub.com/luckyPipewrench/pipelock/blob/main/docs/configuration.md. - CA installation in the Anthropic-provided Claude Code Docker image.
The base image's distribution determines whether
update-ca-certificates(Debian/Ubuntu) orupdate-ca-trust(Red Hat) is the right command. The currentDockerfileshould be inspected before assuming Debian. - HTTP/2 over the agent → mitmproxy hop. Node's HTTP client
negotiates
h2via ALPN. mitmproxy speaksh2to clients in recent versions. Confirm the version we pin supportsh2end-to-end and doesn't downgrade tohttp/1.1(which would be a silent performance regression). - Selective-bump policy surface. Where does the
"tunnel this hostname blindly" decision live? Options: a field on
bottle.egressin the manifest, a fixed list of known-pinning hosts baked into the mitmproxy config, or pipelock-side opt-out. Manifest field is most consistent with the existingbottle.egress.allowlistshape. - Image pin for mitmproxy. The
pipelock-assessment.mdrecommendation is to pin by digest. The mitmproxy Docker Hub image should be pinned the same way. Which release line?mitmproxy/mitmproxyships rolling and tagged versions; the tagged:11.xline is the right baseline. - CA generation in Python (mitmproxy) vs. as a separate step.
mitmproxy generates a CA on first launch if none is provided. For
per-bottle ephemerality, we want the CA to be ours, not whatever
mitmproxy chooses — so generate the CA in the host-side prepare
step and inject it via
--certs *=.... Mechanics need confirming. - Domain fronting verification. Once pipelock sees plaintext, it
has access to the inner
Host/:authority. A new rule that compares it against the outerCONNECTtarget catches domain fronting. Worth a follow-up note on whether pipelock has such a rule or whether we add it.
References
- mitmproxy: https://mitmproxy.org, https://github.com/mitmproxy/mitmproxy
- mitmproxy
upstream_proxymode: https://docs.mitmproxy.org/stable/concepts/modes/#upstream-proxy - mitmproxy CA cert installation: https://docs.mitmproxy.org/stable/concepts/certificates/
- Squid
ssl_bump: https://wiki.squid-cache.org/Features/SslPeekAndSplice - Squid ICAP: https://wiki.squid-cache.org/Features/ICAP
goproxy: https://github.com/elazarl/goproxygomitmproxy: https://github.com/AdguardTeam/gomitmproxymartian: https://github.com/google/martian- Node TLS /
NODE_EXTRA_CA_CERTS: https://nodejs.org/api/cli.html#node_extra_ca_certsfile - Python
SSL_CERT_FILEandREQUESTS_CA_BUNDLE: https://docs.python.org/3/library/ssl.html#ssl.SSLContext.load_verify_locations - Prior research — pipelock assessment:
docs/research/pipelock-assessment.md - Prior research — network egress guard:
docs/research/network-egress-guard.md - Prior research — secret exfil tripwire encodings:
docs/research/secret-exfil-tripwire-encodings.md
Research conducted 2026-05-12.