didericis/bot-bottle

Fork 0

Files

T

didericis 6716f091c1

test / unit (pull_request) Successful in 12s

Details

test / integration (pull_request) Successful in 13s

Details

docs(prd): add 0006, enable pipelock's native TLS interception

Supersedes the abandoned PR #8 (`mitmproxy-tls-interception`),
which built a mitmproxy + addon chain on the (falsified) premise
that pipelock could not MITM. Empirical proof from the impl-time
spike: with `tls_interception: { enabled: true, ca_cert, ca_key }`
in pipelock's config, pipelock answered a credential POST over
HTTPS with `STATUS=403 / body: blocked: request body contains
secret: GitHub Token` and emitted both `scanner:"tls_intercept"`
and `scanner:"body_dlp"` events. Standalone, no second proxy.

Net change vs PR #8: one sidecar instead of two, no vendored
addon, no addon-verdict pattern matching, no HTTPS-trust /
DNS / lookup workarounds. Same end-state behavior — pipelock's
DLP fires on plaintext for HTTPS hosts in the allowlist.

Also cleaning up the now-stale TLS-research notes:

- `docs/research/tls-mitm-for-pipelock.md` is removed. Its
  entire premise (mitmproxy in front of pipelock) is moot now
  that pipelock does the work natively. The mechanics of CONNECT
  bumping and the CA-lifecycle considerations it documented are
  the same as what pipelock implements; the PRD restates the
  parts that matter for the integration.
- `docs/research/pipelock-assessment.md` had two stale claims
  corrected: the "Pipelock does not perform TLS inspection (no
  CA trust injection)" line in §Scope gaps and the
  "no TLS termination" cell in the comparison table. Both now
  point at the `tls_interception` config and `pipelock tls`
  CLI instead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-12 14:15:44 -04:00

14 KiB

Raw Blame History

PRD 0006: pipelock native TLS interception

Status: Draft
Author: didericis
Created: 2026-05-12

Summary

Turn on pipelock's built-in tls_interception so its DLP / URL / header / MCP scanners fire on the plaintext of HTTPS requests instead of only the outer CONNECT hostname. Pipelock generates a per-bottle ephemeral CA at launch (pipelock tls init); the public cert is installed into the agent container's trust store and the private key dies with the sidecar on teardown. The existing per-agent sidecar topology from PRD 0001 is otherwise unchanged — one container, no addon, no second proxy.

This supersedes the closed PR #8 / branch mitmproxy-tls-interception, which built a mitmproxy + addon chain on the (falsified) premise that pipelock could not MITM. Empirical proof from the impl-time spike: with tls_interception: { enabled: true, ca_cert, ca_key } in the pipelock config, pipelock answered a credential POST over HTTPS with STATUS=403 / body: blocked: request body contains secret: GitHub Token and emitted both scanner:"tls_intercept" and scanner:"body_dlp" events.

Problem

PRD 0001 wired pipelock onto every bottle's egress, but pipelock ran with its default tls_interception.enabled: false. The agent container's only egress route is pipelock, but pipelock only saw CONNECT hostnames and the encrypted bytes inside the tunnel. Pipelock's headline scanners — request body DLP (48 credential patterns), header DLP, URL DLP, subdomain entropy, MCP scanning, response-body scanning — all need plaintext to fire. Against the HTTPS-only hosts in DEFAULT_ALLOWLIST (api.anthropic.com, raw.githubusercontent.com, etc.) they are effectively disabled.

The existing tests/integration/test_pipelock_blocks_secret_post test only fires because it forces the agent to send plain HTTP through pipelock's forward-proxy mode. Real Claude Code traffic uses HTTPS via CONNECT and slips past the scanner.

Goals / Success Criteria

The feature works when all of the following are observable:

A Node / curl request from inside a launched bottle to a CONNECT-bumped HTTPS host (e.g. https://api.anthropic.com/dlp-probe) carrying a pipelock-recognized credential pattern in the body returns 403 from pipelock with the documented blocked: request body contains secret: … body. Pipelock's body_dlp event fires on the decrypted request.
A clean HTTPS GET from inside the bottle to an allowlisted host (e.g. https://raw.githubusercontent.com/...) returns the real upstream response — TLS interception doesn't break legitimate traffic.
The agent's TLS library trusts pipelock's bumped leaf certs (per the bottle's installed CA); no TLS-trust errors.
Claude Code reaches api.anthropic.com end-to-end through the bottle and completes a chat round-trip.

The feature is done when all of the following ship:

pipelock_build_config / pipelock_render_yaml emit a tls_interception block with enabled: true and the per-bottle CA cert/key paths. The defaults (cert_ttl: 24h, cert_cache_size: 10000, passthrough_domains: []) are kept; only enabled and the cert paths are populated.
The prepare step generates a per-bottle CA via pipelock tls init in a one-shot container, writes ca.pem and ca-key.pem to stage_dir. Paths land on the DockerBottlePlan.
DockerPipelockProxy.start mounts the stage dir into the sidecar (read-only) so the running pipelock can read its CA.
BottleBackend.provision_ca (new) copies the CA public cert into the agent at /usr/local/share/ca-certificates/claude-bottle-mitm.crt, runs update-ca-certificates, and sets the NODE_EXTRA_CA_CERTS / SSL_CERT_FILE / REQUESTS_CA_BUNDLE env trio on the agent container's runtime env. Default no-op on the abstract base so other backends aren't forced to implement.
The launch step prints a one-line stderr log with the SHA-256 fingerprint of the public CA cert (computed via stdlib ssl.PEM_cert_to_DER_cert + hashlib.sha256).
On bottle teardown the sidecar is removed and the CA private key is gone with it.
Two new integration tests under tests/integration/:
- HTTPS variant of the credential-post block test (proves the tls_intercept + body_dlp chain fires end-to-end).
- Clean HTTPS GET test (proves the allow path doesn't break TLS trust and returns real upstream content).
The dry-run preflight (start --dry-run) renders the new TLS layer. Text: one line under the egress summary. JSON: a reserved egress.tls_interception: { enabled: true, ca_fingerprint: null } block — fingerprint is null at dry-run because the CA only exists after launch.

Non-goals

A second proxy in the chain. Pipelock does the bumping natively; the mitmproxy approach was based on a wrong premise (closed PR #8).
Per-bottle override to disable interception. v1 always enables tls_interception. The pipelock-side passthrough_domains list is the right knob if a future allowlisted host turns out to pin certs — exposing it through the manifest is a follow-up.
A long-lived / shared CA across bottles. Each bottle gets a fresh CA generated by pipelock tls init and destroyed with the sidecar.
Tuning cert_ttl, cert_cache_size, max_response_bytes, cross_request_detection, or other pipelock advanced features. Defaults from pipelock generate config --preset strict are fine for v1.
Trust-store paths for non-Debian agent images. node:22-slim is Debian; update-ca-certificates is the right command. A Red-Hat-family base would need update-ca-trust.
HTTP/3 / QUIC. Pipelock's interception is HTTP/HTTPS-over-TLS; UDP/443 still needs an iptables layer (separate PRD).

Scope

In scope

claude_bottle/pipelock.py changes:
- Extend pipelock_build_config to include tls_interception: { enabled: true, ca_cert: <path>, ca_key: <path> }. Paths are populated from the plan; the function's signature grows a cert_path / key_path pair or reads them off Bottle once they're stored.
- Extend pipelock_render_yaml to emit the new block.
claude_bottle/backend/docker/pipelock.py changes:
- New helper pipelock_tls_init(stage_dir) runs the upstream image as a one-shot: docker run --rm -v <stage>:/h -e PIPELOCK_HOME=/h pipelock tls init, leaving ca.pem and ca-key.pem under stage_dir. The host file owner is whatever the upstream image's user is; the sidecar mount is read-only so this is fine.
- DockerPipelockProxy.start mounts the stage dir into the sidecar at /h:ro and references the CA paths in the rendered YAML.
claude_bottle/backend/__init__.py: new abstract method provision_ca(plan, target) on BottleBackend, default no-op. BottleBackend.provision orchestrates ca → prompt → skills → ssh → git.
claude_bottle/backend/docker/provision/ca.py (new):
- Reads the cert from stage_dir (already written by prepare).
- docker cp into the agent.
- docker exec -u 0 ... chmod 644 ... + update-ca-certificates.
- Computes the SHA-256 fingerprint with stdlib (ssl + hashlib), emits one stderr log line.
claude_bottle/backend/docker/launch.py:
- Three new -e flags on the agent's docker run: NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/claude-bottle-mitm.crt, SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt, REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt.
- HTTPS_PROXY / HTTP_PROXY continue to point at pipelock (unchanged from PRD 0001 — the mitmproxy detour in PR #8 is abandoned).
claude_bottle/backend/docker/bottle_plan.py:
- One new info(...) line in print() noting TLS interception is on.
- to_dict() gains an egress.tls_interception: { enabled: true, ca_fingerprint: null } block. Reserved for future population.
claude_bottle/backend/docker/prepare.py: call pipelock_tls_init(stage_dir) and write the resolved cert/key paths onto the plan (either on the existing proxy_plan field or on the parent DockerBottlePlan).
Tests:
- tests/integration/test_pipelock_blocks_secret_https_post.py (new) — HTTPS variant of the existing block test.
- tests/integration/test_pipelock_allows_normal_https.py (new) — clean HTTPS GET succeeds.
- tests/unit/test_pipelock_yaml.py updated to assert the new tls_interception block in the rendered config.
- tests/integration/test_dry_run_plan.py updated to assert the new egress.tls_interception JSON block.

Out of scope

Modifying pipelock itself. We're using existing config knobs.
A manifest field to disable / customize interception per bottle. Doable but premature.
Wiring passthrough_domains. The default [] is correct for v1; add the manifest field when a pinning host shows up.
cross_request_detection, entropy_budget, fragment_reassembly, reverse_proxy, scan_api — features pipelock exposes but we don't need for the body-DLP gap.

Proposed Design

Topology

agent --HTTPS_PROXY--> pipelock --[bumps TLS]--> internet
                       (sees plaintext: URL, headers, body)

Same single-sidecar shape as PRD 0001. The only addition is tls_interception in pipelock's config plus the per-bottle CA generated at prepare time.

CA lifecycle

Generation. Host-side, at prepare time, via a one-shot docker run --rm -v <stage>:/h pipelock tls init. Output is <stage>/ca.pem + <stage>/ca-key.pem, both mode 600.
Sidecar mount. DockerPipelockProxy.start adds -v <stage>:/h:ro to the sidecar's docker run. The rendered YAML references /h/ca.pem and /h/ca-key.pem. The private key is read-only from pipelock's perspective; the host stage dir is owned by the launching user.
Bottle install. provision_ca (Docker impl) does docker cp <stage>/ca.pem agent:/usr/local/share/ca-certificates/claude-bottle-mitm.crt, then update-ca-certificates. The CA env trio is set at docker run -e time (Docker propagates run-time env into docker exec, verified in PR #8's spike).
Teardown. The sidecar container is destroyed, the stage dir is removed by start.py's existing finally block, and the CA dies with both.
Fingerprint. Computed via stdlib in provision_ca and logged once to stderr (claude-bottle: mitm ca fingerprint: sha256:<hex>…). The private key never appears in any log.

Data model changes

None to the manifest schema. The dry-run JSON contract grows a reserved egress.tls_interception block; the fingerprint is always null at dry-run because the CA doesn't exist yet.

Existing code touched

Surgical, all on the existing pipelock path:

claude_bottle/pipelock.py — config builder + YAML renderer.
claude_bottle/backend/__init__.py — abstract provision_ca.
claude_bottle/backend/docker/pipelock.py — tls init helper, sidecar volume mount.
claude_bottle/backend/docker/prepare.py — CA paths on plan.
claude_bottle/backend/docker/launch.py — CA env trio on agent.
claude_bottle/backend/docker/backend.py — provision_ca dispatch + thread self._proxy through prepare/launch unchanged shape.
claude_bottle/backend/docker/bottle_plan.py — preflight rendering.
claude_bottle/backend/docker/provision/ca.py (new).

Net diff is meaningfully smaller than PR #8 because pipelock already does the work — no addon, no second sidecar, no second backend module.

External dependencies

Pipelock image — unchanged pin from PRD 0001 (ghcr.io/luckypipewrench/pipelock@sha256:3b1a3941…, matching pipelock v2.3.0). No new image dependency.
No host-side crypto deps. CA generation uses the pipelock image's own tls init command in a one-shot container. Fingerprint uses Python stdlib ssl + hashlib.

Open questions

Mount semantics for the stage dir. The sidecar runs with a -v <host-stage>:/h:ro bind mount. The CA files were written by the one-shot pipelock tls init container with whatever UID pipelock's image uses; the sidecar reads them as that same UID. Should work, but confirm on first impl by inspecting the file modes/owners and that the sidecar actually loads them. Fallback: docker cp the cert/key into the running sidecar after docker create (mirror PR #8's mitmproxy lifecycle).
Cert validity / TTL. Defaults are cert_ttl: 24h for per-host leaves; the CA validity from pipelock tls init is 10 years by default (--validity 87600h). The CA outlives the bottle either way; per-bottle ephemerality is enforced by generating a fresh one each launch, not by setting a short CA validity. Document; no tuning in v1.
passthrough_domains shape. Once we expose this through the manifest in a follow-up, the natural place is bottle.egress.tls_passthrough_domains: [host, ...], mirroring the existing egress.allowlist shape.
Stage-dir cleanup ordering. The stage dir holds the CA private key briefly. start.py's existing finally block shutil.rmtrees it. Confirm the rmtree fires after the sidecar is stopped, so the sidecar isn't reading a deleted mount when it shuts down. The current order is correct (teardown unwinds via ExitStack before the outer finally runs); verify.

References

docs/research/pipelock-assessment.md (now corrected) — pipelock capability assessment including the tls_interception block.
docs/prds/0001-per-agent-egress-proxy-via-pipelock.md — egress-proxy baseline this PRD extends.
docs/prds/0003-bottle-backend-abstraction.md — backend ABC contract this PRD adds a provision_ca method to.
docs/prds/0004-split-out-provisioners.md — per-provisioner module pattern reused for the new CA provisioner.
Pipelock tls CLI (in-image help): pipelock tls init / install-ca / show-ca.
Closed PR #8 — earlier mitmproxy-based design built on the falsified "pipelock can't MITM" premise; archived for context.

14 KiB Raw Blame History