feat(egress-proxy): cutover from cred-proxy (PRD 0017 chunk 2)
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m3s

Hard cutover. cred-proxy is deleted; egress-proxy is now the agent's
HTTP_PROXY (when routes are declared) with pipelock on its outbound
leg. Two per-bottle CAs are minted: egress-proxy's (agent trust
store) and pipelock's (egress-proxy's outbound trust store).

Manifest:
  - `bottle.cred_proxy` → hard error with a migration recipe.
  - `bottle.egress_proxy` is the new shape (PRD 0017 chunk 1).
  - CredProxy* types + role validators removed.

Wiring:
  - launch.py: `egress_proxy_tls_init` mints the egress-proxy CA
    (cert+key concat for mitmproxy + cert-only for agent trust);
    `DockerEgressProxy.start` docker-cps both CAs in, sets
    `HTTPS_PROXY=pipelock` + `EGRESS_PROXY_UPSTREAM_CA` so mitmdump
    trusts pipelock's MITM. Agent's HTTP_PROXY points at
    egress-proxy when routes exist, else falls back to pipelock
    (no-routes bottles unchanged).
  - prepare.py / backend.py: `cred_proxy` arg → `egress_proxy`;
    sidecar-orphan probe + plan field + dashboard view all
    renamed.
  - provision_ca: selects the egress-proxy CA when present, else
    pipelock's (filename renamed to claude-bottle-mitm-ca.crt).
  - bottle.provision: cred-proxy dotfile rewrites (~/.npmrc,
    ~/.gitconfig insteadOf, tea config) are gone — HTTP_PROXY
    catches everything respecting it.

Pipelock helpers:
  - `pipelock_token_hosts` → `pipelock_route_hosts` (now reading
    egress_proxy.routes).
  - cred-proxy hostname auto-allow → egress-proxy hostname
    auto-allow.
  - Anthropic seed-phrase workaround now triggers when an
    egress_proxy route targets api.anthropic.com (was based on the
    cred-proxy `anthropic-base-url` role).

Dockerfile.egress-proxy:
  - Entrypoint conditionally passes
    `--set ssl_verify_upstream_trusted_ca=$EGRESS_PROXY_UPSTREAM_CA`
    (via the `${VAR:+...}` shell expansion) so standalone runs without
    a mounted pipelock CA still boot.
  - mkdirs `/home/mitmproxy/.mitmproxy` ahead of `docker cp`.

Deleted: claude_bottle/{cred_proxy,cred_proxy_server}.py,
backend/docker/{cred_proxy,provision/cred_proxy}.py,
Dockerfile.cred-proxy, plus the corresponding unit + integration
tests. backend/docker/cred_proxy_apply.py stays as a stub for
chunk 3 to rewrite (its container-name + routes-path constants
are inlined so it survives without the deleted module).

Test changes:
  - test_pipelock_allowlist rewritten against egress-proxy routes
    + the new `pipelock_route_hosts`.
  - test_manifest_md_load + test_pipelock_yaml + test_yaml_subset
    fixtures migrated to the `egress_proxy: { routes: [...] }`
    shape.
  - test_supervise_sidecar's round-trip test switched from
    `dashboard.approve` to `dashboard.reject`: the approval-apply
    path on cred-proxy-block proposals hits a deleted sidecar in
    chunk 2's transitional state. Chunk 3 restores the approval
    test once the remediation flow is retargeted at egress-proxy.

376 tests pass (was 427; net delta is removed cred-proxy tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-25 14:30:39 -04:00
parent 9e41845a2b
commit 70f773ac61
30 changed files with 573 additions and 3451 deletions
+58 -60
View File
@@ -3,9 +3,14 @@
Pipelock (https://github.com/luckyPipewrench/pipelock) is an HTTP
forward proxy with hostname allowlisting + DLP scanning + URL-entropy
checks. One sidecar per agent, attached to the agent's --internal
network and a per-agent user-defined egress bridge. Combined with
HTTPS_PROXY/HTTP_PROXY pointing at the sidecar's service name, pipelock
is the only egress route the agent has.
network and a per-agent user-defined egress bridge.
Post-PRD-0017 topology: the agent's HTTP_PROXY points at egress-proxy
(not pipelock); egress-proxy sets `HTTPS_PROXY=pipelock` on its
outbound leg. So pipelock no longer sees the agent's connections
directly — it sees the egress-proxy → upstream leg, applies the
hostname allowlist + DLP body scan there, and forwards to the real
upstream.
Image pin: ghcr.io/luckypipewrench/pipelock@sha256:<digest> for tag 2.3.0.
"""
@@ -17,7 +22,7 @@ from dataclasses import dataclass
from pathlib import Path
from typing import cast
from .cred_proxy import CRED_PROXY_HOSTNAME
from .egress_proxy import EGRESS_PROXY_HOSTNAME
from .supervise import SUPERVISE_HOSTNAME
from .manifest import Bottle
@@ -57,48 +62,45 @@ def pipelock_bottle_allowlist(bottle: Bottle) -> list[str]:
return list(bottle.egress.allowlist)
def pipelock_token_hosts(bottle: Bottle) -> list[str]:
"""Hostnames the cred-proxy sidecar (PRD 0010) talks to upstream
on the agent's behalf. Derived from each route's
`upstream.UpstreamHost` in `bottle.cred_proxy.routes`. Returned
sorted+deduped.
def pipelock_route_hosts(bottle: Bottle) -> list[str]:
"""Hostnames declared in `bottle.egress_proxy.routes`. Returned
sorted + deduped.
These hosts must be on pipelock's allowlist so cred-proxy's
outbound HTTPS traffic can leave the egress network. They are
NOT auto-added to passthrough_domains: cred-proxy's HTTPS client
trusts pipelock's per-bottle CA at runtime (installed via
docker cp + update-ca-certificates in the cred-proxy image),
so pipelock MITMs and body-scans the cred-proxy → upstream leg
the same way it does direct agent traffic."""
hosts = {r.UpstreamHost for r in bottle.cred_proxy.routes if r.UpstreamHost}
Post-cutover topology (PRD 0017): the agent's HTTPS_PROXY points
at egress-proxy, not pipelock; egress-proxy's outbound leg sets
`HTTPS_PROXY=pipelock`. So pipelock no longer terminates the
agent's connections — it sees the egress-proxy → upstream leg
only. Each declared route's host still needs to be on pipelock's
allowlist so that leg can leave the egress network."""
hosts = {r.Host for r in bottle.egress_proxy.routes if r.Host}
return sorted(hosts)
def pipelock_effective_allowlist(bottle: Bottle) -> list[str]:
"""Deduplicated union of: baked-in defaults, bottle.egress.allowlist,
the cred-proxy upstream hosts derived from bottle.cred_proxy.routes,
the cred-proxy sidecar's own hostname when any cred_proxy route is
declared, and the supervise sidecar's hostname when bottle.supervise
is enabled. Sorted for stability. Git upstreams declared in
`bottle.git` do NOT contribute here — git traffic flows through the
per-agent git-gate sidecar (PRD 0008), not pipelock.
the egress-proxy route hosts (from bottle.egress_proxy.routes), the
egress-proxy sidecar's own hostname when any route is declared, and
the supervise sidecar's hostname when bottle.supervise is enabled.
Sorted for stability. Git upstreams declared in `bottle.git` do NOT
contribute here — git traffic flows through the per-agent git-gate
sidecar (PRD 0008), not pipelock.
The cred-proxy + supervise hostnames are auto-added because the
agent's HTTP_PROXY points at pipelock, so a manifest-driven URL
like `http://cred-proxy:9099/anthropic/...` or
`http://supervise:9100/` arrives at pipelock as a request for the
sidecar hostname. Without this auto-allow, pipelock would 403 the
request before it reached the sidecar."""
The egress-proxy + supervise hostnames are auto-added because the
sidecars sit on the bottle's internal network alongside the agent;
requests that pass through pipelock for `egress-proxy:9099` or
`supervise:9100` (e.g. when egress-proxy uses HTTPS_PROXY=pipelock
on its upstream leg) would otherwise be 403'd by pipelock's
hostname gate."""
seen: dict[str, None] = {}
for h in DEFAULT_ALLOWLIST:
seen.setdefault(h, None)
for h in pipelock_bottle_allowlist(bottle):
if h:
seen.setdefault(h, None)
for h in pipelock_token_hosts(bottle):
for h in pipelock_route_hosts(bottle):
seen.setdefault(h, None)
if bottle.cred_proxy.routes:
seen.setdefault(CRED_PROXY_HOSTNAME, None)
if bottle.egress_proxy.routes:
seen.setdefault(EGRESS_PROXY_HOSTNAME, None)
if bottle.supervise:
seen.setdefault(SUPERVISE_HOSTNAME, None)
return sorted(seen.keys())
@@ -122,16 +124,16 @@ def pipelock_seed_phrase_detection_enabled(bottle: Bottle) -> bool:
Empirically only `seed_phrase_detection.enabled: false`
actually stops the block (verified by sending a 12-word BIP-39
body through three pipelock instances). It is a global toggle
— there is no per-path / per-host knob in pipelock 2.3.0 — so
we turn the detector off for the entire bottle when an
`anthropic-base-url` route is declared. The trade-off is
body through three pipelock instances). It is a global toggle
no per-path / per-host knob in pipelock 2.3.0 — so we turn the
detector off for the entire bottle when the bottle declares an
egress-proxy route to `api.anthropic.com`. The trade-off is
accepted: BIP-39 detection has little value in claude-bottle's
threat model (the agent has no access to a user's crypto
wallet seeds; the patterns that matter — gh*_, sk-ant-, AKIA,
etc. — keep firing)."""
threat model (the agent has no access to a user's crypto wallet
seeds; the patterns that matter — gh*_, sk-ant-, AKIA, etc. —
keep firing)."""
return not any(
"anthropic-base-url" in r.Role for r in bottle.cred_proxy.routes
r.Host == "api.anthropic.com" for r in bottle.egress_proxy.routes
)
@@ -143,16 +145,12 @@ def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]:
other allowlisted host is MITM'd by pipelock's per-bottle CA so
its body scanner sees the cleartext.
cred-proxy upstream hosts (github, gitea, npm) are deliberately
NOT auto-added here. cred-proxy's HTTPS client trusts pipelock's
CA at runtime (folded into its trust store via docker cp +
update-ca-certificates), so pipelock can MITM the cred-proxy →
upstream leg and body-scan it the same way it body-scans the
agent's direct HTTPS traffic. Without this, an agent that pushed
a secret via cred-proxy's /gh-git/ path would have no body
scanner in front of it. The PRD's earlier reasoning that
cred-proxy hosts needed passthrough was a workaround for the
cert-trust gap that no longer exists.
egress-proxy route hosts (github, gitea, npm) are deliberately
NOT auto-added here. egress-proxy's HTTPS client trusts pipelock's
CA at runtime (folded into its trust store via docker cp), so
pipelock MITMs and body-scans the egress-proxy → upstream leg the
same way it body-scanned the agent's direct HTTPS traffic before
the PRD 0017 cutover.
`bottle` is kept on the signature for forward-compat (a future
knob might let a manifest opt a host into passthrough); today
@@ -207,13 +205,13 @@ def pipelock_build_config(
`ssrf_ip_allowlist` is the list of IPs / CIDRs that bypass
pipelock's SSRF guard. Pipelock blocks RFC1918-resolved
destinations by default, which would catch the agent's
cred-proxy traffic (cred-proxy sits on the bottle's internal
Docker network in 172.x space). Pass the bottle's internal
network CIDR here so `cred-proxy:9099` requests get through
pipelock while api_allowlist + body-scanning still apply. Empty
by default; omitted from the rendered yaml when empty so
pipelock keeps its built-in SSRF defaults."""
destinations by default, which would catch sibling-sidecar
traffic on the bottle's internal Docker network in 172.x space
(e.g. egress-proxy → pipelock on the upstream leg). Pass the
bottle's internal network CIDR here so internal-network requests
pass through pipelock while api_allowlist + body-scanning still
apply. Empty by default; omitted from the rendered yaml when
empty so pipelock keeps its built-in SSRF defaults."""
cfg: dict[str, object] = {
"version": 1,
"mode": "strict",
@@ -322,9 +320,9 @@ class PipelockProxyPlan:
that they are populated.
`internal_network_cidr` ends up on pipelock's `ssrf.ip_allowlist`
so the agent's requests at `cred-proxy:9099` (or any other
bottle-internal sidecar) bypass pipelock's RFC1918 SSRF guard
while api_allowlist and body-scanning still apply."""
so traffic from sibling sidecars (egress-proxy → pipelock on the
upstream leg, etc.) bypasses pipelock's RFC1918 SSRF guard while
api_allowlist and body-scanning still apply."""
yaml_path: Path
slug: str