feat(egress-proxy): cutover from cred-proxy (PRD 0017 chunk 2)
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m3s

Hard cutover. cred-proxy is deleted; egress-proxy is now the agent's
HTTP_PROXY (when routes are declared) with pipelock on its outbound
leg. Two per-bottle CAs are minted: egress-proxy's (agent trust
store) and pipelock's (egress-proxy's outbound trust store).

Manifest:
  - `bottle.cred_proxy` → hard error with a migration recipe.
  - `bottle.egress_proxy` is the new shape (PRD 0017 chunk 1).
  - CredProxy* types + role validators removed.

Wiring:
  - launch.py: `egress_proxy_tls_init` mints the egress-proxy CA
    (cert+key concat for mitmproxy + cert-only for agent trust);
    `DockerEgressProxy.start` docker-cps both CAs in, sets
    `HTTPS_PROXY=pipelock` + `EGRESS_PROXY_UPSTREAM_CA` so mitmdump
    trusts pipelock's MITM. Agent's HTTP_PROXY points at
    egress-proxy when routes exist, else falls back to pipelock
    (no-routes bottles unchanged).
  - prepare.py / backend.py: `cred_proxy` arg → `egress_proxy`;
    sidecar-orphan probe + plan field + dashboard view all
    renamed.
  - provision_ca: selects the egress-proxy CA when present, else
    pipelock's (filename renamed to claude-bottle-mitm-ca.crt).
  - bottle.provision: cred-proxy dotfile rewrites (~/.npmrc,
    ~/.gitconfig insteadOf, tea config) are gone — HTTP_PROXY
    catches everything respecting it.

Pipelock helpers:
  - `pipelock_token_hosts` → `pipelock_route_hosts` (now reading
    egress_proxy.routes).
  - cred-proxy hostname auto-allow → egress-proxy hostname
    auto-allow.
  - Anthropic seed-phrase workaround now triggers when an
    egress_proxy route targets api.anthropic.com (was based on the
    cred-proxy `anthropic-base-url` role).

Dockerfile.egress-proxy:
  - Entrypoint conditionally passes
    `--set ssl_verify_upstream_trusted_ca=$EGRESS_PROXY_UPSTREAM_CA`
    (via the `${VAR:+...}` shell expansion) so standalone runs without
    a mounted pipelock CA still boot.
  - mkdirs `/home/mitmproxy/.mitmproxy` ahead of `docker cp`.

Deleted: claude_bottle/{cred_proxy,cred_proxy_server}.py,
backend/docker/{cred_proxy,provision/cred_proxy}.py,
Dockerfile.cred-proxy, plus the corresponding unit + integration
tests. backend/docker/cred_proxy_apply.py stays as a stub for
chunk 3 to rewrite (its container-name + routes-path constants
are inlined so it survives without the deleted module).

Test changes:
  - test_pipelock_allowlist rewritten against egress-proxy routes
    + the new `pipelock_route_hosts`.
  - test_manifest_md_load + test_pipelock_yaml + test_yaml_subset
    fixtures migrated to the `egress_proxy: { routes: [...] }`
    shape.
  - test_supervise_sidecar's round-trip test switched from
    `dashboard.approve` to `dashboard.reject`: the approval-apply
    path on cred-proxy-block proposals hits a deleted sidecar in
    chunk 2's transitional state. Chunk 3 restores the approval
    test once the remediation flow is retargeted at egress-proxy.

376 tests pass (was 427; net delta is removed cred-proxy tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-25 14:30:39 -04:00
parent 9e41845a2b
commit 70f773ac61
30 changed files with 573 additions and 3451 deletions
+15 -17
View File
@@ -219,26 +219,30 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
argv.
Default orchestration: ca → prompt → skills → git →
cred_proxy. CA install runs first so the agent's trust store
supervise. CA install runs first so the agent's trust store
is rebuilt before anything inside the agent makes a TLS call.
cred_proxy runs last because it appends to ~/.gitconfig (which
provision_git writes). Subclasses typically don't override
this; they implement the sub-methods below."""
Subclasses typically don't override this; they implement the
sub-methods below.
PRD 0017: cred-proxy's agent-side dotfile rewrites (~/.npmrc,
~/.gitconfig insteadOf, tea config) are gone. Egress-proxy is
on the agent's HTTP_PROXY path so every tool that respects
HTTPS_PROXY (claude-code, git over HTTPS, npm, curl) is
intercepted without per-tool reconfiguration."""
self.provision_ca(plan, target)
prompt_path = self.provision_prompt(plan, target)
self.provision_skills(plan, target)
self.provision_git(plan, target)
self.provision_cred_proxy(plan, target)
self.provision_supervise(plan, target)
return prompt_path
def provision_ca(self, plan: PlanT, target: str) -> None:
"""Install pipelock's per-bottle CA into the agent's trust
store so the agent trusts the bumped CONNECT cert pipelock
presents. Default impl is a no-op so backends that don't
yet support TLS interception (every backend except Docker
today) aren't forced to implement it. The Docker backend
overrides to docker-cp the cert in and run
"""Install the per-bottle CA into the agent's trust store so
the agent trusts the bumped CONNECT cert egress-proxy (was
pipelock, pre-PRD-0017) presents. Default impl is a no-op so
backends that don't yet support TLS interception (every backend
except Docker today) aren't forced to implement it. The Docker
backend overrides to docker-cp the cert in and run
`update-ca-certificates`."""
@abstractmethod
@@ -258,12 +262,6 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]):
"""Copy the host's cwd `.git` directory into the running
bottle if the user requested --cwd. No-op otherwise."""
def provision_cred_proxy(self, plan: PlanT, target: str) -> None:
"""Drop the cred-proxy agent-side dotfiles (.npmrc,
.gitconfig insteadOf, ~/.config/tea/config.yml) per PRD 0010.
Default impl is a no-op for backends that don't yet support
the cred-proxy sidecar; the Docker backend overrides."""
def provision_supervise(self, plan: PlanT, target: str) -> None:
"""Write the in-bottle Claude Code MCP config so the agent
discovers the per-bottle supervise sidecar (PRD 0013).
+4 -8
View File
@@ -23,11 +23,10 @@ from . import prepare as _prepare
from .bottle import DockerBottle
from .bottle_cleanup_plan import DockerBottleCleanupPlan
from .bottle_plan import DockerBottlePlan
from .cred_proxy import DockerCredProxy
from .egress_proxy import DockerEgressProxy
from .git_gate import DockerGitGate
from .pipelock import DockerPipelockProxy
from .provision import ca as _ca
from .provision import cred_proxy as _cred_proxy
from .provision import git as _git
from .provision import prompt as _prompt
from .provision import skills as _skills
@@ -44,7 +43,7 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup
def __init__(self) -> None:
self._proxy = DockerPipelockProxy()
self._git_gate = DockerGitGate()
self._cred_proxy = DockerCredProxy()
self._egress_proxy = DockerEgressProxy()
self._supervise = DockerSupervise()
def _resolve_plan(self, spec: BottleSpec, *, stage_dir: Path) -> DockerBottlePlan:
@@ -53,7 +52,7 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup
stage_dir=stage_dir,
proxy=self._proxy,
git_gate=self._git_gate,
cred_proxy=self._cred_proxy,
egress_proxy=self._egress_proxy,
supervise=self._supervise,
)
@@ -63,7 +62,7 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup
plan,
proxy=self._proxy,
git_gate=self._git_gate,
cred_proxy=self._cred_proxy,
egress_proxy=self._egress_proxy,
supervise=self._supervise,
provision=self.provision,
) as bottle:
@@ -81,9 +80,6 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup
def provision_git(self, plan: DockerBottlePlan, target: str) -> None:
_git.provision_git(plan, target)
def provision_cred_proxy(self, plan: DockerBottlePlan, target: str) -> None:
_cred_proxy.provision_cred_proxy(plan, target)
def provision_supervise(self, plan: DockerBottlePlan, target: str) -> None:
_supervise_prov.provision_supervise(plan, target)
+30 -21
View File
@@ -11,7 +11,7 @@ import sys
from dataclasses import dataclass, field
from pathlib import Path
from ...cred_proxy import CredProxyPlan
from ...egress_proxy import EgressProxyPlan
from ...git_gate import GitGatePlan
from ...log import info
from ...manifest import Agent, Bottle
@@ -58,7 +58,7 @@ class DockerBottlePlan(BottlePlan):
prompt_file: Path
proxy_plan: PipelockProxyPlan
git_gate_plan: GitGatePlan
cred_proxy_plan: CredProxyPlan
egress_proxy_plan: EgressProxyPlan
# None when bottle.supervise is False. PRD 0013 supervise sidecar
# is opt-in via the manifest's bottle.supervise field.
supervise_plan: SupervisePlan | None
@@ -72,10 +72,11 @@ class DockerBottlePlan(BottlePlan):
bottle = manifest.bottle_for(spec.agent_name)
# The agent sees the union of literal env names (rendered into
# --env-file) and forwarded env names (`-e NAME` with the value
# arriving via subprocess env). The forwarded set already
# reflects PRD 0010's switch — when cred-proxy holds the
# anthropic token, CLAUDE_CODE_OAUTH_TOKEN is absent and
# ANTHROPIC_BASE_URL is present.
# arriving via subprocess env). The forwarded set holds the
# OAuth token (CLAUDE_CODE_OAUTH_TOKEN) and any host-env
# interpolations from the manifest; egress-proxy holds upstream
# tokens in its own environ, so no token forwarding from the
# agent to the proxy is needed.
env_names = sorted(set(bottle.env.keys()) | set(self.forwarded_env.keys()))
return _PlanView(
agent=agent,
@@ -120,16 +121,25 @@ class DockerBottlePlan(BottlePlan):
info(f" git gate : {'; '.join(git_lines)}")
else:
info(" git remotes : (none)")
if self.cred_proxy_plan.routes:
lines = [f"{r.path}{r.upstream}" for r in self.cred_proxy_plan.routes]
refs = sorted({r.token_ref for r in self.cred_proxy_plan.routes})
info(f" cred-proxy : {len(lines)} route(s); tokens: {', '.join(refs)}")
if self.egress_proxy_plan.routes:
lines = []
for r in self.egress_proxy_plan.routes:
paths = (
" " + ",".join(r.path_allowlist) if r.path_allowlist else ""
)
auth = f" [auth:{r.auth_scheme}]" if r.auth_scheme else ""
lines.append(f"{r.host}{auth}{paths}")
refs = sorted({r.token_ref for r in self.egress_proxy_plan.routes if r.token_ref})
tokens_part = (
f"; tokens: {', '.join(refs)}" if refs else ""
)
info(f" egress-proxy : {len(lines)} route(s){tokens_part}")
for line in lines:
info(f" {line}")
else:
info(" cred-proxy : (none)")
info(" egress-proxy : (none)")
info(f" egress : {self.allowlist_summary}")
info(" tls intercept : pipelock (per-bottle ephemeral CA, generated at launch)")
info(" tls intercept : egress-proxy (per-bottle ephemeral CA, generated at launch)")
if self.supervise_plan is not None:
info(
f" supervise : enabled; queue at {self.supervise_plan.queue_dir}"
@@ -166,23 +176,22 @@ class DockerBottlePlan(BottlePlan):
}
for u in self.git_gate_plan.upstreams
],
"cred_proxy": [
"egress_proxy": [
{
"path": r.path,
"upstream": r.upstream,
"host": r.host,
"path_allowlist": list(r.path_allowlist),
"auth_scheme": r.auth_scheme,
"token_ref": r.token_ref,
"roles": list(r.roles),
}
for r in self.cred_proxy_plan.routes
for r in self.egress_proxy_plan.routes
],
"egress": {
"host_count": len(hosts),
"hosts": hosts,
# PRD 0006: pipelock's `tls_interception` block is on
# for every launched bottle. ca_fingerprint is always
# null at dry-run because the CA doesn't exist yet —
# real launches print the fingerprint to stderr from
# PRD 0017: TLS interception moved from pipelock to
# egress-proxy. ca_fingerprint is always null at
# dry-run because the CA doesn't exist yet — real
# launches print the fingerprint to stderr from
# provision_ca. Reserved field for forward-compat.
"tls_interception": {
"enabled": True,
-250
View File
@@ -1,250 +0,0 @@
"""DockerCredProxy — the Docker-specific lifecycle for the per-bottle
cred-proxy sidecar (PRD 0010). Inherits the platform-agnostic prepare
step (route lift + routes.json render + token-env-map derivation)
from `CredProxy`."""
from __future__ import annotations
import os
import subprocess
from pathlib import Path
from ...cred_proxy import (
CRED_PROXY_HOSTNAME,
CredProxy,
CredProxyPlan,
cred_proxy_resolve_token_values,
)
from ...log import die, info, warn
from . import util as docker_mod
CRED_PROXY_IMAGE = os.environ.get(
"CLAUDE_BOTTLE_CRED_PROXY_IMAGE",
"claude-bottle-cred-proxy:latest",
)
CRED_PROXY_DOCKERFILE = "Dockerfile.cred-proxy"
# Listening port inside the sidecar. The agent dials cred-proxy on
# this port; surfaced as a constant so the provisioner and tests can
# both reference it.
CRED_PROXY_PORT = int(os.environ.get("CLAUDE_BOTTLE_CRED_PROXY_PORT", "9099"))
# In-container path the proxy server reads its route table from.
# Pre-created in Dockerfile.cred-proxy so `docker cp` can drop the
# file directly.
CRED_PROXY_ROUTES_IN_CONTAINER = "/run/cred-proxy/routes.json"
# In-container path for the per-bottle pipelock CA. Alpine's
# update-ca-certificates picks anything ending in `.crt` under
# /usr/local/share/ca-certificates/ and folds it into the system
# trust store at boot — so cred-proxy's HTTPS client trusts
# pipelock's bumped certs when pipelock MITMs the outbound leg.
CRED_PROXY_PIPELOCK_CA_IN_CONTAINER = "/usr/local/share/ca-certificates/pipelock.crt"
# Repo root, for `docker build` context. Resolved from this file's
# location: claude_bottle/backend/docker/cred_proxy.py → repo root.
_REPO_DIR = str(Path(__file__).resolve().parent.parent.parent.parent)
def cred_proxy_container_name(slug: str) -> str:
return f"claude-bottle-cred-proxy-{slug}"
def cred_proxy_url() -> str:
"""Base URL the agent dials. Stable across bottles because the
sidecar attaches `--network-alias cred-proxy` on the internal
network; the container name (which carries the slug) is not
referenced by agent-side config."""
return f"http://{CRED_PROXY_HOSTNAME}:{CRED_PROXY_PORT}"
def build_cred_proxy_image() -> None:
"""Build the cred-proxy image from `Dockerfile.cred-proxy`.
Called by `DockerCredProxy.start`; exposed at module level so
integration tests can build it without running the full launch
pipeline."""
docker_mod.build_image(CRED_PROXY_IMAGE, _REPO_DIR, dockerfile=CRED_PROXY_DOCKERFILE)
class DockerCredProxy(CredProxy):
"""Brings the cred-proxy sidecar up and down via Docker."""
def start(self, plan: CredProxyPlan) -> str:
"""Boot the cred-proxy sidecar:
1. Resolve every host TokenRef env var into a concrete
value. Fails early if any are unset.
2. Build the cred-proxy image (no-op when cache is hot).
3. `docker create` on the internal network with
`--network-alias cred-proxy` and one `-e CRED_PROXY_TOKEN_N`
flag per route. The values arrive via subprocess env, so
they never land on argv.
4. `docker cp` the routes.json into the container.
5. Attach to the per-agent egress network so the proxy can
reach the real upstream over HTTPS.
6. `docker start`.
Returns the container name (the target passed to `.stop`)."""
if not plan.routes:
die("DockerCredProxy.start called with no routes; caller should skip")
if not plan.internal_network or not plan.egress_network:
die(
"DockerCredProxy.start: internal_network / egress_network must be "
"populated on the plan before start"
)
if not plan.routes_path.is_file():
die(
f"cred-proxy routes file missing at {plan.routes_path}; "
f"CredProxy.prepare must run first"
)
# pipelock fields are populated by launch.py in production; both
# must be present (URL + CA) or both absent. Mixing is a wiring
# bug. Both-absent is supported only as a test escape hatch:
# the integration tests in tests/integration/ exercise header
# injection in isolation and do not bring pipelock up.
route_via_pipelock = bool(plan.pipelock_proxy_url) or plan.pipelock_ca_host_path != Path()
if route_via_pipelock:
if not plan.pipelock_proxy_url:
die(
"DockerCredProxy.start: pipelock_ca_host_path is set but "
"pipelock_proxy_url is empty; populate both or neither."
)
if not plan.pipelock_ca_host_path.is_file():
die(
f"DockerCredProxy.start: pipelock CA missing at "
f"{plan.pipelock_ca_host_path}; pipelock_tls_init must run first"
)
# Resolve host env vars into concrete values. This must
# happen at start time (not prepare) — the values flow into
# the sidecar's environ via subprocess env. The plan never
# holds them.
token_values = cred_proxy_resolve_token_values(plan.token_env_map, dict(os.environ))
build_cred_proxy_image()
name = cred_proxy_container_name(plan.slug)
info(f"starting cred-proxy sidecar {name} on network {plan.internal_network}")
create_args = [
"docker", "create",
"--name", name,
"--network", plan.internal_network,
"--network-alias", CRED_PROXY_HOSTNAME,
]
if route_via_pipelock:
# Route cred-proxy's outbound HTTPS through pipelock so
# the egress allowlist + DLP body scanner apply to its
# traffic. Pipelock MITMs each handshake with the
# per-bottle CA we docker cp in below.
create_args.extend([
"-e", f"HTTPS_PROXY={plan.pipelock_proxy_url}",
"-e", f"HTTP_PROXY={plan.pipelock_proxy_url}",
"-e", "NO_PROXY=localhost,127.0.0.1",
])
# One -e flag per token slot; values arrive via subprocess env.
# docker create with `-e NAME` (no =VALUE) reads NAME from the
# current process env at create time. We pass `env=child_env`
# to subprocess.run so the value comes from token_values, not
# the host's os.environ directly — keeps the resolver in one
# place and lets cred_proxy_resolve_token_values surface
# missing-env errors with a clear hint.
for token_env in sorted(plan.token_env_map.keys()):
create_args.extend(["-e", token_env])
create_args.append(CRED_PROXY_IMAGE)
child_env: dict[str, str] = {**os.environ, **token_values}
create_result = subprocess.run(
create_args, capture_output=True, text=True, env=child_env, check=False,
)
if create_result.returncode != 0:
die(
f"failed to create cred-proxy sidecar {name}: "
f"{create_result.stderr.strip()}"
)
cps: list[tuple[str, str, str]] = [
(str(plan.routes_path), CRED_PROXY_ROUTES_IN_CONTAINER, "routes.json"),
]
if route_via_pipelock:
# CA must land BEFORE `docker start` so the entrypoint's
# update-ca-certificates picks it up. Docker cp's the
# file in even on the stopped container — that's the
# whole reason this works without a custom build step.
cps.append((
str(plan.pipelock_ca_host_path),
CRED_PROXY_PIPELOCK_CA_IN_CONTAINER,
"pipelock CA",
))
for src, dst, label in cps:
cp_result = subprocess.run(
["docker", "cp", src, f"{name}:{dst}"],
capture_output=True,
text=True,
check=False,
)
if cp_result.returncode != 0:
subprocess.run(
["docker", "rm", "-f", name],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
)
die(
f"failed to copy {label} into {name}: "
f"{cp_result.stderr.strip()}"
)
connect_result = subprocess.run(
["docker", "network", "connect", plan.egress_network, name],
capture_output=True, text=True, check=False,
)
if connect_result.returncode != 0:
subprocess.run(
["docker", "rm", "-f", name],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
)
die(
f"failed to attach cred-proxy sidecar {name} to egress network "
f"{plan.egress_network}: {connect_result.stderr.strip()}"
)
start_result = subprocess.run(
["docker", "start", name], capture_output=True, text=True, check=False,
)
if start_result.returncode != 0:
subprocess.run(
["docker", "rm", "-f", name],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
)
die(
f"failed to start cred-proxy sidecar {name}: "
f"{start_result.stderr.strip()}"
)
return name
def stop(self, target: str) -> None:
"""Idempotent: missing container is success. `target` is the
container name returned by `.start`."""
if subprocess.run(
["docker", "inspect", target],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
).returncode == 0:
if subprocess.run(
["docker", "rm", "-f", target],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
).returncode != 0:
warn(
f"failed to remove cred-proxy sidecar {target}; "
f"clean up with 'docker rm -f {target}'"
)
@@ -22,15 +22,25 @@ import subprocess
import tempfile
from pathlib import Path
from .cred_proxy import (
CRED_PROXY_ROUTES_IN_CONTAINER,
cred_proxy_container_name,
)
# Constants inlined from the deleted `claude_bottle.backend.docker.
# cred_proxy` module (PRD 0017 chunk 2 cutover). Chunk 3 retargets
# this file at egress-proxy and gets rid of these.
CRED_PROXY_ROUTES_IN_CONTAINER = "/run/cred-proxy/routes.json"
def _cred_proxy_container_name(slug: str) -> str:
return f"claude-bottle-cred-proxy-{slug}"
class CredProxyApplyError(RuntimeError):
"""Raised when fetch / apply fails. Caller renders to the
operator; does not crash the dashboard."""
operator; does not crash the dashboard.
PRD 0017 chunk 2 deletes the cred-proxy sidecar; this module's
docker-exec calls now hit a non-existent container and raise
CredProxyApplyError with a "container not running" message,
which the dashboard surfaces to the operator. Chunk 3 retargets
everything at egress-proxy."""
def fetch_current_routes(slug: str) -> str:
@@ -38,7 +48,7 @@ def fetch_current_routes(slug: str) -> str:
for `slug`. Returns the file content as a string. Raises
CredProxyApplyError if the sidecar isn't reachable or the read
fails."""
container = cred_proxy_container_name(slug)
container = _cred_proxy_container_name(slug)
r = subprocess.run(
["docker", "exec", container, "cat", CRED_PROXY_ROUTES_IN_CONTAINER],
capture_output=True, text=True, check=False,
@@ -80,7 +90,7 @@ def apply_routes_change(slug: str, new_content: str) -> tuple[str, str]:
sidecar are unchanged if the failure is before docker cp, and
are reverted in spirit if SIGHUP fails (cp landed but reload
didn't fire — caller's next attempt will SIGHUP again)."""
container = cred_proxy_container_name(slug)
container = _cred_proxy_container_name(slug)
before = fetch_current_routes(slug)
validate_routes_json(new_content)
+128 -29
View File
@@ -3,9 +3,9 @@ per-bottle egress-proxy sidecar (PRD 0017). Inherits the platform-
agnostic prepare step (route lift + routes.yaml render + token-env
map derivation) from `EgressProxy`.
Chunk 1 of the PRD: the lifecycle is implemented but not yet called
from `launch.py`. Tests build the image and exercise start/stop
directly. Chunk 2 wires this in alongside the cred-proxy removal."""
Chunks 1+2 of the PRD: the lifecycle is implemented and wired into
launch.py — cred-proxy is gone. Chunk 3 retargets the cred-proxy-
block remediation flow (PRD 0014)."""
from __future__ import annotations
@@ -24,6 +24,8 @@ from ...log import die, info, warn
from . import util as docker_mod
EGRESS_PROXY_IMAGE = os.environ.get(
"CLAUDE_BOTTLE_EGRESS_PROXY_IMAGE",
"claude-bottle-egress-proxy:latest",
@@ -32,9 +34,19 @@ EGRESS_PROXY_IMAGE = os.environ.get(
EGRESS_PROXY_DOCKERFILE = "Dockerfile.egress-proxy"
# Listening port inside the sidecar. The agent's HTTP_PROXY env var
# (chunk 2) will resolve to `http://egress-proxy:<port>`.
# resolves to `http://egress-proxy:<port>`.
EGRESS_PROXY_PORT = int(os.environ.get("CLAUDE_BOTTLE_EGRESS_PROXY_PORT", "9099"))
# In-container path for mitmproxy's CA. The format is a single PEM
# file holding BOTH the cert and the private key, concatenated. The
# upstream-trust CA (pipelock's, so egress-proxy trusts the upstream
# leg) is a separate file because pipelock keeps a different CA on
# its end.
EGRESS_PROXY_CA_IN_CONTAINER = "/home/mitmproxy/.mitmproxy/mitmproxy-ca.pem"
EGRESS_PROXY_PIPELOCK_CA_IN_CONTAINER = (
"/home/mitmproxy/.mitmproxy/pipelock-ca.pem"
)
# Repo root, for `docker build` context. Resolved from this file's
# location: claude_bottle/backend/docker/egress_proxy.py → repo root.
_REPO_DIR = str(Path(__file__).resolve().parent.parent.parent.parent)
@@ -62,6 +74,55 @@ def build_egress_proxy_image() -> None:
)
def egress_proxy_tls_init(stage_dir: Path) -> tuple[Path, Path]:
"""Mint the per-bottle egress-proxy MITM CA. Reuses the pipelock
binary's `tls init` subcommand — a known-good RSA CA minter we
already pin and run on this host.
Returns `(mitmproxy_pem, cert_only_pem)`:
- `mitmproxy_pem` is the single-PEM concat (cert + key)
mitmproxy reads from `~/.mitmproxy/mitmproxy-ca.pem`.
- `cert_only_pem` is the cert alone — installed into the agent's
trust store by `provision_ca` so the agent trusts the bumped
CONNECT cert egress-proxy presents.
Both files live under `<stage_dir>/egress-proxy-ca/` (mode 600).
Private keys never leave the host stage dir until
`DockerEgressProxy.start` docker-cps the concat file into the
sidecar; start.py's outer finally `shutil.rmtree`s the stage dir
after teardown.
Imported lazily inside the function so test patchers in
pipelock-land don't need to know about us."""
# Local import keeps the module-import graph free of a hard
# pipelock-image dependency at top of file (we don't actually
# need pipelock's *runtime* here, just its tls-init subcommand).
from .pipelock import PIPELOCK_IMAGE
work = stage_dir / "egress-proxy-ca"
work.mkdir(exist_ok=True)
result = subprocess.run(
["docker", "run", "--rm",
"-v", f"{work}:/h",
"-e", "PIPELOCK_HOME=/h",
PIPELOCK_IMAGE, "tls", "init"],
capture_output=True,
text=True,
check=False,
)
if result.returncode != 0:
die(f"egress-proxy tls init failed: {result.stderr.strip()}")
cert = work / "ca.pem"
key = work / "ca-key.pem"
if not cert.is_file() or not key.is_file():
die(f"egress-proxy tls init did not produce ca files in {work}")
cert.chmod(0o600)
# mitmproxy reads cert + key from a single concatenated PEM file.
mitm = work / "mitmproxy-ca.pem"
mitm.write_bytes(cert.read_bytes() + key.read_bytes())
mitm.chmod(0o600)
return (mitm, cert)
class DockerEgressProxy(EgressProxy):
"""Brings the egress-proxy sidecar up and down via Docker."""
@@ -71,13 +132,16 @@ class DockerEgressProxy(EgressProxy):
value. Fails early if any are unset.
2. Build the egress-proxy image (no-op when cache is hot).
3. `docker create` on the internal network with
`--network-alias egress-proxy` and one `-e EGRESS_PROXY_TOKEN_N`
flag per token slot. The values arrive via subprocess env, so
they never land on argv.
4. `docker cp` the routes.yaml into the container.
`--network-alias egress-proxy`, the `HTTPS_PROXY=pipelock`
env (so the upstream leg traverses pipelock), the
`EGRESS_PROXY_UPSTREAM_CA` env pointing at the in-container
pipelock-CA path (so mitmproxy trusts pipelock's MITM),
and one `-e EGRESS_PROXY_TOKEN_N` flag per token slot.
Secret values arrive via subprocess env, never argv.
4. `docker cp` the routes.yaml, mitmproxy CA (cert+key
concat), and pipelock CA (cert only) into the container.
5. Attach to the per-agent egress network so the proxy can
reach pipelock (chunk 2 turns this into the pipelock leg
via HTTPS_PROXY).
reach pipelock.
6. `docker start`.
Returns the container name (the target passed to `.stop`)."""
if not plan.routes:
@@ -92,6 +156,27 @@ class DockerEgressProxy(EgressProxy):
f"egress-proxy routes file missing at {plan.routes_path}; "
f"EgressProxy.prepare must run first"
)
if plan.mitmproxy_ca_host_path == Path() or not plan.mitmproxy_ca_host_path.is_file():
die(
f"DockerEgressProxy.start: mitmproxy CA missing at "
f"{plan.mitmproxy_ca_host_path}; egress_proxy_tls_init must run first"
)
# pipelock CA + upstream proxy URL: both must be present (we
# use HTTPS_PROXY=pipelock with pipelock's own MITM CA on the
# upstream leg) or both absent (egress-proxy goes direct, for
# standalone integration tests that don't bring pipelock up).
route_via_pipelock = bool(plan.pipelock_proxy_url) or plan.pipelock_ca_host_path != Path()
if route_via_pipelock:
if not plan.pipelock_proxy_url:
die(
"DockerEgressProxy.start: pipelock_ca_host_path is set but "
"pipelock_proxy_url is empty; populate both or neither."
)
if not plan.pipelock_ca_host_path.is_file():
die(
f"DockerEgressProxy.start: pipelock CA missing at "
f"{plan.pipelock_ca_host_path}; pipelock_tls_init must run first"
)
# Resolve host env vars into concrete values. Must happen at
# start time (not prepare) — the values flow into the sidecar's
@@ -111,15 +196,19 @@ class DockerEgressProxy(EgressProxy):
"--network", plan.internal_network,
"--network-alias", EGRESS_PROXY_HOSTNAME,
]
if plan.pipelock_proxy_url:
if route_via_pipelock:
# Route egress-proxy's outbound HTTPS through pipelock so
# the egress allowlist + DLP body scanner apply to its
# traffic on the egress-proxy → upstream leg. Wiring lands
# in chunk 2.
# traffic on the egress-proxy → upstream leg. Pipelock
# MITMs each handshake with its per-bottle CA, which is
# docker-cp'd in below and pointed to via the
# EGRESS_PROXY_UPSTREAM_CA env (entrypoint conditionally
# adds the matching --set flag).
create_args.extend([
"-e", f"HTTPS_PROXY={plan.pipelock_proxy_url}",
"-e", f"HTTP_PROXY={plan.pipelock_proxy_url}",
"-e", "NO_PROXY=localhost,127.0.0.1",
"-e", f"EGRESS_PROXY_UPSTREAM_CA={EGRESS_PROXY_PIPELOCK_CA_IN_CONTAINER}",
])
# One -e flag per token slot; values arrive via subprocess env.
# docker create with `-e NAME` (no =VALUE) reads NAME from the
@@ -143,24 +232,34 @@ class DockerEgressProxy(EgressProxy):
f"{create_result.stderr.strip()}"
)
cp_result = subprocess.run(
["docker", "cp", str(plan.routes_path),
f"{name}:{EGRESS_PROXY_ROUTES_IN_CONTAINER}"],
capture_output=True,
text=True,
check=False,
)
if cp_result.returncode != 0:
subprocess.run(
["docker", "rm", "-f", name],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
cps: list[tuple[Path, str, str]] = [
(plan.routes_path, EGRESS_PROXY_ROUTES_IN_CONTAINER, "routes.yaml"),
(plan.mitmproxy_ca_host_path, EGRESS_PROXY_CA_IN_CONTAINER, "mitmproxy CA"),
]
if route_via_pipelock:
cps.append((
plan.pipelock_ca_host_path,
EGRESS_PROXY_PIPELOCK_CA_IN_CONTAINER,
"pipelock CA",
))
for src, dst, label in cps:
cp_result = subprocess.run(
["docker", "cp", str(src), f"{name}:{dst}"],
capture_output=True,
text=True,
check=False,
)
die(
f"failed to copy routes.yaml into {name}: "
f"{cp_result.stderr.strip()}"
)
if cp_result.returncode != 0:
subprocess.run(
["docker", "rm", "-f", name],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
check=False,
)
die(
f"failed to copy {label} into {name}: "
f"{cp_result.stderr.strip()}"
)
connect_result = subprocess.run(
["docker", "network", "connect", plan.egress_network, name],
+55 -28
View File
@@ -24,7 +24,11 @@ from . import network as network_mod
from . import util as docker_mod
from .bottle import DockerBottle
from .bottle_plan import DockerBottlePlan
from .cred_proxy import DockerCredProxy
from .egress_proxy import (
DockerEgressProxy,
egress_proxy_tls_init,
egress_proxy_url,
)
from .git_gate import DockerGitGate
from .pipelock import (
PIPELOCK_CA_CERT_IN_CONTAINER,
@@ -47,7 +51,7 @@ def launch(
*,
proxy: DockerPipelockProxy,
git_gate: DockerGitGate,
cred_proxy: DockerCredProxy,
egress_proxy: DockerEgressProxy,
supervise: DockerSupervise,
provision: Callable[[DockerBottlePlan, str], str | None],
) -> Generator[DockerBottle, None, None]:
@@ -83,19 +87,28 @@ def launch(
# Docker assigns a CIDR to the new internal network. Pipelock's
# SSRF guard otherwise rejects any destination resolving into
# RFC1918 space — which includes the cred-proxy / git-gate /
# pipelock sidecars themselves. Allowlist the bottle's own
# internal subnet so the agent can reach its sidecars via
# pipelock; api_allowlist + body-scanning still apply.
# RFC1918 space — which includes the sibling sidecars
# (egress-proxy → pipelock on the upstream leg, etc.).
# Allowlist the bottle's own internal subnet so internal
# traffic passes through pipelock; api_allowlist + body-scanning
# still apply.
internal_cidr = network_mod.network_inspect_cidr(internal_network)
# Per-bottle ephemeral CA for pipelock's TLS interception
# (PRD 0006). One-shot pipelock container writes ca.pem +
# ca-key.pem under plan.stage_dir; .start docker-cp's them
# into the sidecar. The private key never leaves the host
# stage dir, which start.py's outer finally `shutil.rmtree`s
# after the sidecar is torn down.
# Per-bottle ephemeral CAs (PRD 0006 + PRD 0017). Two
# separate CAs:
# - pipelock CA: signs MITM certs pipelock presents on the
# egress-proxy → upstream leg.
# - egress-proxy CA: signs MITM certs egress-proxy presents
# to the agent on the agent → egress-proxy leg.
# Both are minted by one-shot pipelock containers (pipelock's
# `tls init` is a known-good RSA CA minter) under stage_dir;
# the .start steps docker-cp the files in. Private keys never
# leave the host stage dir, which start.py's outer finally
# `shutil.rmtree`s after the sidecars are torn down.
ca_cert_host, ca_key_host = pipelock_tls_init(plan.stage_dir)
egress_proxy_ca_host, egress_proxy_ca_cert_only = egress_proxy_tls_init(
plan.stage_dir,
)
# Re-render the pipelock yaml with the SSRF allowlist now that
# we know the internal CIDR. Prepare wrote the yaml without
@@ -141,26 +154,28 @@ def launch(
git_gate_name = git_gate.start(plan.git_gate_plan)
stack.callback(git_gate.stop, git_gate_name)
# Cred-proxy (PRD 0010). One sidecar per bottle when
# bottle.cred_proxy.routes is non-empty. Must come up AFTER pipelock
# — cred-proxy routes its outbound HTTPS through pipelock
# (HTTPS_PROXY in environ + the per-bottle CA in its trust
# store) so the egress allowlist + body scanner sit in the
# cred-proxy path too. Must come up BEFORE the agent so DNS
# resolution for `cred-proxy` succeeds on the agent's first
# call; tokens flow from the host env into the sidecar's
# environ, not the agent's.
if plan.cred_proxy_plan.routes:
cred_proxy_plan = dataclasses.replace(
plan.cred_proxy_plan,
# Egress-proxy (PRD 0017). One sidecar per bottle when
# bottle.egress_proxy.routes is non-empty. Must come up AFTER
# pipelock — egress-proxy routes its outbound HTTPS through
# pipelock (HTTPS_PROXY in environ + the pipelock CA in its
# trust store) so the egress allowlist + body scanner sit on
# the egress-proxy → upstream leg. Must come up BEFORE the
# agent so DNS resolution for `egress-proxy` succeeds on the
# agent's first call; tokens flow from the host env into the
# sidecar's environ, not the agent's.
if plan.egress_proxy_plan.routes:
egress_proxy_plan = dataclasses.replace(
plan.egress_proxy_plan,
internal_network=internal_network,
egress_network=egress_network,
mitmproxy_ca_host_path=egress_proxy_ca_host,
mitmproxy_ca_cert_only_host_path=egress_proxy_ca_cert_only,
pipelock_ca_host_path=ca_cert_host,
pipelock_proxy_url=pipelock_proxy_url(plan.slug),
)
plan = dataclasses.replace(plan, cred_proxy_plan=cred_proxy_plan)
cred_proxy_name = cred_proxy.start(plan.cred_proxy_plan)
stack.callback(cred_proxy.stop, cred_proxy_name)
plan = dataclasses.replace(plan, egress_proxy_plan=egress_proxy_plan)
egress_proxy_name = egress_proxy.start(plan.egress_proxy_plan)
stack.callback(egress_proxy.stop, egress_proxy_name)
# Supervise sidecar (PRD 0013). Opt-in via bottle.supervise.
# Internal-network only — the sidecar makes no outbound calls.
@@ -208,11 +223,23 @@ def _agent_no_proxy(plan: DockerBottlePlan) -> str:
return ",".join(hosts)
def _agent_proxy_url(plan: DockerBottlePlan) -> str:
"""Pick the proxy URL the agent's HTTP_PROXY env points at. PRD
0017: when an egress-proxy is declared, the agent goes through
egress-proxy (which in turn uses HTTPS_PROXY=pipelock on its
outbound leg). Otherwise the agent talks straight to pipelock —
keeps the network surface minimal for bottles that don't need
path filtering or credential injection."""
if plan.egress_proxy_plan.routes:
return egress_proxy_url()
return pipelock_proxy_url(plan.slug)
def _run_agent_container(plan: DockerBottlePlan, internal_network: str) -> str:
"""Build the `docker run` argv and execute it, handling name-
conflict races by incrementing the suffix (unless the name was
user-pinned). Returns the resolved container name."""
proxy_url = pipelock_proxy_url(plan.slug)
proxy_url = _agent_proxy_url(plan)
docker_args: list[str] = [
"--rm", "-d",
"--name", plan.container_name,
+27 -43
View File
@@ -15,17 +15,13 @@ from datetime import datetime, timezone
from pathlib import Path
from ... import pipelock
from ...cred_proxy import cred_proxy_render_routes
from ...egress_proxy import egress_proxy_render_routes
from ...env import ResolvedEnv, resolve_env
from ...log import die
from .. import BottleSpec
from . import util as docker_mod
from .bottle_plan import DockerBottlePlan
from .cred_proxy import (
DockerCredProxy,
cred_proxy_container_name,
cred_proxy_url,
)
from .egress_proxy import DockerEgressProxy, egress_proxy_container_name
from .git_gate import DockerGitGate, git_gate_container_name
from .bottle_state import (
BottleMetadata,
@@ -46,7 +42,7 @@ def resolve_plan(
stage_dir: Path,
proxy: DockerPipelockProxy,
git_gate: DockerGitGate,
cred_proxy: DockerCredProxy,
egress_proxy: DockerEgressProxy,
supervise: DockerSupervise,
) -> DockerBottlePlan:
"""Resolve Docker-specific names and write scratch files. Trusts
@@ -127,15 +123,15 @@ def resolve_plan(
# surface as a docker-create conflict deep inside launch() with no
# actionable hint. Fail fast here with a cleanup pointer instead.
# Only probe sidecars this launch will actually try to create:
# pipelock always; git-gate when bottle.git is non-empty; cred-proxy
# when bottle.cred_proxy.routes is non-empty.
# pipelock always; git-gate when bottle.git is non-empty;
# egress-proxy when bottle.egress_proxy.routes is non-empty.
sidecar_probes: list[tuple[str, str]] = [
("pipelock", pipelock_container_name(slug)),
]
if bottle.git:
sidecar_probes.append(("git-gate", git_gate_container_name(slug)))
if bottle.cred_proxy.routes:
sidecar_probes.append(("cred-proxy", cred_proxy_container_name(slug)))
if bottle.egress_proxy.routes:
sidecar_probes.append(("egress-proxy", egress_proxy_container_name(slug)))
if bottle.supervise:
sidecar_probes.append(("supervise", supervise_container_name(slug)))
for label, sidecar_name in sidecar_probes:
@@ -154,10 +150,13 @@ def resolve_plan(
proxy_plan = proxy.prepare(bottle, slug, stage_dir)
git_gate_plan = git_gate.prepare(bottle, slug, stage_dir)
cred_proxy_plan = cred_proxy.prepare(bottle, slug, stage_dir)
egress_proxy_plan = egress_proxy.prepare(bottle, slug, stage_dir)
supervise_plan = None
if bottle.supervise:
routes_content = cred_proxy_render_routes(cred_proxy_plan.routes) if cred_proxy_plan.routes else ""
routes_content = (
egress_proxy_render_routes(egress_proxy_plan.routes)
if egress_proxy_plan.routes else ""
)
allowlist_content = "\n".join(pipelock.pipelock_effective_allowlist(bottle)) + "\n"
# Current Dockerfile for the agent image. Read from the repo
# root; for `--cwd` derived images the base Dockerfile is what
@@ -176,36 +175,21 @@ def resolve_plan(
# never lands on argv or in env_file) goes into one dict. Nothing
# mutates the host os.environ.
forwarded_env: dict[str, str] = dict(resolved.forwarded)
# Find the (at most one) cred-proxy route claiming the
# anthropic-base-url role. Manifest validation enforces the
# singleton constraint. cred-proxy is the only path the Anthropic
# OAuth token reaches the bottle — there is no fallback that
# forwards it into the agent's environ directly. Bottles that
# need claude-code to authenticate must declare an
# anthropic-base-url route.
anthropic_route = next(
(r for r in cred_proxy_plan.routes if "anthropic-base-url" in r.roles),
None,
# When the bottle declares an egress-proxy route for the Anthropic
# OAuth flow, claude-code's outbound Authorization gets stripped +
# re-injected by egress-proxy. The agent's environ still needs
# *something* claude-code recognises as a credential or it refuses
# to start; ship a non-secret placeholder. The placeholder is not
# any real `auth.token_ref` value, so leaking it would tell an
# attacker only that egress-proxy is in front.
has_anthropic_auth = any(
r.token_ref == "CLAUDE_CODE_OAUTH_TOKEN"
for r in egress_proxy_plan.routes
)
if anthropic_route is not None:
# Point claude-code at the cred-proxy. The sidecar holds the
# OAuth token; the agent's environ does not. Strip the
# trailing slash so claude-code's path-join produces e.g.
# http://cred-proxy:9099/anthropic/v1/messages.
forwarded_env["ANTHROPIC_BASE_URL"] = (
f"{cred_proxy_url()}{anthropic_route.path}".rstrip("/")
)
# claude-code refuses to start without *some* credential in
# its env. The proxy strips inbound Authorization on every
# request and injects the real one — so a non-secret
# placeholder is sufficient and the SC1 test still holds
# (the placeholder is not a `cred_proxy.routes[].TokenRef`
# value). The agent cannot exfiltrate this string because
# it carries no meaning to api.anthropic.com.
forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = "cred-proxy-placeholder"
# Belt-and-braces: turn off telemetry endpoints that don't
# route through ANTHROPIC_BASE_URL (statsig, error reporting).
# PRD 0010 open question default.
if has_anthropic_auth:
forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = "egress-proxy-placeholder"
# Belt-and-braces: turn off telemetry endpoints (statsig,
# error reporting) that egress-proxy can't gate by auth.
forwarded_env.setdefault("CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC", "1")
forwarded_env.setdefault("DISABLE_ERROR_REPORTING", "1")
_write_env_file(resolved, env_file)
@@ -229,7 +213,7 @@ def resolve_plan(
prompt_file=prompt_file,
proxy_plan=proxy_plan,
git_gate_plan=git_gate_plan,
cred_proxy_plan=cred_proxy_plan,
egress_proxy_plan=egress_proxy_plan,
supervise_plan=supervise_plan,
allowlist_summary=allowlist_summary,
use_runsc=use_runsc,
+45 -21
View File
@@ -1,12 +1,19 @@
"""Install pipelock's per-bottle CA into the agent container's trust
store (PRD 0006).
"""Install the per-bottle MITM CA into the agent container's trust
store.
By the time this provisioner runs, `pipelock_tls_init` has generated
a fresh CA into `plan.stage_dir/pipelock-ca/` and the pipelock sidecar
is up with `tls_interception: { enabled: true }` referencing the
in-container CA paths. This step makes the agent trust certs signed
by that CA so the agent's TLS handshake with the bumped CONNECT
succeeds.
Post-PRD-0017 the CA depends on the agent's HTTP_PROXY target:
- Bottle declares `egress_proxy.routes[]` → agent's HTTP_PROXY
points at egress-proxy; the cert the agent must trust is the
one egress-proxy mints leaf certs with (the egress-proxy CA).
- No egress_proxy routes → agent's HTTP_PROXY points straight at
pipelock; the cert the agent must trust is pipelock's CA (the
pre-cutover behavior).
By the time this provisioner runs, the corresponding `tls_init`
helper has generated the chosen CA under `plan.stage_dir`, and the
sidecar (pipelock or egress-proxy) is up referencing the
in-container CA paths.
Cert lands on Debian's standard source path
(`/usr/local/share/ca-certificates/`); `update-ca-certificates`
@@ -27,6 +34,7 @@ from __future__ import annotations
import hashlib
import ssl
import subprocess
from pathlib import Path
from ....log import info
from ..bottle_plan import DockerBottlePlan
@@ -35,26 +43,42 @@ from ..bottle_plan import DockerBottlePlan
# Debian-family path for sources that `update-ca-certificates` reads.
# Bundle path is what the command rebuilds and what every standard
# TLS consumer in the image reads.
AGENT_CA_PATH = "/usr/local/share/ca-certificates/claude-bottle-pipelock-ca.crt"
AGENT_CA_PATH = "/usr/local/share/ca-certificates/claude-bottle-mitm-ca.crt"
AGENT_CA_BUNDLE = "/etc/ssl/certs/ca-certificates.crt"
def provision_ca(plan: DockerBottlePlan, target: str) -> None:
"""Copy pipelock's CA cert into the agent, rebuild the trust
bundle, emit a one-line fingerprint log. Called from
`BottleBackend.provision` after the agent container is up."""
container = target
cert_host_path = plan.proxy_plan.ca_cert_host_path
if not cert_host_path or not cert_host_path.is_file():
# Defensive: provision runs after launch wires CA paths
# onto the plan via dataclasses.replace; an empty path here
# would mean that wiring was skipped.
def _select_ca_cert(plan: DockerBottlePlan) -> tuple[Path, str]:
"""Pick the CA cert (and a short label for the log line) that
matches the proxy the agent's HTTP_PROXY points at. Egress-proxy
wins when the bottle declares any routes (it sits in front of
pipelock); else pipelock."""
if plan.egress_proxy_plan.routes:
cert = plan.egress_proxy_plan.mitmproxy_ca_cert_only_host_path
if cert == Path() or not cert.is_file():
from ....log import die
die(
f"egress-proxy CA cert missing at {cert or '(empty)'}; "
f"launch must have called egress_proxy_tls_init and "
f"re-bound the plan before provision"
)
return cert, "egress-proxy"
cert = plan.proxy_plan.ca_cert_host_path
if not cert or not cert.is_file():
from ....log import die
die(
f"pipelock CA cert missing at {cert_host_path or '(empty)'}; "
f"pipelock CA cert missing at {cert or '(empty)'}; "
f"launch must have called pipelock_tls_init and re-bound "
f"the plan before provision"
)
return cert, "pipelock"
def provision_ca(plan: DockerBottlePlan, target: str) -> None:
"""Copy the agent-facing CA cert into the agent, rebuild the
trust bundle, emit a one-line fingerprint log. Called from
`BottleBackend.provision` after the agent container is up."""
container = target
cert_host_path, label = _select_ca_cert(plan)
subprocess.run(
["docker", "cp", str(cert_host_path), f"{container}:{AGENT_CA_PATH}"],
@@ -76,4 +100,4 @@ def provision_ca(plan: DockerBottlePlan, target: str) -> None:
# fingerprint form. Never the private key.
der = ssl.PEM_cert_to_DER_cert(cert_host_path.read_text())
fingerprint = hashlib.sha256(der).hexdigest()
info(f"pipelock ca fingerprint: sha256:{fingerprint[:32]}...")
info(f"{label} ca fingerprint: sha256:{fingerprint[:32]}...")
@@ -1,238 +0,0 @@
"""Cred-proxy provisioning inside a running Docker bottle (PRD 0010).
Writes the agent-side configuration that points each tool at the
per-bottle cred-proxy sidecar:
- ~/.npmrc — `registry=` pointing at /npm/
- ~/.gitconfig (appended) — `insteadOf` rules for the
github / gitea hosts the bottle
declared a token for
- ~/.config/tea/config.yml — per-gitea login pointing at
/gitea/<host>/
The ANTHROPIC_BASE_URL env var is set at `docker run -e` time by the
backend's launch step, not here — it has to be in the agent's environ
before claude starts, and there is no point in writing it to a dotfile
the agent would have to source. See `prepare.py` for that.
"""
from __future__ import annotations
import os
import subprocess
from pathlib import Path
from ....cred_proxy import CredProxyRoute
from ....log import info
from .. import util as docker_mod
from ..bottle_plan import DockerBottlePlan
from ..cred_proxy import cred_proxy_url
def provision_cred_proxy(plan: DockerBottlePlan, target: str) -> None:
"""Drop the agent-side dotfiles for each declared cred-proxy
route. No-op when the bottle has no routes."""
routes = plan.cred_proxy_plan.routes
if not routes:
return
bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
git_gate_hosts = {g.UpstreamHost for g in bottle.git}
_provision_npmrc(plan, target, routes)
_provision_gitconfig(plan, target, routes, git_gate_hosts)
_provision_tea_config(plan, target, routes)
# --- npm --------------------------------------------------------------------
def render_npmrc(routes: tuple[CredProxyRoute, ...]) -> str:
"""Render `~/.npmrc` content. Driven by the `npm-registry` role:
finds the (single) route that claims it and writes a registry=
line at the proxy. Empty string when no such route exists, so
callers can branch on emptiness.
The proxy strips inbound Authorization and injects its own — the
npmrc deliberately carries no `_authToken`. The registry alone
is enough. Manifest validation enforces that the role is a
singleton, so the first match is the only match."""
for r in routes:
if "npm-registry" in r.roles:
return f"registry={cred_proxy_url()}{r.path}\n"
return ""
def _provision_npmrc(
plan: DockerBottlePlan,
target: str,
routes: tuple[CredProxyRoute, ...],
) -> None:
content = render_npmrc(routes)
if not content:
return
container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node")
container_npmrc = f"{container_home}/.npmrc"
npmrc = plan.stage_dir / "agent_npmrc"
npmrc.write_text(content)
npmrc.chmod(0o600)
info(f"writing {container_npmrc} (cred-proxy npm registry)")
subprocess.run(
["docker", "cp", str(npmrc), f"{target}:{container_npmrc}"],
stdout=subprocess.DEVNULL,
check=True,
)
docker_mod.docker_exec_root(target, ["chown", "node:node", container_npmrc])
docker_mod.docker_exec_root(target, ["chmod", "644", container_npmrc])
# --- git config -------------------------------------------------------------
def render_cred_proxy_gitconfig(
routes: tuple[CredProxyRoute, ...],
git_gate_hosts: set[str] = frozenset(), # type: ignore[assignment]
) -> str:
"""Render the `~/.gitconfig` fragment for cred-proxy insteadOf
rewrites. Driven by the `git-insteadof` role: each route that
claims it produces a `[url "<proxy><path>"] insteadOf =
<upstream>/` block. Empty string when no such route exists.
The rewrite is suppressed for any route whose upstream host is
also declared in `bottle.git`. git-gate is the canonical git
path on those hosts — its pre-receive runs gitleaks before
forwarding the push. A cred-proxy `https://<host>/` rewrite
would route HTTPS git ops around the gate. cred-proxy still
refuses smart-HTTP push at runtime (defense in depth), but
suppressing the rewrite means `git clone https://<host>/...`
doesn't have a tempting shortcut that just confuses on push.
The insteadOf left-hand side comes from `route.upstream` (with a
trailing `/` so insteadOf matches at the directory boundary),
so the same renderer handles github.com, gitea.dideric.is, and
any future host the user wires up."""
rules: list[str] = []
for r in routes:
if "git-insteadof" not in r.roles:
continue
# Strip scheme to derive the host for the git-gate overlap
# check. urllib.parse-free parse: same shape we accept in
# manifest validation.
host = r.upstream.removeprefix("https://").partition("/")[0].partition(":")[0]
if host in git_gate_hosts:
continue
rules.append(
f'[url "{cred_proxy_url()}{r.path}"]\n'
f"\tinsteadOf = {r.upstream}/\n"
)
if not rules:
return ""
return (
"# claude-bottle cred-proxy (PRD 0010): rewrite https://<host>/ to\n"
"# the per-bottle cred-proxy sidecar, which holds the upstream\n"
"# credential and injects the Authorization header.\n"
+ "".join(rules)
)
def _provision_gitconfig(
plan: DockerBottlePlan,
target: str,
routes: tuple[CredProxyRoute, ...],
git_gate_hosts: set[str],
) -> None:
"""Append the cred-proxy insteadOf rules to ~/.gitconfig. Runs
after `provision_git`, so any git-gate rules already live in the
file; we append rather than overwrite. Hosts already brokered by
git-gate are skipped — git-gate is the canonical git path there."""
content = render_cred_proxy_gitconfig(routes, git_gate_hosts)
if not content:
return
container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node")
container_gitconfig = f"{container_home}/.gitconfig"
info(f"appending cred-proxy insteadOf rules to {container_gitconfig}")
# Use `tee -a` over stdin so the content never lands on argv and the
# append is atomic from the agent's perspective. `tee` runs as the
# node user (the default in the container) so ownership is preserved.
result = subprocess.run(
["docker", "exec", "-i", target, "tee", "-a", container_gitconfig],
input=content,
text=True,
capture_output=True,
check=False,
)
if result.returncode != 0:
# Fall back to root-tee in case ~/.gitconfig didn't exist as the
# node user yet (no git-gate rules were written). The chown
# below makes ownership consistent.
result_root = subprocess.run(
["docker", "exec", "-i", "-u", "0", target,
"tee", "-a", container_gitconfig],
input=content,
text=True,
capture_output=True,
check=True,
)
_ = result_root # silence unused
docker_mod.docker_exec_root(target, ["chown", "node:node", container_gitconfig])
docker_mod.docker_exec_root(target, ["chmod", "644", container_gitconfig])
# --- tea --------------------------------------------------------------------
def render_tea_config(routes: tuple[CredProxyRoute, ...]) -> str:
"""Render `~/.config/tea/config.yml`. Driven by the `tea-login`
role: each route that claims it produces one `logins:` entry
pointing at the cred-proxy. The proxy substitutes the real
token at request time; the value in `token:` here is a
placeholder. `tea` refuses to make calls without a non-empty
token field, so the placeholder is necessary."""
tea_routes = [r for r in routes if "tea-login" in r.roles]
if not tea_routes:
return ""
lines = ["logins:"]
for r in tea_routes:
# Derive a stable login name from the upstream host. The
# path may not encode the host (e.g. `/gitea/dideric/` vs
# upstream gitea.dideric.is), so we read it off `upstream`.
host = r.upstream.removeprefix("https://").partition("/")[0].partition(":")[0]
lines.extend([
f"- name: {host}",
f" url: {cred_proxy_url()}{r.path}",
" token: cred-proxy-placeholder",
" default: false",
" ssh_host: \"\"",
" ssh_key: \"\"",
" insecure: false",
])
return "\n".join(lines) + "\n"
def _provision_tea_config(
plan: DockerBottlePlan,
target: str,
routes: tuple[CredProxyRoute, ...],
) -> None:
content = render_tea_config(routes)
if not content:
return
container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node")
container_tea = f"{container_home}/.config/tea/config.yml"
cfg = plan.stage_dir / "agent_tea_config.yml"
cfg.write_text(content)
cfg.chmod(0o600)
info(
f"writing {container_tea} "
f"({len([r for r in routes if 'tea-login' in r.roles])} tea login(s))"
)
docker_mod.docker_exec_root(
target, ["mkdir", "-p", str(Path(container_tea).parent)]
)
subprocess.run(
["docker", "cp", str(cfg), f"{target}:{container_tea}"],
stdout=subprocess.DEVNULL,
check=True,
)
docker_mod.docker_exec_root(target, [
"chown", "-R", "node:node", str(Path(container_tea).parent),
])
docker_mod.docker_exec_root(target, ["chmod", "600", container_tea])
-268
View File
@@ -1,268 +0,0 @@
"""Per-bottle credential proxy (PRD 0010).
A fourth per-bottle sidecar that holds API tokens (Anthropic OAuth,
GitHub PAT, Gitea PAT, npm token) and injects them as `Authorization`
headers on the agent's behalf. The agent's environ carries only URLs
pointing at `cred-proxy:<PORT>/<route>`; the upstream credentials live
exclusively in the cred-proxy container's environ.
The boundary is the container line — different PID, mount, and network
namespaces separate the agent's container from the cred-proxy's, so
the agent cannot ptrace into the proxy, cannot read its environ via
/proc, and cannot share memory. Reaching the proxy's environ requires
escaping the agent container, the same threshold pipelock and
git-gate already rely on.
This module defines the abstract proxy (`CredProxy`), its plan
dataclass (`CredProxyPlan`), and the resolved per-route shape
(`CredProxyRoute`). The sidecar's start/stop lifecycle is backend-
specific and lives on concrete subclasses (see
`claude_bottle/backend/docker/cred_proxy.py`).
"""
from __future__ import annotations
import json
from abc import ABC, abstractmethod
from dataclasses import dataclass
from pathlib import Path
from .log import die
from .manifest import Bottle
# DNS name agents use to reach the per-bottle cred-proxy sidecar.
# Backend-agnostic by contract: every concrete backend (Docker today,
# others later) attaches this name to its sidecar on the bottle's
# internal network so the agent's manifest-driven URLs (`http://
# cred-proxy:9099/...`) work without a backend-specific hostname.
# pipelock's allowlist also references this when adding the
# auto-allow entry for cred-proxy traffic from the agent.
CRED_PROXY_HOSTNAME = "cred-proxy"
@dataclass(frozen=True)
class CredProxyRoute:
"""One resolved route on the cred-proxy sidecar. Maps a path
under the proxy to a real upstream, an auth scheme, an
in-container env-var slot, and optional provisioner roles.
Distinct from `manifest.CredProxyRoute` (the declaration shape
with Capitalize fields): this is the runtime view after the
abstract `CredProxy.prepare` step assigns token slots and
normalizes URLs. Modules that need both alias one on import.
`path` is the agent-facing prefix (e.g. `/anthropic/`).
`upstream` is the upstream base URL with scheme. `auth_scheme`
is the literal word that precedes the token in the injected
header (`Bearer` for most upstreams; `token` for Gitea —
sidesteps go-gitea/gitea#16734).
`token_env` is the env-var name inside the cred-proxy container
(e.g. `CRED_PROXY_TOKEN_0`); `token_ref` is the host env var the
CLI reads at launch and forwards into the container's environ
under `token_env`. Routes that share a `token_ref` coalesce to
one `token_env` slot.
`roles` are the provisioner tags from the manifest route (see
`manifest.CRED_PROXY_ROLES`). Each tag drives one agent-side
rewrite when this route's dotfile family is written."""
path: str
upstream: str
auth_scheme: str
token_env: str
token_ref: str
roles: tuple[str, ...] = ()
@dataclass(frozen=True)
class CredProxyPlan:
"""Output of CredProxy.prepare; consumed by .start.
The slug + routes_path + routes + token_env_map fields are
filled at prepare time (host-side, side-effect-free on docker).
The network + pipelock fields are populated by the backend's
launch step via `dataclasses.replace` once those resources
exist. Empty defaults are sentinels meaning "not yet set";
`.start` validates that they are populated.
`token_env_map` is `{<token_env in container>: <token_ref on host>}`.
The backend's start step reads `os.environ[token_ref]` and
forwards the value into the cred-proxy container's environ under
`token_env`. The plan itself never holds token values — secrets
never land in a dataclass that might be logged.
`pipelock_ca_host_path` is the host path of the per-bottle CA
pipelock will present on bumped TLS handshakes; the cred-proxy
image's entrypoint runs `update-ca-certificates` over it so the
proxy's HTTPS client trusts pipelock's CA. `pipelock_proxy_url`
is the URL cred-proxy sets as `HTTPS_PROXY` in its environ so
outbound HTTPS traverses pipelock — making pipelock's body
scanner part of the cred-proxy egress path."""
slug: str
routes_path: Path
routes: tuple[CredProxyRoute, ...]
token_env_map: dict[str, str]
internal_network: str = ""
egress_network: str = ""
pipelock_ca_host_path: Path = Path()
pipelock_proxy_url: str = ""
def cred_proxy_routes_for_bottle(
bottle: Bottle,
) -> tuple[CredProxyRoute, ...]:
"""Lift each `bottle.cred_proxy.routes[]` manifest entry into a
resolved CredProxyRoute. Order is preserved so route lookup at
the proxy is stable.
Token-env slots are assigned per distinct `token_ref`: the first
route with `token_ref` "GH_PAT" gets `CRED_PROXY_TOKEN_0`; a
second route with the same `token_ref` shares slot 0. The launch
step forwards each `token_ref`'s value from the host environ into
the sidecar's environ under the matching slot name once.
Manifest validation already enforced uniqueness rules (no
duplicate paths, singleton-role enforcement)."""
out: list[CredProxyRoute] = []
slot_for_token: dict[str, str] = {}
for r in bottle.cred_proxy.routes:
token_env = slot_for_token.get(r.TokenRef)
if token_env is None:
token_env = f"CRED_PROXY_TOKEN_{len(slot_for_token)}"
slot_for_token[r.TokenRef] = token_env
out.append(CredProxyRoute(
path=r.Path,
upstream=r.Upstream.rstrip("/"),
auth_scheme=r.AuthScheme,
token_env=token_env,
token_ref=r.TokenRef,
roles=r.Role,
))
return tuple(out)
def cred_proxy_token_env_map(
routes: tuple[CredProxyRoute, ...],
) -> dict[str, str]:
"""Collapse the route list into `{token_env: token_ref}`. Two
routes that share a token (gh-api + gh-git) coalesce; the result
is the set of env vars the backend's start step must forward into
the sidecar's environ."""
out: dict[str, str] = {}
for r in routes:
existing = out.get(r.token_env)
if existing is not None and existing != r.token_ref:
die(
f"cred-proxy plan conflict: {r.token_env} maps to both "
f"{existing!r} and {r.token_ref!r}. Two routes sharing a "
f"token slot must reference the same host env var."
)
out[r.token_env] = r.token_ref
return out
def cred_proxy_render_routes(
routes: tuple[CredProxyRoute, ...],
) -> str:
"""Serialize the route table for the cred-proxy server to read.
JSON, no token values, no host env-var names — the only thing
the proxy needs at runtime is the path → upstream + auth-scheme +
in-container env-var mapping. The actual token values arrive via
the container's environ."""
payload = {
"routes": [
{
"path": r.path,
"upstream": r.upstream,
"auth_scheme": r.auth_scheme,
"token_env": r.token_env,
}
for r in routes
],
}
return json.dumps(payload, indent=2, sort_keys=False) + "\n"
def cred_proxy_resolve_token_values(
token_env_map: dict[str, str],
host_env: dict[str, str],
) -> dict[str, str]:
"""Read `host_env[TokenRef]` for each entry in `token_env_map` and
return `{token_env: <value>}`. Dies (with a clear pointer at the
missing var name) if any TokenRef is unset.
Pure function: takes the host env as an argument so tests can pass
a sealed mapping without touching `os.environ`."""
out: dict[str, str] = {}
for token_env, token_ref in token_env_map.items():
value = host_env.get(token_ref)
if value is None:
die(
f"cred-proxy: host env var '{token_ref}' is unset. Set it "
f"before launching, or remove the corresponding route from "
f"bottle.cred_proxy.routes."
)
if not value:
die(
f"cred-proxy: host env var '{token_ref}' is empty. The "
f"cred-proxy will not inject an empty token; set it to the "
f"real value or remove the route."
)
out[token_env] = value
return out
class CredProxy(ABC):
"""The per-bottle credential proxy. Encapsulates the host-side
prepare (route lift + routes.json render + token-env-map
derivation); the sidecar's start/stop lifecycle is backend-
specific and lives on concrete subclasses."""
def prepare(self, bottle: Bottle, slug: str, stage_dir: Path) -> CredProxyPlan:
"""Lift `bottle.cred_proxy.routes` into resolved routes,
render the routes.json (mode 600) under `stage_dir`, and
return the plan. Pure host-side, no docker subprocess. The
token-env map records the mapping the launch step uses to
forward values from the host's environ into the sidecar's
environ.
Returned plan is incomplete: the launch step must fill
`internal_network` / `egress_network` via `dataclasses.replace`
before passing it to `.start`."""
routes = cred_proxy_routes_for_bottle(bottle)
routes_path = stage_dir / "cred_proxy_routes.json"
routes_path.write_text(cred_proxy_render_routes(routes))
routes_path.chmod(0o600)
return CredProxyPlan(
slug=slug,
routes_path=routes_path,
routes=routes,
token_env_map=cred_proxy_token_env_map(routes),
)
@abstractmethod
def start(self, plan: CredProxyPlan) -> str:
"""Bring up the cred-proxy sidecar according to `plan`. Returns
the target string identifying the running instance — the same
value to pass to `.stop`. Backend-specific."""
@abstractmethod
def stop(self, target: str) -> None:
"""Tear down the cred-proxy sidecar identified by `target` (the
value `.start` returned). Idempotent: a missing target is
success. Backend-specific."""
__all__ = [
"CRED_PROXY_HOSTNAME",
"CredProxy",
"CredProxyPlan",
"CredProxyRoute",
"cred_proxy_render_routes",
"cred_proxy_resolve_token_values",
"cred_proxy_routes_for_bottle",
"cred_proxy_token_env_map",
]
-499
View File
@@ -1,499 +0,0 @@
"""Cred-proxy HTTP server (PRD 0010).
Runs inside the per-bottle cred-proxy sidecar. Reads
`/run/cred-proxy/routes.json` (laid down by the backend's start step
via `docker cp`) and listens on `0.0.0.0:<port>`. For each request:
1. Match the request path against the longest route prefix.
2. Strip any inbound `Authorization` header (the agent cannot
smuggle a stolen token through this path).
3. Inject the configured header using the value of the env var
named by the route's `token_env`.
4. Forward to the upstream over HTTPS, preserving method, path
suffix, query string, request body, and the remaining headers.
5. Stream the response back without buffering — SSE-safe.
The agent talks plain HTTP to this server (loopback-equivalent across
the per-bottle internal docker network). The cred-proxy talks HTTPS
outbound through pipelock to the real upstream. Tokens live in this
container's environ; they never land in routes.json on disk and never
reach the agent's container.
Stdlib-only: this file ships into a minimal Python image with no pip
install layer. The constants are duplicated from `cred_proxy.py` so
the server doesn't need to import the rest of the package.
"""
from __future__ import annotations
import http.client
import http.server
import json
import os
import signal
import socketserver
import sys
import typing
import urllib.parse
from dataclasses import dataclass
# --- Config / route table ---------------------------------------------------
@dataclass(frozen=True)
class Route:
"""One row of the proxy's route table.
`path` is the agent-facing prefix (e.g. `/anthropic/`); the
incoming request's path starts with this. `upstream_scheme` /
`upstream_host` / `upstream_base_path` are the parsed pieces of
the upstream URL — the request's path after the prefix is
appended to `upstream_base_path`. `auth_scheme` is the literal
word in the injected header (`Bearer` or `token`). `token_env`
is the env-var name this container reads to get the token."""
path: str
upstream_scheme: str
upstream_host: str
upstream_port: int
upstream_base_path: str
auth_scheme: str
token_env: str
def parse_routes(payload: dict[str, object]) -> tuple[Route, ...]:
"""Parse the routes.json payload into a tuple of `Route`s. Sorted
by descending path length so longest-prefix match is the first
hit in iteration order."""
raw = payload.get("routes")
if not isinstance(raw, list):
raise ValueError("routes.json: 'routes' must be a list")
out: list[Route] = []
for r in raw:
if not isinstance(r, dict):
raise ValueError(f"routes.json: route must be an object (got {type(r).__name__})")
path = r["path"]
upstream = r["upstream"]
auth_scheme = r["auth_scheme"]
token_env = r["token_env"]
if not isinstance(path, str) or not path.startswith("/") or not path.endswith("/"):
raise ValueError(f"routes.json: path {path!r} must start and end with /")
if not isinstance(upstream, str):
raise ValueError("routes.json: upstream must be a string")
if not isinstance(auth_scheme, str):
raise ValueError("routes.json: auth_scheme must be a string")
if not isinstance(token_env, str) or not token_env:
raise ValueError("routes.json: token_env must be a non-empty string")
parsed = urllib.parse.urlsplit(upstream)
if parsed.scheme not in ("http", "https"):
raise ValueError(f"routes.json: upstream scheme must be http or https (got {parsed.scheme!r})")
if not parsed.hostname:
raise ValueError(f"routes.json: upstream {upstream!r} missing host")
port = parsed.port or (443 if parsed.scheme == "https" else 80)
base_path = parsed.path or ""
out.append(Route(
path=path,
upstream_scheme=parsed.scheme,
upstream_host=parsed.hostname,
upstream_port=port,
upstream_base_path=base_path,
auth_scheme=auth_scheme,
token_env=token_env,
))
out.sort(key=lambda r: len(r.path), reverse=True)
return tuple(out)
def select_route(routes: typing.Sequence[Route], request_path: str) -> Route | None:
"""Return the longest-prefix matching route, or None. Caller is
responsible for stripping any query string before passing
`request_path`."""
for r in routes:
if request_path.startswith(r.path):
return r
return None
def is_git_push_request(path: str, query: str) -> bool:
"""Return True if the request is a git smart-HTTP push.
git push over HTTPS hits two endpoints:
GET <repo>/info/refs?service=git-receive-pack (capabilities)
POST <repo>/git-receive-pack (the push)
Fetches use `service=git-upload-pack` / `/git-upload-pack` and are
not blocked. cred-proxy refuses push because git-gate's pre-receive
gitleaks scan is the gate for outbound git data; routing push
through cred-proxy would bypass that. Use the bottle.git SSH path
if you need to push.
"""
if path.endswith("/git-receive-pack"):
return True
if path.endswith("/info/refs"):
# Query string is parsed leniently — `service=git-receive-pack`
# may appear with other params in any order.
for pair in query.split("&"):
k, _, v = pair.partition("=")
if k == "service" and v == "git-receive-pack":
return True
return False
# --- Header handling --------------------------------------------------------
# Hop-by-hop headers (RFC 7230 §6.1). Stripped before forwarding.
# Plus `host` (we set it for the upstream) and any `authorization` /
# `proxy-authorization` (the proxy injects its own, never forwards
# the agent's).
_HOP_BY_HOP = frozenset({
"connection",
"keep-alive",
"proxy-authenticate",
"proxy-authorization",
"te",
"trailers",
"transfer-encoding",
"upgrade",
})
# Strip the agent's Accept-Encoding on the upstream leg and force
# `identity` instead. The response then flows back uncompressed,
# which lets pipelock's response scanner read the body — pipelock
# 2.3.0 has no decompression path and otherwise blocks with
# "compressed sse_stream response cannot be scanned". The cost is
# bandwidth from upstream; for LLM SSE streams this is negligible
# and the DLP coverage on the agent leg is the win.
_STRIPPED = _HOP_BY_HOP | frozenset({
"host", "authorization", "content-length", "accept-encoding",
})
def build_forward_headers(
incoming: typing.Iterable[tuple[str, str]],
*,
auth_scheme: str,
token: str,
upstream_host: str,
) -> list[tuple[str, str]]:
"""Build the header list to send upstream.
- Strip hop-by-hop headers, the inbound Authorization (the agent
cannot smuggle a stolen token), and Host (we set it ourselves).
- Strip Content-Length too: http.client recomputes it when we
pass `body` to `request()`.
- Honor the `Connection: close, x, y, z` form by also stripping
every listed header name.
- Inject `Authorization: <auth_scheme> <token>` and a Host header
pointing at the upstream.
- Force `Accept-Encoding: identity` so the upstream returns
uncompressed bytes — pipelock's response scanner can't read
gzip/br/deflate and would otherwise 403 the response.
"""
incoming_list = list(incoming)
# Headers listed in `Connection:` are also hop-by-hop for this hop.
extra_hop: set[str] = set()
for name, value in incoming_list:
if name.lower() == "connection":
for token_name in value.split(","):
extra_hop.add(token_name.strip().lower())
forwarded: list[tuple[str, str]] = []
for name, value in incoming_list:
lname = name.lower()
if lname in _STRIPPED or lname in extra_hop:
continue
forwarded.append((name, value))
forwarded.append(("Host", upstream_host))
forwarded.append(("Authorization", f"{auth_scheme} {token}"))
forwarded.append(("Accept-Encoding", "identity"))
return forwarded
def filter_response_headers(
incoming: typing.Iterable[tuple[str, str]],
) -> list[tuple[str, str]]:
"""Build the response header list to send back to the agent.
Strip hop-by-hop + `transfer-encoding` (we let the client's
HTTP/1.1 default chunking handle streamed bodies)."""
incoming_list = list(incoming)
extra_hop: set[str] = set()
for name, value in incoming_list:
if name.lower() == "connection":
for token_name in value.split(","):
extra_hop.add(token_name.strip().lower())
out: list[tuple[str, str]] = []
for name, value in incoming_list:
lname = name.lower()
if lname in _HOP_BY_HOP or lname in extra_hop:
continue
out.append((name, value))
return out
# --- HTTP handler -----------------------------------------------------------
# How many bytes to read off the upstream response per chunk. Small
# enough that SSE keep-alive `:` lines (~1 byte) and per-event payloads
# (~hundreds of bytes) round-trip without waiting for a larger buffer
# to fill. Large enough to not dominate syscall overhead under load.
STREAM_CHUNK = 4096
class CredProxyHandler(http.server.BaseHTTPRequestHandler):
"""Per-request handler. The routes + tokens are read off the
server instance (set by `serve()`)."""
# Quieter logs: the default writes one line per request to stderr.
# Useful in debug but noisy in normal operation.
def log_message(self, format: str, *args: typing.Any) -> None:
if os.environ.get("CRED_PROXY_DEBUG"):
super().log_message(format, *args)
def do_GET(self) -> None: self._proxy()
def do_POST(self) -> None: self._proxy()
def do_PUT(self) -> None: self._proxy()
def do_DELETE(self) -> None: self._proxy()
def do_PATCH(self) -> None: self._proxy()
def do_HEAD(self) -> None: self._proxy()
def do_OPTIONS(self) -> None: self._proxy()
def _proxy(self) -> None:
server = typing.cast("CredProxyServer", self.server)
path, _, query = self.path.partition("?")
if is_git_push_request(path, query):
self.send_error(
403,
"cred-proxy: git push over HTTPS is not supported; "
"use the bottle.git SSH path (gitleaks-scanned by "
"git-gate's pre-receive hook)",
)
return
route = select_route(server.routes, path)
if route is None:
self.send_error(404, f"no route for {path!r}")
return
token = server.tokens.get(route.token_env)
if not token:
self.send_error(500, f"cred-proxy: env var {route.token_env} unset in sidecar")
return
suffix = path[len(route.path):]
upstream_path = route.upstream_base_path.rstrip("/") + "/" + suffix
if query:
upstream_path = f"{upstream_path}?{query}"
# Read the request body, if any. We do not stream the body up
# because http.client doesn't accept a streamable body for
# arbitrary methods cleanly. v1 buffers — claude's tool-use
# requests are small JSON payloads; SSE flows are in the
# response direction only.
body: bytes | None = None
length_header = self.headers.get("Content-Length")
if length_header is not None:
try:
length = int(length_header)
except ValueError:
self.send_error(400, "invalid Content-Length")
return
if length > 0:
body = self.rfile.read(length)
elif self.headers.get("Transfer-Encoding", "").lower() == "chunked":
self.send_error(411, "cred-proxy: chunked request bodies not supported in v1")
return
forward_headers = build_forward_headers(
self.headers.items(),
auth_scheme=route.auth_scheme,
token=token,
upstream_host=route.upstream_host,
)
if route.upstream_scheme == "https":
conn: http.client.HTTPConnection = http.client.HTTPSConnection(
route.upstream_host, route.upstream_port, timeout=300,
)
else:
conn = http.client.HTTPConnection(
route.upstream_host, route.upstream_port, timeout=300,
)
try:
conn.request(self.command, upstream_path, body=body,
headers=dict(forward_headers))
resp = conn.getresponse()
except (OSError, http.client.HTTPException) as e:
try:
conn.close()
except Exception:
pass
self.send_error(502, f"upstream connection failed: {e}")
return
try:
self._stream_response(resp)
finally:
try:
conn.close()
except Exception:
pass
def _stream_response(self, resp: http.client.HTTPResponse) -> None:
out_headers = filter_response_headers(resp.getheaders())
# We send Connection: close so the agent's client closes after
# each request; simplifies streaming bookkeeping and keeps
# the handler stateless per request.
self.send_response(resp.status, resp.reason)
for name, value in out_headers:
self.send_header(name, value)
self.send_header("Connection", "close")
self.end_headers()
try:
while True:
chunk = resp.read(STREAM_CHUNK)
if not chunk:
break
self.wfile.write(chunk)
self.wfile.flush()
except (BrokenPipeError, ConnectionResetError):
# Agent disconnected mid-stream; that's fine.
return
class CredProxyServer(socketserver.ThreadingMixIn, http.server.HTTPServer):
"""Threaded HTTP server. `routes` + `tokens` are populated by
`serve()` before `serve_forever()`."""
allow_reuse_address = True
daemon_threads = True
routes: tuple[Route, ...] = ()
tokens: dict[str, str] = {}
# --- Entry point ------------------------------------------------------------
DEFAULT_ROUTES_PATH = "/run/cred-proxy/routes.json"
DEFAULT_PORT = 9099
def load_routes(path: str) -> tuple[Route, ...]:
with open(path, "r", encoding="utf-8") as f:
payload = json.load(f)
if not isinstance(payload, dict):
raise ValueError(f"{path}: top-level must be an object")
return parse_routes(payload)
def load_tokens(routes: tuple[Route, ...], environ: typing.Mapping[str, str]) -> dict[str, str]:
"""Read each route's `token_env` from the supplied environ. Missing
entries default to empty string; the handler returns 500 for
unset tokens at request time so the operator can spot the
misconfig in the cred-proxy's logs without the proxy refusing to
boot."""
out: dict[str, str] = {}
for r in routes:
out[r.token_env] = environ.get(r.token_env, "")
return out
def reload_routes(
server: "CredProxyServer",
routes_path: str,
*,
environ: typing.Mapping[str, str] | None = None,
) -> tuple[bool, str]:
"""Re-read routes.json + tokens and swap them onto `server`. Used
by the SIGHUP handler (PRD 0014) so the operator can update the
routes file in-place and have cred-proxy pick up the change
without dropping in-flight connections.
Returns (ok, message). On failure the server's existing routes
stay in place — better to keep serving the old config than to
leave the proxy with no routes after a typo.
Atomic swap: Python attribute reassignment is atomic, and the
request handler reads `server.routes`/`server.tokens` once at
the top of `_proxy()` so an in-flight request keeps the version
it captured. New requests see the new routes."""
env = environ if environ is not None else os.environ
try:
new_routes = load_routes(routes_path)
new_tokens = load_tokens(new_routes, env)
except (OSError, ValueError, json.JSONDecodeError) as e:
return False, f"reload failed: {e}"
server.routes = new_routes
server.tokens = new_tokens
return True, (
f"reloaded {len(new_routes)} route(s): "
f"{', '.join(r.path for r in new_routes)}"
)
def install_sighup_handler(server: "CredProxyServer", routes_path: str) -> None:
"""Wire SIGHUP to reload_routes. No-op on platforms without
SIGHUP (Windows). The handler swallows exceptions so a bad
reload doesn't crash the long-lived sidecar."""
if not hasattr(signal, "SIGHUP"):
return
def handler(signum: int, frame: object) -> None:
del signum, frame
ok, message = reload_routes(server, routes_path)
prefix = "cred-proxy: SIGHUP " + ("ok: " if ok else "failed: ")
sys.stderr.write(prefix + message + "\n")
sys.stderr.flush()
signal.signal(signal.SIGHUP, handler)
def serve(
*,
routes_path: str = DEFAULT_ROUTES_PATH,
port: int = DEFAULT_PORT,
bind: str = "0.0.0.0",
environ: typing.Mapping[str, str] | None = None,
) -> typing.NoReturn:
"""Bring up the server and run until killed. Exits non-zero on
config error so the container's restart policy can surface the
failure rather than silently retrying."""
env = environ if environ is not None else os.environ
routes = load_routes(routes_path)
tokens = load_tokens(routes, env)
server = CredProxyServer((bind, port), CredProxyHandler)
server.routes = routes
server.tokens = tokens
install_sighup_handler(server, routes_path)
sys.stderr.write(
f"cred-proxy listening on {bind}:{port}; "
f"{len(routes)} route(s): "
f"{', '.join(r.path for r in routes)}\n"
)
sys.stderr.flush()
try:
server.serve_forever()
except KeyboardInterrupt:
pass
finally:
server.server_close()
sys.exit(0)
def main(argv: list[str]) -> int:
"""Tiny argv shim: no flags in v1, all config via env vars.
`CRED_PROXY_ROUTES` overrides the routes path (default
`/run/cred-proxy/routes.json`). `CRED_PROXY_PORT` overrides the
listen port. Both have defaults so the container needs no extra
config to come up."""
routes_path = os.environ.get("CRED_PROXY_ROUTES", DEFAULT_ROUTES_PATH)
port = int(os.environ.get("CRED_PROXY_PORT", str(DEFAULT_PORT)))
bind = os.environ.get("CRED_PROXY_BIND", "0.0.0.0")
serve(routes_path=routes_path, port=port, bind=bind)
return 0 # serve() does not return.
if __name__ == "__main__":
raise SystemExit(main(sys.argv))
+24 -4
View File
@@ -16,9 +16,10 @@ dataclass (`EgressProxyPlan`), and the resolved per-route shape
specific and lives on concrete subclasses (see
`claude_bottle/backend/docker/egress_proxy.py`).
Chunk 1 of the PRD: this module + the mitmproxy addon + the Docker
lifecycle land alongside the existing cred-proxy code. Chunk 2 wires
the agent's `HTTP_PROXY` over to egress-proxy and removes cred-proxy.
Chunks 1+2 of the PRD: this module + the mitmproxy addon + the Docker
lifecycle are wired into the agent's `HTTP_PROXY` path; cred-proxy
has been removed. Chunk 3 retargets the cred-proxy-block remediation
flow (PRD 0014) at egress-proxy and renames the MCP tool.
"""
from __future__ import annotations
@@ -76,7 +77,7 @@ class EgressProxyPlan:
The slug + routes_path + routes + token_env_map fields are
filled at prepare time (host-side, side-effect-free on docker).
The network + pipelock fields are populated by the backend's
The network + CA + pipelock fields are populated by the backend's
launch step via `dataclasses.replace` once those resources
exist. Empty defaults are sentinels meaning "not yet set";
`.start` validates that they are populated.
@@ -87,6 +88,22 @@ class EgressProxyPlan:
under `token_env`. The plan itself never holds token values
secrets never land in a dataclass that might be logged.
`mitmproxy_ca_host_path` is the host path of the per-bottle
egress-proxy CA (single PEM with cert+key concatenated) minted
by `egress_proxy_tls_init`. `.start` docker-cps it into the
sidecar at `~/.mitmproxy/mitmproxy-ca.pem` mitmproxy reads
that file at boot to mint per-host leaf certs.
`mitmproxy_ca_cert_only_host_path` is the cert-only PEM (no
key) for installing into the agent's trust store via
`provision_ca`. Separate file rather than re-parsing the
concat so secrets and trust artefacts stay on distinct paths.
`pipelock_ca_host_path` is the host path of the pipelock CA
(cert only). `.start` docker-cps it into the sidecar so the
proxy's outbound HTTPS client trusts pipelock's MITM on the
egress-proxy upstream leg.
`pipelock_proxy_url` is the URL egress-proxy sets as `HTTPS_PROXY`
in its environ so outbound HTTPS traverses pipelock keeping
pipelock's hostname allowlist + DLP body scanner on the
@@ -99,6 +116,9 @@ class EgressProxyPlan:
token_env_map: dict[str, str]
internal_network: str = ""
egress_network: str = ""
mitmproxy_ca_host_path: Path = Path()
mitmproxy_ca_cert_only_host_path: Path = Path()
pipelock_ca_host_path: Path = Path()
pipelock_proxy_url: str = ""
+24 -218
View File
@@ -14,7 +14,6 @@ the system prompt, for bottles the body is human documentation
Bottle schema (frontmatter):
env: { <NAME>: <env-entry>, ... }
git: [ <git-entry>, ... ]
cred_proxy: { routes: [ <route>, ... ] } # superseded by egress_proxy (PRD 0017)
egress_proxy: { routes: [ <egress-route>, ... ] }
egress: { allowlist: [ <hostname>, ... ] }
@@ -125,154 +124,6 @@ class GitEntry:
)
CRED_PROXY_AUTH_SCHEMES = ("Bearer", "token")
# Provisioner role tags a route may carry. Each tag drives one
# agent-side rewrite when the cred-proxy sidecar comes up.
# anthropic-base-url: set ANTHROPIC_BASE_URL=<proxy><path>
# npm-registry: write ~/.npmrc registry= <proxy><path>
# git-insteadof: write ~/.gitconfig [url "<proxy><path>"]
# insteadOf = <route.upstream>/
# tea-login: add an entry to ~/.config/tea/config.yml
# (login url = <proxy><path>)
# Routes without a `role` are pure proxy entries with no agent-side
# rewrite — useful for upstreams whose tools the user wires up by
# hand.
CRED_PROXY_ROLES = frozenset({
"anthropic-base-url",
"npm-registry",
"git-insteadof",
"tea-login",
})
# Roles whose semantics imply a single route can carry them. A second
# route claiming the same role would make the provisioner's choice
# ambiguous (which path goes into ANTHROPIC_BASE_URL?).
CRED_PROXY_SINGLETON_ROLES = frozenset({
"anthropic-base-url",
"npm-registry",
})
@dataclass(frozen=True)
class CredProxyRoute:
"""One route on the per-bottle cred-proxy sidecar (PRD 0010).
The agent dials `http://cred-proxy:<port><Path>...`; the sidecar
strips any inbound `Authorization` header, injects
`<AuthScheme> <token>` using the value of the host env var named
by `TokenRef`, and forwards the rest of the request to `Upstream`.
`Path` is the agent-facing prefix (must start and end with `/`).
`Upstream` is the upstream base URL (https only) the request
path after `Path` is appended to it. `AuthScheme` is the literal
word that precedes the token in the injected header (`Bearer` for
most upstreams, `token` for Gitea sidesteps go-gitea/gitea#16734).
`TokenRef` names the host env var holding the credential value;
the CLI reads it at launch and forwards into the sidecar's environ.
`Role` carries optional provisioner tags (see CRED_PROXY_ROLES).
`UpstreamHost` is parsed from `Upstream` for the pipelock allowlist
+ the git-insteadof suppression check."""
Path: str
Upstream: str
AuthScheme: str
TokenRef: str
Role: tuple[str, ...] = ()
UpstreamHost: str = ""
@classmethod
def from_dict(cls, bottle_name: str, idx: int, raw: object) -> "CredProxyRoute":
label = f"bottle '{bottle_name}' cred_proxy.routes[{idx}]"
d = _as_json_object(raw, label)
path = d.get("path")
if not isinstance(path, str) or not path:
die(f"{label} missing required string field 'path'")
if not (path.startswith("/") and path.endswith("/")):
die(f"{label} path {path!r} must start and end with '/'")
upstream = d.get("upstream")
if not isinstance(upstream, str) or not upstream:
die(f"{label} missing required string field 'upstream'")
host = _parse_https_host(upstream, f"{label} upstream")
auth_scheme = d.get("auth_scheme")
if not isinstance(auth_scheme, str) or not auth_scheme:
die(f"{label} missing required string field 'auth_scheme'")
if auth_scheme not in CRED_PROXY_AUTH_SCHEMES:
die(
f"{label} auth_scheme {auth_scheme!r} is not one of "
f"{', '.join(CRED_PROXY_AUTH_SCHEMES)}"
)
token_ref = d.get("token_ref")
if not isinstance(token_ref, str) or not token_ref:
die(
f"{label} missing required string field 'token_ref' "
f"(name of the host env var holding the token value)"
)
role_raw = d.get("role")
roles: tuple[str, ...] = ()
if role_raw is None:
roles = ()
elif isinstance(role_raw, str):
roles = (role_raw,)
elif isinstance(role_raw, list):
role_list = cast(list[object], role_raw)
collected: list[str] = []
for r in role_list:
if not isinstance(r, str):
die(f"{label} role items must be strings (got {type(r).__name__})")
collected.append(r)
roles = tuple(collected)
else:
die(
f"{label} role must be a string or a list of strings "
f"(was {type(role_raw).__name__})"
)
for r in roles:
if r not in CRED_PROXY_ROLES:
die(
f"{label} role {r!r} is not one of "
f"{', '.join(sorted(CRED_PROXY_ROLES))}"
)
return cls(
Path=path,
Upstream=upstream,
AuthScheme=auth_scheme,
TokenRef=token_ref,
Role=roles,
UpstreamHost=host,
)
@dataclass(frozen=True)
class CredProxyConfig:
"""Per-bottle cred-proxy configuration. Today this is just the
route table; the nesting under `cred_proxy:` leaves room for
per-bottle proxy settings (port override, log level, etc.) in
follow-ups."""
routes: tuple[CredProxyRoute, ...] = ()
@classmethod
def from_dict(cls, bottle_name: str, raw: object) -> "CredProxyConfig":
d = _as_json_object(raw, f"bottle '{bottle_name}' cred_proxy")
routes_raw = d.get("routes")
routes: tuple[CredProxyRoute, ...] = ()
if routes_raw is not None:
if not isinstance(routes_raw, list):
die(
f"bottle '{bottle_name}' cred_proxy.routes must be an array "
f"(was {type(routes_raw).__name__})"
)
routes_list = cast(list[object], routes_raw)
routes = tuple(
CredProxyRoute.from_dict(bottle_name, i, entry)
for i, entry in enumerate(routes_list)
)
_validate_cred_proxy_routes(bottle_name, routes)
return cls(routes=routes)
# Auth schemes for the egress-proxy route's optional `auth` block.
# Same values cred-proxy accepts today; `token` sidesteps the Gitea
# token-not-Bearer quirk (go-gitea/gitea#16734).
@@ -480,15 +331,15 @@ class BottleEgress:
class Bottle:
env: Mapping[str, str] = field(default_factory=_empty_str_dict)
git: tuple[GitEntry, ...] = ()
cred_proxy: CredProxyConfig = field(default_factory=CredProxyConfig)
egress_proxy: EgressProxyConfig = field(default_factory=EgressProxyConfig)
egress: BottleEgress = field(default_factory=BottleEgress)
# Opt-in per-bottle stuck-recovery sidecar (PRD 0013). When true,
# the launch step brings up a supervise sidecar that exposes three
# MCP tools to the agent (cred-proxy-block, pipelock-block,
# capability-block) plus mounts the current-config dir read-only
# into the agent at /etc/claude-bottle/current-config. False (the
# default) skips the sidecar and the mount.
# capability-block; the cred-proxy-block tool is renamed and
# retargeted at egress-proxy in PRD 0017 chunk 3) plus mounts the
# current-config dir read-only into the agent at /etc/claude-bottle/
# current-config. False (the default) skips the sidecar and mount.
supervise: bool = False
@classmethod
@@ -539,16 +390,25 @@ class Bottle:
if "tokens" in d:
die(
f"bottle '{name}' has a 'tokens' field. The shape was reworked: "
f"each route now lives under 'cred_proxy.routes' with explicit "
f"path / upstream / auth_scheme / token_ref / role[]. See "
f"docs/prds/0010-cred-proxy.md."
f"each route now lives under 'egress_proxy.routes' with explicit "
f"host / path_allowlist / auth. See docs/prds/0017-egress-proxy-via-mitmproxy.md."
)
cred_proxy = (
CredProxyConfig.from_dict(name, d["cred_proxy"])
if "cred_proxy" in d
else CredProxyConfig()
)
if "cred_proxy" in d:
die(
f"bottle '{name}' has a 'cred_proxy' field, which has been removed "
f"(PRD 0017). Rename to 'egress_proxy' and migrate each route:\n"
f" - 'path' + 'upstream' (cred-proxy URL prefix + upstream URL)\n"
f"'host' (just the upstream hostname)\n"
f" - 'auth_scheme' + 'token_ref' (flat)\n"
f"'auth: {{ scheme, token_ref }}' (nested, optional)\n"
f" - 'role' (provisioner dotfile rewrites): drop — egress-proxy "
f"is on the agent's HTTP_PROXY path, so dotfile rewrites are no "
f"longer needed.\n"
f" - 'path_allowlist' (new): optional URL prefix gate for the "
f"host.\n"
f"See docs/prds/0017-egress-proxy-via-mitmproxy.md."
)
egress_proxy = (
EgressProxyConfig.from_dict(name, d["egress_proxy"])
@@ -571,8 +431,8 @@ class Bottle:
)
return cls(
env=env, git=git, cred_proxy=cred_proxy, egress_proxy=egress_proxy,
egress=egress, supervise=supervise_raw,
env=env, git=git, egress_proxy=egress_proxy, egress=egress,
supervise=supervise_raw,
)
@@ -846,60 +706,6 @@ def _parse_git_upstream(url: str, label: str) -> tuple[str, str, str, str]:
return (user, host, port, path)
def _parse_https_host(url: str, label: str) -> str:
"""Extract the host from an `https://host[:port][/path]` URL.
Dies if `url` is not an https:// URL or the host segment is empty.
Used to derive `CredProxyRoute.UpstreamHost` from a route's
`upstream` so pipelock's allowlist (and the provisioner's git-gate
overlap check) can match on host alone."""
if not url.startswith("https://"):
die(f"{label} must be an https:// URL (was {url!r})")
rest = url[len("https://"):]
hostport, _, _ = rest.partition("/")
host, _, _port = hostport.partition(":")
if not host:
die(f"{label} host is empty in {url!r}")
return host
def _validate_cred_proxy_routes(
bottle_name: str,
routes: tuple[CredProxyRoute, ...],
) -> None:
"""Cross-validation for `bottle.cred_proxy.routes`:
- Paths must be unique within the bottle (the proxy routes by
longest-prefix match; duplicate paths leave the choice
undefined).
- Singleton roles (`anthropic-base-url`, `npm-registry`) may
appear on at most one route the provisioner uses them to
write a single dotfile entry, so two routes claiming the role
would make the choice ambiguous.
No cross-validation against `bottle.git` is performed. git-gate
(SSH push/fetch) and cred-proxy (HTTPS REST + git smart-HTTP
fetch) broker different protocols; declaring both on the same
host is a legitimate dev setup.
"""
seen_paths: dict[str, None] = {}
for r in routes:
if r.Path in seen_paths:
die(
f"bottle '{bottle_name}' cred_proxy.routes has duplicate path "
f"{r.Path!r}; each path must be unique on the proxy."
)
seen_paths[r.Path] = None
for role in CRED_PROXY_SINGLETON_ROLES:
with_role = [r for r in routes if role in r.Role]
if len(with_role) > 1:
paths = ", ".join(r.Path for r in with_role)
die(
f"bottle '{bottle_name}' cred_proxy.routes has {len(with_role)} "
f"routes with role {role!r} (paths: {paths}); this role drives a "
f"single agent-side rewrite — pick one."
)
def _validate_egress_proxy_routes(
bottle_name: str,
routes: tuple[EgressProxyRoute, ...],
@@ -950,7 +756,7 @@ _FILENAME_RX = re.compile(r"^[a-z][a-z0-9-]*$")
# sets dies with a "did you mean" pointer — typos shouldn't silently
# ghost into an empty config.
_BOTTLE_KEYS = frozenset(
{"env", "git", "cred_proxy", "egress_proxy", "egress", "supervise"}
{"env", "git", "egress_proxy", "egress", "supervise"}
)
_AGENT_KEYS_REQUIRED = frozenset({"bottle"})
_AGENT_KEYS_OPTIONAL = frozenset({"skills"})
+58 -60
View File
@@ -3,9 +3,14 @@
Pipelock (https://github.com/luckyPipewrench/pipelock) is an HTTP
forward proxy with hostname allowlisting + DLP scanning + URL-entropy
checks. One sidecar per agent, attached to the agent's --internal
network and a per-agent user-defined egress bridge. Combined with
HTTPS_PROXY/HTTP_PROXY pointing at the sidecar's service name, pipelock
is the only egress route the agent has.
network and a per-agent user-defined egress bridge.
Post-PRD-0017 topology: the agent's HTTP_PROXY points at egress-proxy
(not pipelock); egress-proxy sets `HTTPS_PROXY=pipelock` on its
outbound leg. So pipelock no longer sees the agent's connections
directly it sees the egress-proxy upstream leg, applies the
hostname allowlist + DLP body scan there, and forwards to the real
upstream.
Image pin: ghcr.io/luckypipewrench/pipelock@sha256:<digest> for tag 2.3.0.
"""
@@ -17,7 +22,7 @@ from dataclasses import dataclass
from pathlib import Path
from typing import cast
from .cred_proxy import CRED_PROXY_HOSTNAME
from .egress_proxy import EGRESS_PROXY_HOSTNAME
from .supervise import SUPERVISE_HOSTNAME
from .manifest import Bottle
@@ -57,48 +62,45 @@ def pipelock_bottle_allowlist(bottle: Bottle) -> list[str]:
return list(bottle.egress.allowlist)
def pipelock_token_hosts(bottle: Bottle) -> list[str]:
"""Hostnames the cred-proxy sidecar (PRD 0010) talks to upstream
on the agent's behalf. Derived from each route's
`upstream.UpstreamHost` in `bottle.cred_proxy.routes`. Returned
sorted+deduped.
def pipelock_route_hosts(bottle: Bottle) -> list[str]:
"""Hostnames declared in `bottle.egress_proxy.routes`. Returned
sorted + deduped.
These hosts must be on pipelock's allowlist so cred-proxy's
outbound HTTPS traffic can leave the egress network. They are
NOT auto-added to passthrough_domains: cred-proxy's HTTPS client
trusts pipelock's per-bottle CA at runtime (installed via
docker cp + update-ca-certificates in the cred-proxy image),
so pipelock MITMs and body-scans the cred-proxy upstream leg
the same way it does direct agent traffic."""
hosts = {r.UpstreamHost for r in bottle.cred_proxy.routes if r.UpstreamHost}
Post-cutover topology (PRD 0017): the agent's HTTPS_PROXY points
at egress-proxy, not pipelock; egress-proxy's outbound leg sets
`HTTPS_PROXY=pipelock`. So pipelock no longer terminates the
agent's connections — it sees the egress-proxy → upstream leg
only. Each declared route's host still needs to be on pipelock's
allowlist so that leg can leave the egress network."""
hosts = {r.Host for r in bottle.egress_proxy.routes if r.Host}
return sorted(hosts)
def pipelock_effective_allowlist(bottle: Bottle) -> list[str]:
"""Deduplicated union of: baked-in defaults, bottle.egress.allowlist,
the cred-proxy upstream hosts derived from bottle.cred_proxy.routes,
the cred-proxy sidecar's own hostname when any cred_proxy route is
declared, and the supervise sidecar's hostname when bottle.supervise
is enabled. Sorted for stability. Git upstreams declared in
`bottle.git` do NOT contribute here git traffic flows through the
per-agent git-gate sidecar (PRD 0008), not pipelock.
the egress-proxy route hosts (from bottle.egress_proxy.routes), the
egress-proxy sidecar's own hostname when any route is declared, and
the supervise sidecar's hostname when bottle.supervise is enabled.
Sorted for stability. Git upstreams declared in `bottle.git` do NOT
contribute here git traffic flows through the per-agent git-gate
sidecar (PRD 0008), not pipelock.
The cred-proxy + supervise hostnames are auto-added because the
agent's HTTP_PROXY points at pipelock, so a manifest-driven URL
like `http://cred-proxy:9099/anthropic/...` or
`http://supervise:9100/` arrives at pipelock as a request for the
sidecar hostname. Without this auto-allow, pipelock would 403 the
request before it reached the sidecar."""
The egress-proxy + supervise hostnames are auto-added because the
sidecars sit on the bottle's internal network alongside the agent;
requests that pass through pipelock for `egress-proxy:9099` or
`supervise:9100` (e.g. when egress-proxy uses HTTPS_PROXY=pipelock
on its upstream leg) would otherwise be 403'd by pipelock's
hostname gate."""
seen: dict[str, None] = {}
for h in DEFAULT_ALLOWLIST:
seen.setdefault(h, None)
for h in pipelock_bottle_allowlist(bottle):
if h:
seen.setdefault(h, None)
for h in pipelock_token_hosts(bottle):
for h in pipelock_route_hosts(bottle):
seen.setdefault(h, None)
if bottle.cred_proxy.routes:
seen.setdefault(CRED_PROXY_HOSTNAME, None)
if bottle.egress_proxy.routes:
seen.setdefault(EGRESS_PROXY_HOSTNAME, None)
if bottle.supervise:
seen.setdefault(SUPERVISE_HOSTNAME, None)
return sorted(seen.keys())
@@ -122,16 +124,16 @@ def pipelock_seed_phrase_detection_enabled(bottle: Bottle) -> bool:
Empirically only `seed_phrase_detection.enabled: false`
actually stops the block (verified by sending a 12-word BIP-39
body through three pipelock instances). It is a global toggle
there is no per-path / per-host knob in pipelock 2.3.0 so
we turn the detector off for the entire bottle when an
`anthropic-base-url` route is declared. The trade-off is
body through three pipelock instances). It is a global toggle
no per-path / per-host knob in pipelock 2.3.0 so we turn the
detector off for the entire bottle when the bottle declares an
egress-proxy route to `api.anthropic.com`. The trade-off is
accepted: BIP-39 detection has little value in claude-bottle's
threat model (the agent has no access to a user's crypto
wallet seeds; the patterns that matter gh*_, sk-ant-, AKIA,
etc. keep firing)."""
threat model (the agent has no access to a user's crypto wallet
seeds; the patterns that matter gh*_, sk-ant-, AKIA, etc.
keep firing)."""
return not any(
"anthropic-base-url" in r.Role for r in bottle.cred_proxy.routes
r.Host == "api.anthropic.com" for r in bottle.egress_proxy.routes
)
@@ -143,16 +145,12 @@ def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]:
other allowlisted host is MITM'd by pipelock's per-bottle CA so
its body scanner sees the cleartext.
cred-proxy upstream hosts (github, gitea, npm) are deliberately
NOT auto-added here. cred-proxy's HTTPS client trusts pipelock's
CA at runtime (folded into its trust store via docker cp +
update-ca-certificates), so pipelock can MITM the cred-proxy
upstream leg and body-scan it the same way it body-scans the
agent's direct HTTPS traffic. Without this, an agent that pushed
a secret via cred-proxy's /gh-git/ path would have no body
scanner in front of it. The PRD's earlier reasoning that
cred-proxy hosts needed passthrough was a workaround for the
cert-trust gap that no longer exists.
egress-proxy route hosts (github, gitea, npm) are deliberately
NOT auto-added here. egress-proxy's HTTPS client trusts pipelock's
CA at runtime (folded into its trust store via docker cp), so
pipelock MITMs and body-scans the egress-proxy upstream leg the
same way it body-scanned the agent's direct HTTPS traffic before
the PRD 0017 cutover.
`bottle` is kept on the signature for forward-compat (a future
knob might let a manifest opt a host into passthrough); today
@@ -207,13 +205,13 @@ def pipelock_build_config(
`ssrf_ip_allowlist` is the list of IPs / CIDRs that bypass
pipelock's SSRF guard. Pipelock blocks RFC1918-resolved
destinations by default, which would catch the agent's
cred-proxy traffic (cred-proxy sits on the bottle's internal
Docker network in 172.x space). Pass the bottle's internal
network CIDR here so `cred-proxy:9099` requests get through
pipelock while api_allowlist + body-scanning still apply. Empty
by default; omitted from the rendered yaml when empty so
pipelock keeps its built-in SSRF defaults."""
destinations by default, which would catch sibling-sidecar
traffic on the bottle's internal Docker network in 172.x space
(e.g. egress-proxy pipelock on the upstream leg). Pass the
bottle's internal network CIDR here so internal-network requests
pass through pipelock while api_allowlist + body-scanning still
apply. Empty by default; omitted from the rendered yaml when
empty so pipelock keeps its built-in SSRF defaults."""
cfg: dict[str, object] = {
"version": 1,
"mode": "strict",
@@ -322,9 +320,9 @@ class PipelockProxyPlan:
that they are populated.
`internal_network_cidr` ends up on pipelock's `ssrf.ip_allowlist`
so the agent's requests at `cred-proxy:9099` (or any other
bottle-internal sidecar) bypass pipelock's RFC1918 SSRF guard
while api_allowlist and body-scanning still apply."""
so traffic from sibling sidecars (egress-proxy pipelock on the
upstream leg, etc.) bypasses pipelock's RFC1918 SSRF guard while
api_allowlist and body-scanning still apply."""
yaml_path: Path
slug: str