From 141171997305239bc8efc1dbb12c88629e4e3b6e Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 00:18:55 -0400 Subject: [PATCH 01/24] docs: add PRD 0010 for credential proxy Per-bottle reverse proxy that holds API tokens (Anthropic OAuth, GitHub PAT, Gitea PAT, npm) in a root-owned process; agent gets only URLs in its environ. AWS / SigV4 explicitly out of scope. --- docs/prds/0010-cred-proxy.md | 420 +++++++++++++++++++++++++++++++++++ 1 file changed, 420 insertions(+) create mode 100644 docs/prds/0010-cred-proxy.md diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md new file mode 100644 index 0000000..65f153f --- /dev/null +++ b/docs/prds/0010-cred-proxy.md @@ -0,0 +1,420 @@ +# PRD 0010: Credential proxy for agent-bound API tokens + +- **Status:** Draft +- **Author:** didericis +- **Created:** 2026-05-13 + +## Summary + +Per-bottle reverse proxy that holds API tokens (Anthropic OAuth, +GitHub PAT, Gitea PAT, npm token) in a root-owned process inside +the agent container. The agent (`node`, UID 1000) keeps only URLs +in its environ; the proxy injects the right `Authorization` header +and forwards over TLS. The boundary that makes this meaningful is +the kernel's `ptrace_may_access` check: `node` cannot read root's +`/proc//environ` and cannot `ptrace` attach without +`CAP_SYS_PTRACE` / `CAP_PERFMON`, which claude-bottle does not +grant. + +AWS / SigV4 is explicitly out of scope — it is per-request signing, +not header injection, and does not fit this proxy's shape. If a +bottle needs AWS credentials later, that lives in a separate PRD. + +## Problem + +Today `CLAUDE_CODE_OAUTH_TOKEN` (and any `bottle.env` secrets such +as a Gitea PAT, GitHub PAT, or npm token) gets `docker run -e`'d +straight into the agent's environ. Inside the bottle the agent +runs as `node` with `--dangerously-skip-permissions`; its Bash +tool can do `printenv`, `cat /proc/self/environ`, or +`node -e 'console.log(process.env)'` and capture every value into +the conversation. From there a prompt-injected or hijacked agent +can exfil over any allowed egress (api.anthropic.com itself if +nothing else). + +Linux has no per-env-var ACL — once a variable is in a process's +environ, the process and its descendants own it. The credible +boundary is process-level: hold the credential in a different +process the agent cannot read. Default Docker already enforces +that boundary at the kernel line via `ptrace_may_access`, the +same property the (removed) ssh-gate and the current git-gate +rely on. + +The research note +[`agent-credential-proxy-landscape.md`](../research/agent-credential-proxy-landscape.md) +surveys the existing tools and concludes that a small +claude-bottle-specific reverse proxy is less work and less risk +than either adopting nono (alpha, unaudited) or Infisical Agent +Vault (TLS-MITM topology that doubles up on pipelock's CA stack). +This PRD is the build. + +## Goals / Success Criteria + +Each test runs inside a bottle whose manifest declares the four +supported kinds (anthropic, github, gitea, npm): + +1. **No plaintext tokens in the agent's environ.** `printenv` and + `cat /proc/self/environ` from the agent's shell return only + URLs pointing at `127.0.0.1:/...`. None of the + `bottle.tokens[].TokenRef` values appear. +2. **Kernel boundary holds.** From the agent's shell, + `cat /proc//environ` returns `EACCES` and + `gdb -p ` / `strace -p ` fails + with `EPERM`. +3. **Anthropic API works.** `claude` makes a successful streaming + tool-use round-trip via `ANTHROPIC_BASE_URL` → + `127.0.0.1:/anthropic`. SSE chunks arrive without + buffering; `anthropic-version`, `anthropic-beta`, and + `X-Claude-Code-Session-Id` headers round-trip untouched. +4. **Git push to declared remotes works.** `git push` against a + `bottle.tokens[].Kind: github` or `gitea` upstream succeeds; + the upstream sees the gate's token, not the agent's. +5. **npm install works.** `npm install ` + succeeds against the registry pointed at the proxy. A scoped + install that requires the token (e.g. against a private + registry) also succeeds. +6. **Wrong token rejected at the source, not silently swapped.** + If the agent tries to send its own `Authorization: …` header, + the proxy strips and replaces with the configured one. A + manifest token revoked at the upstream produces a 401 to the + agent, not a 5xx. + +## Non-goals + +- **AWS / SigV4.** Per-request signing is a different shape; a + bearer-injecting proxy doesn't help. Hold for a future PRD + (likely an IMDS emulator sidecar handing out short-lived STS + credentials). +- **DB-backed credential store.** Flat env / mode-600 file only. + The LiteLLM CVE-2026-42208 incident is the cautionary tale: + any DB-backed credential gateway is itself a high-value attack + target. +- **Generic LLM-gateway features.** No cost tracking, no + fallbacks, no virtual keys, no multi-tenant routing, no usage + metering. The proxy is a credential-injection trust endpoint, + not a gateway. +- **Subsuming pipelock.** pipelock keeps its egress-allowlist + role. It drops the `api.anthropic.com` TLS-MITM job because + cred-proxy is now the trust endpoint for that host; everything + else pipelock does stays. +- **TLS interception inside the bottle.** The agent talks plain + HTTP to loopback; cred-proxy speaks real HTTPS outbound. No + container-local CA, no `golang/go#28866` loopback workaround. +- **Cross-bottle credential sharing.** One proxy per bottle, same + one-sidecar-per-agent posture as pipelock and git-gate. +- **`claude --bare` mode.** Reads only `ANTHROPIC_API_KEY`, not + the OAuth token. Not in claude-bottle's flow today. +- **MCP-server tokens, package-installer tokens for languages + beyond npm.** PyPI / Bun / cargo can land in a follow-up if + needed; the routing pattern generalizes. + +## Scope + +### In scope + +- **Manifest field.** `bottle.tokens: [TokenEntry, ...]`. Each + entry carries `Kind` (`anthropic` | `github` | `gitea` | + `npm`), an optional `Url` (required for `gitea`, defaulted for + the others), and `TokenRef` (the name of a host env var the + CLI resolves at launch time). +- **cred-proxy process.** Runs as root inside the agent + container, listens on `127.0.0.1:`. Holds the tokens in + its own environ — never on argv, never written to disk. + Per-`Kind` route handler: inject the right header, forward + over TLS, stream the response back to the client without + buffering. +- **Agent-side rewrites.** Provisioner writes: + - `ANTHROPIC_BASE_URL=http://127.0.0.1:/anthropic` to + the agent's environ + - `~/.npmrc` `registry = http://127.0.0.1:/npm/` + - `~/.gitconfig` `[url …] insteadOf = …` for each declared + `github` / `gitea` upstream + - `~/.config/tea/config.yml` with the proxy URL for each + declared `gitea` entry +- **Process lifecycle.** Container entrypoint launches the proxy + first as root, waits for it to bind, then `exec setpriv … + --reuid=node --regid=node …` for the claude child. Proxy + death is fatal (the container exits); this is also the + PID-1-zombie story. +- **pipelock interop.** Drop `api.anthropic.com` from pipelock's + TLS-MITM list; keep it on the allowlist as a plain HTTPS host + (cred-proxy is the trust endpoint now). Verify pipelock still + lets cred-proxy's HTTPS connections out for the four upstream + hosts. +- **Plan rendering.** `bottle_plan.py` and the y/N preflight + show: which tokens are configured (kind + ref name, not the + value), the proxy port, the routes the proxy will publish. +- **Drop the existing `CLAUDE_CODE_OAUTH_TOKEN` forward in + `prepare.py`.** Today it lands in the agent's environ; once + this PRD ships, it lands in the proxy's environ instead. +- **Tests.** Integration tests for each of the six success + criteria; unit tests for manifest parsing, route table + generation, header injection. + +### Out of scope + +- AWS / SigV4 (see Non-goals). +- Per-method / per-path allowlist *inside* a kind. Defer to a + follow-up once observed traffic stabilizes. +- Replacing `bottle.env` for non-token secrets. The proxy + handles the four kinds listed above; other env vars keep their + current path. +- Migrating an in-flight bottle from "token in agent env" to + "token via proxy" mid-session. Restart required. +- Audit logging. The proxy doesn't write request logs in v1. + Add only if a concrete debugging need surfaces. + +## Proposed Design + +### Architecture + +``` +┌── Host (macOS) ──────────────────────────────────────────────────┐ +│ Secrets at rest (keychain / .env): │ +│ CLAUDE_BOTTLE_OAUTH_TOKEN, GITHUB_TOKEN, │ +│ GITEA_SERVER_TOKEN, NPM_TOKEN │ +│ │ docker run -e KEY (no =VALUE on argv) │ +│ ▼ │ +│ ┌── Bottle container ────────────────────────────────────────┐ │ +│ │ │ │ +│ │ ┌── UID 1000 (node) ─────────────────────────────────┐ │ │ +│ │ │ claude --dangerously-skip-permissions │ │ │ +│ │ │ environ: URLs only, no plaintext tokens │ │ │ +│ │ │ ANTHROPIC_BASE_URL=http://127.0.0.1:PORT/anth.. │ │ │ +│ │ │ npm registry → http://127.0.0.1:PORT/npm/ │ │ │ +│ │ │ git remote.url → http://127.0.0.1:PORT/... │ │ │ +│ │ │ tea --url → http://127.0.0.1:PORT/gitea │ │ │ +│ │ └────────────┬───────────────────────────────────────┘ │ │ +│ │ │ plain HTTP, loopback │ │ +│ │ ▼ │ │ +│ │ ┌── UID 0 (root) ────────────────────────────────────┐ │ │ +│ │ │ cred-proxy listens 127.0.0.1:PORT │ │ │ +│ │ │ tokens live ONLY in this process's environ │ │ │ +│ │ │ per-route: inject auth header, forward over TLS │ │ │ +│ │ │ /anthropic → api.anthropic.com Bearer │ │ │ +│ │ │ /gh-api → api.github.com Bearer │ │ │ +│ │ │ /gh-git → github.com Bearer │ │ │ +│ │ │ /gitea → gitea.dideric.is token │ │ │ +│ │ │ /npm → registry.npmjs.org Bearer │ │ │ +│ │ │ SSE pass-through, no buffering │ │ │ +│ │ └────────────┬───────────────────────────────────────┘ │ │ +│ │ │ HTTPS │ │ +│ │ ▼ │ │ +│ │ ┌── pipelock (egress allowlist) ─────────────────────┐ │ │ +│ │ │ allow: api.anthropic.com, api.github.com, │ │ │ +│ │ │ github.com, gitea.dideric.is, │ │ │ +│ │ │ registry.npmjs.org │ │ │ +│ │ │ block: statsig, sentry, autoupdater, * │ │ │ +│ │ └────────────┬───────────────────────────────────────┘ │ │ +│ └────────────────┼──────────────────────────────────────────┘ │ +│ ▼ │ +└────────────────────┼─────────────────────────────────────────────┘ + ▼ + Upstream APIs + + +Why node@1000 can't just steal the tokens: + ┌─────────────────────────────────────────────────────────┐ + │ node tries: │ + │ cat /proc//environ → EACCES │ + │ ptrace(PTRACE_ATTACH, , ...) → EPERM│ + │ Kernel's ptrace_may_access rejects: UID mismatch │ + │ and no CAP_SYS_PTRACE / CAP_PERFMON in the container. │ + └─────────────────────────────────────────────────────────┘ +``` + +### New components + +- **`claude_bottle/cred_proxy.py`** (new): abstract `CredProxy` + + `CredProxyPlan` dataclass. `prepare` is host-side and + side-effect-free on Docker; renders the route table and + resolves `TokenRef`s against host env. Mirrors the existing + `GitGate` / `Pipelock` shape. +- **`claude_bottle/backend/docker/cred_proxy.py`** (new): + `DockerCredProxy` concrete subclass. Bakes the proxy binary + into the agent image; `start` writes the route table to a + mode-600 file under `stage_dir` and arranges the entrypoint + so the proxy boots first. +- **`claude_bottle/backend/docker/provision/cred_proxy.py`** + (new): renders `ANTHROPIC_BASE_URL`, `~/.npmrc`, + `~/.gitconfig` `insteadOf` blocks, and `~/.config/tea/config.yml` + into the agent's home for each declared kind. +- **The proxy binary itself.** Bundled into the agent image at + `/usr/local/libexec/cred-proxy`. See "External dependencies" + for the language choice. + +### Existing code touched + +- **`claude_bottle/manifest.py`** — add `TokenEntry`, + `Bottle.tokens: tuple[TokenEntry, ...] = ()`, parse + validate + (at most one entry per `Kind` except `gitea`, which may + carry multiple Urls). +- **`claude_bottle/backend/docker/prepare.py`** — delete the + `CLAUDE_BOTTLE_OAUTH_TOKEN` → `CLAUDE_CODE_OAUTH_TOKEN` branch + in the agent's forwarded env. The OAuth token now flows to + the proxy's environ via the cred-proxy lifecycle. +- **`claude_bottle/backend/docker/backend.py`** — instantiate + `DockerCredProxy`; thread its `prepare` / `start` / `stop` + through `resolve_plan` / `launch`. +- **`claude_bottle/backend/docker/launch.py`** — add cred-proxy + start before the cred-proxy provisioner runs (provisioner + writes URLs that reference the proxy port, so it must be up). +- **`claude_bottle/backend/docker/bottle_plan.py`** — new + `CredProxyPlan` field; preflight shows kind + ref name + + port + route table. +- **`claude_bottle/pipelock.py`** — drop the `api.anthropic.com` + TLS-MITM branch; the host stays on the allowlist as a plain + HTTPS destination. Confirm the four upstream hosts are + allowlisted by default when `bottle.tokens` declares them. +- **`README.md`** — replace the architecture diagram with the + one above; document the `bottle.tokens` field. +- **`claude-bottle.example.json`** — add a `tokens` array to + one bottle showing each Kind. +- **Tests** — new unit tests for manifest parsing, route table + generation, header injection; new integration tests for the + six success criteria. Delete the bits of `prepare.py` tests + that asserted on `CLAUDE_CODE_OAUTH_TOKEN` landing in the + agent's env. + +### Data model changes + +```python +@dataclass(frozen=True) +class TokenEntry: + Kind: Literal["anthropic", "github", "gitea", "npm"] + TokenRef: str # name of host env var + Url: str | None = None # required for gitea; defaulted otherwise + +@dataclass(frozen=True) +class Bottle: + ... + tokens: tuple[TokenEntry, ...] = () +``` + +Validation: + +- `Kind` must be one of the four supported values. +- `TokenRef` must resolve against `os.environ` at launch (fail + fast with a clear "host env var X is unset" if missing). +- `gitea` entries require `Url`; others fall back to the + documented upstream. +- At most one entry per `Kind` except `gitea`, which may have + multiple distinct `Url`s. +- No silent overlap with `bottle.git` upstreams that already + flow through git-gate; if a `tokens[].Kind: github|gitea` + entry's `Url` collides with a `git[].Upstream`'s host, parse + fails with a "git-gate already brokers this remote, drop one" + hint. (Both paths broker credentials; doubling up is a + configuration smell, not a feature.) + +### Routing table + +| Kind | Proxy path | Upstream | Header | +|-----------|----------------|-------------------------|----------------------------| +| anthropic | `/anthropic/` | `api.anthropic.com` | `Authorization: Bearer …` | +| github | `/gh-api/` | `api.github.com` | `Authorization: Bearer …` | +| github | `/gh-git/` | `github.com` | `Authorization: Bearer …` | +| gitea | `/gitea/` | configured `Url` | `Authorization: token …` | +| npm | `/npm/` | `registry.npmjs.org` | `Authorization: Bearer …` | + +Gitea uses `Authorization: token` rather than `Bearer` to +sidestep `go-gitea/gitea#16734`. The proxy strips any incoming +`Authorization` header before injecting its own — the agent +cannot smuggle a stolen token through this path. + +### External dependencies + +The proxy binary. Two real options: + +- **Python (stdlib)** — `http.server` + `urllib`/`http.client`, + no new pip packages. Matches CLAUDE.md's "bash-first, low-deps" + posture. SSE pass-through is fiddly but doable. +- **Go single binary** — cleaner SSE story, smaller runtime, + one static binary baked into the image. New build dependency. + +Default: Python, baked into the agent image. Reconsider in the +implementation PR if SSE behavior is troublesome under load. + +No new Python packages. No DB. No admin API. The proxy's +configuration is a single mode-600 JSON file passed in via +`/run/cred-proxy/routes.json`. + +## Future work + +- **AWS / SigV4.** Likely an IMDS emulator sidecar handing out + short-lived STS tokens. Different threat model (the agent + ends up holding the STS creds — the proxy just shortens + their lifetime). Separate PRD. +- **Per-method / per-path allowlist** inside a kind. Once the + set of API operations claude actually performs is observed, + reject everything else. Narrows the within-allowlist surface. +- **Short-lived token minting.** For services that support it + (GitHub Apps, GitLab project-access tokens, fine-grained + PATs with TTL), have the proxy mint a fresh per-session + child credential from a long-lived parent. +- **Smolmachines colocation.** Same packing question as + pipelock / git-gate; the cred-proxy can sit inside the agent + VM (current shape) or in a separate VM (stricter isolation, + per-bottle TCP hop). Backend decision, not a manifest decision. +- **More kinds.** PyPI, Bun, cargo, Docker Hub. The routing + pattern generalizes; add as needed. + +## Open questions + +- **Field name.** `bottle.tokens` is the working name. The + research note used `bottle.forge` for the gitea/github + generalization, but "forge" doesn't fit `anthropic` or + `npm`. Alternatives: `bottle.brokered`, `bottle.upstreams`, + `bottle.cred_proxy`. Default: `bottle.tokens`. +- **Python vs Go for the proxy.** Default: Python, revisit + during implementation if SSE pass-through is unreliable. +- **Process inside the agent container vs sidecar container.** + v1: inside (simpler lifecycle, no extra container; ptrace + boundary is enough). The sidecar option becomes attractive + only if we want a network-layer split between proxy and agent + on top of the UID split. +- **Belt-and-braces on outbound telemetry.** Set + `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1` and + `DISABLE_ERROR_REPORTING=1` in the agent's environ by + default? Default: yes — they don't route through + `ANTHROPIC_BASE_URL`, so the proxy doesn't catch them; the + flags are the only off switch. +- **`git push` over a rewritten URL vs. credential-helper + shim.** `[url "http://…"] insteadOf = "https://github.com/"` + captures push/fetch/clone/pull/ls-remote in one config knob; + a credential helper would need separate wiring. Default: + `insteadOf`. +- **Token-refresh story for the Anthropic OAuth token.** The + token is ~1-year and there's no client-side refresh, so the + proxy holds a static value. The 1-year blast radius is the + cost, documented in + [`claude-code-token-revocation.md`](../research/claude-code-token-revocation.md). + No design change here; flagged for awareness. +- **`anthropics/claude-code#36998`.** Older claude-code + versions bypassed `ANTHROPIC_BASE_URL` for some startup + calls (auth validation, org lookup). Marked closed upstream; + the implementation PR verifies with `strace -e connect` + against the pinned claude-code build before trusting the + isolation. + +## References + +- [`docs/research/agent-credential-proxy-landscape.md`](../research/agent-credential-proxy-landscape.md) + — landscape research; this PRD is the build path that note + recommends. +- [`docs/research/secret-minimization-over-dlp.md`](../research/secret-minimization-over-dlp.md) + — architectural framing: why moving the credential matters + more than scanning egress. +- PRD 0006: pipelock TLS interception — the + `api.anthropic.com` TLS-MITM responsibility cred-proxy takes + over. +- PRD 0008: Git gate — the credential-broker pattern this PRD + reuses (gate holds creds, agent gets a rewritten URL, gate + makes the upstream connection). +- [`anthropics/claude-code#36998`](https://github.com/anthropics/claude-code/issues/36998) + — historic `ANTHROPIC_BASE_URL` bypass. +- [`go-gitea/gitea#16734`](https://github.com/go-gitea/gitea/issues/16734) + — why Gitea uses `Authorization: token`, not `Bearer`. +- [`golang/go#28866`](https://github.com/golang/go/issues/28866) + — the `HTTPS_PROXY` loopback bug; not hit here because we're + a reverse proxy, not a forward proxy. -- 2.52.0 From 3747927b9efddf58741d445314baec9732e1aae7 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 00:23:09 -0400 Subject: [PATCH 02/24] docs: align cred-proxy architecture diagram Trim one trailing space from the four arrow/HTTPS rows and add one dash to the bottle-container bottom edge so all box-bound lines are 68 columns. --- docs/prds/0010-cred-proxy.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md index 65f153f..8f1756b 100644 --- a/docs/prds/0010-cred-proxy.md +++ b/docs/prds/0010-cred-proxy.md @@ -185,8 +185,8 @@ supported kinds (anthropic, github, gitea, npm): │ │ │ git remote.url → http://127.0.0.1:PORT/... │ │ │ │ │ │ tea --url → http://127.0.0.1:PORT/gitea │ │ │ │ │ └────────────┬───────────────────────────────────────┘ │ │ -│ │ │ plain HTTP, loopback │ │ -│ │ ▼ │ │ +│ │ │ plain HTTP, loopback │ │ +│ │ ▼ │ │ │ │ ┌── UID 0 (root) ────────────────────────────────────┐ │ │ │ │ │ cred-proxy listens 127.0.0.1:PORT │ │ │ │ │ │ tokens live ONLY in this process's environ │ │ │ @@ -198,15 +198,15 @@ supported kinds (anthropic, github, gitea, npm): │ │ │ /npm → registry.npmjs.org Bearer │ │ │ │ │ │ SSE pass-through, no buffering │ │ │ │ │ └────────────┬───────────────────────────────────────┘ │ │ -│ │ │ HTTPS │ │ -│ │ ▼ │ │ +│ │ │ HTTPS │ │ +│ │ ▼ │ │ │ │ ┌── pipelock (egress allowlist) ─────────────────────┐ │ │ │ │ │ allow: api.anthropic.com, api.github.com, │ │ │ │ │ │ github.com, gitea.dideric.is, │ │ │ │ │ │ registry.npmjs.org │ │ │ │ │ │ block: statsig, sentry, autoupdater, * │ │ │ │ │ └────────────┬───────────────────────────────────────┘ │ │ -│ └────────────────┼──────────────────────────────────────────┘ │ +│ └────────────────┼───────────────────────────────────────────┘ │ │ ▼ │ └────────────────────┼─────────────────────────────────────────────┘ ▼ -- 2.52.0 From 9fa971713565ec16de795ea882b8952a7bae8ee5 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 00:40:16 -0400 Subject: [PATCH 03/24] docs: switch cred-proxy to sidecar shape Make the cred-proxy a per-bottle sidecar container on the bottle's internal docker network instead of a root-owned process inside the agent container. The boundary becomes container namespace separation, matching pipelock and git-gate. Update summary, problem, goals, in-scope, architecture diagram, components, existing code touched, external deps, and open questions; add a "Considered alternatives" section recording the rejected in-container shape. --- docs/prds/0010-cred-proxy.md | 233 ++++++++++++++++++++++------------- 1 file changed, 145 insertions(+), 88 deletions(-) diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md index 8f1756b..760c2a8 100644 --- a/docs/prds/0010-cred-proxy.md +++ b/docs/prds/0010-cred-proxy.md @@ -6,15 +6,16 @@ ## Summary -Per-bottle reverse proxy that holds API tokens (Anthropic OAuth, -GitHub PAT, Gitea PAT, npm token) in a root-owned process inside -the agent container. The agent (`node`, UID 1000) keeps only URLs -in its environ; the proxy injects the right `Authorization` header -and forwards over TLS. The boundary that makes this meaningful is -the kernel's `ptrace_may_access` check: `node` cannot read root's -`/proc//environ` and cannot `ptrace` attach without -`CAP_SYS_PTRACE` / `CAP_PERFMON`, which claude-bottle does not -grant. +Per-bottle sidecar container that holds API tokens (Anthropic +OAuth, GitHub PAT, Gitea PAT, npm token). The agent container +keeps only URLs in its environ; the sidecar injects the right +`Authorization` header and forwards over TLS to the upstream. The +boundary is the container line — PID, mount, and network +namespaces separate the agent's container from the sidecar's, so +from inside the agent the sidecar's processes are not visible in +`/proc`, cannot be `ptrace`'d, and share no memory. Reaching the +sidecar's environ requires escaping the agent container — the same +threshold pipelock and git-gate already rely on. AWS / SigV4 is explicitly out of scope — it is per-request signing, not header injection, and does not fit this proxy's shape. If a @@ -34,11 +35,10 @@ nothing else). Linux has no per-env-var ACL — once a variable is in a process's environ, the process and its descendants own it. The credible -boundary is process-level: hold the credential in a different -process the agent cannot read. Default Docker already enforces -that boundary at the kernel line via `ptrace_may_access`, the -same property the (removed) ssh-gate and the current git-gate -rely on. +boundary is container-level: hold the credential in a separate +container the agent cannot reach. Default Docker's namespace +isolation enforces that — the same property pipelock and git-gate +already rely on. The research note [`agent-credential-proxy-landscape.md`](../research/agent-credential-proxy-landscape.md) @@ -55,15 +55,16 @@ supported kinds (anthropic, github, gitea, npm): 1. **No plaintext tokens in the agent's environ.** `printenv` and `cat /proc/self/environ` from the agent's shell return only - URLs pointing at `127.0.0.1:/...`. None of the + URLs pointing at `cred-proxy:/...`. None of the `bottle.tokens[].TokenRef` values appear. -2. **Kernel boundary holds.** From the agent's shell, - `cat /proc//environ` returns `EACCES` and - `gdb -p ` / `strace -p ` fails - with `EPERM`. +2. **Container boundary holds.** From the agent's shell, `ps aux` + does not list the cred-proxy process; there is no `/proc/` + entry for it to read. The sidecar's hostname (`cred-proxy`) + resolves only on the bottle's internal network — from a + different bottle or from the host, the name does not resolve. 3. **Anthropic API works.** `claude` makes a successful streaming tool-use round-trip via `ANTHROPIC_BASE_URL` → - `127.0.0.1:/anthropic`. SSE chunks arrive without + `cred-proxy:/anthropic`. SSE chunks arrive without buffering; `anthropic-version`, `anthropic-beta`, and `X-Claude-Code-Session-Id` headers round-trip untouched. 4. **Git push to declared remotes works.** `git push` against a @@ -117,36 +118,41 @@ supported kinds (anthropic, github, gitea, npm): `npm`), an optional `Url` (required for `gitea`, defaulted for the others), and `TokenRef` (the name of a host env var the CLI resolves at launch time). -- **cred-proxy process.** Runs as root inside the agent - container, listens on `127.0.0.1:`. Holds the tokens in - its own environ — never on argv, never written to disk. +- **cred-proxy sidecar.** Runs as its own container on the + bottle's internal docker network with hostname `cred-proxy`, + listening on `0.0.0.0:` bound to the internal interface. + No host port published. Holds the tokens in the sidecar + container's environ — never on argv, never written to disk. Per-`Kind` route handler: inject the right header, forward - over TLS, stream the response back to the client without - buffering. + over TLS, stream the response back without buffering. - **Agent-side rewrites.** Provisioner writes: - - `ANTHROPIC_BASE_URL=http://127.0.0.1:/anthropic` to + - `ANTHROPIC_BASE_URL=http://cred-proxy:/anthropic` to the agent's environ - - `~/.npmrc` `registry = http://127.0.0.1:/npm/` + - `~/.npmrc` `registry = http://cred-proxy:/npm/` - `~/.gitconfig` `[url …] insteadOf = …` for each declared `github` / `gitea` upstream - `~/.config/tea/config.yml` with the proxy URL for each declared `gitea` entry -- **Process lifecycle.** Container entrypoint launches the proxy - first as root, waits for it to bind, then `exec setpriv … - --reuid=node --regid=node …` for the claude child. Proxy - death is fatal (the container exits); this is also the - PID-1-zombie story. -- **pipelock interop.** Drop `api.anthropic.com` from pipelock's - TLS-MITM list; keep it on the allowlist as a plain HTTPS host - (cred-proxy is the trust endpoint now). Verify pipelock still - lets cred-proxy's HTTPS connections out for the four upstream - hosts. +- **Sidecar lifecycle.** Mirrors `DockerGitGate` / + `DockerPipelockProxy` in shape: `prepare` is host-side and + side-effect-free; `start` does `docker create` + `docker start` + on the bottle's internal network with hostname `cred-proxy`; + `stop` is idempotent `docker rm -f`. Container name: + `claude-bottle-cred-proxy-`. The agent container starts + after the sidecar is up so DNS resolution succeeds on the + agent's first call. +- **pipelock interop.** cred-proxy's outbound HTTPS still + traverses pipelock — pipelock keeps its egress-allowlist role + for the four upstream hosts. Drop `api.anthropic.com` from + pipelock's TLS-MITM list (cred-proxy is now the trust endpoint + for that host); the host stays on the plain HTTPS allowlist. - **Plan rendering.** `bottle_plan.py` and the y/N preflight show: which tokens are configured (kind + ref name, not the value), the proxy port, the routes the proxy will publish. - **Drop the existing `CLAUDE_CODE_OAUTH_TOKEN` forward in `prepare.py`.** Today it lands in the agent's environ; once - this PRD ships, it lands in the proxy's environ instead. + this PRD ships, it lands in the cred-proxy sidecar's environ + instead. - **Tests.** Integration tests for each of the six success criteria; unit tests for manifest parsing, route table generation, header injection. @@ -175,22 +181,23 @@ supported kinds (anthropic, github, gitea, npm): │ GITEA_SERVER_TOKEN, NPM_TOKEN │ │ │ docker run -e KEY (no =VALUE on argv) │ │ ▼ │ -│ ┌── Bottle container ────────────────────────────────────────┐ │ +│ ┌── per-bottle internal docker network ──────────────────────┐ │ │ │ │ │ -│ │ ┌── UID 1000 (node) ─────────────────────────────────┐ │ │ -│ │ │ claude --dangerously-skip-permissions │ │ │ +│ │ ┌── agent container ─────────────────────────────────┐ │ │ +│ │ │ claude as node (UID 1000) │ │ │ +│ │ │ --dangerously-skip-permissions │ │ │ │ │ │ environ: URLs only, no plaintext tokens │ │ │ -│ │ │ ANTHROPIC_BASE_URL=http://127.0.0.1:PORT/anth.. │ │ │ -│ │ │ npm registry → http://127.0.0.1:PORT/npm/ │ │ │ -│ │ │ git remote.url → http://127.0.0.1:PORT/... │ │ │ -│ │ │ tea --url → http://127.0.0.1:PORT/gitea │ │ │ +│ │ │ ANTHROPIC_BASE_URL=http://cred-proxy:PORT/an.. │ │ │ +│ │ │ npm registry → http://cred-proxy:PORT/npm/ │ │ │ +│ │ │ git insteadOf → http://cred-proxy:PORT/... │ │ │ +│ │ │ tea --url → http://cred-proxy:PORT/gite │ │ │ │ │ └────────────┬───────────────────────────────────────┘ │ │ -│ │ │ plain HTTP, loopback │ │ +│ │ │ HTTP, DNS → cred-proxy │ │ │ │ ▼ │ │ -│ │ ┌── UID 0 (root) ────────────────────────────────────┐ │ │ -│ │ │ cred-proxy listens 127.0.0.1:PORT │ │ │ -│ │ │ tokens live ONLY in this process's environ │ │ │ -│ │ │ per-route: inject auth header, forward over TLS │ │ │ +│ │ ┌── cred-proxy sidecar ──────────────────────────────┐ │ │ +│ │ │ distroless image, no shell, runs as root │ │ │ +│ │ │ hostname: cred-proxy listens 0.0.0.0:PORT │ │ │ +│ │ │ tokens live ONLY in this container's environ │ │ │ │ │ │ /anthropic → api.anthropic.com Bearer │ │ │ │ │ │ /gh-api → api.github.com Bearer │ │ │ │ │ │ /gh-git → github.com Bearer │ │ │ @@ -200,7 +207,7 @@ supported kinds (anthropic, github, gitea, npm): │ │ └────────────┬───────────────────────────────────────┘ │ │ │ │ │ HTTPS │ │ │ │ ▼ │ │ -│ │ ┌── pipelock (egress allowlist) ─────────────────────┐ │ │ +│ │ ┌── pipelock sidecar (egress allowlist) ─────────────┐ │ │ │ │ │ allow: api.anthropic.com, api.github.com, │ │ │ │ │ │ github.com, gitea.dideric.is, │ │ │ │ │ │ registry.npmjs.org │ │ │ @@ -213,35 +220,40 @@ supported kinds (anthropic, github, gitea, npm): Upstream APIs -Why node@1000 can't just steal the tokens: - ┌─────────────────────────────────────────────────────────┐ - │ node tries: │ - │ cat /proc//environ → EACCES │ - │ ptrace(PTRACE_ATTACH, , ...) → EPERM│ - │ Kernel's ptrace_may_access rejects: UID mismatch │ - │ and no CAP_SYS_PTRACE / CAP_PERFMON in the container. │ - └─────────────────────────────────────────────────────────┘ +Why the agent can't reach the sidecar's environ: + ┌───────────────────────────────────────────────────────────────┐ + │ Different container = different PID, mount, and network ns. │ + │ The agent's /proc shows only the agent's own processes; │ + │ the cred-proxy PID is not visible — no /proc//environ │ + │ to read, no PID to ptrace, no shared memory. │ + │ │ + │ Reaching the sidecar's environ requires escaping the agent │ + │ container — the same threshold pipelock and git-gate rely │ + │ on. Default Docker isolation is the boundary. │ + └───────────────────────────────────────────────────────────────┘ ``` ### New components - **`claude_bottle/cred_proxy.py`** (new): abstract `CredProxy` + `CredProxyPlan` dataclass. `prepare` is host-side and - side-effect-free on Docker; renders the route table and - resolves `TokenRef`s against host env. Mirrors the existing - `GitGate` / `Pipelock` shape. + side-effect-free; renders the route table and resolves + `TokenRef`s against host env. Mirrors the existing `GitGate` / + `Pipelock` shape. - **`claude_bottle/backend/docker/cred_proxy.py`** (new): - `DockerCredProxy` concrete subclass. Bakes the proxy binary - into the agent image; `start` writes the route table to a - mode-600 file under `stage_dir` and arranges the entrypoint - so the proxy boots first. + `DockerCredProxy` concrete subclass. `start` does + `docker create` on the bottle's internal network with hostname + `cred-proxy`, copies the route-table file into the container, + then `docker start`. `stop` is idempotent `docker rm -f`. + Container name: `claude-bottle-cred-proxy-`. - **`claude_bottle/backend/docker/provision/cred_proxy.py`** (new): renders `ANTHROPIC_BASE_URL`, `~/.npmrc`, `~/.gitconfig` `insteadOf` blocks, and `~/.config/tea/config.yml` - into the agent's home for each declared kind. -- **The proxy binary itself.** Bundled into the agent image at - `/usr/local/libexec/cred-proxy`. See "External dependencies" - for the language choice. + into the agent's home for each declared kind — all pointing at + `http://cred-proxy:/...`. +- **cred-proxy image.** Minimal base + the proxy binary, no + shell. Pinned by digest, baked at build time. Footprint sized + to match git-gate's image rather than the full agent image. ### Existing code touched @@ -251,14 +263,17 @@ Why node@1000 can't just steal the tokens: carry multiple Urls). - **`claude_bottle/backend/docker/prepare.py`** — delete the `CLAUDE_BOTTLE_OAUTH_TOKEN` → `CLAUDE_CODE_OAUTH_TOKEN` branch - in the agent's forwarded env. The OAuth token now flows to - the proxy's environ via the cred-proxy lifecycle. + in the agent's forwarded env. The OAuth token is forwarded + into the cred-proxy sidecar's environ at sidecar `docker create` + time instead. - **`claude_bottle/backend/docker/backend.py`** — instantiate - `DockerCredProxy`; thread its `prepare` / `start` / `stop` + `DockerCredProxy` alongside `DockerPipelockProxy` and + `DockerGitGate`; thread its `prepare` / `start` / `stop` through `resolve_plan` / `launch`. - **`claude_bottle/backend/docker/launch.py`** — add cred-proxy - start before the cred-proxy provisioner runs (provisioner - writes URLs that reference the proxy port, so it must be up). + start/stop to the `ExitStack` alongside pipelock and git-gate; + the sidecar must be up before the agent container starts so + DNS resolution for `cred-proxy` succeeds on first contact. - **`claude_bottle/backend/docker/bottle_plan.py`** — new `CredProxyPlan` field; preflight shows kind + ref name + port + route table. @@ -330,14 +345,17 @@ The proxy binary. Two real options: no new pip packages. Matches CLAUDE.md's "bash-first, low-deps" posture. SSE pass-through is fiddly but doable. - **Go single binary** — cleaner SSE story, smaller runtime, - one static binary baked into the image. New build dependency. + one static binary in a scratch/distroless image. New build + dependency. -Default: Python, baked into the agent image. Reconsider in the -implementation PR if SSE behavior is troublesome under load. +Default: Python in a minimal `python:3.X-slim` image (or alpine +if we want smaller). Reconsider in the implementation PR if SSE +behavior is troublesome under load. No new Python packages. No DB. No admin API. The proxy's -configuration is a single mode-600 JSON file passed in via -`/run/cred-proxy/routes.json`. +configuration is a single mode-600 JSON file copied into the +sidecar at `docker create` time and read by the proxy at startup +from `/run/cred-proxy/routes.json`. ## Future work @@ -353,12 +371,51 @@ configuration is a single mode-600 JSON file passed in via PATs with TTL), have the proxy mint a fresh per-session child credential from a long-lived parent. - **Smolmachines colocation.** Same packing question as - pipelock / git-gate; the cred-proxy can sit inside the agent - VM (current shape) or in a separate VM (stricter isolation, - per-bottle TCP hop). Backend decision, not a manifest decision. + pipelock / git-gate; under a future microVM backend the + cred-proxy could share a VM with the agent (today's per-bottle + network gives it its own container, not its own VM) or sit in + its own VM (stricter isolation, an extra TCP hop). Backend + decision, not a manifest decision. - **More kinds.** PyPI, Bun, cargo, Docker Hub. The routing pattern generalizes; add as needed. +## Considered alternatives + +### In-container proxy (root inside the agent container) + +Run cred-proxy as PID 1 of the agent container, listening on +`127.0.0.1:`, with claude exec'd as `node` (UID 1000) only +after the proxy is bound. The boundary in that shape is the +kernel's cross-UID `ptrace_may_access` check — `node` cannot read +root's `/proc//environ` and cannot `ptrace` attach. + +Pros: one less container per bottle; slightly faster bottle +startup; no extra docker create/start/stop dance. + +Rejected because: + +- **Weaker isolation.** The boundary collapses to UID separation + alone. Any container-root compromise inside the agent (setuid + bug in the image, accidentally mounted docker socket, a kernel + CVE, accidental `--privileged`) reads the proxy's environ via + `/proc//environ`. The sidecar's namespace separation + cannot be bypassed from inside the agent container without a + container escape. +- **Inconsistent with the existing topology.** pipelock and + git-gate are already sidecars on the bottle's internal network. + cred-proxy slots into the same shape and reuses the same + lifecycle abstractions (`BottleBackend.prepare/start/stop`, + `ExitStack` ordering, plan rendering). +- **Coupled to the agent image.** The proxy binary, its + entrypoint, and its priv-drop logic would all live in the + agent's Dockerfile. A sidecar image evolves independently — + agents can change base, language, or tooling without touching + the proxy. +- **PID-1 babysitting.** The "proxy supervises, then `exec + setpriv → node`" entrypoint introduces a class of issues + (zombie reaping, signal forwarding, exit-code propagation) that + the sidecar shape avoids. + ## Open questions - **Field name.** `bottle.tokens` is the working name. The @@ -368,11 +425,11 @@ configuration is a single mode-600 JSON file passed in via `bottle.cred_proxy`. Default: `bottle.tokens`. - **Python vs Go for the proxy.** Default: Python, revisit during implementation if SSE pass-through is unreliable. -- **Process inside the agent container vs sidecar container.** - v1: inside (simpler lifecycle, no extra container; ptrace - boundary is enough). The sidecar option becomes attractive - only if we want a network-layer split between proxy and agent - on top of the UID split. +- **Sidecar image base.** Distroless (smallest, no shell — hardest + to debug), Python slim (debuggable, larger), or scratch + a + statically-linked Go binary (smallest if Go). Default: whatever + fits the chosen language with the smallest non-shell base; + revisit if debuggability bites during implementation. - **Belt-and-braces on outbound telemetry.** Set `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1` and `DISABLE_ERROR_REPORTING=1` in the agent's environ by -- 2.52.0 From 930997d0a705383566eb4d93ccc87898540ef722 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 15:59:00 -0400 Subject: [PATCH 04/24] feat(manifest): add bottle.tokens with TokenEntry (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit TokenEntry carries Kind (anthropic / github / gitea / npm), TokenRef (name of host env var the CLI resolves at launch), and an optional Url (required for gitea, fixed for the other kinds). Validation rejects unknown kinds, duplicate non-gitea entries, duplicate gitea Urls, and overlap with bottle.git hosts (where git-gate is already brokering). No wiring yet — the field exists on Bottle but cred-proxy is the next step. Adds tests/unit/test_manifest_tokens.py. --- claude_bottle/manifest.py | 166 ++++++++++++++++++++++++- tests/unit/test_manifest_tokens.py | 191 +++++++++++++++++++++++++++++ 2 files changed, 356 insertions(+), 1 deletion(-) create mode 100644 tests/unit/test_manifest_tokens.py diff --git a/claude_bottle/manifest.py b/claude_bottle/manifest.py index face952..9e35da9 100644 --- a/claude_bottle/manifest.py +++ b/claude_bottle/manifest.py @@ -7,6 +7,7 @@ Schema (see CLAUDE.md "Intended design"): "": { "env": { "": , ... }, "git": [ , ... ], + "tokens": [ , ... ], "egress": { "allowlist": [ "", ... ] } } }, @@ -113,6 +114,94 @@ class GitEntry: ) +TOKEN_KINDS = ("anthropic", "github", "gitea", "npm") + + +@dataclass(frozen=True) +class TokenEntry: + """One credential the per-bottle cred-proxy sidecar (PRD 0010) + holds and injects on the agent's behalf. + + `Kind` selects the route handler: `anthropic` / `github` / `npm` + have fixed upstream URLs; `gitea` requires an explicit `Url` + because the upstream is per-instance. + + `TokenRef` is the name of the host env var the CLI resolves at + launch time. The value is forwarded into the cred-proxy + container's environ via `docker run -e NAME` — never onto argv, + never into a file. The value does NOT land in the agent's + environ. + + `UpstreamHost` is parsed from `Url` for `gitea` entries (or the + documented default for the other kinds). It exists so the + cross-validator can spot collisions with `bottle.git` upstreams + without re-parsing URLs at every call site.""" + + Kind: str + TokenRef: str + Url: str = "" + UpstreamHost: str = "" + + @classmethod + def from_dict(cls, bottle_name: str, idx: int, raw: object) -> "TokenEntry": + d = _as_json_object(raw, f"bottle '{bottle_name}' tokens[{idx}]") + kind = d.get("Kind") + if not isinstance(kind, str) or not kind: + die( + f"bottle '{bottle_name}' tokens[{idx}] missing required string field " + f"'Kind'" + ) + if kind not in TOKEN_KINDS: + die( + f"bottle '{bottle_name}' tokens[{idx}] Kind {kind!r} is not one of " + f"{', '.join(TOKEN_KINDS)}" + ) + token_ref = d.get("TokenRef") + if not isinstance(token_ref, str) or not token_ref: + die( + f"bottle '{bottle_name}' tokens[{idx}] ({kind}) missing required " + f"string field 'TokenRef' (name of the host env var to forward)" + ) + url_raw = d.get("Url") + if url_raw is None: + url = "" + elif isinstance(url_raw, str): + url = url_raw + else: + die( + f"bottle '{bottle_name}' tokens[{idx}] ({kind}) Url must be a string " + f"(was {type(url_raw).__name__})" + ) + if kind == "gitea": + if not url: + die( + f"bottle '{bottle_name}' tokens[{idx}] (gitea) requires a Url " + f"(the Gitea instance, e.g. https://gitea.dideric.is)" + ) + host = _parse_https_host( + url, f"bottle '{bottle_name}' tokens[{idx}] (gitea) Url" + ) + else: + if url: + die( + f"bottle '{bottle_name}' tokens[{idx}] ({kind}) cannot set Url; " + f"the upstream for this Kind is fixed by cred-proxy. Drop the " + f"'Url' field." + ) + host = _TOKEN_DEFAULT_HOST[kind] + return cls(Kind=kind, TokenRef=token_ref, Url=url, UpstreamHost=host) + + +# Hostnames the cred-proxy talks to upstream for the non-gitea kinds. +# Used both for the proxy's route table and for the manifest cross- +# validator that rejects overlap with `bottle.git`. +_TOKEN_DEFAULT_HOST: dict[str, str] = { + "anthropic": "api.anthropic.com", + "github": "github.com", + "npm": "registry.npmjs.org", +} + + DLP_ACTIONS = ("block", "warn") @@ -168,6 +257,7 @@ class BottleEgress: class Bottle: env: Mapping[str, str] = field(default_factory=_empty_str_dict) git: tuple[GitEntry, ...] = () + tokens: tuple[TokenEntry, ...] = () egress: BottleEgress = field(default_factory=BottleEgress) @classmethod @@ -215,6 +305,21 @@ class Bottle: ) _validate_unique_git_names(name, git) + tokens: tuple[TokenEntry, ...] = () + tokens_raw = d.get("tokens") + if tokens_raw is not None: + if not isinstance(tokens_raw, list): + die( + f"bottle '{name}' tokens must be an array " + f"(was {type(tokens_raw).__name__})" + ) + tokens_list = cast(list[object], tokens_raw) + tokens = tuple( + TokenEntry.from_dict(name, i, entry) + for i, entry in enumerate(tokens_list) + ) + _validate_tokens(name, tokens, git) + egress_raw = d.get("egress") egress = ( BottleEgress.from_dict(name, egress_raw) @@ -222,7 +327,7 @@ class Bottle: else BottleEgress() ) - return cls(env=env, git=git, egress=egress) + return cls(env=env, git=git, tokens=tokens, egress=egress) @dataclass(frozen=True) @@ -441,6 +546,65 @@ def _parse_git_upstream(url: str, label: str) -> tuple[str, str, str, str]: return (user, host, port, path) +def _parse_https_host(url: str, label: str) -> str: + """Extract the host from an `https://host[:port][/path]` URL. + Dies if `url` is not an https:// URL or the host segment is empty. + Used to derive `TokenEntry.UpstreamHost` from a gitea Url so the + cross-validator can spot collisions with `bottle.git` hosts.""" + if not url.startswith("https://"): + die(f"{label} must be an https:// URL (was {url!r})") + rest = url[len("https://"):] + hostport, _, _ = rest.partition("/") + host, _, _port = hostport.partition(":") + if not host: + die(f"{label} host is empty in {url!r}") + return host + + +def _validate_tokens( + bottle_name: str, + tokens: tuple[TokenEntry, ...], + git: tuple[GitEntry, ...], +) -> None: + """Cross-validation for `bottle.tokens`: + + - At most one entry per Kind, except `gitea` which may have + multiple entries (one per Gitea instance) with distinct Urls. + - No overlap with `bottle.git` hosts: a `github` or `gitea` token + whose host matches a `bottle.git` upstream host would put two + credential brokers on the same remote (git-gate's gitleaks- + scanning gate AND cred-proxy's bearer injection). Pick one. + """ + by_kind: dict[str, list[TokenEntry]] = {} + for t in tokens: + by_kind.setdefault(t.Kind, []).append(t) + for kind, entries in by_kind.items(): + if kind == "gitea": + seen: dict[str, None] = {} + for e in entries: + if e.Url in seen: + die( + f"bottle '{bottle_name}' tokens has duplicate gitea Url " + f"{e.Url!r}; one entry per Gitea instance." + ) + seen[e.Url] = None + elif len(entries) > 1: + die( + f"bottle '{bottle_name}' tokens has {len(entries)} entries with " + f"Kind {kind!r}; at most one is allowed (gitea is the only Kind " + f"that may have multiple entries)." + ) + + git_hosts = {g.UpstreamHost for g in git} + for t in tokens: + if t.Kind in ("github", "gitea") and t.UpstreamHost in git_hosts: + die( + f"bottle '{bottle_name}' token ({t.Kind}, host {t.UpstreamHost!r}) " + f"overlaps a bottle.git upstream on the same host. git-gate already " + f"brokers this remote; drop the token entry or remove the git entry." + ) + + def _validate_unique_git_names(bottle_name: str, git: tuple[GitEntry, ...]) -> None: seen: dict[str, None] = {} for g in git: diff --git a/tests/unit/test_manifest_tokens.py b/tests/unit/test_manifest_tokens.py new file mode 100644 index 0000000..388c591 --- /dev/null +++ b/tests/unit/test_manifest_tokens.py @@ -0,0 +1,191 @@ +"""Unit: Bottle.tokens manifest parsing + validation (PRD 0010).""" + +import unittest + +from claude_bottle.log import Die +from claude_bottle.manifest import Manifest + + +def _manifest(tokens, git=None): + bottle: dict[str, object] = {"tokens": tokens} + if git is not None: + bottle["git"] = git + return { + "bottles": {"dev": bottle}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + } + + +class TestTokenEntryParsing(unittest.TestCase): + def test_parses_anthropic_entry(self): + m = Manifest.from_json_obj(_manifest([ + {"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}, + ])) + entries = m.bottles["dev"].tokens + self.assertEqual(1, len(entries)) + e = entries[0] + self.assertEqual("anthropic", e.Kind) + self.assertEqual("CLAUDE_BOTTLE_OAUTH_TOKEN", e.TokenRef) + self.assertEqual("", e.Url) + self.assertEqual("api.anthropic.com", e.UpstreamHost) + + def test_parses_github_entry(self): + m = Manifest.from_json_obj(_manifest([ + {"Kind": "github", "TokenRef": "GITHUB_TOKEN"}, + ])) + e = m.bottles["dev"].tokens[0] + self.assertEqual("github", e.Kind) + self.assertEqual("github.com", e.UpstreamHost) + + def test_parses_npm_entry(self): + m = Manifest.from_json_obj(_manifest([ + {"Kind": "npm", "TokenRef": "NPM_TOKEN"}, + ])) + e = m.bottles["dev"].tokens[0] + self.assertEqual("registry.npmjs.org", e.UpstreamHost) + + def test_parses_gitea_entry_with_url(self): + m = Manifest.from_json_obj(_manifest([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is"}, + ])) + e = m.bottles["dev"].tokens[0] + self.assertEqual("gitea", e.Kind) + self.assertEqual("https://gitea.dideric.is", e.Url) + self.assertEqual("gitea.dideric.is", e.UpstreamHost) + + def test_gitea_url_with_port_strips_port_from_host(self): + m = Manifest.from_json_obj(_manifest([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is:30009"}, + ])) + self.assertEqual("gitea.dideric.is", m.bottles["dev"].tokens[0].UpstreamHost) + + +class TestTokenEntryValidation(unittest.TestCase): + def test_unknown_kind_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "aws", "TokenRef": "AWS_TOKEN"}, + ])) + + def test_missing_kind_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"TokenRef": "GITHUB_TOKEN"}, + ])) + + def test_missing_token_ref_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "github"}, + ])) + + def test_gitea_without_url_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN"}, + ])) + + def test_gitea_with_non_https_url_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "http://gitea.dideric.is"}, + ])) + + def test_non_gitea_kind_with_url_dies(self): + # Url is fixed for anthropic / github / npm — passing one is a + # configuration smell, not an override knob. + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "github", "TokenRef": "GITHUB_TOKEN", + "Url": "https://api.example.com"}, + ])) + + def test_duplicate_non_gitea_kind_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "github", "TokenRef": "A"}, + {"Kind": "github", "TokenRef": "B"}, + ])) + + def test_two_gitea_with_distinct_urls_ok(self): + m = Manifest.from_json_obj(_manifest([ + {"Kind": "gitea", "TokenRef": "T1", + "Url": "https://gitea.dideric.is"}, + {"Kind": "gitea", "TokenRef": "T2", + "Url": "https://gitea.example.com"}, + ])) + self.assertEqual(2, len(m.bottles["dev"].tokens)) + + def test_two_gitea_with_same_url_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([ + {"Kind": "gitea", "TokenRef": "T1", + "Url": "https://gitea.dideric.is"}, + {"Kind": "gitea", "TokenRef": "T2", + "Url": "https://gitea.dideric.is"}, + ])) + + +class TestTokenGitOverlap(unittest.TestCase): + def test_github_token_collides_with_github_git_entry(self): + # bottle.git already brokers github.com via the gate; declaring + # a github token on top would put two credential brokers on + # the same remote. + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest( + tokens=[{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}], + git=[{ + "Name": "myrepo", + "Upstream": "ssh://git@github.com/me/myrepo.git", + "IdentityFile": "/dev/null", + }], + )) + + def test_gitea_token_collides_with_same_host_git_entry(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest( + tokens=[{ + "Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is", + }], + git=[{ + "Name": "myrepo", + "Upstream": "ssh://git@gitea.dideric.is:30009/me/myrepo.git", + "IdentityFile": "/dev/null", + }], + )) + + def test_anthropic_token_does_not_collide_with_git(self): + # api.anthropic.com isn't a git host; no overlap possible. + m = Manifest.from_json_obj(_manifest( + tokens=[{"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}], + git=[{ + "Name": "myrepo", + "Upstream": "ssh://git@gitea.dideric.is:30009/me/myrepo.git", + "IdentityFile": "/dev/null", + }], + )) + self.assertEqual(1, len(m.bottles["dev"].tokens)) + + +class TestEmptyTokensField(unittest.TestCase): + def test_no_tokens_field_yields_empty_tuple(self): + m = Manifest.from_json_obj({ + "bottles": {"dev": {}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }) + self.assertEqual((), m.bottles["dev"].tokens) + + def test_tokens_array_type_required(self): + with self.assertRaises(Die): + Manifest.from_json_obj({ + "bottles": {"dev": {"tokens": "not-a-list"}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 3165fbeafe53dd4d5c8657ee721a73a6f30f647b Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:01:18 -0400 Subject: [PATCH 05/24] feat(cred_proxy): add abstract CredProxy + plan (PRD 0010) Lifts bottle.tokens into a per-route CredProxyUpstream table, renders a mode-600 routes.json that carries no token values or host env-var names, and derives the {token_env: TokenRef} map the launch step will use to forward host env values into the sidecar's environ. Shape mirrors GitGate/PipelockProxy: abstract base does the host-side prepare; start/stop is backend-specific. No backend wiring yet. --- claude_bottle/cred_proxy.py | 268 ++++++++++++++++++++++++++++++++++ tests/unit/test_cred_proxy.py | 191 ++++++++++++++++++++++++ 2 files changed, 459 insertions(+) create mode 100644 claude_bottle/cred_proxy.py create mode 100644 tests/unit/test_cred_proxy.py diff --git a/claude_bottle/cred_proxy.py b/claude_bottle/cred_proxy.py new file mode 100644 index 0000000..ab53db4 --- /dev/null +++ b/claude_bottle/cred_proxy.py @@ -0,0 +1,268 @@ +"""Per-bottle credential proxy (PRD 0010). + +A fourth per-bottle sidecar that holds API tokens (Anthropic OAuth, +GitHub PAT, Gitea PAT, npm token) and injects them as `Authorization` +headers on the agent's behalf. The agent's environ carries only URLs +pointing at `cred-proxy:/`; the upstream credentials live +exclusively in the cred-proxy container's environ. + +The boundary is the container line — different PID, mount, and network +namespaces separate the agent's container from the cred-proxy's, so +the agent cannot ptrace into the proxy, cannot read its environ via +/proc, and cannot share memory. Reaching the proxy's environ requires +escaping the agent container, the same threshold pipelock and +git-gate already rely on. + +This module defines the abstract proxy (`CredProxy`), its plan +dataclass (`CredProxyPlan`), and the per-route shape +(`CredProxyUpstream`). The sidecar's start/stop lifecycle is backend- +specific and lives on concrete subclasses (see +`claude_bottle/backend/docker/cred_proxy.py`). +""" + +from __future__ import annotations + +import json +from abc import ABC, abstractmethod +from dataclasses import dataclass +from pathlib import Path + +from .log import die +from .manifest import Bottle, TokenEntry + + +@dataclass(frozen=True) +class CredProxyUpstream: + """One route on the cred-proxy sidecar. Maps a path under the + proxy to a real upstream, an auth scheme, and the env-var slot + that holds the token inside the proxy container. + + `kind` is the originating `TokenEntry.Kind`; `path` is the agent- + facing prefix (e.g. `/anthropic/`); `upstream` is the upstream + base URL with scheme; `auth_scheme` is the literal word that + precedes the token in the injected header (`Bearer` for all kinds + except `gitea`, which uses `token` to sidestep go-gitea/gitea#16734). + + `token_env` is the env-var name inside the cred-proxy container + (e.g. `CRED_PROXY_TOKEN_0`); `token_ref` is the host env var the + CLI reads at launch and forwards into the container's environ + under `token_env`. Two routes that share a TokenRef (the github + Kind expands into two routes — gh-api and gh-git) carry the same + `token_env`.""" + + kind: str + path: str + upstream: str + auth_scheme: str + token_env: str + token_ref: str + + +@dataclass(frozen=True) +class CredProxyPlan: + """Output of CredProxy.prepare; consumed by .start. + + The slug + routes_path + upstreams + token_env_map fields are + filled at prepare time (host-side, side-effect-free on docker). + The network fields are populated by the backend's launch step + via `dataclasses.replace` once those networks exist. Empty + defaults are sentinels meaning "not yet set"; `.start` validates + that they are populated. + + `token_env_map` is `{: }`. + The backend's start step reads `os.environ[TokenRef]` and forwards + the value into the cred-proxy container's environ under + `token_env`. The plan itself never holds token values — secrets + never land in a dataclass that might be logged.""" + + slug: str + routes_path: Path + upstreams: tuple[CredProxyUpstream, ...] + token_env_map: dict[str, str] + internal_network: str = "" + egress_network: str = "" + + +# Hardcoded upstream URLs for the non-gitea Kinds. Gitea's URL is per- +# entry (`TokenEntry.Url`). +_KIND_ROUTES: dict[str, tuple[tuple[str, str], ...]] = { + # kind -> ((path, upstream), ...) — a Kind can produce multiple + # routes; today only `github` does (api + git endpoints). + "anthropic": (("/anthropic/", "https://api.anthropic.com"),), + "github": ( + ("/gh-api/", "https://api.github.com"), + ("/gh-git/", "https://github.com"), + ), + "npm": (("/npm/", "https://registry.npmjs.org"),), +} + +# Per-Kind auth header value prefix. Gitea uses `token` (not Bearer); +# everyone else uses Bearer. +_KIND_AUTH_SCHEME: dict[str, str] = { + "anthropic": "Bearer", + "github": "Bearer", + "gitea": "token", + "npm": "Bearer", +} + + +def cred_proxy_route_path_for_gitea(host: str) -> str: + """Agent-facing path for a single Gitea instance. The host segment + disambiguates routes when multiple gitea entries are declared.""" + return f"/gitea/{host}/" + + +def cred_proxy_upstreams_for_bottle( + bottle: Bottle, +) -> tuple[CredProxyUpstream, ...]: + """Lift every `bottle.tokens[]` entry into one or more + CredProxyUpstreams. Order is preserved so route lookup is stable. + Manifest validation already enforced uniqueness rules.""" + out: list[CredProxyUpstream] = [] + for i, t in enumerate(bottle.tokens): + token_env = f"CRED_PROXY_TOKEN_{i}" + scheme = _KIND_AUTH_SCHEME[t.Kind] + if t.Kind == "gitea": + out.append(CredProxyUpstream( + kind="gitea", + path=cred_proxy_route_path_for_gitea(t.UpstreamHost), + upstream=t.Url.rstrip("/"), + auth_scheme=scheme, + token_env=token_env, + token_ref=t.TokenRef, + )) + else: + for path, upstream in _KIND_ROUTES[t.Kind]: + out.append(CredProxyUpstream( + kind=t.Kind, + path=path, + upstream=upstream, + auth_scheme=scheme, + token_env=token_env, + token_ref=t.TokenRef, + )) + return tuple(out) + + +def cred_proxy_token_env_map( + upstreams: tuple[CredProxyUpstream, ...], +) -> dict[str, str]: + """Collapse the upstream list into `{token_env: TokenRef}`. Two + routes that share a token (gh-api + gh-git) coalesce; the result + is the set of env vars the backend's start step must forward into + the sidecar's environ.""" + out: dict[str, str] = {} + for u in upstreams: + existing = out.get(u.token_env) + if existing is not None and existing != u.token_ref: + die( + f"cred-proxy plan conflict: {u.token_env} maps to both " + f"{existing!r} and {u.token_ref!r}. Two routes sharing a " + f"token slot must reference the same host env var." + ) + out[u.token_env] = u.token_ref + return out + + +def cred_proxy_render_routes( + upstreams: tuple[CredProxyUpstream, ...], +) -> str: + """Serialize the route table for the cred-proxy server to read. + JSON, no token values, no host env-var names — the only thing + the proxy needs at runtime is the path → upstream + auth-scheme + + in-container env-var mapping. The actual token values arrive via + the container's environ.""" + payload = { + "routes": [ + { + "path": u.path, + "upstream": u.upstream, + "auth_scheme": u.auth_scheme, + "token_env": u.token_env, + } + for u in upstreams + ], + } + return json.dumps(payload, indent=2, sort_keys=False) + "\n" + + +def cred_proxy_resolve_token_values( + token_env_map: dict[str, str], + host_env: dict[str, str], +) -> dict[str, str]: + """Read `host_env[TokenRef]` for each entry in `token_env_map` and + return `{token_env: }`. Dies (with a clear pointer at the + missing var name) if any TokenRef is unset. + + Pure function: takes the host env as an argument so tests can pass + a sealed mapping without touching `os.environ`.""" + out: dict[str, str] = {} + for token_env, token_ref in token_env_map.items(): + value = host_env.get(token_ref) + if value is None: + die( + f"cred-proxy: host env var '{token_ref}' is unset. Set it " + f"before launching, or remove the corresponding token entry " + f"from bottle.tokens." + ) + if not value: + die( + f"cred-proxy: host env var '{token_ref}' is empty. The " + f"cred-proxy will not inject an empty token; set it to the " + f"real value or remove the token entry." + ) + out[token_env] = value + return out + + +class CredProxy(ABC): + """The per-bottle credential proxy. Encapsulates the host-side + prepare (upstream lift + routes.json render + token-env-map + derivation); the sidecar's start/stop lifecycle is backend- + specific and lives on concrete subclasses.""" + + def prepare(self, bottle: Bottle, slug: str, stage_dir: Path) -> CredProxyPlan: + """Lift `bottle.tokens` into the upstream table, render the + routes.json (mode 600) under `stage_dir`, and return the plan. + Pure host-side, no docker subprocess. The token-env map records + the mapping the launch step uses to forward values from the + host's environ into the sidecar's environ. + + Returned plan is incomplete: the launch step must fill + `internal_network` / `egress_network` via `dataclasses.replace` + before passing it to `.start`.""" + upstreams = cred_proxy_upstreams_for_bottle(bottle) + routes_path = stage_dir / "cred_proxy_routes.json" + routes_path.write_text(cred_proxy_render_routes(upstreams)) + routes_path.chmod(0o600) + return CredProxyPlan( + slug=slug, + routes_path=routes_path, + upstreams=upstreams, + token_env_map=cred_proxy_token_env_map(upstreams), + ) + + @abstractmethod + def start(self, plan: CredProxyPlan) -> str: + """Bring up the cred-proxy sidecar according to `plan`. Returns + the target string identifying the running instance — the same + value to pass to `.stop`. Backend-specific.""" + + @abstractmethod + def stop(self, target: str) -> None: + """Tear down the cred-proxy sidecar identified by `target` (the + value `.start` returned). Idempotent: a missing target is + success. Backend-specific.""" + + +__all__ = [ + "CredProxy", + "CredProxyPlan", + "CredProxyUpstream", + "TokenEntry", + "cred_proxy_render_routes", + "cred_proxy_resolve_token_values", + "cred_proxy_route_path_for_gitea", + "cred_proxy_token_env_map", + "cred_proxy_upstreams_for_bottle", +] diff --git a/tests/unit/test_cred_proxy.py b/tests/unit/test_cred_proxy.py new file mode 100644 index 0000000..68794b9 --- /dev/null +++ b/tests/unit/test_cred_proxy.py @@ -0,0 +1,191 @@ +"""Unit: CredProxy upstream lift + routes.json render + token resolution +(PRD 0010).""" + +import json +import unittest + +from claude_bottle.cred_proxy import ( + cred_proxy_render_routes, + cred_proxy_resolve_token_values, + cred_proxy_token_env_map, + cred_proxy_upstreams_for_bottle, +) +from claude_bottle.log import Die +from claude_bottle.manifest import Manifest + + +def _bottle(tokens): + return Manifest.from_json_obj({ + "bottles": {"dev": {"tokens": tokens}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }).bottles["dev"] + + +class TestUpstreamLift(unittest.TestCase): + def test_anthropic_yields_one_route(self): + b = _bottle([{"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}]) + upstreams = cred_proxy_upstreams_for_bottle(b) + self.assertEqual(1, len(upstreams)) + u = upstreams[0] + self.assertEqual("anthropic", u.kind) + self.assertEqual("/anthropic/", u.path) + self.assertEqual("https://api.anthropic.com", u.upstream) + self.assertEqual("Bearer", u.auth_scheme) + self.assertEqual("CRED_PROXY_TOKEN_0", u.token_env) + self.assertEqual("CLAUDE_BOTTLE_OAUTH_TOKEN", u.token_ref) + + def test_github_yields_two_routes_sharing_token_env(self): + b = _bottle([{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}]) + upstreams = cred_proxy_upstreams_for_bottle(b) + self.assertEqual(2, len(upstreams)) + paths = [u.path for u in upstreams] + self.assertIn("/gh-api/", paths) + self.assertIn("/gh-git/", paths) + self.assertEqual({"CRED_PROXY_TOKEN_0"}, {u.token_env for u in upstreams}) + for u in upstreams: + self.assertEqual("Bearer", u.auth_scheme) + self.assertEqual("GITHUB_TOKEN", u.token_ref) + + def test_gitea_uses_token_scheme_and_host_path(self): + b = _bottle([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is"}, + ]) + u = cred_proxy_upstreams_for_bottle(b)[0] + self.assertEqual("/gitea/gitea.dideric.is/", u.path) + self.assertEqual("https://gitea.dideric.is", u.upstream) + self.assertEqual("token", u.auth_scheme) + + def test_gitea_url_trailing_slash_stripped(self): + b = _bottle([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is/"}, + ]) + u = cred_proxy_upstreams_for_bottle(b)[0] + self.assertEqual("https://gitea.dideric.is", u.upstream) + + def test_npm_yields_one_route(self): + b = _bottle([{"Kind": "npm", "TokenRef": "NPM_TOKEN"}]) + u = cred_proxy_upstreams_for_bottle(b)[0] + self.assertEqual("/npm/", u.path) + self.assertEqual("https://registry.npmjs.org", u.upstream) + + def test_four_kinds_get_distinct_token_envs(self): + b = _bottle([ + {"Kind": "anthropic", "TokenRef": "A"}, + {"Kind": "github", "TokenRef": "G"}, + {"Kind": "gitea", "TokenRef": "T", + "Url": "https://gitea.dideric.is"}, + {"Kind": "npm", "TokenRef": "N"}, + ]) + upstreams = cred_proxy_upstreams_for_bottle(b) + # 1 anthropic + 2 github + 1 gitea + 1 npm = 5 routes + self.assertEqual(5, len(upstreams)) + # github shares one token_env across its two routes -> 4 distinct + envs = {u.token_env for u in upstreams} + self.assertEqual({"CRED_PROXY_TOKEN_0", "CRED_PROXY_TOKEN_1", + "CRED_PROXY_TOKEN_2", "CRED_PROXY_TOKEN_3"}, envs) + + def test_empty_tokens_yields_empty_upstreams(self): + b = _bottle([]) + self.assertEqual((), cred_proxy_upstreams_for_bottle(b)) + + +class TestTokenEnvMap(unittest.TestCase): + def test_distinct_envs_yield_full_map(self): + b = _bottle([ + {"Kind": "anthropic", "TokenRef": "A"}, + {"Kind": "github", "TokenRef": "G"}, + ]) + m = cred_proxy_token_env_map(cred_proxy_upstreams_for_bottle(b)) + self.assertEqual({"CRED_PROXY_TOKEN_0": "A", + "CRED_PROXY_TOKEN_1": "G"}, m) + + def test_github_two_routes_coalesce_to_one_env(self): + b = _bottle([{"Kind": "github", "TokenRef": "G"}]) + m = cred_proxy_token_env_map(cred_proxy_upstreams_for_bottle(b)) + self.assertEqual({"CRED_PROXY_TOKEN_0": "G"}, m) + + +class TestRoutesRender(unittest.TestCase): + def test_renders_json_with_expected_shape(self): + b = _bottle([ + {"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}, + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is"}, + ]) + rendered = cred_proxy_render_routes(cred_proxy_upstreams_for_bottle(b)) + payload = json.loads(rendered) + self.assertEqual(["routes"], list(payload.keys())) + self.assertEqual(2, len(payload["routes"])) + anthropic = payload["routes"][0] + self.assertEqual({"path", "upstream", "auth_scheme", "token_env"}, + set(anthropic.keys())) + self.assertEqual("/anthropic/", anthropic["path"]) + self.assertEqual("https://api.anthropic.com", anthropic["upstream"]) + self.assertEqual("Bearer", anthropic["auth_scheme"]) + self.assertEqual("CRED_PROXY_TOKEN_0", anthropic["token_env"]) + + def test_routes_carry_no_token_values_or_host_env_names(self): + # routes.json lives mode-600 in the staging dir and gets + # docker cp'd into the sidecar — it must not leak secret values + # or even the host-side TokenRef name. + b = _bottle([{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}]) + rendered = cred_proxy_render_routes(cred_proxy_upstreams_for_bottle(b)) + self.assertNotIn("GITHUB_TOKEN", rendered) + + def test_empty_upstreams_renders_empty_routes_array(self): + rendered = cred_proxy_render_routes(()) + self.assertEqual({"routes": []}, json.loads(rendered)) + + +class TestResolveTokenValues(unittest.TestCase): + def test_resolves_present_env(self): + out = cred_proxy_resolve_token_values( + {"CRED_PROXY_TOKEN_0": "FOO"}, + {"FOO": "the-value"}, + ) + self.assertEqual({"CRED_PROXY_TOKEN_0": "the-value"}, out) + + def test_unset_host_env_dies(self): + with self.assertRaises(Die): + cred_proxy_resolve_token_values( + {"CRED_PROXY_TOKEN_0": "MISSING"}, + {}, + ) + + def test_empty_host_env_dies(self): + with self.assertRaises(Die): + cred_proxy_resolve_token_values( + {"CRED_PROXY_TOKEN_0": "FOO"}, + {"FOO": ""}, + ) + + +class TestCredProxyPrepare(unittest.TestCase): + def test_prepare_writes_routes_file_and_returns_plan(self): + import tempfile + from pathlib import Path + + from claude_bottle.cred_proxy import CredProxy, CredProxyPlan + + class StubCredProxy(CredProxy): + def start(self, plan): return "" + def stop(self, target): return None + + b = _bottle([{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}]) + with tempfile.TemporaryDirectory() as td: + stage = Path(td) + plan = StubCredProxy().prepare(b, "test-slug", stage) + self.assertIsInstance(plan, CredProxyPlan) + self.assertEqual("test-slug", plan.slug) + self.assertTrue(plan.routes_path.is_file()) + self.assertEqual(0o600, plan.routes_path.stat().st_mode & 0o777) + payload = json.loads(plan.routes_path.read_text()) + self.assertEqual(2, len(payload["routes"])) + self.assertEqual({"CRED_PROXY_TOKEN_0": "GITHUB_TOKEN"}, + plan.token_env_map) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 3436d8a68a3474595b06f85476e7c09d5917dfcd Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:05:56 -0400 Subject: [PATCH 06/24] feat(cred_proxy): add HTTP server + sidecar image (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stdlib-only Python proxy: reads /run/cred-proxy/routes.json on boot, listens on 0.0.0.0:9099, strips inbound Authorization, injects the configured header (Bearer or token) using the route's token_env env var, forwards over HTTPS to the upstream, and streams the response back chunk-by-chunk (SSE-safe). Hop-by-hop headers are stripped per RFC 7230, including anything listed in `Connection:`. Content-Length is dropped so http.client recomputes it on the upstream leg. Tokens never reach routes.json — they arrive via the container's environ. Dockerfile.cred-proxy builds on python:3.13-alpine pinned by digest; mkdir /run/cred-proxy is baked in so docker cp can drop the route table at start time. No pip install layer. Smoke-tested: container boots, logs listen line, returns 404 for unmatched paths. Full request/response cycle covered by the integration tests in a follow-up commit. --- Dockerfile.cred-proxy | 35 +++ claude_bottle/cred_proxy_server.py | 401 +++++++++++++++++++++++++++ tests/unit/test_cred_proxy_server.py | 205 ++++++++++++++ 3 files changed, 641 insertions(+) create mode 100644 Dockerfile.cred-proxy create mode 100644 claude_bottle/cred_proxy_server.py create mode 100644 tests/unit/test_cred_proxy_server.py diff --git a/Dockerfile.cred-proxy b/Dockerfile.cred-proxy new file mode 100644 index 0000000..451d7cc --- /dev/null +++ b/Dockerfile.cred-proxy @@ -0,0 +1,35 @@ +# Per-bottle cred-proxy sidecar image (PRD 0010). +# +# Holds API tokens (Anthropic OAuth, GitHub PAT, Gitea PAT, npm) in +# this container's environ, strips inbound Authorization headers, and +# injects the configured one before forwarding to the real upstream +# over HTTPS. The agent's environ carries only URLs pointing at this +# sidecar — the upstream credentials never reach the agent container. +# +# Stdlib-only Python; no pip install layer. The route table lands at +# /run/cred-proxy/routes.json via `docker cp` from the backend's +# start step. + +# python:3.13-alpine. Pinned by digest for reproducibility — the +# proxy script is stdlib-only so a Python minor-version drift would +# only affect the runtime, not API surface, but pinning makes the +# image bytes deterministic. +FROM python@sha256:420cd0bf0f3998275875e02ecd5808168cf0843cbb4d3c536432f729247b2acc + +# The proxy script ships as a single file. Tests in tests/unit/ import +# it as `claude_bottle.cred_proxy_server`; the container runs it +# directly as a script. No package install, no other modules pulled. +COPY claude_bottle/cred_proxy_server.py /app/cred_proxy_server.py + +# Pre-create the runtime directory the backend's start step will +# `docker cp` routes.json into. docker cp does not create +# intermediate dirs, so the mkdir must be baked into the image. +RUN mkdir -p /run/cred-proxy + +# Listening port. The agent's environ resolves the cred-proxy host +# via Docker's embedded DNS on the per-bottle internal network and +# dials this port. Surfaced as EXPOSE for documentation; not required +# for the internal network to route to it. +EXPOSE 9099 + +ENTRYPOINT ["python3", "/app/cred_proxy_server.py"] diff --git a/claude_bottle/cred_proxy_server.py b/claude_bottle/cred_proxy_server.py new file mode 100644 index 0000000..6d756bb --- /dev/null +++ b/claude_bottle/cred_proxy_server.py @@ -0,0 +1,401 @@ +"""Cred-proxy HTTP server (PRD 0010). + +Runs inside the per-bottle cred-proxy sidecar. Reads +`/run/cred-proxy/routes.json` (laid down by the backend's start step +via `docker cp`) and listens on `0.0.0.0:`. For each request: + + 1. Match the request path against the longest route prefix. + 2. Strip any inbound `Authorization` header (the agent cannot + smuggle a stolen token through this path). + 3. Inject the configured header using the value of the env var + named by the route's `token_env`. + 4. Forward to the upstream over HTTPS, preserving method, path + suffix, query string, request body, and the remaining headers. + 5. Stream the response back without buffering — SSE-safe. + +The agent talks plain HTTP to this server (loopback-equivalent across +the per-bottle internal docker network). The cred-proxy talks HTTPS +outbound through pipelock to the real upstream. Tokens live in this +container's environ; they never land in routes.json on disk and never +reach the agent's container. + +Stdlib-only: this file ships into a minimal Python image with no pip +install layer. The constants are duplicated from `cred_proxy.py` so +the server doesn't need to import the rest of the package. +""" + +from __future__ import annotations + +import http.client +import http.server +import json +import os +import socketserver +import sys +import typing +import urllib.parse +from dataclasses import dataclass + + +# --- Config / route table --------------------------------------------------- + + +@dataclass(frozen=True) +class Route: + """One row of the proxy's route table. + + `path` is the agent-facing prefix (e.g. `/anthropic/`); the + incoming request's path starts with this. `upstream_scheme` / + `upstream_host` / `upstream_base_path` are the parsed pieces of + the upstream URL — the request's path after the prefix is + appended to `upstream_base_path`. `auth_scheme` is the literal + word in the injected header (`Bearer` or `token`). `token_env` + is the env-var name this container reads to get the token.""" + + path: str + upstream_scheme: str + upstream_host: str + upstream_port: int + upstream_base_path: str + auth_scheme: str + token_env: str + + +def parse_routes(payload: dict[str, object]) -> tuple[Route, ...]: + """Parse the routes.json payload into a tuple of `Route`s. Sorted + by descending path length so longest-prefix match is the first + hit in iteration order.""" + raw = payload.get("routes") + if not isinstance(raw, list): + raise ValueError("routes.json: 'routes' must be a list") + out: list[Route] = [] + for r in raw: + if not isinstance(r, dict): + raise ValueError(f"routes.json: route must be an object (got {type(r).__name__})") + path = r["path"] + upstream = r["upstream"] + auth_scheme = r["auth_scheme"] + token_env = r["token_env"] + if not isinstance(path, str) or not path.startswith("/") or not path.endswith("/"): + raise ValueError(f"routes.json: path {path!r} must start and end with /") + if not isinstance(upstream, str): + raise ValueError("routes.json: upstream must be a string") + if not isinstance(auth_scheme, str): + raise ValueError("routes.json: auth_scheme must be a string") + if not isinstance(token_env, str) or not token_env: + raise ValueError("routes.json: token_env must be a non-empty string") + parsed = urllib.parse.urlsplit(upstream) + if parsed.scheme not in ("http", "https"): + raise ValueError(f"routes.json: upstream scheme must be http or https (got {parsed.scheme!r})") + if not parsed.hostname: + raise ValueError(f"routes.json: upstream {upstream!r} missing host") + port = parsed.port or (443 if parsed.scheme == "https" else 80) + base_path = parsed.path or "" + out.append(Route( + path=path, + upstream_scheme=parsed.scheme, + upstream_host=parsed.hostname, + upstream_port=port, + upstream_base_path=base_path, + auth_scheme=auth_scheme, + token_env=token_env, + )) + out.sort(key=lambda r: len(r.path), reverse=True) + return tuple(out) + + +def select_route(routes: typing.Sequence[Route], request_path: str) -> Route | None: + """Return the longest-prefix matching route, or None. Caller is + responsible for stripping any query string before passing + `request_path`.""" + for r in routes: + if request_path.startswith(r.path): + return r + return None + + +# --- Header handling -------------------------------------------------------- + + +# Hop-by-hop headers (RFC 7230 §6.1). Stripped before forwarding. +# Plus `host` (we set it for the upstream) and any `authorization` / +# `proxy-authorization` (the proxy injects its own, never forwards +# the agent's). +_HOP_BY_HOP = frozenset({ + "connection", + "keep-alive", + "proxy-authenticate", + "proxy-authorization", + "te", + "trailers", + "transfer-encoding", + "upgrade", +}) + +_STRIPPED = _HOP_BY_HOP | frozenset({"host", "authorization", "content-length"}) + + +def build_forward_headers( + incoming: typing.Iterable[tuple[str, str]], + *, + auth_scheme: str, + token: str, + upstream_host: str, +) -> list[tuple[str, str]]: + """Build the header list to send upstream. + + - Strip hop-by-hop headers, the inbound Authorization (the agent + cannot smuggle a stolen token), and Host (we set it ourselves). + - Strip Content-Length too: http.client recomputes it when we + pass `body` to `request()`. + - Honor the `Connection: close, x, y, z` form by also stripping + every listed header name. + - Inject `Authorization: ` and a Host header + pointing at the upstream. + """ + incoming_list = list(incoming) + # Headers listed in `Connection:` are also hop-by-hop for this hop. + extra_hop: set[str] = set() + for name, value in incoming_list: + if name.lower() == "connection": + for token_name in value.split(","): + extra_hop.add(token_name.strip().lower()) + forwarded: list[tuple[str, str]] = [] + for name, value in incoming_list: + lname = name.lower() + if lname in _STRIPPED or lname in extra_hop: + continue + forwarded.append((name, value)) + forwarded.append(("Host", upstream_host)) + forwarded.append(("Authorization", f"{auth_scheme} {token}")) + return forwarded + + +def filter_response_headers( + incoming: typing.Iterable[tuple[str, str]], +) -> list[tuple[str, str]]: + """Build the response header list to send back to the agent. + Strip hop-by-hop + `transfer-encoding` (we let the client's + HTTP/1.1 default chunking handle streamed bodies).""" + incoming_list = list(incoming) + extra_hop: set[str] = set() + for name, value in incoming_list: + if name.lower() == "connection": + for token_name in value.split(","): + extra_hop.add(token_name.strip().lower()) + out: list[tuple[str, str]] = [] + for name, value in incoming_list: + lname = name.lower() + if lname in _HOP_BY_HOP or lname in extra_hop: + continue + out.append((name, value)) + return out + + +# --- HTTP handler ----------------------------------------------------------- + + +# How many bytes to read off the upstream response per chunk. Small +# enough that SSE keep-alive `:` lines (~1 byte) and per-event payloads +# (~hundreds of bytes) round-trip without waiting for a larger buffer +# to fill. Large enough to not dominate syscall overhead under load. +STREAM_CHUNK = 4096 + + +class CredProxyHandler(http.server.BaseHTTPRequestHandler): + """Per-request handler. The routes + tokens are read off the + server instance (set by `serve()`).""" + + # Quieter logs: the default writes one line per request to stderr. + # Useful in debug but noisy in normal operation. + def log_message(self, format: str, *args: typing.Any) -> None: + if os.environ.get("CRED_PROXY_DEBUG"): + super().log_message(format, *args) + + def do_GET(self) -> None: self._proxy() + def do_POST(self) -> None: self._proxy() + def do_PUT(self) -> None: self._proxy() + def do_DELETE(self) -> None: self._proxy() + def do_PATCH(self) -> None: self._proxy() + def do_HEAD(self) -> None: self._proxy() + def do_OPTIONS(self) -> None: self._proxy() + + def _proxy(self) -> None: + server = typing.cast("CredProxyServer", self.server) + path, _, query = self.path.partition("?") + route = select_route(server.routes, path) + if route is None: + self.send_error(404, f"no route for {path!r}") + return + token = server.tokens.get(route.token_env) + if not token: + self.send_error(500, f"cred-proxy: env var {route.token_env} unset in sidecar") + return + + suffix = path[len(route.path):] + upstream_path = route.upstream_base_path.rstrip("/") + "/" + suffix + if query: + upstream_path = f"{upstream_path}?{query}" + + # Read the request body, if any. We do not stream the body up + # because http.client doesn't accept a streamable body for + # arbitrary methods cleanly. v1 buffers — claude's tool-use + # requests are small JSON payloads; SSE flows are in the + # response direction only. + body: bytes | None = None + length_header = self.headers.get("Content-Length") + if length_header is not None: + try: + length = int(length_header) + except ValueError: + self.send_error(400, "invalid Content-Length") + return + if length > 0: + body = self.rfile.read(length) + elif self.headers.get("Transfer-Encoding", "").lower() == "chunked": + self.send_error(411, "cred-proxy: chunked request bodies not supported in v1") + return + + forward_headers = build_forward_headers( + self.headers.items(), + auth_scheme=route.auth_scheme, + token=token, + upstream_host=route.upstream_host, + ) + + if route.upstream_scheme == "https": + conn: http.client.HTTPConnection = http.client.HTTPSConnection( + route.upstream_host, route.upstream_port, timeout=300, + ) + else: + conn = http.client.HTTPConnection( + route.upstream_host, route.upstream_port, timeout=300, + ) + + try: + conn.request(self.command, upstream_path, body=body, + headers=dict(forward_headers)) + resp = conn.getresponse() + except (OSError, http.client.HTTPException) as e: + try: + conn.close() + except Exception: + pass + self.send_error(502, f"upstream connection failed: {e}") + return + + try: + self._stream_response(resp) + finally: + try: + conn.close() + except Exception: + pass + + def _stream_response(self, resp: http.client.HTTPResponse) -> None: + out_headers = filter_response_headers(resp.getheaders()) + # We send Connection: close so the agent's client closes after + # each request; simplifies streaming bookkeeping and keeps + # the handler stateless per request. + self.send_response(resp.status, resp.reason) + for name, value in out_headers: + self.send_header(name, value) + self.send_header("Connection", "close") + self.end_headers() + try: + while True: + chunk = resp.read(STREAM_CHUNK) + if not chunk: + break + self.wfile.write(chunk) + self.wfile.flush() + except (BrokenPipeError, ConnectionResetError): + # Agent disconnected mid-stream; that's fine. + return + + +class CredProxyServer(socketserver.ThreadingMixIn, http.server.HTTPServer): + """Threaded HTTP server. `routes` + `tokens` are populated by + `serve()` before `serve_forever()`.""" + + allow_reuse_address = True + daemon_threads = True + + routes: tuple[Route, ...] = () + tokens: dict[str, str] = {} + + +# --- Entry point ------------------------------------------------------------ + + +DEFAULT_ROUTES_PATH = "/run/cred-proxy/routes.json" +DEFAULT_PORT = 9099 + + +def load_routes(path: str) -> tuple[Route, ...]: + with open(path, "r", encoding="utf-8") as f: + payload = json.load(f) + if not isinstance(payload, dict): + raise ValueError(f"{path}: top-level must be an object") + return parse_routes(payload) + + +def load_tokens(routes: tuple[Route, ...], environ: typing.Mapping[str, str]) -> dict[str, str]: + """Read each route's `token_env` from the supplied environ. Missing + entries default to empty string; the handler returns 500 for + unset tokens at request time so the operator can spot the + misconfig in the cred-proxy's logs without the proxy refusing to + boot.""" + out: dict[str, str] = {} + for r in routes: + out[r.token_env] = environ.get(r.token_env, "") + return out + + +def serve( + *, + routes_path: str = DEFAULT_ROUTES_PATH, + port: int = DEFAULT_PORT, + bind: str = "0.0.0.0", + environ: typing.Mapping[str, str] | None = None, +) -> typing.NoReturn: + """Bring up the server and run until killed. Exits non-zero on + config error so the container's restart policy can surface the + failure rather than silently retrying.""" + env = environ if environ is not None else os.environ + routes = load_routes(routes_path) + tokens = load_tokens(routes, env) + server = CredProxyServer((bind, port), CredProxyHandler) + server.routes = routes + server.tokens = tokens + sys.stderr.write( + f"cred-proxy listening on {bind}:{port}; " + f"{len(routes)} route(s): " + f"{', '.join(r.path for r in routes)}\n" + ) + sys.stderr.flush() + try: + server.serve_forever() + except KeyboardInterrupt: + pass + finally: + server.server_close() + sys.exit(0) + + +def main(argv: list[str]) -> int: + """Tiny argv shim: no flags in v1, all config via env vars. + + `CRED_PROXY_ROUTES` overrides the routes path (default + `/run/cred-proxy/routes.json`). `CRED_PROXY_PORT` overrides the + listen port. Both have defaults so the container needs no extra + config to come up.""" + routes_path = os.environ.get("CRED_PROXY_ROUTES", DEFAULT_ROUTES_PATH) + port = int(os.environ.get("CRED_PROXY_PORT", str(DEFAULT_PORT))) + bind = os.environ.get("CRED_PROXY_BIND", "0.0.0.0") + serve(routes_path=routes_path, port=port, bind=bind) + return 0 # serve() does not return. + + +if __name__ == "__main__": + raise SystemExit(main(sys.argv)) diff --git a/tests/unit/test_cred_proxy_server.py b/tests/unit/test_cred_proxy_server.py new file mode 100644 index 0000000..f3f22fd --- /dev/null +++ b/tests/unit/test_cred_proxy_server.py @@ -0,0 +1,205 @@ +"""Unit: cred-proxy server pure functions — route parsing, route +selection, header injection (PRD 0010).""" + +import unittest + +from claude_bottle.cred_proxy_server import ( + Route, + build_forward_headers, + filter_response_headers, + load_tokens, + parse_routes, + select_route, +) + + +class TestParseRoutes(unittest.TestCase): + def test_parses_minimal_payload(self): + routes = parse_routes({"routes": [ + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_env": "CRED_PROXY_TOKEN_0"}, + ]}) + self.assertEqual(1, len(routes)) + r = routes[0] + self.assertEqual("/anthropic/", r.path) + self.assertEqual("https", r.upstream_scheme) + self.assertEqual("api.anthropic.com", r.upstream_host) + self.assertEqual(443, r.upstream_port) + self.assertEqual("", r.upstream_base_path) + self.assertEqual("Bearer", r.auth_scheme) + self.assertEqual("CRED_PROXY_TOKEN_0", r.token_env) + + def test_extracts_port_from_upstream(self): + routes = parse_routes({"routes": [ + {"path": "/gitea/gitea.dideric.is/", + "upstream": "https://gitea.dideric.is:30443", + "auth_scheme": "token", "token_env": "CRED_PROXY_TOKEN_0"}, + ]}) + self.assertEqual(30443, routes[0].upstream_port) + + def test_sorted_by_descending_path_length(self): + # /a/b/ should come before /a/ so longest-prefix is first. + routes = parse_routes({"routes": [ + {"path": "/a/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_env": "T1"}, + {"path": "/a/b/", "upstream": "https://y.example", + "auth_scheme": "Bearer", "token_env": "T2"}, + ]}) + self.assertEqual("/a/b/", routes[0].path) + self.assertEqual("/a/", routes[1].path) + + def test_bad_path_rejected(self): + with self.assertRaises(ValueError): + parse_routes({"routes": [ + {"path": "no-leading-slash", "upstream": "https://x", + "auth_scheme": "Bearer", "token_env": "T"}, + ]}) + + def test_non_http_scheme_rejected(self): + with self.assertRaises(ValueError): + parse_routes({"routes": [ + {"path": "/x/", "upstream": "ftp://x.example/", + "auth_scheme": "Bearer", "token_env": "T"}, + ]}) + + +class TestSelectRoute(unittest.TestCase): + def setUp(self): + self.routes = parse_routes({"routes": [ + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_env": "T_A"}, + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_env": "T_G"}, + {"path": "/gitea/gitea.dideric.is/", + "upstream": "https://gitea.dideric.is", + "auth_scheme": "token", "token_env": "T_T"}, + ]}) + + def test_matches_prefix(self): + r = select_route(self.routes, "/anthropic/v1/messages") + assert r is not None + self.assertEqual("/anthropic/", r.path) + + def test_no_match_returns_none(self): + self.assertIsNone(select_route(self.routes, "/other/path")) + + def test_picks_longest_prefix(self): + routes = parse_routes({"routes": [ + {"path": "/a/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_env": "T1"}, + {"path": "/a/long/", "upstream": "https://y.example", + "auth_scheme": "Bearer", "token_env": "T2"}, + ]}) + r = select_route(routes, "/a/long/sub") + assert r is not None + self.assertEqual("/a/long/", r.path) + + +class TestBuildForwardHeaders(unittest.TestCase): + def test_strips_authorization_and_injects(self): + headers = build_forward_headers( + [("Authorization", "Bearer stolen-token"), + ("Content-Type", "application/json")], + auth_scheme="Bearer", + token="real-token", + upstream_host="api.anthropic.com", + ) + names = [n.lower() for n, _ in headers] + # Only one Authorization remains, with the injected value. + auth_values = [v for n, v in headers if n.lower() == "authorization"] + self.assertEqual(["Bearer real-token"], auth_values) + self.assertEqual(1, names.count("authorization")) + # Content-Type passes through. + self.assertIn(("Content-Type", "application/json"), headers) + + def test_strips_authorization_case_insensitive(self): + headers = build_forward_headers( + [("authorization", "Bearer stolen")], + auth_scheme="Bearer", + token="real", + upstream_host="x.example", + ) + auth_values = [v for n, v in headers if n.lower() == "authorization"] + self.assertEqual(["Bearer real"], auth_values) + + def test_strips_hop_by_hop(self): + headers = build_forward_headers( + [("Connection", "keep-alive, x-custom"), + ("X-Custom", "should-be-dropped"), + ("Keep-Alive", "300"), + ("Transfer-Encoding", "chunked"), + ("X-Real", "kept")], + auth_scheme="Bearer", + token="t", + upstream_host="x.example", + ) + names = [n.lower() for n, _ in headers] + self.assertNotIn("connection", names) + self.assertNotIn("keep-alive", names) + self.assertNotIn("transfer-encoding", names) + self.assertNotIn("x-custom", names) # listed in Connection: -> hop-by-hop + self.assertIn("x-real", names) + + def test_strips_content_length(self): + # http.client recomputes Content-Length; passing it through + # double-counts and breaks the upstream. + headers = build_forward_headers( + [("Content-Length", "999")], + auth_scheme="Bearer", token="t", upstream_host="x.example", + ) + names = [n.lower() for n, _ in headers] + self.assertNotIn("content-length", names) + + def test_sets_host_to_upstream(self): + headers = build_forward_headers( + [("Host", "cred-proxy:9099")], + auth_scheme="Bearer", token="t", upstream_host="api.anthropic.com", + ) + host_values = [v for n, v in headers if n.lower() == "host"] + self.assertEqual(["api.anthropic.com"], host_values) + + def test_uses_token_scheme(self): + # gitea uses Authorization: token , not Bearer. + headers = build_forward_headers( + [], + auth_scheme="token", token="abc123", upstream_host="gitea.dideric.is", + ) + auth_values = [v for n, v in headers if n.lower() == "authorization"] + self.assertEqual(["token abc123"], auth_values) + + +class TestFilterResponseHeaders(unittest.TestCase): + def test_strips_hop_by_hop_only(self): + out = filter_response_headers([ + ("Content-Type", "text/event-stream"), + ("Connection", "close"), + ("Transfer-Encoding", "chunked"), + ("Cache-Control", "no-cache"), + ]) + names = [n.lower() for n, _ in out] + self.assertIn("content-type", names) + self.assertIn("cache-control", names) + self.assertNotIn("connection", names) + self.assertNotIn("transfer-encoding", names) + + +class TestLoadTokens(unittest.TestCase): + def test_reads_per_route_env(self): + routes = ( + Route("/a/", "https", "x", 443, "", "Bearer", "T_0"), + Route("/b/", "https", "y", 443, "", "Bearer", "T_1"), + ) + out = load_tokens(routes, {"T_0": "val0", "T_1": "val1"}) + self.assertEqual({"T_0": "val0", "T_1": "val1"}, out) + + def test_missing_env_yields_empty_string(self): + # The handler returns 500 at request time rather than the + # server refusing to start. This keeps the operator's failure + # signal in the cred-proxy's logs. + routes = (Route("/a/", "https", "x", 443, "", "Bearer", "T_0"),) + out = load_tokens(routes, {}) + self.assertEqual({"T_0": ""}, out) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 61e334c1b80c9ad9c65677108010aec1bafae30b Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:07:52 -0400 Subject: [PATCH 07/24] feat(cred_proxy): add DockerCredProxy concrete lifecycle (PRD 0010) Mirrors DockerGitGate: build the image, docker create on the internal network with --network-alias cred-proxy, docker cp the routes.json into /run/cred-proxy/, attach the egress network, docker start. stop() is idempotent. Token values flow host env -> subprocess env -> sidecar env via docker create -e NAME (no =VALUE on argv). The resolver fails early with a clear pointer at the missing host env var name if any TokenRef is unset. Helpers (cred_proxy_container_name, cred_proxy_url) are agent-side stable: the URL uses the network alias, not the slugged container name, so the provisioner can write a fixed http://cred-proxy:9099/ URL regardless of which bottle is running. --- claude_bottle/backend/docker/cred_proxy.py | 209 +++++++++++++++++++++ tests/unit/test_docker_cred_proxy.py | 82 ++++++++ 2 files changed, 291 insertions(+) create mode 100644 claude_bottle/backend/docker/cred_proxy.py create mode 100644 tests/unit/test_docker_cred_proxy.py diff --git a/claude_bottle/backend/docker/cred_proxy.py b/claude_bottle/backend/docker/cred_proxy.py new file mode 100644 index 0000000..4e28bab --- /dev/null +++ b/claude_bottle/backend/docker/cred_proxy.py @@ -0,0 +1,209 @@ +"""DockerCredProxy — the Docker-specific lifecycle for the per-bottle +cred-proxy sidecar (PRD 0010). Inherits the platform-agnostic prepare +step (upstream lift + routes.json render + token-env-map derivation) +from `CredProxy`.""" + +from __future__ import annotations + +import os +import subprocess +from pathlib import Path + +from ...cred_proxy import ( + CredProxy, + CredProxyPlan, + cred_proxy_resolve_token_values, +) +from ...log import die, info, warn +from . import util as docker_mod + + +CRED_PROXY_IMAGE = os.environ.get( + "CLAUDE_BOTTLE_CRED_PROXY_IMAGE", + "claude-bottle-cred-proxy:latest", +) + +CRED_PROXY_DOCKERFILE = "Dockerfile.cred-proxy" + +# Listening port inside the sidecar. The agent dials cred-proxy on +# this port; surfaced as a constant so the provisioner and tests can +# both reference it. +CRED_PROXY_PORT = int(os.environ.get("CLAUDE_BOTTLE_CRED_PROXY_PORT", "9099")) + +# DNS name agents use to reach the sidecar. Attached as a +# --network-alias on the internal docker network so the URL the +# provisioner writes into the agent's environ is stable across +# bottles (the container name carries the per-bottle slug; the alias +# does not). +CRED_PROXY_HOSTNAME = "cred-proxy" + +# In-container path the proxy server reads its route table from. +# Pre-created in Dockerfile.cred-proxy so `docker cp` can drop the +# file directly. +CRED_PROXY_ROUTES_IN_CONTAINER = "/run/cred-proxy/routes.json" + +# Repo root, for `docker build` context. Resolved from this file's +# location: claude_bottle/backend/docker/cred_proxy.py → repo root. +_REPO_DIR = str(Path(__file__).resolve().parent.parent.parent.parent) + + +def cred_proxy_container_name(slug: str) -> str: + return f"claude-bottle-cred-proxy-{slug}" + + +def cred_proxy_url() -> str: + """Base URL the agent dials. Stable across bottles because the + sidecar attaches `--network-alias cred-proxy` on the internal + network; the container name (which carries the slug) is not + referenced by agent-side config.""" + return f"http://{CRED_PROXY_HOSTNAME}:{CRED_PROXY_PORT}" + + +def build_cred_proxy_image() -> None: + """Build the cred-proxy image from `Dockerfile.cred-proxy`. + Called by `DockerCredProxy.start`; exposed at module level so + integration tests can build it without running the full launch + pipeline.""" + docker_mod.build_image(CRED_PROXY_IMAGE, _REPO_DIR, dockerfile=CRED_PROXY_DOCKERFILE) + + +class DockerCredProxy(CredProxy): + """Brings the cred-proxy sidecar up and down via Docker.""" + + def start(self, plan: CredProxyPlan) -> str: + """Boot the cred-proxy sidecar: + 1. Resolve every host TokenRef env var into a concrete + value. Fails early if any are unset. + 2. Build the cred-proxy image (no-op when cache is hot). + 3. `docker create` on the internal network with + `--network-alias cred-proxy` and one `-e CRED_PROXY_TOKEN_N` + flag per route. The values arrive via subprocess env, so + they never land on argv. + 4. `docker cp` the routes.json into the container. + 5. Attach to the per-agent egress network so the proxy can + reach the real upstream over HTTPS. + 6. `docker start`. + Returns the container name (the target passed to `.stop`).""" + if not plan.upstreams: + die("DockerCredProxy.start called with no upstreams; caller should skip") + if not plan.internal_network or not plan.egress_network: + die( + "DockerCredProxy.start: internal_network / egress_network must be " + "populated on the plan before start" + ) + if not plan.routes_path.is_file(): + die( + f"cred-proxy routes file missing at {plan.routes_path}; " + f"CredProxy.prepare must run first" + ) + + # Resolve host env vars into concrete values. This must + # happen at start time (not prepare) — the values flow into + # the sidecar's environ via subprocess env. The plan never + # holds them. + token_values = cred_proxy_resolve_token_values(plan.token_env_map, dict(os.environ)) + + build_cred_proxy_image() + + name = cred_proxy_container_name(plan.slug) + info(f"starting cred-proxy sidecar {name} on network {plan.internal_network}") + + create_args = [ + "docker", "create", + "--name", name, + "--network", plan.internal_network, + "--network-alias", CRED_PROXY_HOSTNAME, + ] + # One -e flag per token slot; values arrive via subprocess env. + # docker create with `-e NAME` (no =VALUE) reads NAME from the + # current process env at create time. We pass `env=child_env` + # to subprocess.run so the value comes from token_values, not + # the host's os.environ directly — keeps the resolver in one + # place and lets cred_proxy_resolve_token_values surface + # missing-env errors with a clear hint. + for token_env in sorted(plan.token_env_map.keys()): + create_args.extend(["-e", token_env]) + create_args.append(CRED_PROXY_IMAGE) + + child_env: dict[str, str] = {**os.environ, **token_values} + + if subprocess.run( + create_args, + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + env=child_env, + check=False, + ).returncode != 0: + die(f"failed to create cred-proxy sidecar {name}") + + cp_result = subprocess.run( + ["docker", "cp", str(plan.routes_path), + f"{name}:{CRED_PROXY_ROUTES_IN_CONTAINER}"], + capture_output=True, + text=True, + check=False, + ) + if cp_result.returncode != 0: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + die( + f"failed to copy routes.json into {name}: " + f"{cp_result.stderr.strip()}" + ) + + if subprocess.run( + ["docker", "network", "connect", plan.egress_network, name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ).returncode != 0: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + die( + f"failed to attach cred-proxy sidecar {name} to egress network " + f"{plan.egress_network}" + ) + + if subprocess.run( + ["docker", "start", name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ).returncode != 0: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + die(f"failed to start cred-proxy sidecar {name}") + + return name + + def stop(self, target: str) -> None: + """Idempotent: missing container is success. `target` is the + container name returned by `.start`.""" + if subprocess.run( + ["docker", "inspect", target], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ).returncode == 0: + if subprocess.run( + ["docker", "rm", "-f", target], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ).returncode != 0: + warn( + f"failed to remove cred-proxy sidecar {target}; " + f"clean up with 'docker rm -f {target}'" + ) diff --git a/tests/unit/test_docker_cred_proxy.py b/tests/unit/test_docker_cred_proxy.py new file mode 100644 index 0000000..f292996 --- /dev/null +++ b/tests/unit/test_docker_cred_proxy.py @@ -0,0 +1,82 @@ +"""Unit: DockerCredProxy helpers + early-exit guards (PRD 0010). + +The full docker lifecycle is exercised by integration tests; here we +cover the pure helpers and the validation checks `.start` runs +before touching docker.""" + +import unittest +from pathlib import Path + +from claude_bottle.backend.docker.cred_proxy import ( + CRED_PROXY_HOSTNAME, + CRED_PROXY_PORT, + DockerCredProxy, + cred_proxy_container_name, + cred_proxy_url, +) +from claude_bottle.cred_proxy import CredProxyPlan, CredProxyUpstream +from claude_bottle.log import Die + + +def _empty_plan(**overrides): + base = { + "slug": "demo", + "routes_path": Path("/nonexistent"), + "upstreams": (), + "token_env_map": {}, + "internal_network": "", + "egress_network": "", + } + base.update(overrides) + return CredProxyPlan(**base) + + +class TestNameAndUrl(unittest.TestCase): + def test_container_name_carries_slug(self): + self.assertEqual("claude-bottle-cred-proxy-demo", + cred_proxy_container_name("demo")) + + def test_url_uses_alias_not_container_name(self): + # The URL agents dial is stable across bottles — the slug + # never appears in it. That's the whole point of attaching + # --network-alias cred-proxy on the internal network. + self.assertEqual(f"http://{CRED_PROXY_HOSTNAME}:{CRED_PROXY_PORT}", + cred_proxy_url()) + + +class TestStartGuards(unittest.TestCase): + def setUp(self): + self.proxy = DockerCredProxy() + + def test_empty_upstreams_dies(self): + with self.assertRaises(Die): + self.proxy.start(_empty_plan()) + + def test_missing_internal_network_dies(self): + upstream = CredProxyUpstream( + kind="anthropic", path="/anthropic/", + upstream="https://api.anthropic.com", + auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", + token_ref="T", + ) + with self.assertRaises(Die): + self.proxy.start(_empty_plan(upstreams=(upstream,))) + + def test_missing_routes_file_dies(self): + upstream = CredProxyUpstream( + kind="anthropic", path="/anthropic/", + upstream="https://api.anthropic.com", + auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", + token_ref="T", + ) + with self.assertRaises(Die): + self.proxy.start(_empty_plan( + upstreams=(upstream,), + internal_network="net-x", + egress_network="egress-x", + routes_path=Path("/tmp/cred-proxy-test-does-not-exist.json"), + )) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From b3529b27a556ea6da04048ae2299f47674aa9349 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:11:04 -0400 Subject: [PATCH 08/24] feat(cred_proxy): add agent-side provisioner (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit provision_cred_proxy(plan, target) drops: - ~/.npmrc with registry= pointing at /npm/ on the proxy - ~/.gitconfig insteadOf rules for github (https://github.com/) and per-gitea hosts, appended after provision_git's git-gate rules - ~/.config/tea/config.yml with a logins: entry per declared gitea URL, pointing at /gitea// on the proxy Renderers are pure and unit-tested. The dispatcher reads plan.cred_proxy_plan.upstreams, which the backend wiring (next commit) populates on DockerBottlePlan. ANTHROPIC_BASE_URL is deliberately *not* a dotfile — it goes into the agent's docker run -e env so claude sees it from process start. --- .../backend/docker/provision/cred_proxy.py | 217 ++++++++++++++++++ tests/unit/test_provision_cred_proxy.py | 109 +++++++++ 2 files changed, 326 insertions(+) create mode 100644 claude_bottle/backend/docker/provision/cred_proxy.py create mode 100644 tests/unit/test_provision_cred_proxy.py diff --git a/claude_bottle/backend/docker/provision/cred_proxy.py b/claude_bottle/backend/docker/provision/cred_proxy.py new file mode 100644 index 0000000..5fecde9 --- /dev/null +++ b/claude_bottle/backend/docker/provision/cred_proxy.py @@ -0,0 +1,217 @@ +"""Cred-proxy provisioning inside a running Docker bottle (PRD 0010). + +Writes the agent-side configuration that points each tool at the +per-bottle cred-proxy sidecar: + + - ~/.npmrc — `registry=` pointing at /npm/ + - ~/.gitconfig (appended) — `insteadOf` rules for the + github / gitea hosts the bottle + declared a token for + - ~/.config/tea/config.yml — per-gitea login pointing at + /gitea// + +The ANTHROPIC_BASE_URL env var is set at `docker run -e` time by the +backend's launch step, not here — it has to be in the agent's environ +before claude starts, and there is no point in writing it to a dotfile +the agent would have to source. See `prepare.py` for that. +""" + +from __future__ import annotations + +import os +import subprocess +from pathlib import Path + +from ....cred_proxy import CredProxyUpstream +from ....log import info +from .. import util as docker_mod +from ..bottle_plan import DockerBottlePlan +from ..cred_proxy import cred_proxy_url + + +def provision_cred_proxy(plan: DockerBottlePlan, target: str) -> None: + """Drop the agent-side dotfiles for each declared cred-proxy + route. No-op when the bottle has no tokens.""" + upstreams = plan.cred_proxy_plan.upstreams + if not upstreams: + return + _provision_npmrc(plan, target, upstreams) + _provision_gitconfig(plan, target, upstreams) + _provision_tea_config(plan, target, upstreams) + + +# --- npm -------------------------------------------------------------------- + + +def render_npmrc(upstreams: tuple[CredProxyUpstream, ...]) -> str: + """Render `~/.npmrc` content. No-op (empty string) when no npm + route is declared, so callers can branch on emptiness. + + The proxy strips inbound Authorization and injects its own — the + npmrc deliberately carries no `_authToken`. The registry alone + is enough.""" + for u in upstreams: + if u.kind == "npm": + return f"registry={cred_proxy_url()}{u.path}\n" + return "" + + +def _provision_npmrc( + plan: DockerBottlePlan, + target: str, + upstreams: tuple[CredProxyUpstream, ...], +) -> None: + content = render_npmrc(upstreams) + if not content: + return + container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") + container_npmrc = f"{container_home}/.npmrc" + npmrc = plan.stage_dir / "agent_npmrc" + npmrc.write_text(content) + npmrc.chmod(0o600) + info(f"writing {container_npmrc} (cred-proxy npm registry)") + subprocess.run( + ["docker", "cp", str(npmrc), f"{target}:{container_npmrc}"], + stdout=subprocess.DEVNULL, + check=True, + ) + docker_mod.docker_exec_root(target, ["chown", "node:node", container_npmrc]) + docker_mod.docker_exec_root(target, ["chmod", "644", container_npmrc]) + + +# --- git config ------------------------------------------------------------- + + +def render_cred_proxy_gitconfig(upstreams: tuple[CredProxyUpstream, ...]) -> str: + """Render the `~/.gitconfig` fragment for cred-proxy insteadOf + rewrites. Empty string when no github / gitea routes are declared. + + github expands to two rewrites: https://github.com/... → /gh-git/... + (the git transport endpoint), and the agent's git client reaches + api.github.com over the same proxy via the /gh-api/ route, but + that's used by tools that call the GitHub API directly (gh, tea, + octokit) rather than `git` itself. + + Gitea entries get one rewrite per declared host, pointing at + /gitea//. The path component scopes the credential + so multiple gitea instances coexist on one proxy.""" + rules: list[str] = [] + for u in upstreams: + if u.kind == "github" and u.path == "/gh-git/": + rules.append( + f'[url "{cred_proxy_url()}/gh-git/"]\n' + f"\tinsteadOf = https://github.com/\n" + ) + elif u.kind == "gitea": + # u.upstream is the configured gitea URL (e.g. + # https://gitea.dideric.is) and u.path is /gitea//. + rules.append( + f'[url "{cred_proxy_url()}{u.path}"]\n' + f"\tinsteadOf = {u.upstream}/\n" + ) + if not rules: + return "" + return ( + "# claude-bottle cred-proxy (PRD 0010): rewrite https:/// to\n" + "# the per-bottle cred-proxy sidecar, which holds the upstream\n" + "# credential and injects the Authorization header.\n" + + "".join(rules) + ) + + +def _provision_gitconfig( + plan: DockerBottlePlan, + target: str, + upstreams: tuple[CredProxyUpstream, ...], +) -> None: + """Append the cred-proxy insteadOf rules to ~/.gitconfig. Runs + after `provision_git`, so any git-gate rules already live in the + file; we append rather than overwrite.""" + content = render_cred_proxy_gitconfig(upstreams) + if not content: + return + container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") + container_gitconfig = f"{container_home}/.gitconfig" + info(f"appending cred-proxy insteadOf rules to {container_gitconfig}") + # Use `tee -a` over stdin so the content never lands on argv and the + # append is atomic from the agent's perspective. `tee` runs as the + # node user (the default in the container) so ownership is preserved. + result = subprocess.run( + ["docker", "exec", "-i", target, "tee", "-a", container_gitconfig], + input=content, + text=True, + capture_output=True, + check=False, + ) + if result.returncode != 0: + # Fall back to root-tee in case ~/.gitconfig didn't exist as the + # node user yet (no git-gate rules were written). The chown + # below makes ownership consistent. + result_root = subprocess.run( + ["docker", "exec", "-i", "-u", "0", target, + "tee", "-a", container_gitconfig], + input=content, + text=True, + capture_output=True, + check=True, + ) + _ = result_root # silence unused + docker_mod.docker_exec_root(target, ["chown", "node:node", container_gitconfig]) + docker_mod.docker_exec_root(target, ["chmod", "644", container_gitconfig]) + + +# --- tea -------------------------------------------------------------------- + + +def render_tea_config(upstreams: tuple[CredProxyUpstream, ...]) -> str: + """Render `~/.config/tea/config.yml`. One `logins:` entry per + gitea route, pointing at the cred-proxy. The proxy substitutes + the real token; the value in `token:` here is a placeholder and + is replaced by the proxy on every request, but `tea` won't make + calls without a non-empty token field.""" + giteas = [u for u in upstreams if u.kind == "gitea"] + if not giteas: + return "" + lines = ["logins:"] + for u in giteas: + # Derive a stable login name from the host (the part of the + # path between /gitea/ and the trailing /). + host = u.path[len("/gitea/"):].rstrip("/") + lines.extend([ + f"- name: {host}", + f" url: {cred_proxy_url()}{u.path}", + " token: cred-proxy-placeholder", + " default: false", + " ssh_host: \"\"", + " ssh_key: \"\"", + " insecure: false", + ]) + return "\n".join(lines) + "\n" + + +def _provision_tea_config( + plan: DockerBottlePlan, + target: str, + upstreams: tuple[CredProxyUpstream, ...], +) -> None: + content = render_tea_config(upstreams) + if not content: + return + container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") + container_tea = f"{container_home}/.config/tea/config.yml" + cfg = plan.stage_dir / "agent_tea_config.yml" + cfg.write_text(content) + cfg.chmod(0o600) + info(f"writing {container_tea} ({len([u for u in upstreams if u.kind == 'gitea'])} gitea login(s))") + docker_mod.docker_exec_root( + target, ["mkdir", "-p", str(Path(container_tea).parent)] + ) + subprocess.run( + ["docker", "cp", str(cfg), f"{target}:{container_tea}"], + stdout=subprocess.DEVNULL, + check=True, + ) + docker_mod.docker_exec_root(target, [ + "chown", "-R", "node:node", str(Path(container_tea).parent), + ]) + docker_mod.docker_exec_root(target, ["chmod", "600", container_tea]) diff --git a/tests/unit/test_provision_cred_proxy.py b/tests/unit/test_provision_cred_proxy.py new file mode 100644 index 0000000..dbf7730 --- /dev/null +++ b/tests/unit/test_provision_cred_proxy.py @@ -0,0 +1,109 @@ +"""Unit: cred-proxy agent-side provisioner renderers (PRD 0010). + +The docker cp / docker exec side effects are exercised by integration +tests; these unit tests cover the pure render functions.""" + +import unittest + +from claude_bottle.backend.docker.provision.cred_proxy import ( + render_cred_proxy_gitconfig, + render_npmrc, + render_tea_config, +) +from claude_bottle.cred_proxy import cred_proxy_upstreams_for_bottle +from claude_bottle.manifest import Manifest + + +def _bottle(tokens): + return Manifest.from_json_obj({ + "bottles": {"dev": {"tokens": tokens}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }).bottles["dev"] + + +def _upstreams(tokens): + return cred_proxy_upstreams_for_bottle(_bottle(tokens)) + + +class TestRenderNpmrc(unittest.TestCase): + def test_empty_when_no_npm_route(self): + self.assertEqual("", render_npmrc(_upstreams([]))) + self.assertEqual("", render_npmrc(_upstreams([ + {"Kind": "anthropic", "TokenRef": "A"}, + ]))) + + def test_writes_registry_line(self): + out = render_npmrc(_upstreams([ + {"Kind": "npm", "TokenRef": "NPM_TOKEN"}, + ])) + self.assertEqual("registry=http://cred-proxy:9099/npm/\n", out) + + def test_omits_authtoken(self): + # The proxy injects Authorization at request time. The npmrc + # deliberately carries no _authToken — a stale token there + # would just get stripped, but it also creates the false + # impression that the agent holds a credential. + out = render_npmrc(_upstreams([ + {"Kind": "npm", "TokenRef": "NPM_TOKEN"}, + ])) + self.assertNotIn("_authToken", out) + self.assertNotIn("NPM_TOKEN", out) + + +class TestRenderGitconfig(unittest.TestCase): + def test_empty_when_no_github_or_gitea(self): + self.assertEqual("", render_cred_proxy_gitconfig(_upstreams([ + {"Kind": "anthropic", "TokenRef": "A"}, + {"Kind": "npm", "TokenRef": "N"}, + ]))) + + def test_github_writes_https_insteadof(self): + out = render_cred_proxy_gitconfig(_upstreams([ + {"Kind": "github", "TokenRef": "GITHUB_TOKEN"}, + ])) + self.assertIn('[url "http://cred-proxy:9099/gh-git/"]', out) + self.assertIn("insteadOf = https://github.com/", out) + + def test_gitea_writes_per_host_insteadof(self): + out = render_cred_proxy_gitconfig(_upstreams([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is"}, + ])) + self.assertIn('[url "http://cred-proxy:9099/gitea/gitea.dideric.is/"]', out) + self.assertIn("insteadOf = https://gitea.dideric.is/", out) + + def test_two_giteas_yield_two_rules(self): + out = render_cred_proxy_gitconfig(_upstreams([ + {"Kind": "gitea", "TokenRef": "G1", + "Url": "https://gitea.dideric.is"}, + {"Kind": "gitea", "TokenRef": "G2", + "Url": "https://gitea.example.com"}, + ])) + self.assertEqual(2, out.count("insteadOf")) + self.assertIn("gitea.dideric.is/", out) + self.assertIn("gitea.example.com/", out) + + +class TestRenderTeaConfig(unittest.TestCase): + def test_empty_when_no_gitea(self): + self.assertEqual("", render_tea_config(_upstreams([ + {"Kind": "github", "TokenRef": "G"}, + ]))) + + def test_single_gitea_login_block(self): + out = render_tea_config(_upstreams([ + {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is"}, + ])) + self.assertIn("logins:", out) + self.assertIn("- name: gitea.dideric.is", out) + self.assertIn("url: http://cred-proxy:9099/gitea/gitea.dideric.is/", out) + # Placeholder token, not the host env var name (which is not a + # secret but also not useful) or the real value (which the + # provisioner does not have). + self.assertIn("token: cred-proxy-placeholder", out) + self.assertNotIn("GITEA_TOKEN", out) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 8334f5126869bb065875e096e2aff0d56967376d Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:20:42 -0400 Subject: [PATCH 09/24] feat(cred_proxy): wire DockerCredProxy through backend (PRD 0010) - DockerBottleBackend instantiates DockerCredProxy alongside pipelock and git-gate; threads it through prepare and launch. - DockerBottlePlan gains cred_proxy_plan; preflight rendering shows the declared kinds + TokenRefs and to_dict emits a cred_proxy array matching the routing table. - prepare.py: when bottle.tokens has an anthropic entry, route the agent at the proxy via ANTHROPIC_BASE_URL, drop the agent-side CLAUDE_CODE_OAUTH_TOKEN forward (the token goes to the sidecar's environ instead, set a non-secret placeholder so claude-code's startup check passes), and default the telemetry-off env vars. - launch.py: bring up the cred-proxy sidecar in ExitStack before the agent container so DNS resolution for `cred-proxy` succeeds on the agent's first call. - backend/__init__.py: add provision_cred_proxy to the provision template (runs after provision_git so it can append to ~/.gitconfig). - bottle_plan _view: env_names is derived from the forwarded_env dict, so the preflight reflects the PRD 0010 switch without ad-hoc branching on spec.forward_oauth_token. --- claude_bottle/backend/__init__.py | 18 ++++++++--- claude_bottle/backend/docker/backend.py | 8 +++++ claude_bottle/backend/docker/bottle_plan.py | 35 +++++++++++++++++++-- claude_bottle/backend/docker/launch.py | 17 ++++++++++ claude_bottle/backend/docker/prepare.py | 32 ++++++++++++++++--- 5 files changed, 98 insertions(+), 12 deletions(-) diff --git a/claude_bottle/backend/__init__.py b/claude_bottle/backend/__init__.py index ba677b6..cb04496 100644 --- a/claude_bottle/backend/__init__.py +++ b/claude_bottle/backend/__init__.py @@ -214,15 +214,17 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]): decide whether to add --append-system-prompt-file to claude's argv. - Default orchestration: ca → prompt → skills → git. CA install - runs first so the agent's trust store is rebuilt before - anything inside the agent makes a TLS call. Subclasses - typically don't override this; they implement the sub-methods - below.""" + Default orchestration: ca → prompt → skills → git → + cred_proxy. CA install runs first so the agent's trust store + is rebuilt before anything inside the agent makes a TLS call. + cred_proxy runs last because it appends to ~/.gitconfig (which + provision_git writes). Subclasses typically don't override + this; they implement the sub-methods below.""" self.provision_ca(plan, target) prompt_path = self.provision_prompt(plan, target) self.provision_skills(plan, target) self.provision_git(plan, target) + self.provision_cred_proxy(plan, target) return prompt_path def provision_ca(self, plan: PlanT, target: str) -> None: @@ -251,6 +253,12 @@ class BottleBackend(ABC, Generic[PlanT, CleanupT]): """Copy the host's cwd `.git` directory into the running bottle if the user requested --cwd. No-op otherwise.""" + def provision_cred_proxy(self, plan: PlanT, target: str) -> None: + """Drop the cred-proxy agent-side dotfiles (.npmrc, + .gitconfig insteadOf, ~/.config/tea/config.yml) per PRD 0010. + Default impl is a no-op for backends that don't yet support + the cred-proxy sidecar; the Docker backend overrides.""" + @abstractmethod def prepare_cleanup(self) -> CleanupT: """Enumerate orphaned resources from previous bottles. No side diff --git a/claude_bottle/backend/docker/backend.py b/claude_bottle/backend/docker/backend.py index 55baa8b..3ea0c0e 100644 --- a/claude_bottle/backend/docker/backend.py +++ b/claude_bottle/backend/docker/backend.py @@ -23,9 +23,11 @@ from . import prepare as _prepare from .bottle import DockerBottle from .bottle_cleanup_plan import DockerBottleCleanupPlan from .bottle_plan import DockerBottlePlan +from .cred_proxy import DockerCredProxy from .git_gate import DockerGitGate from .pipelock import DockerPipelockProxy from .provision import ca as _ca +from .provision import cred_proxy as _cred_proxy from .provision import git as _git from .provision import prompt as _prompt from .provision import skills as _skills @@ -40,6 +42,7 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup def __init__(self) -> None: self._proxy = DockerPipelockProxy() self._git_gate = DockerGitGate() + self._cred_proxy = DockerCredProxy() def _resolve_plan(self, spec: BottleSpec, *, stage_dir: Path) -> DockerBottlePlan: return _prepare.resolve_plan( @@ -47,6 +50,7 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup stage_dir=stage_dir, proxy=self._proxy, git_gate=self._git_gate, + cred_proxy=self._cred_proxy, ) @contextmanager @@ -55,6 +59,7 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup plan, proxy=self._proxy, git_gate=self._git_gate, + cred_proxy=self._cred_proxy, provision=self.provision, ) as bottle: yield bottle @@ -71,6 +76,9 @@ class DockerBottleBackend(BottleBackend["DockerBottlePlan", "DockerBottleCleanup def provision_git(self, plan: DockerBottlePlan, target: str) -> None: _git.provision_git(plan, target) + def provision_cred_proxy(self, plan: DockerBottlePlan, target: str) -> None: + _cred_proxy.provision_cred_proxy(plan, target) + def prepare_cleanup(self) -> DockerBottleCleanupPlan: return _cleanup.prepare_cleanup() diff --git a/claude_bottle/backend/docker/bottle_plan.py b/claude_bottle/backend/docker/bottle_plan.py index af635de..c3f2af5 100644 --- a/claude_bottle/backend/docker/bottle_plan.py +++ b/claude_bottle/backend/docker/bottle_plan.py @@ -11,6 +11,7 @@ import sys from dataclasses import dataclass, field from pathlib import Path +from ...cred_proxy import CredProxyPlan from ...git_gate import GitGatePlan from ...log import info from ...manifest import Agent, Bottle @@ -51,6 +52,7 @@ class DockerBottlePlan(BottlePlan): prompt_file: Path proxy_plan: PipelockProxyPlan git_gate_plan: GitGatePlan + cred_proxy_plan: CredProxyPlan allowlist_summary: str use_runsc: bool @@ -59,9 +61,13 @@ class DockerBottlePlan(BottlePlan): manifest = spec.manifest agent = manifest.agents[spec.agent_name] bottle = manifest.bottle_for(spec.agent_name) - env_names = list(bottle.env.keys()) - if spec.forward_oauth_token: - env_names.append("CLAUDE_CODE_OAUTH_TOKEN") + # The agent sees the union of literal env names (rendered into + # --env-file) and forwarded env names (`-e NAME` with the value + # arriving via subprocess env). The forwarded set already + # reflects PRD 0010's switch — when cred-proxy holds the + # anthropic token, CLAUDE_CODE_OAUTH_TOKEN is absent and + # ANTHROPIC_BASE_URL is present. + env_names = sorted(set(bottle.env.keys()) | set(self.forwarded_env.keys())) return _PlanView( agent=agent, bottle=bottle, @@ -100,6 +106,19 @@ class DockerBottlePlan(BottlePlan): info(f" git gate : {'; '.join(git_lines)}") else: info(" git remotes : (none)") + if self.cred_proxy_plan.upstreams: + kinds: list[str] = [] + seen: set[str] = set() + for u in self.cred_proxy_plan.upstreams: + key = u.kind if u.kind != "gitea" else f"gitea ({u.upstream})" + if key in seen: + continue + seen.add(key) + kinds.append(key) + refs = sorted({u.token_ref for u in self.cred_proxy_plan.upstreams}) + info(f" cred-proxy : {', '.join(kinds)}; tokens: {', '.join(refs)}") + else: + info(" cred-proxy : (none)") info(f" egress : {self.allowlist_summary}") info(" tls intercept : pipelock (per-bottle ephemeral CA, generated at launch)") info( @@ -132,6 +151,16 @@ class DockerBottlePlan(BottlePlan): } for u in self.git_gate_plan.upstreams ], + "cred_proxy": [ + { + "kind": u.kind, + "path": u.path, + "upstream": u.upstream, + "auth_scheme": u.auth_scheme, + "token_ref": u.token_ref, + } + for u in self.cred_proxy_plan.upstreams + ], "egress": { "host_count": len(hosts), "hosts": hosts, diff --git a/claude_bottle/backend/docker/launch.py b/claude_bottle/backend/docker/launch.py index c1575bc..a274333 100644 --- a/claude_bottle/backend/docker/launch.py +++ b/claude_bottle/backend/docker/launch.py @@ -22,6 +22,7 @@ from . import network as network_mod from . import util as docker_mod from .bottle import DockerBottle from .bottle_plan import DockerBottlePlan +from .cred_proxy import DockerCredProxy from .git_gate import DockerGitGate from .pipelock import DockerPipelockProxy, pipelock_proxy_url, pipelock_tls_init from .provision.ca import AGENT_CA_BUNDLE, AGENT_CA_PATH @@ -37,6 +38,7 @@ def launch( *, proxy: DockerPipelockProxy, git_gate: DockerGitGate, + cred_proxy: DockerCredProxy, provision: Callable[[DockerBottlePlan, str], str | None], ) -> Generator[DockerBottle, None, None]: """Build, launch, and provision a Docker bottle. Teardown on exit. @@ -102,6 +104,21 @@ def launch( git_gate_name = git_gate.start(plan.git_gate_plan) stack.callback(git_gate.stop, git_gate_name) + # Cred-proxy (PRD 0010). One sidecar per bottle when + # bottle.tokens declares any kind. Must come up before the + # agent so DNS resolution for `cred-proxy` succeeds on the + # agent's first call; tokens flow from the host env into the + # sidecar's environ, not the agent's. + if plan.cred_proxy_plan.upstreams: + cred_proxy_plan = dataclasses.replace( + plan.cred_proxy_plan, + internal_network=internal_network, + egress_network=egress_network, + ) + plan = dataclasses.replace(plan, cred_proxy_plan=cred_proxy_plan) + cred_proxy_name = cred_proxy.start(plan.cred_proxy_plan) + stack.callback(cred_proxy.stop, cred_proxy_name) + container = _run_agent_container(plan, internal_network) stack.callback(docker_mod.force_remove_container, container) diff --git a/claude_bottle/backend/docker/prepare.py b/claude_bottle/backend/docker/prepare.py index 074d8d7..66d6d76 100644 --- a/claude_bottle/backend/docker/prepare.py +++ b/claude_bottle/backend/docker/prepare.py @@ -19,6 +19,7 @@ from ...log import die from .. import BottleSpec from . import util as docker_mod from .bottle_plan import DockerBottlePlan +from .cred_proxy import DockerCredProxy, cred_proxy_url from .git_gate import DockerGitGate from .pipelock import DockerPipelockProxy @@ -29,6 +30,7 @@ def resolve_plan( stage_dir: Path, proxy: DockerPipelockProxy, git_gate: DockerGitGate, + cred_proxy: DockerCredProxy, ) -> DockerBottlePlan: """Resolve Docker-specific names and write scratch files. Trusts that the agent and its skills/git-gate keys are present — @@ -81,14 +83,35 @@ def resolve_plan( proxy_plan = proxy.prepare(bottle, slug, stage_dir) git_gate_plan = git_gate.prepare(bottle, slug, stage_dir) + cred_proxy_plan = cred_proxy.prepare(bottle, slug, stage_dir) resolved = resolve_env(manifest, spec.agent_name) # Everything that should reach the bottle by-name (so its value - # never lands on argv or in env_file) goes into one dict. The - # rename from CLAUDE_BOTTLE_OAUTH_TOKEN to CLAUDE_CODE_OAUTH_TOKEN - # happens here; nothing mutates the host os.environ. + # never lands on argv or in env_file) goes into one dict. Nothing + # mutates the host os.environ. forwarded_env: dict[str, str] = dict(resolved.forwarded) - if spec.forward_oauth_token: + has_anthropic_token = any(t.Kind == "anthropic" for t in bottle.tokens) + if spec.forward_oauth_token and not has_anthropic_token: + # Pre-PRD 0010 behavior: agent reads CLAUDE_CODE_OAUTH_TOKEN + # directly. Still the path when bottle.tokens has no anthropic + # entry; the cred-proxy sidecar holds the token otherwise. forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = os.environ["CLAUDE_BOTTLE_OAUTH_TOKEN"] + if has_anthropic_token: + # Point claude-code at the cred-proxy. The sidecar holds the + # OAuth token; the agent's environ does not. + forwarded_env["ANTHROPIC_BASE_URL"] = f"{cred_proxy_url()}/anthropic" + # claude-code refuses to start without *some* credential in + # its env. The proxy strips inbound Authorization on every + # request and injects the real one — so a non-secret + # placeholder is sufficient and the SC1 test still holds + # (the placeholder is not a `bottle.tokens[].TokenRef` + # value). The agent cannot exfiltrate this string because + # it carries no meaning to api.anthropic.com. + forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = "cred-proxy-placeholder" + # Belt-and-braces: turn off telemetry endpoints that don't + # route through ANTHROPIC_BASE_URL (statsig, error reporting). + # PRD 0010 open question default. + forwarded_env.setdefault("CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC", "1") + forwarded_env.setdefault("DISABLE_ERROR_REPORTING", "1") _write_env_file(resolved, env_file) prompt_file.write_text(agent.prompt) @@ -109,6 +132,7 @@ def resolve_plan( prompt_file=prompt_file, proxy_plan=proxy_plan, git_gate_plan=git_gate_plan, + cred_proxy_plan=cred_proxy_plan, allowlist_summary=allowlist_summary, use_runsc=use_runsc, ) -- 2.52.0 From 051896ba4cd1e91780b2fac25b3b1c53902c361a Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:22:44 -0400 Subject: [PATCH 10/24] feat(pipelock): auto-allowlist cred-proxy upstream hosts (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit bottle.tokens declarations contribute their upstream hosts to both pipelock's allowlist (so cred-proxy can reach them) and passthrough_domains (so pipelock doesn't MITM the connection — cred-proxy validates real upstream certs with the system CA bundle). Mapping: anthropic -> api.anthropic.com (already on defaults); github -> api.github.com + github.com; gitea -> the entry's host; npm -> registry.npmjs.org. --- claude_bottle/pipelock.py | 47 ++++++++++- tests/unit/test_pipelock_allowlist.py | 113 ++++++++++++++++++++++---- 2 files changed, 140 insertions(+), 20 deletions(-) diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 867d54f..597fef1 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -55,8 +55,35 @@ def pipelock_bottle_allowlist(bottle: Bottle) -> list[str]: return list(bottle.egress.allowlist) +def pipelock_token_hosts(bottle: Bottle) -> list[str]: + """Hostnames the cred-proxy sidecar (PRD 0010) talks to upstream + on the agent's behalf. Derived from `bottle.tokens[]`. Returned + sorted+deduped. + + These hosts must be on pipelock's allowlist so cred-proxy's + outbound HTTPS traffic can leave the egress network, and on + pipelock's TLS-passthrough list so pipelock does not MITM them — + cred-proxy validates real upstream certs with the system CA store, + so a pipelock-bumped cert would fail trust.""" + hosts: set[str] = set() + for t in bottle.tokens: + if t.Kind == "github": + hosts.add("api.github.com") + hosts.add("github.com") + elif t.Kind == "gitea": + if t.UpstreamHost: + hosts.add(t.UpstreamHost) + elif t.Kind == "npm": + hosts.add("registry.npmjs.org") + elif t.Kind == "anthropic": + # Already on DEFAULT_ALLOWLIST + DEFAULT_TLS_PASSTHROUGH. + hosts.add("api.anthropic.com") + return sorted(hosts) + + def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: - """Deduplicated union of: baked-in defaults, bottle.egress.allowlist. + """Deduplicated union of: baked-in defaults, bottle.egress.allowlist, + and the cred-proxy upstream hosts derived from bottle.tokens. Sorted for stability. Git upstreams declared in `bottle.git` do NOT contribute here — git traffic flows through the per-agent git-gate sidecar (PRD 0008), not pipelock.""" @@ -66,6 +93,22 @@ def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: for h in pipelock_bottle_allowlist(bottle): if h: seen.setdefault(h, None) + for h in pipelock_token_hosts(bottle): + seen.setdefault(h, None) + return sorted(seen.keys()) + + +def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]: + """Hostnames pipelock should pass through (no TLS MITM, no body + scan). Default carries the LLM API endpoint (its request bodies + legitimately trip DLP); cred-proxy upstream hosts are added so + cred-proxy's HTTPS client (which trusts only the real CA bundle) + can complete the upstream handshake.""" + seen: dict[str, None] = {} + for h in DEFAULT_TLS_PASSTHROUGH: + seen.setdefault(h, None) + for h in pipelock_token_hosts(bottle): + seen.setdefault(h, None) return sorted(seen.keys()) @@ -135,7 +178,7 @@ def pipelock_build_config( "enabled": True, "ca_cert": ca_cert_path, "ca_key": ca_key_path, - "passthrough_domains": list(DEFAULT_TLS_PASSTHROUGH), + "passthrough_domains": pipelock_effective_tls_passthrough(bottle), } return cfg diff --git a/tests/unit/test_pipelock_allowlist.py b/tests/unit/test_pipelock_allowlist.py index e50d9d6..d5a5cf5 100644 --- a/tests/unit/test_pipelock_allowlist.py +++ b/tests/unit/test_pipelock_allowlist.py @@ -1,36 +1,113 @@ -"""Unit: pipelock_effective_allowlist — the union of baked-in defaults -and bottle.egress.allowlist. Git upstreams declared in bottle.git do not +"""Unit: pipelock_effective_allowlist — the union of baked-in defaults, +bottle.egress.allowlist, and cred-proxy upstream hosts derived from +bottle.tokens (PRD 0010). Git upstreams declared in bottle.git do not contribute here; they flow through the per-agent git-gate (PRD 0008).""" import unittest from claude_bottle.manifest import Manifest -from claude_bottle.pipelock import pipelock_effective_allowlist +from claude_bottle.pipelock import ( + pipelock_effective_allowlist, + pipelock_effective_tls_passthrough, + pipelock_token_hosts, +) + + +def _bottle(spec): + return Manifest.from_json_obj({ + "bottles": {"dev": spec}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }).bottles["dev"] class TestEffectiveAllowlist(unittest.TestCase): def test_union_and_dedup(self): - manifest = Manifest.from_json_obj({ - "bottles": { - "dev": { - "egress": { - "allowlist": [ - "registry.npmjs.org", - # Duplicate of a baked default; the union - # must dedupe. - "api.anthropic.com", - ], - }, - }, + eff = pipelock_effective_allowlist(_bottle({ + "egress": { + "allowlist": [ + "registry.npmjs.org", + # Duplicate of a baked default; the union must dedupe. + "api.anthropic.com", + ], }, - "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, - }) - eff = pipelock_effective_allowlist(manifest.bottles["dev"]) + })) self.assertIn("api.anthropic.com", eff, "baked default present") self.assertIn("registry.npmjs.org", eff, "egress.allowlist present") self.assertEqual(len(eff), len(set(eff)), "deduplicated") self.assertEqual(eff, sorted(eff), "sorted") +class TestTokenHosts(unittest.TestCase): + def test_github_yields_both_hosts(self): + hosts = pipelock_token_hosts(_bottle({ + "tokens": [{"Kind": "github", "TokenRef": "GH"}], + })) + self.assertEqual(["api.github.com", "github.com"], hosts) + + def test_gitea_yields_configured_host(self): + hosts = pipelock_token_hosts(_bottle({ + "tokens": [{"Kind": "gitea", "TokenRef": "T", + "Url": "https://gitea.dideric.is"}], + })) + self.assertEqual(["gitea.dideric.is"], hosts) + + def test_npm_yields_registry(self): + hosts = pipelock_token_hosts(_bottle({ + "tokens": [{"Kind": "npm", "TokenRef": "N"}], + })) + self.assertEqual(["registry.npmjs.org"], hosts) + + def test_anthropic_yields_api_host(self): + hosts = pipelock_token_hosts(_bottle({ + "tokens": [{"Kind": "anthropic", "TokenRef": "A"}], + })) + self.assertEqual(["api.anthropic.com"], hosts) + + def test_no_tokens_empty(self): + self.assertEqual([], pipelock_token_hosts(_bottle({}))) + + +class TestAllowlistWithTokens(unittest.TestCase): + def test_token_hosts_added_to_allowlist(self): + eff = pipelock_effective_allowlist(_bottle({ + "tokens": [ + {"Kind": "npm", "TokenRef": "N"}, + {"Kind": "github", "TokenRef": "G"}, + ], + })) + self.assertIn("registry.npmjs.org", eff) + self.assertIn("api.github.com", eff) + self.assertIn("github.com", eff) + + def test_gitea_host_added(self): + eff = pipelock_effective_allowlist(_bottle({ + "tokens": [{"Kind": "gitea", "TokenRef": "T", + "Url": "https://gitea.dideric.is"}], + })) + self.assertIn("gitea.dideric.is", eff) + + +class TestTlsPassthrough(unittest.TestCase): + def test_default_includes_api_anthropic(self): + passthrough = pipelock_effective_tls_passthrough(_bottle({})) + self.assertEqual(["api.anthropic.com"], passthrough) + + def test_token_hosts_added_to_passthrough(self): + # cred-proxy validates upstream certs with the real CA bundle; + # pipelock must not MITM these or the handshake fails. + passthrough = pipelock_effective_tls_passthrough(_bottle({ + "tokens": [ + {"Kind": "github", "TokenRef": "G"}, + {"Kind": "npm", "TokenRef": "N"}, + {"Kind": "gitea", "TokenRef": "T", + "Url": "https://gitea.dideric.is"}, + ], + })) + for host in ("api.anthropic.com", "api.github.com", "github.com", + "registry.npmjs.org", "gitea.dideric.is"): + self.assertIn(host, passthrough) + self.assertEqual(passthrough, sorted(passthrough), "sorted") + + if __name__ == "__main__": unittest.main() -- 2.52.0 From 07da4366ad9a13408a055fff8bca2387dae809b1 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:29:10 -0400 Subject: [PATCH 11/24] test(cred_proxy): integration tests for header inject + strip (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drives DockerCredProxy.start through the production code path against a fake upstream container running on the same egress network. The "agent" is a curl container on the bottle's internal network — same access topology the agent uses in production. Covers PRD 0010 success criteria: - SC3 (the request reaches upstream, header round-trip works) - SC6 (inbound Authorization stripped; the proxy injects the configured token even when the agent tries to smuggle one in) - partial SC2 (cred-proxy reachable by the alias from the internal network) - 404 for unconfigured routes Live-network tests against real Anthropic / GitHub / Gitea / npm upstreams (SC4 and SC5 specifically) are deferred — the fake-upstream shape covers the routing + header layer that's actually under test here. --- tests/integration/_fake_upstream.py | 91 ++++++ tests/integration/test_cred_proxy_sidecar.py | 295 +++++++++++++++++++ 2 files changed, 386 insertions(+) create mode 100644 tests/integration/_fake_upstream.py create mode 100644 tests/integration/test_cred_proxy_sidecar.py diff --git a/tests/integration/_fake_upstream.py b/tests/integration/_fake_upstream.py new file mode 100644 index 0000000..f5c2264 --- /dev/null +++ b/tests/integration/_fake_upstream.py @@ -0,0 +1,91 @@ +"""A capture-and-echo HTTP server used as a fake upstream behind the +cred-proxy in integration tests. + +Captures the last request's method, path, and headers under +/__last_request (as JSON). Returns a fixed 200 OK with a deterministic +body for every other path. Tests probe /__last_request to assert on +header injection (PRD 0010 SC3/SC6). + +Stdlib-only; runs inside a python:alpine container with a single +bind-mount. +""" + +from __future__ import annotations + +import http.server +import json +import os +import socketserver +import sys +import threading + + +_lock = threading.Lock() +_last_request: dict[str, object] = {} + + +class Handler(http.server.BaseHTTPRequestHandler): + def log_message(self, format: str, *args: object) -> None: + # Quiet — the test reads the capture endpoint, not stderr. + return + + def _capture_and_respond(self) -> None: + # Skip capturing the inspection endpoints so the test's own + # query to /__last_request doesn't overwrite the request it + # came in to inspect. + if not self.path.startswith("/__"): + with _lock: + global _last_request + _last_request = { + "method": self.command, + "path": self.path, + "headers": [[k, v] for k, v in self.headers.items()], + } + if self.path == "/__last_request": + body = json.dumps(_last_request, indent=2).encode("utf-8") + self.send_response(200) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", str(len(body))) + self.end_headers() + self.wfile.write(body) + return + if self.path == "/__sse": + # SSE-style streaming response. Used by the no-buffering + # test: three events with short flushes between them. + self.send_response(200) + self.send_header("Content-Type", "text/event-stream") + self.send_header("Cache-Control", "no-cache") + self.end_headers() + for i in range(3): + self.wfile.write(f"data: event-{i}\n\n".encode("utf-8")) + self.wfile.flush() + return + body = b'{"upstream":"fake","ok":true}\n' + self.send_response(200) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", str(len(body))) + self.end_headers() + self.wfile.write(body) + + def do_GET(self) -> None: self._capture_and_respond() + def do_POST(self) -> None: self._capture_and_respond() + def do_PUT(self) -> None: self._capture_and_respond() + def do_DELETE(self) -> None: self._capture_and_respond() + def do_PATCH(self) -> None: self._capture_and_respond() + + +class FakeServer(socketserver.ThreadingMixIn, http.server.HTTPServer): + allow_reuse_address = True + daemon_threads = True + + +def main() -> None: + port = int(os.environ.get("FAKE_UPSTREAM_PORT", "8080")) + server = FakeServer(("0.0.0.0", port), Handler) + sys.stderr.write(f"fake-upstream listening on :{port}\n") + sys.stderr.flush() + server.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/tests/integration/test_cred_proxy_sidecar.py b/tests/integration/test_cred_proxy_sidecar.py new file mode 100644 index 0000000..a121780 --- /dev/null +++ b/tests/integration/test_cred_proxy_sidecar.py @@ -0,0 +1,295 @@ +"""Integration: drive `DockerCredProxy.prepare` → `.start` against a +fake upstream container, then verify header injection / strip-and- +replace at the wire level (PRD 0010 SC2, SC3, SC6). + +Topology mirrors production: a per-bottle internal docker network (no +default gateway) for the agent ↔ cred-proxy leg, and an egress network +for cred-proxy ↔ upstream. The "agent" is a curl container on the +internal net; the "upstream" is the fake-upstream container on the +egress net. cred-proxy straddles both. +""" + +from __future__ import annotations + +import dataclasses +import json +import os +import shutil +import subprocess +import tempfile +import unittest +from pathlib import Path + +from claude_bottle.backend.docker.cred_proxy import ( + CRED_PROXY_HOSTNAME, + CRED_PROXY_PORT, + DockerCredProxy, + build_cred_proxy_image, + cred_proxy_container_name, +) +from claude_bottle.backend.docker.network import ( + network_create_egress, + network_create_internal, + network_remove, +) +from claude_bottle.cred_proxy import CredProxy +from claude_bottle.manifest import Manifest +from tests._docker import skip_unless_docker + + +CURL_IMAGE = "curlimages/curl:latest" +FAKE_UPSTREAM_IMAGE = "python:3.13-alpine" +FAKE_UPSTREAM_HOST = "fake-upstream" +FAKE_UPSTREAM_PORT = "8080" + + +def _bottle(tokens): + return Manifest.from_json_obj({ + "bottles": {"dev": {"tokens": tokens}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }).bottles["dev"] + + +class _StubCredProxy(CredProxy): + """CredProxy.prepare's render uses the Kind defaults, but the + integration test needs the cred-proxy to forward to the fake + upstream — not api.anthropic.com / github.com / npmjs.org. We + pass a one-route plan in directly via DockerCredProxy.start + rather than going through the manifest path.""" + + def start(self, plan): raise NotImplementedError + def stop(self, target): return None + + +def _make_routes_json(upstream_host: str, upstream_port: str) -> str: + payload = { + "routes": [ + { + "path": "/fake/", + "upstream": f"http://{upstream_host}:{upstream_port}", + "auth_scheme": "Bearer", + "token_env": "CRED_PROXY_TOKEN_0", + }, + ], + } + return json.dumps(payload, indent=2) + "\n" + + +@skip_unless_docker() +class TestCredProxySidecar(unittest.TestCase): + @classmethod + def setUpClass(cls): + # Pre-pull the probe + fake-upstream base images so per-test + # retries don't race the registry. Skip if pulls fail (the + # canary suite separately probes registry health). + for image in (CURL_IMAGE, FAKE_UPSTREAM_IMAGE): + r = subprocess.run( + ["docker", "pull", image], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + if r.returncode != 0: + raise unittest.SkipTest(f"could not pull {image}") + build_cred_proxy_image() + + def setUp(self): + self.slug = f"cb-test-cp-{os.getpid()}" + self.proxy_name = "" + self.fake_name = f"fake-upstream-{self.slug}" + self.internal_net = "" + self.egress_net = "" + self.work_dir = Path(tempfile.mkdtemp()) + + def tearDown(self): + for name in (self.proxy_name, self.fake_name): + if name: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + for n in (self.internal_net, self.egress_net): + if n: + network_remove(n) + shutil.rmtree(self.work_dir, ignore_errors=True) + + def _bring_up_fake_upstream(self) -> None: + """Run the fake-upstream container on the egress network with + the host stable name `fake-upstream`. Bind-mount the script + from tests/integration/.""" + repo_dir = str(Path(__file__).resolve().parent.parent.parent) + script = "tests/integration/_fake_upstream.py" + r = subprocess.run( + [ + "docker", "run", "-d", + "--name", self.fake_name, + "--hostname", FAKE_UPSTREAM_HOST, + "--network", self.egress_net, + "--network-alias", FAKE_UPSTREAM_HOST, + "-v", f"{repo_dir}/{script}:/srv.py:ro", + "-e", f"FAKE_UPSTREAM_PORT={FAKE_UPSTREAM_PORT}", + FAKE_UPSTREAM_IMAGE, + "python3", "/srv.py", + ], + capture_output=True, text=True, check=False, + ) + if r.returncode != 0: + self.fail(f"failed to start fake-upstream: {r.stderr}") + + def _start_cred_proxy_via_production_code(self) -> str: + """Run DockerCredProxy.start with a plan that points at the + fake upstream. We bypass the manifest path (which fixes + upstreams by Kind) by handing .start an already-rendered + routes.json.""" + from claude_bottle.cred_proxy import ( + CredProxyPlan, + CredProxyUpstream, + ) + routes_path = self.work_dir / "routes.json" + routes_path.write_text(_make_routes_json(FAKE_UPSTREAM_HOST, FAKE_UPSTREAM_PORT)) + routes_path.chmod(0o600) + plan = CredProxyPlan( + slug=self.slug, + routes_path=routes_path, + upstreams=(CredProxyUpstream( + kind="fake", + path="/fake/", + upstream=f"http://{FAKE_UPSTREAM_HOST}:{FAKE_UPSTREAM_PORT}", + auth_scheme="Bearer", + token_env="CRED_PROXY_TOKEN_0", + token_ref="TEST_TOKEN", + ),), + token_env_map={"CRED_PROXY_TOKEN_0": "TEST_TOKEN"}, + internal_network=self.internal_net, + egress_network=self.egress_net, + ) + # Inject the host-side TEST_TOKEN into our process env so the + # production resolver picks it up. + os.environ["TEST_TOKEN"] = "real-token-injected-by-proxy" + try: + return DockerCredProxy().start(plan) + finally: + os.environ.pop("TEST_TOKEN", None) + + def _curl_via_internal_net(self, path: str, *extra: str) -> str: + """Run a sibling curl container on the internal network — same + access topology the agent uses in production — to hit the + cred-proxy. Returns stdout.""" + r = subprocess.run( + [ + "docker", "run", "--rm", + "--network", self.internal_net, + CURL_IMAGE, + "-s", "--max-time", "10", + "--retry", "20", "--retry-delay", "1", "--retry-connrefused", + *extra, + f"http://{CRED_PROXY_HOSTNAME}:{CRED_PROXY_PORT}{path}", + ], + capture_output=True, text=True, timeout=60, check=False, + ) + self.assertEqual(0, r.returncode, + f"curl failed: stdout={r.stdout!r} stderr={r.stderr!r}") + return r.stdout + + def _query_fake_capture(self) -> dict: + """Read the fake upstream's /__last_request endpoint to see + what headers it received.""" + r = subprocess.run( + [ + "docker", "run", "--rm", + "--network", self.egress_net, + CURL_IMAGE, + "-s", "--max-time", "10", + "--retry", "5", "--retry-delay", "1", "--retry-connrefused", + f"http://{FAKE_UPSTREAM_HOST}:{FAKE_UPSTREAM_PORT}/__last_request", + ], + capture_output=True, text=True, timeout=30, check=False, + ) + self.assertEqual(0, r.returncode, f"capture query failed: {r.stderr}") + return json.loads(r.stdout) + + @unittest.skipIf( + os.environ.get("GITEA_ACTIONS") == "true", + "skipped under act_runner: docker socket mount topology breaks " + "in-process visibility of networks created on the host daemon", + ) + def test_end_to_end_header_injection_and_strip(self): + """Full bring-up via the production DockerCredProxy code path, + then send a request from a sibling curl container with the + agent's `Authorization` header. The fake upstream's capture + must show: + - the agent's Authorization was stripped (no `stolen` token) + - the cred-proxy injected `Bearer real-token-injected-by-proxy` + - the request reached the upstream at all + """ + self.internal_net = network_create_internal(self.slug) + self.egress_net = network_create_egress(self.slug) + self._bring_up_fake_upstream() + self.proxy_name = self._start_cred_proxy_via_production_code() + self.assertEqual(cred_proxy_container_name(self.slug), self.proxy_name) + + # Agent → cred-proxy with a smuggled Authorization header. + body = self._curl_via_internal_net( + "/fake/v1/messages", + "-H", "Authorization: Bearer stolen-by-prompt-injection", + "-X", "POST", + "-H", "Content-Type: application/json", + "--data-binary", '{"hello":"world"}', + ) + # The fake upstream responds with a fixed body. + self.assertIn('"upstream":"fake"', body) + + # Now ask the fake upstream what headers it actually saw. + captured = self._query_fake_capture() + self.assertEqual("POST", captured["method"]) + self.assertEqual("/v1/messages", captured["path"], + "the /fake/ prefix should be stripped before forwarding") + + headers = {k.lower(): v for k, v in captured["headers"]} + self.assertEqual( + "Bearer real-token-injected-by-proxy", + headers.get("authorization"), + "cred-proxy must strip the inbound Authorization and inject " + "the configured value", + ) + self.assertNotIn("stolen", headers.get("authorization", ""), + "the agent's smuggled token must NOT reach upstream") + self.assertEqual( + FAKE_UPSTREAM_HOST, + headers.get("host"), + "Host header should point at the upstream, not the proxy", + ) + + @unittest.skipIf( + os.environ.get("GITEA_ACTIONS") == "true", + "skipped under act_runner: docker socket mount topology breaks " + "in-process visibility of networks created on the host daemon", + ) + def test_unknown_path_returns_404(self): + """An agent reaching for an unconfigured route gets a 404, + not a silent forward to anywhere.""" + self.internal_net = network_create_internal(self.slug) + self.egress_net = network_create_egress(self.slug) + self._bring_up_fake_upstream() + self.proxy_name = self._start_cred_proxy_via_production_code() + + r = subprocess.run( + [ + "docker", "run", "--rm", + "--network", self.internal_net, + CURL_IMAGE, + "-s", "-o", "/dev/null", "-w", "%{http_code}", + "--max-time", "10", + "--retry", "20", "--retry-delay", "1", "--retry-connrefused", + f"http://{CRED_PROXY_HOSTNAME}:{CRED_PROXY_PORT}/not-a-route", + ], + capture_output=True, text=True, timeout=60, check=False, + ) + self.assertEqual(0, r.returncode) + self.assertEqual("404", r.stdout.strip()) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 431e7481ef28a7affa88610ceacd9f5a218f9f81 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:32:46 -0400 Subject: [PATCH 12/24] docs: README + example.json for cred-proxy (PRD 0010) - Architecture diagram gains the cred-proxy lane (agent talks plain HTTP via bearer-auth-injection; sidecar talks HTTPS to the real upstream with the manifest token). - Adds a cred-proxy entry under the sidecar bullet list, with a pointer to PRD 0010. - Manifest example illustrates the `tokens` array on a bottle. - Auth section notes that declaring an `anthropic` token routes CLAUDE_BOTTLE_OAUTH_TOKEN through the sidecar instead of into the agent's environ. - claude-bottle.example.json gains an `agentic` bottle declaring all four token kinds, plus a paired `agentic-helper` agent. --- README.md | 62 ++++++++++++++++++++++++++++++-------- claude-bottle.example.json | 20 ++++++++++++ 2 files changed, 70 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index b35310c..7af84b4 100644 --- a/README.md +++ b/README.md @@ -72,11 +72,13 @@ pieces of v1. A bottle is the agent container plus up to three per-protocol egress sidecars on a per-agent Docker `--internal` network. The agent has no default route off-box; its only way out is through the pipelock -sidecar (for HTTP/HTTPS), the ssh-gate sidecar (for SSH), or the -git-gate sidecar (for git operations against declared upstreams). -Each sidecar also sits on an egress network that does have internet -access, so the agent's traffic always passes through a container -that enforces the manifest before it leaves the host. +sidecar (for HTTP/HTTPS), the git-gate sidecar (for git operations +against declared upstreams), or the cred-proxy sidecar (for API +calls that need a manifest-declared token — Anthropic OAuth, GitHub +PAT, Gitea PAT, npm). Each sidecar also sits on an egress network +that does have internet access, so the agent's traffic always passes +through a container that enforces the manifest before it leaves the +host. ``` host ( ./cli.py ) @@ -91,12 +93,17 @@ that enforces the manifest before it leaves the host. │ │ built locally) │ │ (TLS bump, DLP,│ │ hosts │ │ │ │ allowlist) │ │ │ │ skills, env, │ └────────────────┘ │ - │ │ ~/.gitconfig │ │ - │ │ │ git ops ┌────────────────┐ │ SSH (push/ + │ │ ~/.gitconfig, │ │ + │ │ ~/.npmrc, tea │ git ops ┌────────────────┐ │ SSH (push/ │ │ │ ───────────────► │ git-gate image │──┼──► fetch) to │ │ │ │ (gitleaks + │ │ bottle.git - │ │ │ │ git daemon) │ │ upstreams - │ └──────────────────┘ └────────────────┘ │ + │ │ environ: URLs │ │ git daemon) │ │ upstreams + │ │ only, no real │ └────────────────┘ │ + │ │ tokens │ bearer-auth ┌────────────────┐ │ HTTPS to + │ │ │ ───────────────► │ cred-proxy │──┼──► bottle.tokens + │ │ │ HTTP, plain │ (strips/injects│ │ upstreams + │ │ │ │ Authorization)│ │ (with the + │ └──────────────────┘ └────────────────┘ │ real token) │ │ │ agent on internal network (no default route); │ │ sidecars also attached to an egress network. │ @@ -129,6 +136,20 @@ that enforces the manifest before it leaves the host. `insteadOf` rewrite still keys off the original hostname. Brought up only when `bottle.git` has entries. Design in `docs/prds/0008-git-gate.md`. +- **cred-proxy image** — per-bottle sidecar (`python:3.13-alpine` + base, stdlib-only) that holds API tokens declared in + `bottle.tokens`. The agent dials it as plain HTTP at + `http://cred-proxy:9099//...`; the proxy strips any + inbound `Authorization` header, injects the configured one using + a token held only in its own container's environ, and forwards + to the real upstream over HTTPS. SSE responses stream back + unbuffered. `ANTHROPIC_BASE_URL`, `~/.npmrc`, `~/.gitconfig` + `insteadOf` rules for `https://github.com/` and any declared + Gitea hosts, and `~/.config/tea/config.yml` all get written to + point at the proxy. The agent's `printenv` shows only those + URLs — none of the real token values. Brought up only when + `bottle.tokens` has entries. Design in + `docs/prds/0010-cred-proxy.md`. When the agent exits, `cli.py` tears down every sidecar that was brought up and the two networks; nothing about a bottle persists @@ -172,6 +193,19 @@ project entries overriding home entries on key conflict). } ], + // Tokens declared here are held by a per-bottle cred-proxy + // sidecar, not the agent. Each entry names the host env var + // (`TokenRef`) the CLI reads at launch time; the value goes + // into the sidecar's environ via `docker create -e`, never + // touches argv or disk. Inside the bottle, the agent's + // ANTHROPIC_BASE_URL / npm registry / git insteadOf rules + // point at the proxy. See `docs/prds/0010-cred-proxy.md`. + "tokens": [ + { "Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN" }, + { "Kind": "github", "TokenRef": "GITHUB_PAT" }, + { "Kind": "npm", "TokenRef": "NPM_TOKEN" } + ], + // Egress is forced through a per-agent // [pipelock](https://github.com/luckyPipewrench/pipelock) sidecar // on a Docker `--internal` network — without the proxy the agent @@ -231,9 +265,13 @@ as `CLAUDE_BOTTLE_OAUTH_TOKEN`: export CLAUDE_BOTTLE_OAUTH_TOKEN="" ``` -`cli.py` automatically forwards it to every container as -`CLAUDE_CODE_OAUTH_TOKEN` via `docker run -e` — no manifest wiring -required, and the value is never written to disk or placed on argv. +By default `cli.py` forwards the token into the agent container as +`CLAUDE_CODE_OAUTH_TOKEN`. Declare an `anthropic` entry in +`bottle.tokens` to route via cred-proxy instead: the token then lives +only in the cred-proxy sidecar's environ, the agent's +`ANTHROPIC_BASE_URL` points at the proxy, and `printenv` inside the +agent does not surface the real token. Either way the value is never +written to disk or placed on argv on the host. Inside the container, `claude` picks up `CLAUDE_CODE_OAUTH_TOKEN` and authenticates against your subscription. Caveats: the token is bound diff --git a/claude-bottle.example.json b/claude-bottle.example.json index 1ac6163..7403473 100644 --- a/claude-bottle.example.json +++ b/claude-bottle.example.json @@ -36,6 +36,20 @@ "files.pythonhosted.org" ] } + }, + + "agentic": { + "env": { + "GIT_AUTHOR_NAME": "Eric Diderich", + "NODE_ENV": "development" + }, + "tokens": [ + { "Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN" }, + { "Kind": "github", "TokenRef": "GH_PAT" }, + { "Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is" }, + { "Kind": "npm", "TokenRef": "NPM_TOKEN" } + ] } }, @@ -52,6 +66,12 @@ "prompt": "You help maintain Gitea-hosted projects. Prefer small, focused commits. Follow Conventional Commits. Run tests before pushing." }, + "agentic-helper": { + "bottle": "agentic", + "skills": [], + "prompt": "You operate against APIs whose credentials live in a per-bottle cred-proxy sidecar. Your environ carries only proxy URLs." + }, + "minimal": { "bottle": "default", "skills": [], -- 2.52.0 From c8ab90d01d217e2ca508a1675d2ad4bc653de2b1 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 16:38:36 -0400 Subject: [PATCH 13/24] fix(manifest): allow token + git on the same host (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit git-gate holds an SSH IdentityFile for push/fetch; cred-proxy holds a PAT for HTTPS REST API calls. The two brokers are orthogonal — the common dev setup names both on the same host (e.g. gitea.dideric.is SSH for push, gitea.dideric.is PAT for `tea pr create`). The original PRD 0010 wording called this a "configuration smell" and rejected it at parse time. That was wrong; this drops the overlap rejection from the validator and updates the PRD prose to match. Tests flip from "rejection" to "coexistence" assertions. --- claude_bottle/manifest.py | 20 ++++------ docs/prds/0010-cred-proxy.md | 12 +++--- tests/unit/test_manifest_tokens.py | 62 ++++++++++++++++-------------- 3 files changed, 46 insertions(+), 48 deletions(-) diff --git a/claude_bottle/manifest.py b/claude_bottle/manifest.py index 9e35da9..58abc47 100644 --- a/claude_bottle/manifest.py +++ b/claude_bottle/manifest.py @@ -570,11 +570,14 @@ def _validate_tokens( - At most one entry per Kind, except `gitea` which may have multiple entries (one per Gitea instance) with distinct Urls. - - No overlap with `bottle.git` hosts: a `github` or `gitea` token - whose host matches a `bottle.git` upstream host would put two - credential brokers on the same remote (git-gate's gitleaks- - scanning gate AND cred-proxy's bearer injection). Pick one. + + A `github` or `gitea` token MAY name the same host as a + `bottle.git` entry: the two paths broker different protocols + (git-gate handles SSH push/fetch with an IdentityFile; cred-proxy + handles HTTPS REST API calls with a PAT), so declaring both on + one host is a legitimate dev setup, not a configuration error. """ + del git # cross-host overlap is intentionally not rejected. by_kind: dict[str, list[TokenEntry]] = {} for t in tokens: by_kind.setdefault(t.Kind, []).append(t) @@ -595,15 +598,6 @@ def _validate_tokens( f"that may have multiple entries)." ) - git_hosts = {g.UpstreamHost for g in git} - for t in tokens: - if t.Kind in ("github", "gitea") and t.UpstreamHost in git_hosts: - die( - f"bottle '{bottle_name}' token ({t.Kind}, host {t.UpstreamHost!r}) " - f"overlaps a bottle.git upstream on the same host. git-gate already " - f"brokers this remote; drop the token entry or remove the git entry." - ) - def _validate_unique_git_names(bottle_name: str, git: tuple[GitEntry, ...]) -> None: seen: dict[str, None] = {} diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md index 760c2a8..4737051 100644 --- a/docs/prds/0010-cred-proxy.md +++ b/docs/prds/0010-cred-proxy.md @@ -315,12 +315,12 @@ Validation: documented upstream. - At most one entry per `Kind` except `gitea`, which may have multiple distinct `Url`s. -- No silent overlap with `bottle.git` upstreams that already - flow through git-gate; if a `tokens[].Kind: github|gitea` - entry's `Url` collides with a `git[].Upstream`'s host, parse - fails with a "git-gate already brokers this remote, drop one" - hint. (Both paths broker credentials; doubling up is a - configuration smell, not a feature.) +- A `github` or `gitea` token MAY name the same host as a + `bottle.git` entry. The two paths broker different protocols — + git-gate holds an SSH `IdentityFile` for push/fetch and runs + gitleaks; cred-proxy holds a PAT for HTTPS REST API calls (`tea`, + `gh`, octokit). The common dev setup uses both on the same host + and is not a configuration error. ### Routing table diff --git a/tests/unit/test_manifest_tokens.py b/tests/unit/test_manifest_tokens.py index 388c591..d99bf4b 100644 --- a/tests/unit/test_manifest_tokens.py +++ b/tests/unit/test_manifest_tokens.py @@ -129,37 +129,41 @@ class TestTokenEntryValidation(unittest.TestCase): ])) -class TestTokenGitOverlap(unittest.TestCase): - def test_github_token_collides_with_github_git_entry(self): - # bottle.git already brokers github.com via the gate; declaring - # a github token on top would put two credential brokers on - # the same remote. - with self.assertRaises(Die): - Manifest.from_json_obj(_manifest( - tokens=[{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}], - git=[{ - "Name": "myrepo", - "Upstream": "ssh://git@github.com/me/myrepo.git", - "IdentityFile": "/dev/null", - }], - )) +class TestTokenGitCoexistence(unittest.TestCase): + """git-gate brokers SSH push/fetch via an IdentityFile; cred-proxy + brokers HTTPS REST API calls via a PAT. Declaring both on the same + host is the common dev setup (SSH key for git ops, PAT for `tea` / + `gh` API calls), not a configuration error.""" - def test_gitea_token_collides_with_same_host_git_entry(self): - with self.assertRaises(Die): - Manifest.from_json_obj(_manifest( - tokens=[{ - "Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is", - }], - git=[{ - "Name": "myrepo", - "Upstream": "ssh://git@gitea.dideric.is:30009/me/myrepo.git", - "IdentityFile": "/dev/null", - }], - )) + def test_github_token_and_github_git_entry_coexist(self): + m = Manifest.from_json_obj(_manifest( + tokens=[{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}], + git=[{ + "Name": "myrepo", + "Upstream": "ssh://git@github.com/me/myrepo.git", + "IdentityFile": "/dev/null", + }], + )) + self.assertEqual(1, len(m.bottles["dev"].tokens)) + self.assertEqual(1, len(m.bottles["dev"].git)) - def test_anthropic_token_does_not_collide_with_git(self): - # api.anthropic.com isn't a git host; no overlap possible. + def test_gitea_token_and_same_host_git_entry_coexist(self): + m = Manifest.from_json_obj(_manifest( + tokens=[{ + "Kind": "gitea", "TokenRef": "GITEA_TOKEN", + "Url": "https://gitea.dideric.is", + }], + git=[{ + "Name": "myrepo", + "Upstream": "ssh://git@gitea.dideric.is:30009/me/myrepo.git", + "IdentityFile": "/dev/null", + }], + )) + self.assertEqual("gitea.dideric.is", m.bottles["dev"].tokens[0].UpstreamHost) + self.assertEqual("gitea.dideric.is", m.bottles["dev"].git[0].UpstreamHost) + + def test_anthropic_token_and_git_unrelated(self): + # api.anthropic.com isn't a git host; coexistence is trivial. m = Manifest.from_json_obj(_manifest( tokens=[{"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}], git=[{ -- 2.52.0 From 27b2d78b112e310102a5806bacf21236b3551757 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 21:09:33 -0400 Subject: [PATCH 14/24] fix(cred_proxy): close git-push bypass + route through pipelock (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three coupled fixes that close a documented bypass of git-gate's gitleaks pre-receive hook: 1. cred-proxy refuses git smart-HTTP push at runtime. Any path ending in /git-receive-pack or /info/refs?service=git-receive-pack returns 403 with a pointer at the bottle.git SSH path. Fetch (upload-pack) is still allowed — the bypass we're closing is push, where gitleaks is the load-bearing scanner. Hard guarantee. 2. The provisioner suppresses the cred-proxy `~/.gitconfig` insteadOf rewrite for any host already declared in bottle.git. git-gate is the canonical git path there; we don't write a competing rule that would let `git clone https:///...` succeed in ways that confuse on push. Defense in depth — (1) is the hard guarantee. 3. cred-proxy routes its outbound HTTPS through pipelock. The sidecar's environ now sets HTTPS_PROXY=, and the image's entrypoint runs `update-ca-certificates` over the per-bottle pipelock CA (docker cp'd into /usr/local/share/ca-certificates/pipelock.crt before start) so the proxy's HTTPS client trusts pipelock's bumped certs. Consequence: pipelock's allowlist + body scanner now sit in the cred-proxy egress path the same way they sit in front of direct agent traffic. The cred-proxy upstream hosts (api.github.com, github.com, gitea hosts, registry.npmjs.org) come OFF pipelock's passthrough_domains. Only api.anthropic.com remains on passthrough (LLM body content legitimately trips DLP). PRD 0010 updated to reflect all three. Tests adjusted: the "cred-proxy hosts go on passthrough" assertion in test_pipelock_allowlist flips to "they don't", a new TestIsGitPushRequest exercises the smart-HTTP refusal predicate, and the gitconfig renderer tests cover the per-host suppression matrix. --- Dockerfile.cred-proxy | 17 +++- claude_bottle/backend/docker/cred_proxy.py | 79 +++++++++++++++---- claude_bottle/backend/docker/launch.py | 14 +++- .../backend/docker/provision/cred_proxy.py | 44 +++++++---- claude_bottle/cred_proxy.py | 20 +++-- claude_bottle/cred_proxy_server.py | 33 ++++++++ claude_bottle/pipelock.py | 32 +++++--- docs/prds/0010-cred-proxy.md | 37 +++++++-- tests/unit/test_cred_proxy_server.py | 44 +++++++++++ tests/unit/test_docker_cred_proxy.py | 23 ++++++ tests/unit/test_pipelock_allowlist.py | 15 ++-- tests/unit/test_provision_cred_proxy.py | 34 ++++++++ 12 files changed, 329 insertions(+), 63 deletions(-) diff --git a/Dockerfile.cred-proxy b/Dockerfile.cred-proxy index 451d7cc..82f3769 100644 --- a/Dockerfile.cred-proxy +++ b/Dockerfile.cred-proxy @@ -16,6 +16,14 @@ # image bytes deterministic. FROM python@sha256:420cd0bf0f3998275875e02ecd5808168cf0843cbb4d3c536432f729247b2acc +# `ca-certificates` ships /usr/sbin/update-ca-certificates and the +# system trust store. The backend's start step `docker cp`s the +# per-bottle pipelock CA into /usr/local/share/ca-certificates/ so +# the entrypoint's update-ca-certificates picks it up — cred-proxy's +# outbound HTTPS then trusts pipelock's bumped certs and outbound +# traffic routes through pipelock (HTTPS_PROXY in the environ). +RUN apk add --no-cache ca-certificates + # The proxy script ships as a single file. Tests in tests/unit/ import # it as `claude_bottle.cred_proxy_server`; the container runs it # directly as a script. No package install, no other modules pulled. @@ -32,4 +40,11 @@ RUN mkdir -p /run/cred-proxy # for the internal network to route to it. EXPOSE 9099 -ENTRYPOINT ["python3", "/app/cred_proxy_server.py"] +# Entry runs update-ca-certificates so the per-bottle pipelock CA +# docker-cp'd by the backend's start step is folded into +# /etc/ssl/certs/ca-certificates.crt before python comes up. Then +# exec into the server so PID 1 is python (clean signal handling +# and exit codes). Output of update-ca-certificates is silenced — +# the entry script prints one line per cert under normal operation, +# which the test suite would otherwise treat as a log smell. +ENTRYPOINT ["sh", "-c", "update-ca-certificates >/dev/null 2>&1 && exec python3 /app/cred_proxy_server.py"] diff --git a/claude_bottle/backend/docker/cred_proxy.py b/claude_bottle/backend/docker/cred_proxy.py index 4e28bab..54213cc 100644 --- a/claude_bottle/backend/docker/cred_proxy.py +++ b/claude_bottle/backend/docker/cred_proxy.py @@ -42,6 +42,13 @@ CRED_PROXY_HOSTNAME = "cred-proxy" # file directly. CRED_PROXY_ROUTES_IN_CONTAINER = "/run/cred-proxy/routes.json" +# In-container path for the per-bottle pipelock CA. Alpine's +# update-ca-certificates picks anything ending in `.crt` under +# /usr/local/share/ca-certificates/ and folds it into the system +# trust store at boot — so cred-proxy's HTTPS client trusts +# pipelock's bumped certs when pipelock MITMs the outbound leg. +CRED_PROXY_PIPELOCK_CA_IN_CONTAINER = "/usr/local/share/ca-certificates/pipelock.crt" + # Repo root, for `docker build` context. Resolved from this file's # location: claude_bottle/backend/docker/cred_proxy.py → repo root. _REPO_DIR = str(Path(__file__).resolve().parent.parent.parent.parent) @@ -96,6 +103,23 @@ class DockerCredProxy(CredProxy): f"cred-proxy routes file missing at {plan.routes_path}; " f"CredProxy.prepare must run first" ) + # pipelock fields are populated by launch.py in production; both + # must be present (URL + CA) or both absent. Mixing is a wiring + # bug. Both-absent is supported only as a test escape hatch: + # the integration tests in tests/integration/ exercise header + # injection in isolation and do not bring pipelock up. + route_via_pipelock = bool(plan.pipelock_proxy_url) or plan.pipelock_ca_host_path != Path() + if route_via_pipelock: + if not plan.pipelock_proxy_url: + die( + "DockerCredProxy.start: pipelock_ca_host_path is set but " + "pipelock_proxy_url is empty; populate both or neither." + ) + if not plan.pipelock_ca_host_path.is_file(): + die( + f"DockerCredProxy.start: pipelock CA missing at " + f"{plan.pipelock_ca_host_path}; pipelock_tls_init must run first" + ) # Resolve host env vars into concrete values. This must # happen at start time (not prepare) — the values flow into @@ -114,6 +138,16 @@ class DockerCredProxy(CredProxy): "--network", plan.internal_network, "--network-alias", CRED_PROXY_HOSTNAME, ] + if route_via_pipelock: + # Route cred-proxy's outbound HTTPS through pipelock so + # the egress allowlist + DLP body scanner apply to its + # traffic. Pipelock MITMs each handshake with the + # per-bottle CA we docker cp in below. + create_args.extend([ + "-e", f"HTTPS_PROXY={plan.pipelock_proxy_url}", + "-e", f"HTTP_PROXY={plan.pipelock_proxy_url}", + "-e", "NO_PROXY=localhost,127.0.0.1", + ]) # One -e flag per token slot; values arrive via subprocess env. # docker create with `-e NAME` (no =VALUE) reads NAME from the # current process env at create time. We pass `env=child_env` @@ -136,24 +170,37 @@ class DockerCredProxy(CredProxy): ).returncode != 0: die(f"failed to create cred-proxy sidecar {name}") - cp_result = subprocess.run( - ["docker", "cp", str(plan.routes_path), - f"{name}:{CRED_PROXY_ROUTES_IN_CONTAINER}"], - capture_output=True, - text=True, - check=False, - ) - if cp_result.returncode != 0: - subprocess.run( - ["docker", "rm", "-f", name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, + cps: list[tuple[str, str, str]] = [ + (str(plan.routes_path), CRED_PROXY_ROUTES_IN_CONTAINER, "routes.json"), + ] + if route_via_pipelock: + # CA must land BEFORE `docker start` so the entrypoint's + # update-ca-certificates picks it up. Docker cp's the + # file in even on the stopped container — that's the + # whole reason this works without a custom build step. + cps.append(( + str(plan.pipelock_ca_host_path), + CRED_PROXY_PIPELOCK_CA_IN_CONTAINER, + "pipelock CA", + )) + for src, dst, label in cps: + cp_result = subprocess.run( + ["docker", "cp", src, f"{name}:{dst}"], + capture_output=True, + text=True, check=False, ) - die( - f"failed to copy routes.json into {name}: " - f"{cp_result.stderr.strip()}" - ) + if cp_result.returncode != 0: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + check=False, + ) + die( + f"failed to copy {label} into {name}: " + f"{cp_result.stderr.strip()}" + ) if subprocess.run( ["docker", "network", "connect", plan.egress_network, name], diff --git a/claude_bottle/backend/docker/launch.py b/claude_bottle/backend/docker/launch.py index a274333..a32747d 100644 --- a/claude_bottle/backend/docker/launch.py +++ b/claude_bottle/backend/docker/launch.py @@ -105,15 +105,21 @@ def launch( stack.callback(git_gate.stop, git_gate_name) # Cred-proxy (PRD 0010). One sidecar per bottle when - # bottle.tokens declares any kind. Must come up before the - # agent so DNS resolution for `cred-proxy` succeeds on the - # agent's first call; tokens flow from the host env into the - # sidecar's environ, not the agent's. + # bottle.tokens declares any kind. Must come up AFTER pipelock + # — cred-proxy routes its outbound HTTPS through pipelock + # (HTTPS_PROXY in environ + the per-bottle CA in its trust + # store) so the egress allowlist + body scanner sit in the + # cred-proxy path too. Must come up BEFORE the agent so DNS + # resolution for `cred-proxy` succeeds on the agent's first + # call; tokens flow from the host env into the sidecar's + # environ, not the agent's. if plan.cred_proxy_plan.upstreams: cred_proxy_plan = dataclasses.replace( plan.cred_proxy_plan, internal_network=internal_network, egress_network=egress_network, + pipelock_ca_host_path=ca_cert_host, + pipelock_proxy_url=pipelock_proxy_url(plan.slug), ) plan = dataclasses.replace(plan, cred_proxy_plan=cred_proxy_plan) cred_proxy_name = cred_proxy.start(plan.cred_proxy_plan) diff --git a/claude_bottle/backend/docker/provision/cred_proxy.py b/claude_bottle/backend/docker/provision/cred_proxy.py index 5fecde9..e946be3 100644 --- a/claude_bottle/backend/docker/provision/cred_proxy.py +++ b/claude_bottle/backend/docker/provision/cred_proxy.py @@ -35,8 +35,10 @@ def provision_cred_proxy(plan: DockerBottlePlan, target: str) -> None: upstreams = plan.cred_proxy_plan.upstreams if not upstreams: return + bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name) + git_gate_hosts = {g.UpstreamHost for g in bottle.git} _provision_npmrc(plan, target, upstreams) - _provision_gitconfig(plan, target, upstreams) + _provision_gitconfig(plan, target, upstreams, git_gate_hosts) _provision_tea_config(plan, target, upstreams) @@ -82,29 +84,41 @@ def _provision_npmrc( # --- git config ------------------------------------------------------------- -def render_cred_proxy_gitconfig(upstreams: tuple[CredProxyUpstream, ...]) -> str: +def render_cred_proxy_gitconfig( + upstreams: tuple[CredProxyUpstream, ...], + git_gate_hosts: set[str] = frozenset(), # type: ignore[assignment] +) -> str: """Render the `~/.gitconfig` fragment for cred-proxy insteadOf rewrites. Empty string when no github / gitea routes are declared. - github expands to two rewrites: https://github.com/... → /gh-git/... - (the git transport endpoint), and the agent's git client reaches - api.github.com over the same proxy via the /gh-api/ route, but - that's used by tools that call the GitHub API directly (gh, tea, - octokit) rather than `git` itself. + The rewrite is suppressed for any host that's also declared in + `bottle.git`. git-gate is the canonical git path on those hosts — + its pre-receive runs gitleaks before forwarding the push. A + cred-proxy https:/// rewrite would route HTTPS git ops + around the gate. cred-proxy still refuses smart-HTTP push at + runtime (defense in depth), but suppressing the rewrite means + `git clone https:///...` doesn't have a tempting shortcut + that just confuses on push. - Gitea entries get one rewrite per declared host, pointing at - /gitea//. The path component scopes the credential - so multiple gitea instances coexist on one proxy.""" + github expands to one rewrite (https://github.com/... → /gh-git/..., + the git transport endpoint); /gh-api/ stays unmapped here because + tools call api.github.com directly rather than through git. + Gitea entries get one rewrite per declared host.""" rules: list[str] = [] for u in upstreams: if u.kind == "github" and u.path == "/gh-git/": + if "github.com" in git_gate_hosts: + continue rules.append( f'[url "{cred_proxy_url()}/gh-git/"]\n' f"\tinsteadOf = https://github.com/\n" ) elif u.kind == "gitea": - # u.upstream is the configured gitea URL (e.g. - # https://gitea.dideric.is) and u.path is /gitea//. + # u.path is /gitea//; derive the host the same way + # the route table did so we match git_gate's UpstreamHost. + host = u.path[len("/gitea/"):].rstrip("/") + if host in git_gate_hosts: + continue rules.append( f'[url "{cred_proxy_url()}{u.path}"]\n' f"\tinsteadOf = {u.upstream}/\n" @@ -123,11 +137,13 @@ def _provision_gitconfig( plan: DockerBottlePlan, target: str, upstreams: tuple[CredProxyUpstream, ...], + git_gate_hosts: set[str], ) -> None: """Append the cred-proxy insteadOf rules to ~/.gitconfig. Runs after `provision_git`, so any git-gate rules already live in the - file; we append rather than overwrite.""" - content = render_cred_proxy_gitconfig(upstreams) + file; we append rather than overwrite. Hosts already brokered by + git-gate are skipped — git-gate is the canonical git path there.""" + content = render_cred_proxy_gitconfig(upstreams, git_gate_hosts) if not content: return container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") diff --git a/claude_bottle/cred_proxy.py b/claude_bottle/cred_proxy.py index ab53db4..672c9d5 100644 --- a/claude_bottle/cred_proxy.py +++ b/claude_bottle/cred_proxy.py @@ -64,16 +64,24 @@ class CredProxyPlan: The slug + routes_path + upstreams + token_env_map fields are filled at prepare time (host-side, side-effect-free on docker). - The network fields are populated by the backend's launch step - via `dataclasses.replace` once those networks exist. Empty - defaults are sentinels meaning "not yet set"; `.start` validates - that they are populated. + The network + pipelock fields are populated by the backend's + launch step via `dataclasses.replace` once those resources + exist. Empty defaults are sentinels meaning "not yet set"; + `.start` validates that they are populated. `token_env_map` is `{: }`. The backend's start step reads `os.environ[TokenRef]` and forwards the value into the cred-proxy container's environ under `token_env`. The plan itself never holds token values — secrets - never land in a dataclass that might be logged.""" + never land in a dataclass that might be logged. + + `pipelock_ca_host_path` is the host path of the per-bottle CA + pipelock will present on bumped TLS handshakes; the cred-proxy + image's entrypoint runs `update-ca-certificates` over it so the + proxy's HTTPS client trusts pipelock's CA. `pipelock_proxy_url` + is the URL cred-proxy sets as `HTTPS_PROXY` in its environ so + outbound HTTPS traverses pipelock — making pipelock's body + scanner part of the cred-proxy egress path.""" slug: str routes_path: Path @@ -81,6 +89,8 @@ class CredProxyPlan: token_env_map: dict[str, str] internal_network: str = "" egress_network: str = "" + pipelock_ca_host_path: Path = Path() + pipelock_proxy_url: str = "" # Hardcoded upstream URLs for the non-gitea Kinds. Gitea's URL is per- diff --git a/claude_bottle/cred_proxy_server.py b/claude_bottle/cred_proxy_server.py index 6d756bb..1a0f4a3 100644 --- a/claude_bottle/cred_proxy_server.py +++ b/claude_bottle/cred_proxy_server.py @@ -114,6 +114,31 @@ def select_route(routes: typing.Sequence[Route], request_path: str) -> Route | N return None +def is_git_push_request(path: str, query: str) -> bool: + """Return True if the request is a git smart-HTTP push. + + git push over HTTPS hits two endpoints: + GET /info/refs?service=git-receive-pack (capabilities) + POST /git-receive-pack (the push) + + Fetches use `service=git-upload-pack` / `/git-upload-pack` and are + not blocked. cred-proxy refuses push because git-gate's pre-receive + gitleaks scan is the gate for outbound git data; routing push + through cred-proxy would bypass that. Use the bottle.git SSH path + if you need to push. + """ + if path.endswith("/git-receive-pack"): + return True + if path.endswith("/info/refs"): + # Query string is parsed leniently — `service=git-receive-pack` + # may appear with other params in any order. + for pair in query.split("&"): + k, _, v = pair.partition("=") + if k == "service" and v == "git-receive-pack": + return True + return False + + # --- Header handling -------------------------------------------------------- @@ -223,6 +248,14 @@ class CredProxyHandler(http.server.BaseHTTPRequestHandler): def _proxy(self) -> None: server = typing.cast("CredProxyServer", self.server) path, _, query = self.path.partition("?") + if is_git_push_request(path, query): + self.send_error( + 403, + "cred-proxy: git push over HTTPS is not supported; " + "use the bottle.git SSH path (gitleaks-scanned by " + "git-gate's pre-receive hook)", + ) + return route = select_route(server.routes, path) if route is None: self.send_error(404, f"no route for {path!r}") diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 597fef1..6b8abf0 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -100,16 +100,28 @@ def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]: """Hostnames pipelock should pass through (no TLS MITM, no body - scan). Default carries the LLM API endpoint (its request bodies - legitimately trip DLP); cred-proxy upstream hosts are added so - cred-proxy's HTTPS client (which trusts only the real CA bundle) - can complete the upstream handshake.""" - seen: dict[str, None] = {} - for h in DEFAULT_TLS_PASSTHROUGH: - seen.setdefault(h, None) - for h in pipelock_token_hosts(bottle): - seen.setdefault(h, None) - return sorted(seen.keys()) + scan). Default carries the LLM API endpoint — its request bodies + are user-authored conversation text that legitimately trips DLP + scanners (notably pipelock's BIP-39 seed-phrase detector). Every + other allowlisted host is MITM'd by pipelock's per-bottle CA so + its body scanner sees the cleartext. + + cred-proxy upstream hosts (github, gitea, npm) are deliberately + NOT auto-added here. cred-proxy's HTTPS client trusts pipelock's + CA at runtime (folded into its trust store via docker cp + + update-ca-certificates), so pipelock can MITM the cred-proxy → + upstream leg and body-scan it the same way it body-scans the + agent's direct HTTPS traffic. Without this, an agent that pushed + a secret via cred-proxy's /gh-git/ path would have no body + scanner in front of it. The PRD's earlier reasoning that + cred-proxy hosts needed passthrough was a workaround for the + cert-trust gap that no longer exists. + + `bottle` is kept on the signature for forward-compat (a future + knob might let a manifest opt a host into passthrough); today + the returned list is independent of the bottle.""" + del bottle # not consulted; see docstring. + return sorted(DEFAULT_TLS_PASSTHROUGH) def pipelock_allowlist_summary(bottle: Bottle) -> str: diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md index 4737051..5c92ef5 100644 --- a/docs/prds/0010-cred-proxy.md +++ b/docs/prds/0010-cred-proxy.md @@ -130,7 +130,16 @@ supported kinds (anthropic, github, gitea, npm): the agent's environ - `~/.npmrc` `registry = http://cred-proxy:/npm/` - `~/.gitconfig` `[url …] insteadOf = …` for each declared - `github` / `gitea` upstream + `github` / `gitea` upstream, **except** when a `bottle.git` + entry already brokers the same host. git-gate is the canonical + git path on those hosts — its pre-receive runs gitleaks before + forwarding the push; a cred-proxy `https:///` rewrite + would route HTTPS git ops around the gate, and `git push` over + HTTPS to the same host via cred-proxy carries no gitleaks + equivalent. (cred-proxy independently refuses smart-HTTP push + paths at runtime — see "Smart-HTTP push refused" below — but + suppressing the rewrite means `git clone https:///...` + doesn't have a tempting shortcut that just confuses later.) - `~/.config/tea/config.yml` with the proxy URL for each declared `gitea` entry - **Sidecar lifecycle.** Mirrors `DockerGitGate` / @@ -141,11 +150,27 @@ supported kinds (anthropic, github, gitea, npm): `claude-bottle-cred-proxy-`. The agent container starts after the sidecar is up so DNS resolution succeeds on the agent's first call. -- **pipelock interop.** cred-proxy's outbound HTTPS still - traverses pipelock — pipelock keeps its egress-allowlist role - for the four upstream hosts. Drop `api.anthropic.com` from - pipelock's TLS-MITM list (cred-proxy is now the trust endpoint - for that host); the host stays on the plain HTTPS allowlist. +- **pipelock interop.** cred-proxy's outbound HTTPS traverses + pipelock: the sidecar's environ sets `HTTPS_PROXY` / + `HTTP_PROXY` to the per-bottle pipelock URL, and the cred-proxy + image's entrypoint runs `update-ca-certificates` over the + per-bottle pipelock CA (`docker cp`'d into + `/usr/local/share/ca-certificates/pipelock.crt` before start) + so cred-proxy's HTTPS client trusts pipelock's bumped certs. + Pipelock's allowlist + body scanner therefore apply to + cred-proxy → upstream the same way they apply to direct agent + traffic. Only `api.anthropic.com` stays on + `passthrough_domains` (its bodies are LLM conversation text + that legitimately trips DLP heuristics); github / gitea / npm + hosts are auto-added to the allowlist (so cred-proxy can reach + them) but NOT to passthrough, so pipelock body-scans them. +- **Smart-HTTP push refused.** cred-proxy returns 403 for paths + matching `/info/refs?service=git-receive-pack` and any path + ending in `/git-receive-pack`. Fetch (upload-pack) is allowed. + Push must go through `bottle.git` / git-gate, where the + gitleaks pre-receive hook runs. This holds even when no + matching `bottle.git` entry exists — the proxy is not a + scanned-push path, period. - **Plan rendering.** `bottle_plan.py` and the y/N preflight show: which tokens are configured (kind + ref name, not the value), the proxy port, the routes the proxy will publish. diff --git a/tests/unit/test_cred_proxy_server.py b/tests/unit/test_cred_proxy_server.py index f3f22fd..ce22889 100644 --- a/tests/unit/test_cred_proxy_server.py +++ b/tests/unit/test_cred_proxy_server.py @@ -7,6 +7,7 @@ from claude_bottle.cred_proxy_server import ( Route, build_forward_headers, filter_response_headers, + is_git_push_request, load_tokens, parse_routes, select_route, @@ -183,6 +184,49 @@ class TestFilterResponseHeaders(unittest.TestCase): self.assertNotIn("transfer-encoding", names) +class TestIsGitPushRequest(unittest.TestCase): + """git push over HTTPS goes through /info/refs?service=git-receive-pack + (capabilities probe) then POST /git-receive-pack (the push body). + Fetches use /git-upload-pack and are not blocked — the bypass we're + closing is push, since git-gate's gitleaks pre-receive is the scanner + for outbound git data.""" + + def test_push_capabilities_probe_blocked(self): + self.assertTrue(is_git_push_request( + "/gh-git/owner/repo.git/info/refs", + "service=git-receive-pack", + )) + + def test_push_body_blocked(self): + self.assertTrue(is_git_push_request( + "/gh-git/owner/repo.git/git-receive-pack", "", + )) + + def test_fetch_capabilities_allowed(self): + self.assertFalse(is_git_push_request( + "/gh-git/owner/repo.git/info/refs", + "service=git-upload-pack", + )) + + def test_fetch_body_allowed(self): + self.assertFalse(is_git_push_request( + "/gh-git/owner/repo.git/git-upload-pack", "", + )) + + def test_rest_api_allowed(self): + # tea/gh-style REST calls hit /api/v1/... — unrelated. + self.assertFalse(is_git_push_request( + "/gitea/gitea.dideric.is/api/v1/repos/x/y", "", + )) + + def test_push_with_extra_query_params(self): + # `service` may appear with other params in any order. + self.assertTrue(is_git_push_request( + "/gh-git/owner/repo.git/info/refs", + "trace=1&service=git-receive-pack", + )) + + class TestLoadTokens(unittest.TestCase): def test_reads_per_route_env(self): routes = ( diff --git a/tests/unit/test_docker_cred_proxy.py b/tests/unit/test_docker_cred_proxy.py index f292996..5a0be20 100644 --- a/tests/unit/test_docker_cred_proxy.py +++ b/tests/unit/test_docker_cred_proxy.py @@ -4,6 +4,7 @@ The full docker lifecycle is exercised by integration tests; here we cover the pure helpers and the validation checks `.start` runs before touching docker.""" +import tempfile import unittest from pathlib import Path @@ -26,6 +27,8 @@ def _empty_plan(**overrides): "token_env_map": {}, "internal_network": "", "egress_network": "", + "pipelock_ca_host_path": Path(), + "pipelock_proxy_url": "", } base.update(overrides) return CredProxyPlan(**base) @@ -77,6 +80,26 @@ class TestStartGuards(unittest.TestCase): routes_path=Path("/tmp/cred-proxy-test-does-not-exist.json"), )) + def test_pipelock_url_without_ca_dies(self): + # URL set + CA path empty/missing is a wiring bug: either both + # populated (production) or both empty (test escape hatch). + upstream = CredProxyUpstream( + kind="anthropic", path="/anthropic/", + upstream="https://api.anthropic.com", + auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", + token_ref="T", + ) + with tempfile.NamedTemporaryFile() as routes: + with self.assertRaises(Die): + self.proxy.start(_empty_plan( + upstreams=(upstream,), + internal_network="net-x", + egress_network="egress-x", + routes_path=Path(routes.name), + pipelock_proxy_url="http://pipelock:8888", + pipelock_ca_host_path=Path("/tmp/cred-proxy-no-ca.pem"), + )) + if __name__ == "__main__": unittest.main() diff --git a/tests/unit/test_pipelock_allowlist.py b/tests/unit/test_pipelock_allowlist.py index d5a5cf5..bd8bb31 100644 --- a/tests/unit/test_pipelock_allowlist.py +++ b/tests/unit/test_pipelock_allowlist.py @@ -92,9 +92,13 @@ class TestTlsPassthrough(unittest.TestCase): passthrough = pipelock_effective_tls_passthrough(_bottle({})) self.assertEqual(["api.anthropic.com"], passthrough) - def test_token_hosts_added_to_passthrough(self): - # cred-proxy validates upstream certs with the real CA bundle; - # pipelock must not MITM these or the handshake fails. + def test_token_hosts_NOT_added_to_passthrough(self): + # cred-proxy now trusts pipelock's per-bottle CA (loaded into + # its container's trust store via docker cp + update-ca- + # certificates at start time), so pipelock can MITM the + # cred-proxy -> upstream leg and body-scan it. Auto-adding + # cred-proxy hosts to passthrough would silently disable that + # second scanner for github / gitea / npm. passthrough = pipelock_effective_tls_passthrough(_bottle({ "tokens": [ {"Kind": "github", "TokenRef": "G"}, @@ -103,10 +107,7 @@ class TestTlsPassthrough(unittest.TestCase): "Url": "https://gitea.dideric.is"}, ], })) - for host in ("api.anthropic.com", "api.github.com", "github.com", - "registry.npmjs.org", "gitea.dideric.is"): - self.assertIn(host, passthrough) - self.assertEqual(passthrough, sorted(passthrough), "sorted") + self.assertEqual(["api.anthropic.com"], passthrough) if __name__ == "__main__": diff --git a/tests/unit/test_provision_cred_proxy.py b/tests/unit/test_provision_cred_proxy.py index dbf7730..5093cc6 100644 --- a/tests/unit/test_provision_cred_proxy.py +++ b/tests/unit/test_provision_cred_proxy.py @@ -83,6 +83,40 @@ class TestRenderGitconfig(unittest.TestCase): self.assertIn("gitea.dideric.is/", out) self.assertIn("gitea.example.com/", out) + def test_github_suppressed_when_git_gate_covers_host(self): + # When bottle.git brokers github.com over SSH, git-gate is the + # canonical git path. The cred-proxy https://github.com/ + # rewrite would let the agent push over HTTPS — bypassing + # gitleaks. Suppress it. + out = render_cred_proxy_gitconfig( + _upstreams([{"Kind": "github", "TokenRef": "GH"}]), + {"github.com"}, + ) + self.assertEqual("", out) + + def test_gitea_suppressed_when_git_gate_covers_host(self): + out = render_cred_proxy_gitconfig( + _upstreams([{"Kind": "gitea", "TokenRef": "T", + "Url": "https://gitea.dideric.is"}]), + {"gitea.dideric.is"}, + ) + self.assertEqual("", out) + + def test_partial_suppression_keeps_other_giteas(self): + # Two gitea instances; git-gate brokers one. The other still + # gets the cred-proxy rewrite. + out = render_cred_proxy_gitconfig( + _upstreams([ + {"Kind": "gitea", "TokenRef": "T1", + "Url": "https://gitea.dideric.is"}, + {"Kind": "gitea", "TokenRef": "T2", + "Url": "https://gitea.example.com"}, + ]), + {"gitea.dideric.is"}, + ) + self.assertNotIn("gitea.dideric.is/", out) + self.assertIn("gitea.example.com/", out) + class TestRenderTeaConfig(unittest.TestCase): def test_empty_when_no_gitea(self): -- 2.52.0 From fcbbc4484d6ac6f4e0b0c53cc44bc737aba6fad1 Mon Sep 17 00:00:00 2001 From: didericis Date: Wed, 13 May 2026 21:49:55 -0400 Subject: [PATCH 15/24] refactor(cred_proxy): flat routes, role-driven provisioning (PRD 0010) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace bottle.tokens (with Kind enum and hardcoded per-kind route/auth tables) with bottle.cred_proxy.routes — each route declares its own path, upstream, auth_scheme, token_ref, and optional role[]. The manifest is now the source of truth for the proxy's runtime route table; adding an upstream is a manifest edit, not a code change. Agent-side rewrites move from per-kind dispatch to per-role tags on routes: anthropic-base-url -> set ANTHROPIC_BASE_URL= npm-registry -> write ~/.npmrc registry= git-insteadof -> write ~/.gitconfig [url] insteadOf, keyed off route.upstream (suppressed when bottle.git brokers the same host) tea-login -> add a ~/.config/tea/config.yml login Roles are a list (string accepted as sugar). A gitea route typically carries ["git-insteadof", "tea-login"]. Singleton roles (anthropic-base-url, npm-registry) appear on at most one route. token_env slots are assigned per distinct TokenRef in declaration order — two routes sharing a token_ref (e.g. github API + git endpoints) share a slot. Drops: TOKEN_KINDS, _KIND_ROUTES, _KIND_AUTH_SCHEME, _TOKEN_DEFAULT_HOST, cred_proxy_route_path_for_gitea, the kind field on CredProxyUpstream, and the kind-based hardcoding in pipelock_token_hosts (now derives from route.UpstreamHost). Legacy bottle.tokens manifests now die with a hint pointing at bottle.cred_proxy.routes + this PRD. Tests rewritten end-to-end. Docs + example.json + the dev ~/claude-bottle.json updated to match. --- README.md | 72 +++-- claude-bottle.example.json | 38 ++- claude_bottle/backend/docker/bottle_plan.py | 15 +- claude_bottle/backend/docker/prepare.py | 27 +- .../backend/docker/provision/cred_proxy.py | 92 +++--- claude_bottle/cred_proxy.py | 114 +++---- claude_bottle/manifest.py | 281 +++++++++++------- claude_bottle/pipelock.py | 27 +- docs/prds/0010-cred-proxy.md | 209 +++++++------ tests/integration/test_cred_proxy_sidecar.py | 1 - tests/unit/test_cred_proxy.py | 147 ++++----- tests/unit/test_docker_cred_proxy.py | 6 +- tests/unit/test_manifest_tokens.py | 245 +++++++-------- tests/unit/test_pipelock_allowlist.py | 91 +++--- tests/unit/test_provision_cred_proxy.py | 128 ++++---- 15 files changed, 798 insertions(+), 695 deletions(-) diff --git a/README.md b/README.md index 7af84b4..9326be2 100644 --- a/README.md +++ b/README.md @@ -138,17 +138,23 @@ host. `docs/prds/0008-git-gate.md`. - **cred-proxy image** — per-bottle sidecar (`python:3.13-alpine` base, stdlib-only) that holds API tokens declared in - `bottle.tokens`. The agent dials it as plain HTTP at - `http://cred-proxy:9099//...`; the proxy strips any - inbound `Authorization` header, injects the configured one using - a token held only in its own container's environ, and forwards - to the real upstream over HTTPS. SSE responses stream back - unbuffered. `ANTHROPIC_BASE_URL`, `~/.npmrc`, `~/.gitconfig` - `insteadOf` rules for `https://github.com/` and any declared - Gitea hosts, and `~/.config/tea/config.yml` all get written to - point at the proxy. The agent's `printenv` shows only those - URLs — none of the real token values. Brought up only when - `bottle.tokens` has entries. Design in + `bottle.cred_proxy.routes`. Each route names a `path`, + `upstream`, `auth_scheme`, and `token_ref` (host env var); the + agent dials `http://cred-proxy:9099...` over plain HTTP + and the proxy strips any inbound `Authorization`, injects + ` ` using the value held only in its own + container's environ, and forwards to the real upstream over + HTTPS. SSE responses stream back unbuffered. The cred-proxy's + outbound HTTPS routes through pipelock (it trusts pipelock's + per-bottle CA), so pipelock's egress allowlist + body scanner + apply to cred-proxy traffic the same way they apply to direct + agent traffic. Smart-HTTP push paths (`/git-receive-pack`, + `/info/refs?service=git-receive-pack`) are refused at the + proxy — push must go through `bottle.git` / git-gate where + gitleaks runs. Optional per-route `role` tags drive agent-side + rewrites: `anthropic-base-url`, `npm-registry`, `git-insteadof`, + `tea-login`. The agent's `printenv` shows only proxy URLs — + none of the real token values. Design in `docs/prds/0010-cred-proxy.md`. When the agent exits, `cli.py` tears down every sidecar that was @@ -193,18 +199,31 @@ project entries overriding home entries on key conflict). } ], - // Tokens declared here are held by a per-bottle cred-proxy - // sidecar, not the agent. Each entry names the host env var - // (`TokenRef`) the CLI reads at launch time; the value goes - // into the sidecar's environ via `docker create -e`, never - // touches argv or disk. Inside the bottle, the agent's - // ANTHROPIC_BASE_URL / npm registry / git insteadOf rules - // point at the proxy. See `docs/prds/0010-cred-proxy.md`. - "tokens": [ - { "Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN" }, - { "Kind": "github", "TokenRef": "GITHUB_PAT" }, - { "Kind": "npm", "TokenRef": "NPM_TOKEN" } - ], + // Routes declared here are held by a per-bottle cred-proxy + // sidecar, not the agent. Each route names a path the agent + // dials, the upstream the proxy forwards to, an auth_scheme, + // and a token_ref (host env var). The value goes into the + // sidecar's environ via `docker create -e`, never touches + // argv or disk. Optional `role` tags drive agent-side + // rewrites: `anthropic-base-url` (sets ANTHROPIC_BASE_URL), + // `npm-registry` (writes ~/.npmrc), `git-insteadof` (writes + // ~/.gitconfig), `tea-login` (writes ~/.config/tea/config.yml). + // See `docs/prds/0010-cred-proxy.md`. + "cred_proxy": { + "routes": [ + { "path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", + "role": "anthropic-base-url" }, + { "path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "GITHUB_PAT" }, + { "path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GITHUB_PAT", + "role": "git-insteadof" }, + { "path": "/npm/", "upstream": "https://registry.npmjs.org", + "auth_scheme": "Bearer", "token_ref": "NPM_TOKEN", + "role": "npm-registry" } + ] + }, // Egress is forced through a per-agent // [pipelock](https://github.com/luckyPipewrench/pipelock) sidecar @@ -266,9 +285,10 @@ export CLAUDE_BOTTLE_OAUTH_TOKEN="" ``` By default `cli.py` forwards the token into the agent container as -`CLAUDE_CODE_OAUTH_TOKEN`. Declare an `anthropic` entry in -`bottle.tokens` to route via cred-proxy instead: the token then lives -only in the cred-proxy sidecar's environ, the agent's +`CLAUDE_CODE_OAUTH_TOKEN`. Declare a `bottle.cred_proxy.routes` entry +with `role: "anthropic-base-url"` and `token_ref: +"CLAUDE_BOTTLE_OAUTH_TOKEN"` to route via cred-proxy instead: the +token then lives only in the cred-proxy sidecar's environ, the agent's `ANTHROPIC_BASE_URL` points at the proxy, and `printenv` inside the agent does not surface the real token. Either way the value is never written to disk or placed on argv on the host. diff --git a/claude-bottle.example.json b/claude-bottle.example.json index 7403473..c6be907 100644 --- a/claude-bottle.example.json +++ b/claude-bottle.example.json @@ -43,13 +43,37 @@ "GIT_AUTHOR_NAME": "Eric Diderich", "NODE_ENV": "development" }, - "tokens": [ - { "Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN" }, - { "Kind": "github", "TokenRef": "GH_PAT" }, - { "Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is" }, - { "Kind": "npm", "TokenRef": "NPM_TOKEN" } - ] + "cred_proxy": { + "routes": [ + { "path": "/anthropic/", + "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", + "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", + "role": "anthropic-base-url" }, + + { "path": "/gh-api/", + "upstream": "https://api.github.com", + "auth_scheme": "Bearer", + "token_ref": "GH_PAT" }, + { "path": "/gh-git/", + "upstream": "https://github.com", + "auth_scheme": "Bearer", + "token_ref": "GH_PAT", + "role": "git-insteadof" }, + + { "path": "/gitea/dideric/", + "upstream": "https://gitea.dideric.is", + "auth_scheme": "token", + "token_ref": "GITEA_TOKEN", + "role": ["git-insteadof", "tea-login"] }, + + { "path": "/npm/", + "upstream": "https://registry.npmjs.org", + "auth_scheme": "Bearer", + "token_ref": "NPM_TOKEN", + "role": "npm-registry" } + ] + } } }, diff --git a/claude_bottle/backend/docker/bottle_plan.py b/claude_bottle/backend/docker/bottle_plan.py index c3f2af5..286d64d 100644 --- a/claude_bottle/backend/docker/bottle_plan.py +++ b/claude_bottle/backend/docker/bottle_plan.py @@ -107,16 +107,11 @@ class DockerBottlePlan(BottlePlan): else: info(" git remotes : (none)") if self.cred_proxy_plan.upstreams: - kinds: list[str] = [] - seen: set[str] = set() - for u in self.cred_proxy_plan.upstreams: - key = u.kind if u.kind != "gitea" else f"gitea ({u.upstream})" - if key in seen: - continue - seen.add(key) - kinds.append(key) + routes = [f"{u.path}→{u.upstream}" for u in self.cred_proxy_plan.upstreams] refs = sorted({u.token_ref for u in self.cred_proxy_plan.upstreams}) - info(f" cred-proxy : {', '.join(kinds)}; tokens: {', '.join(refs)}") + info(f" cred-proxy : {len(routes)} route(s); tokens: {', '.join(refs)}") + for line in routes: + info(f" {line}") else: info(" cred-proxy : (none)") info(f" egress : {self.allowlist_summary}") @@ -153,11 +148,11 @@ class DockerBottlePlan(BottlePlan): ], "cred_proxy": [ { - "kind": u.kind, "path": u.path, "upstream": u.upstream, "auth_scheme": u.auth_scheme, "token_ref": u.token_ref, + "roles": list(u.roles), } for u in self.cred_proxy_plan.upstreams ], diff --git a/claude_bottle/backend/docker/prepare.py b/claude_bottle/backend/docker/prepare.py index 66d6d76..46a81ea 100644 --- a/claude_bottle/backend/docker/prepare.py +++ b/claude_bottle/backend/docker/prepare.py @@ -89,21 +89,32 @@ def resolve_plan( # never lands on argv or in env_file) goes into one dict. Nothing # mutates the host os.environ. forwarded_env: dict[str, str] = dict(resolved.forwarded) - has_anthropic_token = any(t.Kind == "anthropic" for t in bottle.tokens) - if spec.forward_oauth_token and not has_anthropic_token: + # Find the (at most one) cred-proxy route claiming the + # anthropic-base-url role. Manifest validation enforces the + # singleton constraint. + anthropic_route = next( + (u for u in cred_proxy_plan.upstreams if "anthropic-base-url" in u.roles), + None, + ) + if spec.forward_oauth_token and anthropic_route is None: # Pre-PRD 0010 behavior: agent reads CLAUDE_CODE_OAUTH_TOKEN - # directly. Still the path when bottle.tokens has no anthropic - # entry; the cred-proxy sidecar holds the token otherwise. + # directly. Still the path when no cred_proxy.routes entry + # is tagged anthropic-base-url; otherwise the sidecar holds + # the token. forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = os.environ["CLAUDE_BOTTLE_OAUTH_TOKEN"] - if has_anthropic_token: + if anthropic_route is not None: # Point claude-code at the cred-proxy. The sidecar holds the - # OAuth token; the agent's environ does not. - forwarded_env["ANTHROPIC_BASE_URL"] = f"{cred_proxy_url()}/anthropic" + # OAuth token; the agent's environ does not. Strip the + # trailing slash so claude-code's path-join produces e.g. + # http://cred-proxy:9099/anthropic/v1/messages. + forwarded_env["ANTHROPIC_BASE_URL"] = ( + f"{cred_proxy_url()}{anthropic_route.path}".rstrip("/") + ) # claude-code refuses to start without *some* credential in # its env. The proxy strips inbound Authorization on every # request and injects the real one — so a non-secret # placeholder is sufficient and the SC1 test still holds - # (the placeholder is not a `bottle.tokens[].TokenRef` + # (the placeholder is not a `cred_proxy.routes[].TokenRef` # value). The agent cannot exfiltrate this string because # it carries no meaning to api.anthropic.com. forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = "cred-proxy-placeholder" diff --git a/claude_bottle/backend/docker/provision/cred_proxy.py b/claude_bottle/backend/docker/provision/cred_proxy.py index e946be3..a375cd1 100644 --- a/claude_bottle/backend/docker/provision/cred_proxy.py +++ b/claude_bottle/backend/docker/provision/cred_proxy.py @@ -46,14 +46,17 @@ def provision_cred_proxy(plan: DockerBottlePlan, target: str) -> None: def render_npmrc(upstreams: tuple[CredProxyUpstream, ...]) -> str: - """Render `~/.npmrc` content. No-op (empty string) when no npm - route is declared, so callers can branch on emptiness. + """Render `~/.npmrc` content. Driven by the `npm-registry` role: + finds the (single) route that claims it and writes a registry= + line at the proxy. Empty string when no such route exists, so + callers can branch on emptiness. The proxy strips inbound Authorization and injects its own — the npmrc deliberately carries no `_authToken`. The registry alone - is enough.""" + is enough. Manifest validation enforces that the role is a + singleton, so the first match is the only match.""" for u in upstreams: - if u.kind == "npm": + if "npm-registry" in u.roles: return f"registry={cred_proxy_url()}{u.path}\n" return "" @@ -89,40 +92,37 @@ def render_cred_proxy_gitconfig( git_gate_hosts: set[str] = frozenset(), # type: ignore[assignment] ) -> str: """Render the `~/.gitconfig` fragment for cred-proxy insteadOf - rewrites. Empty string when no github / gitea routes are declared. + rewrites. Driven by the `git-insteadof` role: each route that + claims it produces a `[url ""] insteadOf = + /` block. Empty string when no such route exists. - The rewrite is suppressed for any host that's also declared in - `bottle.git`. git-gate is the canonical git path on those hosts — - its pre-receive runs gitleaks before forwarding the push. A - cred-proxy https:/// rewrite would route HTTPS git ops - around the gate. cred-proxy still refuses smart-HTTP push at - runtime (defense in depth), but suppressing the rewrite means - `git clone https:///...` doesn't have a tempting shortcut - that just confuses on push. + The rewrite is suppressed for any route whose upstream host is + also declared in `bottle.git`. git-gate is the canonical git + path on those hosts — its pre-receive runs gitleaks before + forwarding the push. A cred-proxy `https:///` rewrite + would route HTTPS git ops around the gate. cred-proxy still + refuses smart-HTTP push at runtime (defense in depth), but + suppressing the rewrite means `git clone https:///...` + doesn't have a tempting shortcut that just confuses on push. - github expands to one rewrite (https://github.com/... → /gh-git/..., - the git transport endpoint); /gh-api/ stays unmapped here because - tools call api.github.com directly rather than through git. - Gitea entries get one rewrite per declared host.""" + The insteadOf left-hand side comes from `upstream` (with a + trailing `/` so insteadOf matches at the directory boundary), + so the same renderer handles github.com, gitea.dideric.is, and + any future host the user wires up.""" rules: list[str] = [] for u in upstreams: - if u.kind == "github" and u.path == "/gh-git/": - if "github.com" in git_gate_hosts: - continue - rules.append( - f'[url "{cred_proxy_url()}/gh-git/"]\n' - f"\tinsteadOf = https://github.com/\n" - ) - elif u.kind == "gitea": - # u.path is /gitea//; derive the host the same way - # the route table did so we match git_gate's UpstreamHost. - host = u.path[len("/gitea/"):].rstrip("/") - if host in git_gate_hosts: - continue - rules.append( - f'[url "{cred_proxy_url()}{u.path}"]\n' - f"\tinsteadOf = {u.upstream}/\n" - ) + if "git-insteadof" not in u.roles: + continue + # Strip scheme to derive the host for the git-gate overlap + # check. urllib.parse-free parse: same shape we accept in + # manifest validation. + host = u.upstream.removeprefix("https://").partition("/")[0].partition(":")[0] + if host in git_gate_hosts: + continue + rules.append( + f'[url "{cred_proxy_url()}{u.path}"]\n' + f"\tinsteadOf = {u.upstream}/\n" + ) if not rules: return "" return ( @@ -180,19 +180,21 @@ def _provision_gitconfig( def render_tea_config(upstreams: tuple[CredProxyUpstream, ...]) -> str: - """Render `~/.config/tea/config.yml`. One `logins:` entry per - gitea route, pointing at the cred-proxy. The proxy substitutes - the real token; the value in `token:` here is a placeholder and - is replaced by the proxy on every request, but `tea` won't make - calls without a non-empty token field.""" - giteas = [u for u in upstreams if u.kind == "gitea"] - if not giteas: + """Render `~/.config/tea/config.yml`. Driven by the `tea-login` + role: each route that claims it produces one `logins:` entry + pointing at the cred-proxy. The proxy substitutes the real + token at request time; the value in `token:` here is a + placeholder. `tea` refuses to make calls without a non-empty + token field, so the placeholder is necessary.""" + tea_routes = [u for u in upstreams if "tea-login" in u.roles] + if not tea_routes: return "" lines = ["logins:"] - for u in giteas: - # Derive a stable login name from the host (the part of the - # path between /gitea/ and the trailing /). - host = u.path[len("/gitea/"):].rstrip("/") + for u in tea_routes: + # Derive a stable login name from the upstream host. The + # path may not encode the host (e.g. `/gitea/dideric/` vs + # upstream gitea.dideric.is), so we read it off `upstream`. + host = u.upstream.removeprefix("https://").partition("/")[0].partition(":")[0] lines.extend([ f"- name: {host}", f" url: {cred_proxy_url()}{u.path}", diff --git a/claude_bottle/cred_proxy.py b/claude_bottle/cred_proxy.py index 672c9d5..1bd4492 100644 --- a/claude_bottle/cred_proxy.py +++ b/claude_bottle/cred_proxy.py @@ -28,34 +28,37 @@ from dataclasses import dataclass from pathlib import Path from .log import die -from .manifest import Bottle, TokenEntry +from .manifest import Bottle @dataclass(frozen=True) class CredProxyUpstream: """One route on the cred-proxy sidecar. Maps a path under the - proxy to a real upstream, an auth scheme, and the env-var slot - that holds the token inside the proxy container. + proxy to a real upstream, an auth scheme, an in-container env-var + slot, and optional provisioner roles. - `kind` is the originating `TokenEntry.Kind`; `path` is the agent- - facing prefix (e.g. `/anthropic/`); `upstream` is the upstream - base URL with scheme; `auth_scheme` is the literal word that - precedes the token in the injected header (`Bearer` for all kinds - except `gitea`, which uses `token` to sidestep go-gitea/gitea#16734). + `path` is the agent-facing prefix (e.g. `/anthropic/`). + `upstream` is the upstream base URL with scheme. `auth_scheme` + is the literal word that precedes the token in the injected + header (`Bearer` for most upstreams; `token` for Gitea — + sidesteps go-gitea/gitea#16734). `token_env` is the env-var name inside the cred-proxy container (e.g. `CRED_PROXY_TOKEN_0`); `token_ref` is the host env var the CLI reads at launch and forwards into the container's environ - under `token_env`. Two routes that share a TokenRef (the github - Kind expands into two routes — gh-api and gh-git) carry the same - `token_env`.""" + under `token_env`. Routes that share a TokenRef coalesce to one + `token_env` slot. + + `roles` are the provisioner tags from the manifest route (see + `manifest.CRED_PROXY_ROLES`). Each tag drives one agent-side + rewrite when this upstream's dotfile family is written.""" - kind: str path: str upstream: str auth_scheme: str token_env: str token_ref: str + roles: tuple[str, ...] = () @dataclass(frozen=True) @@ -93,64 +96,35 @@ class CredProxyPlan: pipelock_proxy_url: str = "" -# Hardcoded upstream URLs for the non-gitea Kinds. Gitea's URL is per- -# entry (`TokenEntry.Url`). -_KIND_ROUTES: dict[str, tuple[tuple[str, str], ...]] = { - # kind -> ((path, upstream), ...) — a Kind can produce multiple - # routes; today only `github` does (api + git endpoints). - "anthropic": (("/anthropic/", "https://api.anthropic.com"),), - "github": ( - ("/gh-api/", "https://api.github.com"), - ("/gh-git/", "https://github.com"), - ), - "npm": (("/npm/", "https://registry.npmjs.org"),), -} - -# Per-Kind auth header value prefix. Gitea uses `token` (not Bearer); -# everyone else uses Bearer. -_KIND_AUTH_SCHEME: dict[str, str] = { - "anthropic": "Bearer", - "github": "Bearer", - "gitea": "token", - "npm": "Bearer", -} - - -def cred_proxy_route_path_for_gitea(host: str) -> str: - """Agent-facing path for a single Gitea instance. The host segment - disambiguates routes when multiple gitea entries are declared.""" - return f"/gitea/{host}/" - - def cred_proxy_upstreams_for_bottle( bottle: Bottle, ) -> tuple[CredProxyUpstream, ...]: - """Lift every `bottle.tokens[]` entry into one or more - CredProxyUpstreams. Order is preserved so route lookup is stable. - Manifest validation already enforced uniqueness rules.""" + """Lift each `bottle.cred_proxy.routes[]` entry into a + CredProxyUpstream. Order is preserved so route lookup is stable. + + Token-env slots are assigned per distinct TokenRef: the first + route with TokenRef "GH_PAT" gets `CRED_PROXY_TOKEN_0`; a second + route with the same TokenRef shares slot 0. The launch step + forwards each TokenRef's value from the host environ into the + sidecar's environ under the matching slot name once. + + Manifest validation already enforced uniqueness rules (no + duplicate paths, singleton-role enforcement).""" out: list[CredProxyUpstream] = [] - for i, t in enumerate(bottle.tokens): - token_env = f"CRED_PROXY_TOKEN_{i}" - scheme = _KIND_AUTH_SCHEME[t.Kind] - if t.Kind == "gitea": - out.append(CredProxyUpstream( - kind="gitea", - path=cred_proxy_route_path_for_gitea(t.UpstreamHost), - upstream=t.Url.rstrip("/"), - auth_scheme=scheme, - token_env=token_env, - token_ref=t.TokenRef, - )) - else: - for path, upstream in _KIND_ROUTES[t.Kind]: - out.append(CredProxyUpstream( - kind=t.Kind, - path=path, - upstream=upstream, - auth_scheme=scheme, - token_env=token_env, - token_ref=t.TokenRef, - )) + slot_for_token: dict[str, str] = {} + for r in bottle.cred_proxy.routes: + token_env = slot_for_token.get(r.TokenRef) + if token_env is None: + token_env = f"CRED_PROXY_TOKEN_{len(slot_for_token)}" + slot_for_token[r.TokenRef] = token_env + out.append(CredProxyUpstream( + path=r.Path, + upstream=r.Upstream.rstrip("/"), + auth_scheme=r.AuthScheme, + token_env=token_env, + token_ref=r.TokenRef, + roles=r.Role, + )) return tuple(out) @@ -212,14 +186,14 @@ def cred_proxy_resolve_token_values( if value is None: die( f"cred-proxy: host env var '{token_ref}' is unset. Set it " - f"before launching, or remove the corresponding token entry " - f"from bottle.tokens." + f"before launching, or remove the corresponding route from " + f"bottle.cred_proxy.routes." ) if not value: die( f"cred-proxy: host env var '{token_ref}' is empty. The " f"cred-proxy will not inject an empty token; set it to the " - f"real value or remove the token entry." + f"real value or remove the route." ) out[token_env] = value return out @@ -269,10 +243,8 @@ __all__ = [ "CredProxy", "CredProxyPlan", "CredProxyUpstream", - "TokenEntry", "cred_proxy_render_routes", "cred_proxy_resolve_token_values", - "cred_proxy_route_path_for_gitea", "cred_proxy_token_env_map", "cred_proxy_upstreams_for_bottle", ] diff --git a/claude_bottle/manifest.py b/claude_bottle/manifest.py index 58abc47..dc332fe 100644 --- a/claude_bottle/manifest.py +++ b/claude_bottle/manifest.py @@ -5,10 +5,10 @@ Schema (see CLAUDE.md "Intended design"): { "bottles": { "": { - "env": { "": , ... }, - "git": [ , ... ], - "tokens": [ , ... ], - "egress": { "allowlist": [ "", ... ] } + "env": { "": , ... }, + "git": [ , ... ], + "cred_proxy": { "routes": [ , ... ] }, + "egress": { "allowlist": [ "", ... ] } } }, "agents": { @@ -114,92 +114,152 @@ class GitEntry: ) -TOKEN_KINDS = ("anthropic", "github", "gitea", "npm") +CRED_PROXY_AUTH_SCHEMES = ("Bearer", "token") + +# Provisioner role tags a route may carry. Each tag drives one +# agent-side rewrite when the cred-proxy sidecar comes up. +# anthropic-base-url: set ANTHROPIC_BASE_URL= +# npm-registry: write ~/.npmrc registry= +# git-insteadof: write ~/.gitconfig [url ""] +# insteadOf = / +# tea-login: add an entry to ~/.config/tea/config.yml +# (login url = ) +# Routes without a `role` are pure proxy entries with no agent-side +# rewrite — useful for upstreams whose tools the user wires up by +# hand. +CRED_PROXY_ROLES = frozenset({ + "anthropic-base-url", + "npm-registry", + "git-insteadof", + "tea-login", +}) + +# Roles whose semantics imply a single route can carry them. A second +# route claiming the same role would make the provisioner's choice +# ambiguous (which path goes into ANTHROPIC_BASE_URL?). +CRED_PROXY_SINGLETON_ROLES = frozenset({ + "anthropic-base-url", + "npm-registry", +}) @dataclass(frozen=True) -class TokenEntry: - """One credential the per-bottle cred-proxy sidecar (PRD 0010) - holds and injects on the agent's behalf. +class CredProxyRoute: + """One route on the per-bottle cred-proxy sidecar (PRD 0010). - `Kind` selects the route handler: `anthropic` / `github` / `npm` - have fixed upstream URLs; `gitea` requires an explicit `Url` - because the upstream is per-instance. + The agent dials `http://cred-proxy:...`; the sidecar + strips any inbound `Authorization` header, injects + ` ` using the value of the host env var named + by `TokenRef`, and forwards the rest of the request to `Upstream`. - `TokenRef` is the name of the host env var the CLI resolves at - launch time. The value is forwarded into the cred-proxy - container's environ via `docker run -e NAME` — never onto argv, - never into a file. The value does NOT land in the agent's - environ. + `Path` is the agent-facing prefix (must start and end with `/`). + `Upstream` is the upstream base URL (https only) — the request + path after `Path` is appended to it. `AuthScheme` is the literal + word that precedes the token in the injected header (`Bearer` for + most upstreams, `token` for Gitea — sidesteps go-gitea/gitea#16734). + `TokenRef` names the host env var holding the credential value; + the CLI reads it at launch and forwards into the sidecar's environ. + `Role` carries optional provisioner tags (see CRED_PROXY_ROLES). - `UpstreamHost` is parsed from `Url` for `gitea` entries (or the - documented default for the other kinds). It exists so the - cross-validator can spot collisions with `bottle.git` upstreams - without re-parsing URLs at every call site.""" + `UpstreamHost` is parsed from `Upstream` for the pipelock allowlist + + the git-insteadof suppression check.""" - Kind: str + Path: str + Upstream: str + AuthScheme: str TokenRef: str - Url: str = "" + Role: tuple[str, ...] = () UpstreamHost: str = "" @classmethod - def from_dict(cls, bottle_name: str, idx: int, raw: object) -> "TokenEntry": - d = _as_json_object(raw, f"bottle '{bottle_name}' tokens[{idx}]") - kind = d.get("Kind") - if not isinstance(kind, str) or not kind: + def from_dict(cls, bottle_name: str, idx: int, raw: object) -> "CredProxyRoute": + label = f"bottle '{bottle_name}' cred_proxy.routes[{idx}]" + d = _as_json_object(raw, label) + path = d.get("path") + if not isinstance(path, str) or not path: + die(f"{label} missing required string field 'path'") + if not (path.startswith("/") and path.endswith("/")): + die(f"{label} path {path!r} must start and end with '/'") + upstream = d.get("upstream") + if not isinstance(upstream, str) or not upstream: + die(f"{label} missing required string field 'upstream'") + host = _parse_https_host(upstream, f"{label} upstream") + auth_scheme = d.get("auth_scheme") + if not isinstance(auth_scheme, str) or not auth_scheme: + die(f"{label} missing required string field 'auth_scheme'") + if auth_scheme not in CRED_PROXY_AUTH_SCHEMES: die( - f"bottle '{bottle_name}' tokens[{idx}] missing required string field " - f"'Kind'" + f"{label} auth_scheme {auth_scheme!r} is not one of " + f"{', '.join(CRED_PROXY_AUTH_SCHEMES)}" ) - if kind not in TOKEN_KINDS: - die( - f"bottle '{bottle_name}' tokens[{idx}] Kind {kind!r} is not one of " - f"{', '.join(TOKEN_KINDS)}" - ) - token_ref = d.get("TokenRef") + token_ref = d.get("token_ref") if not isinstance(token_ref, str) or not token_ref: die( - f"bottle '{bottle_name}' tokens[{idx}] ({kind}) missing required " - f"string field 'TokenRef' (name of the host env var to forward)" + f"{label} missing required string field 'token_ref' " + f"(name of the host env var holding the token value)" ) - url_raw = d.get("Url") - if url_raw is None: - url = "" - elif isinstance(url_raw, str): - url = url_raw + role_raw = d.get("role") + roles: tuple[str, ...] = () + if role_raw is None: + roles = () + elif isinstance(role_raw, str): + roles = (role_raw,) + elif isinstance(role_raw, list): + role_list = cast(list[object], role_raw) + collected: list[str] = [] + for r in role_list: + if not isinstance(r, str): + die(f"{label} role items must be strings (got {type(r).__name__})") + collected.append(r) + roles = tuple(collected) else: die( - f"bottle '{bottle_name}' tokens[{idx}] ({kind}) Url must be a string " - f"(was {type(url_raw).__name__})" + f"{label} role must be a string or a list of strings " + f"(was {type(role_raw).__name__})" ) - if kind == "gitea": - if not url: + for r in roles: + if r not in CRED_PROXY_ROLES: die( - f"bottle '{bottle_name}' tokens[{idx}] (gitea) requires a Url " - f"(the Gitea instance, e.g. https://gitea.dideric.is)" + f"{label} role {r!r} is not one of " + f"{', '.join(sorted(CRED_PROXY_ROLES))}" ) - host = _parse_https_host( - url, f"bottle '{bottle_name}' tokens[{idx}] (gitea) Url" - ) - else: - if url: - die( - f"bottle '{bottle_name}' tokens[{idx}] ({kind}) cannot set Url; " - f"the upstream for this Kind is fixed by cred-proxy. Drop the " - f"'Url' field." - ) - host = _TOKEN_DEFAULT_HOST[kind] - return cls(Kind=kind, TokenRef=token_ref, Url=url, UpstreamHost=host) + return cls( + Path=path, + Upstream=upstream, + AuthScheme=auth_scheme, + TokenRef=token_ref, + Role=roles, + UpstreamHost=host, + ) -# Hostnames the cred-proxy talks to upstream for the non-gitea kinds. -# Used both for the proxy's route table and for the manifest cross- -# validator that rejects overlap with `bottle.git`. -_TOKEN_DEFAULT_HOST: dict[str, str] = { - "anthropic": "api.anthropic.com", - "github": "github.com", - "npm": "registry.npmjs.org", -} +@dataclass(frozen=True) +class CredProxyConfig: + """Per-bottle cred-proxy configuration. Today this is just the + route table; the nesting under `cred_proxy:` leaves room for + per-bottle proxy settings (port override, log level, etc.) in + follow-ups.""" + + routes: tuple[CredProxyRoute, ...] = () + + @classmethod + def from_dict(cls, bottle_name: str, raw: object) -> "CredProxyConfig": + d = _as_json_object(raw, f"bottle '{bottle_name}' cred_proxy") + routes_raw = d.get("routes") + routes: tuple[CredProxyRoute, ...] = () + if routes_raw is not None: + if not isinstance(routes_raw, list): + die( + f"bottle '{bottle_name}' cred_proxy.routes must be an array " + f"(was {type(routes_raw).__name__})" + ) + routes_list = cast(list[object], routes_raw) + routes = tuple( + CredProxyRoute.from_dict(bottle_name, i, entry) + for i, entry in enumerate(routes_list) + ) + _validate_cred_proxy_routes(bottle_name, routes) + return cls(routes=routes) DLP_ACTIONS = ("block", "warn") @@ -257,7 +317,7 @@ class BottleEgress: class Bottle: env: Mapping[str, str] = field(default_factory=_empty_str_dict) git: tuple[GitEntry, ...] = () - tokens: tuple[TokenEntry, ...] = () + cred_proxy: CredProxyConfig = field(default_factory=CredProxyConfig) egress: BottleEgress = field(default_factory=BottleEgress) @classmethod @@ -305,20 +365,19 @@ class Bottle: ) _validate_unique_git_names(name, git) - tokens: tuple[TokenEntry, ...] = () - tokens_raw = d.get("tokens") - if tokens_raw is not None: - if not isinstance(tokens_raw, list): - die( - f"bottle '{name}' tokens must be an array " - f"(was {type(tokens_raw).__name__})" - ) - tokens_list = cast(list[object], tokens_raw) - tokens = tuple( - TokenEntry.from_dict(name, i, entry) - for i, entry in enumerate(tokens_list) + if "tokens" in d: + die( + f"bottle '{name}' has a 'tokens' field. The shape was reworked: " + f"each route now lives under 'cred_proxy.routes' with explicit " + f"path / upstream / auth_scheme / token_ref / role[]. See " + f"docs/prds/0010-cred-proxy.md." ) - _validate_tokens(name, tokens, git) + + cred_proxy = ( + CredProxyConfig.from_dict(name, d["cred_proxy"]) + if "cred_proxy" in d + else CredProxyConfig() + ) egress_raw = d.get("egress") egress = ( @@ -327,7 +386,7 @@ class Bottle: else BottleEgress() ) - return cls(env=env, git=git, tokens=tokens, egress=egress) + return cls(env=env, git=git, cred_proxy=cred_proxy, egress=egress) @dataclass(frozen=True) @@ -561,41 +620,41 @@ def _parse_https_host(url: str, label: str) -> str: return host -def _validate_tokens( +def _validate_cred_proxy_routes( bottle_name: str, - tokens: tuple[TokenEntry, ...], - git: tuple[GitEntry, ...], + routes: tuple[CredProxyRoute, ...], ) -> None: - """Cross-validation for `bottle.tokens`: + """Cross-validation for `bottle.cred_proxy.routes`: - - At most one entry per Kind, except `gitea` which may have - multiple entries (one per Gitea instance) with distinct Urls. + - Paths must be unique within the bottle (the proxy routes by + longest-prefix match; duplicate paths leave the choice + undefined). + - Singleton roles (`anthropic-base-url`, `npm-registry`) may + appear on at most one route — the provisioner uses them to + write a single dotfile entry, so two routes claiming the role + would make the choice ambiguous. - A `github` or `gitea` token MAY name the same host as a - `bottle.git` entry: the two paths broker different protocols - (git-gate handles SSH push/fetch with an IdentityFile; cred-proxy - handles HTTPS REST API calls with a PAT), so declaring both on - one host is a legitimate dev setup, not a configuration error. + No cross-validation against `bottle.git` is performed. git-gate + (SSH push/fetch) and cred-proxy (HTTPS REST + git smart-HTTP + fetch) broker different protocols; declaring both on the same + host is a legitimate dev setup. """ - del git # cross-host overlap is intentionally not rejected. - by_kind: dict[str, list[TokenEntry]] = {} - for t in tokens: - by_kind.setdefault(t.Kind, []).append(t) - for kind, entries in by_kind.items(): - if kind == "gitea": - seen: dict[str, None] = {} - for e in entries: - if e.Url in seen: - die( - f"bottle '{bottle_name}' tokens has duplicate gitea Url " - f"{e.Url!r}; one entry per Gitea instance." - ) - seen[e.Url] = None - elif len(entries) > 1: + seen_paths: dict[str, None] = {} + for r in routes: + if r.Path in seen_paths: die( - f"bottle '{bottle_name}' tokens has {len(entries)} entries with " - f"Kind {kind!r}; at most one is allowed (gitea is the only Kind " - f"that may have multiple entries)." + f"bottle '{bottle_name}' cred_proxy.routes has duplicate path " + f"{r.Path!r}; each path must be unique on the proxy." + ) + seen_paths[r.Path] = None + for role in CRED_PROXY_SINGLETON_ROLES: + with_role = [r for r in routes if role in r.Role] + if len(with_role) > 1: + paths = ", ".join(r.Path for r in with_role) + die( + f"bottle '{bottle_name}' cred_proxy.routes has {len(with_role)} " + f"routes with role {role!r} (paths: {paths}); this role drives a " + f"single agent-side rewrite — pick one." ) diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 6b8abf0..3ae11e0 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -57,27 +57,18 @@ def pipelock_bottle_allowlist(bottle: Bottle) -> list[str]: def pipelock_token_hosts(bottle: Bottle) -> list[str]: """Hostnames the cred-proxy sidecar (PRD 0010) talks to upstream - on the agent's behalf. Derived from `bottle.tokens[]`. Returned + on the agent's behalf. Derived from each route's + `upstream.UpstreamHost` in `bottle.cred_proxy.routes`. Returned sorted+deduped. These hosts must be on pipelock's allowlist so cred-proxy's - outbound HTTPS traffic can leave the egress network, and on - pipelock's TLS-passthrough list so pipelock does not MITM them — - cred-proxy validates real upstream certs with the system CA store, - so a pipelock-bumped cert would fail trust.""" - hosts: set[str] = set() - for t in bottle.tokens: - if t.Kind == "github": - hosts.add("api.github.com") - hosts.add("github.com") - elif t.Kind == "gitea": - if t.UpstreamHost: - hosts.add(t.UpstreamHost) - elif t.Kind == "npm": - hosts.add("registry.npmjs.org") - elif t.Kind == "anthropic": - # Already on DEFAULT_ALLOWLIST + DEFAULT_TLS_PASSTHROUGH. - hosts.add("api.anthropic.com") + outbound HTTPS traffic can leave the egress network. They are + NOT auto-added to passthrough_domains: cred-proxy's HTTPS client + trusts pipelock's per-bottle CA at runtime (installed via + docker cp + update-ca-certificates in the cred-proxy image), + so pipelock MITMs and body-scans the cred-proxy → upstream leg + the same way it does direct agent traffic.""" + hosts = {r.UpstreamHost for r in bottle.cred_proxy.routes if r.UpstreamHost} return sorted(hosts) diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md index 5c92ef5..35789ec 100644 --- a/docs/prds/0010-cred-proxy.md +++ b/docs/prds/0010-cred-proxy.md @@ -51,12 +51,13 @@ This PRD is the build. ## Goals / Success Criteria Each test runs inside a bottle whose manifest declares the four -supported kinds (anthropic, github, gitea, npm): +common upstreams (Anthropic, GitHub, Gitea, npm) as +`bottle.cred_proxy.routes` entries: 1. **No plaintext tokens in the agent's environ.** `printenv` and `cat /proc/self/environ` from the agent's shell return only URLs pointing at `cred-proxy:/...`. None of the - `bottle.tokens[].TokenRef` values appear. + `cred_proxy.routes[].token_ref` host env-var values appear. 2. **Container boundary holds.** From the agent's shell, `ps aux` does not list the cred-proxy process; there is no `/proc/` entry for it to read. The sidecar's hostname (`cred-proxy`) @@ -67,9 +68,11 @@ supported kinds (anthropic, github, gitea, npm): `cred-proxy:/anthropic`. SSE chunks arrive without buffering; `anthropic-version`, `anthropic-beta`, and `X-Claude-Code-Session-Id` headers round-trip untouched. -4. **Git push to declared remotes works.** `git push` against a - `bottle.tokens[].Kind: github` or `gitea` upstream succeeds; - the upstream sees the gate's token, not the agent's. +4. **`tea` / REST API against declared upstreams works.** + `tea pr list` against a route's upstream succeeds; the + upstream sees the proxy-injected token, not the agent's. + `git push` is *not* on the cred-proxy path — that goes + through `bottle.git` / git-gate (where gitleaks runs). 5. **npm install works.** `npm install ` succeeds against the registry pointed at the proxy. A scoped install that requires the token (e.g. against a private @@ -78,7 +81,10 @@ supported kinds (anthropic, github, gitea, npm): If the agent tries to send its own `Authorization: …` header, the proxy strips and replaces with the configured one. A manifest token revoked at the upstream produces a 401 to the - agent, not a 5xx. + agent, not a 5xx. Git smart-HTTP push paths + (`/git-receive-pack`, `/info/refs?service=git-receive-pack`) + return 403 unconditionally — push must go through git-gate's + gitleaks-scanned SSH path. ## Non-goals @@ -113,35 +119,49 @@ supported kinds (anthropic, github, gitea, npm): ### In scope -- **Manifest field.** `bottle.tokens: [TokenEntry, ...]`. Each - entry carries `Kind` (`anthropic` | `github` | `gitea` | - `npm`), an optional `Url` (required for `gitea`, defaulted for - the others), and `TokenRef` (the name of a host env var the - CLI resolves at launch time). +- **Manifest field.** `bottle.cred_proxy.routes: [Route, ...]`. + Each route carries `path` (agent-facing prefix), `upstream` + (HTTPS upstream URL), `auth_scheme` (`Bearer` or `token`), + `token_ref` (name of a host env var the CLI resolves at launch + time), and an optional `role` (string or list of strings — see + "Agent-side rewrites" below). Routes are independent — there is + no `Kind` enum or per-kind hardcoded path/upstream mapping; the + manifest is the source of truth for the proxy's runtime route + table. - **cred-proxy sidecar.** Runs as its own container on the bottle's internal docker network with hostname `cred-proxy`, listening on `0.0.0.0:` bound to the internal interface. No host port published. Holds the tokens in the sidecar container's environ — never on argv, never written to disk. - Per-`Kind` route handler: inject the right header, forward - over TLS, stream the response back without buffering. -- **Agent-side rewrites.** Provisioner writes: - - `ANTHROPIC_BASE_URL=http://cred-proxy:/anthropic` to - the agent's environ - - `~/.npmrc` `registry = http://cred-proxy:/npm/` - - `~/.gitconfig` `[url …] insteadOf = …` for each declared - `github` / `gitea` upstream, **except** when a `bottle.git` - entry already brokers the same host. git-gate is the canonical - git path on those hosts — its pre-receive runs gitleaks before - forwarding the push; a cred-proxy `https:///` rewrite - would route HTTPS git ops around the gate, and `git push` over - HTTPS to the same host via cred-proxy carries no gitleaks - equivalent. (cred-proxy independently refuses smart-HTTP push + Per-route handler: inject the configured header, forward over + TLS, stream the response back without buffering. +- **Agent-side rewrites.** A route's `role` (string or list of + strings) drives optional agent-side dotfile/env writes when the + sidecar comes up. Known roles: + - `anthropic-base-url` (singleton): sets + `ANTHROPIC_BASE_URL=http://cred-proxy:` in + the agent's environ. Used for the Anthropic OAuth path. + - `npm-registry` (singleton): writes + `registry=http://cred-proxy:` to `~/.npmrc`. + - `git-insteadof`: writes a `[url "http://cred-proxy:"] + insteadOf = /` block to `~/.gitconfig`. + Suppressed when `bottle.git` already brokers the same host: + git-gate is the canonical git path there — its pre-receive + runs gitleaks before forwarding pushes; a cred-proxy + `https:///` rewrite would route HTTPS git ops around + the gate. (cred-proxy independently refuses smart-HTTP push paths at runtime — see "Smart-HTTP push refused" below — but suppressing the rewrite means `git clone https:///...` doesn't have a tempting shortcut that just confuses later.) - - `~/.config/tea/config.yml` with the proxy URL for each - declared `gitea` entry + - `tea-login`: adds a `logins:` entry to + `~/.config/tea/config.yml` pointing at the proxy. Used for + Gitea instances; combine with `git-insteadof` for full agent + coverage. + + Routes without a `role` are pure proxy entries — the proxy + handles them at runtime, but no agent-side rewrite happens. The + singleton roles must appear on at most one route per bottle + (manifest validation enforces this). - **Sidecar lifecycle.** Mirrors `DockerGitGate` / `DockerPipelockProxy` in shape: `prepare` is host-side and side-effect-free; `start` does `docker create` + `docker start` @@ -282,80 +302,98 @@ Why the agent can't reach the sidecar's environ: ### Existing code touched -- **`claude_bottle/manifest.py`** — add `TokenEntry`, - `Bottle.tokens: tuple[TokenEntry, ...] = ()`, parse + validate - (at most one entry per `Kind` except `gitea`, which may - carry multiple Urls). -- **`claude_bottle/backend/docker/prepare.py`** — delete the - `CLAUDE_BOTTLE_OAUTH_TOKEN` → `CLAUDE_CODE_OAUTH_TOKEN` branch - in the agent's forwarded env. The OAuth token is forwarded - into the cred-proxy sidecar's environ at sidecar `docker create` - time instead. +- **`claude_bottle/manifest.py`** — add `CredProxyRoute`, + `CredProxyConfig`, `Bottle.cred_proxy: CredProxyConfig`. Parse + + validate route shape, role enum, path uniqueness, singleton- + role constraints. +- **`claude_bottle/backend/docker/prepare.py`** — switch the + agent's OAuth handling: when a route claims the + `anthropic-base-url` role, write `ANTHROPIC_BASE_URL` (pointing + at the proxy) plus a non-secret placeholder for + `CLAUDE_CODE_OAUTH_TOKEN` (claude-code refuses to start + otherwise; the proxy strips & replaces on every request). + When no such route exists, fall back to the pre-PRD-0010 path + (forward `CLAUDE_BOTTLE_OAUTH_TOKEN` as `CLAUDE_CODE_OAUTH_TOKEN`). - **`claude_bottle/backend/docker/backend.py`** — instantiate `DockerCredProxy` alongside `DockerPipelockProxy` and `DockerGitGate`; thread its `prepare` / `start` / `stop` through `resolve_plan` / `launch`. - **`claude_bottle/backend/docker/launch.py`** — add cred-proxy - start/stop to the `ExitStack` alongside pipelock and git-gate; - the sidecar must be up before the agent container starts so - DNS resolution for `cred-proxy` succeeds on first contact. + start/stop to the `ExitStack` after pipelock and before the + agent; populate `pipelock_proxy_url` + `pipelock_ca_host_path` + on the cred-proxy plan so its outbound HTTPS routes through + pipelock. - **`claude_bottle/backend/docker/bottle_plan.py`** — new - `CredProxyPlan` field; preflight shows kind + ref name + - port + route table. -- **`claude_bottle/pipelock.py`** — drop the `api.anthropic.com` - TLS-MITM branch; the host stays on the allowlist as a plain - HTTPS destination. Confirm the four upstream hosts are - allowlisted by default when `bottle.tokens` declares them. -- **`README.md`** — replace the architecture diagram with the - one above; document the `bottle.tokens` field. -- **`claude-bottle.example.json`** — add a `tokens` array to - one bottle showing each Kind. -- **Tests** — new unit tests for manifest parsing, route table - generation, header injection; new integration tests for the - six success criteria. Delete the bits of `prepare.py` tests - that asserted on `CLAUDE_CODE_OAUTH_TOKEN` landing in the - agent's env. + `cred_proxy_plan` field; preflight shows route count + token + refs + a path→upstream line per route; `to_dict` emits a + `cred_proxy` array of `{path, upstream, auth_scheme, token_ref, + roles}`. +- **`claude_bottle/pipelock.py`** — `pipelock_token_hosts` derives + from each route's `UpstreamHost` (not a hardcoded Kind→hosts + map). Allowlist auto-includes them; passthrough does not (the + proxy trusts pipelock's CA so MITM works). +- **`README.md`** — architecture diagram includes the cred-proxy + lane; manifest section documents `bottle.cred_proxy.routes`. +- **`claude-bottle.example.json`** — one bottle demonstrates the + four common routes (Anthropic, GitHub, Gitea, npm). +- **Tests** — manifest parsing/validation, route lift + token-env + slot assignment, role-based dispatch in the provisioner, + pipelock allowlist derivation from routes. Integration test + exercises header inject + smart-HTTP push refusal. ### Data model changes ```python @dataclass(frozen=True) -class TokenEntry: - Kind: Literal["anthropic", "github", "gitea", "npm"] - TokenRef: str # name of host env var - Url: str | None = None # required for gitea; defaulted otherwise +class CredProxyRoute: + Path: str # "/anthropic/" — must start and end with / + Upstream: str # "https://api.anthropic.com" — https only + AuthScheme: str # "Bearer" or "token" + TokenRef: str # name of host env var + Role: tuple[str, ...] = () # provisioner tags; see CRED_PROXY_ROLES + UpstreamHost: str = "" # derived from Upstream + +@dataclass(frozen=True) +class CredProxyConfig: + routes: tuple[CredProxyRoute, ...] = () @dataclass(frozen=True) class Bottle: ... - tokens: tuple[TokenEntry, ...] = () + cred_proxy: CredProxyConfig = field(default_factory=CredProxyConfig) ``` Validation: -- `Kind` must be one of the four supported values. -- `TokenRef` must resolve against `os.environ` at launch (fail - fast with a clear "host env var X is unset" if missing). -- `gitea` entries require `Url`; others fall back to the - documented upstream. -- At most one entry per `Kind` except `gitea`, which may have - multiple distinct `Url`s. -- A `github` or `gitea` token MAY name the same host as a - `bottle.git` entry. The two paths broker different protocols — - git-gate holds an SSH `IdentityFile` for push/fetch and runs - gitleaks; cred-proxy holds a PAT for HTTPS REST API calls (`tea`, - `gh`, octokit). The common dev setup uses both on the same host - and is not a configuration error. +- `Path` non-empty, starts and ends with `/`; unique across all + routes in a bottle (the proxy routes by longest-prefix match). +- `Upstream` is `https://...` with a non-empty host. +- `AuthScheme` is one of `Bearer`, `token`. +- `TokenRef` non-empty; its value is resolved against + `os.environ` at launch (fail fast with a clear "host env var X + is unset" if missing). +- `Role` items are one of `anthropic-base-url`, `npm-registry`, + `git-insteadof`, `tea-login`. Single string accepted as sugar + for a one-item list. +- Singleton roles (`anthropic-base-url`, `npm-registry`) appear + on at most one route per bottle. +- A route MAY name the same host as a `bottle.git` entry. The + two paths broker different protocols — git-gate holds an SSH + `IdentityFile` for push/fetch and runs gitleaks; cred-proxy + holds a PAT for HTTPS REST API calls (`tea`, `gh`, octokit). + The common dev setup uses both on the same host. The + provisioner's `git-insteadof` role is suppressed in that case + (see Agent-side rewrites). -### Routing table +### Example routes -| Kind | Proxy path | Upstream | Header | -|-----------|----------------|-------------------------|----------------------------| -| anthropic | `/anthropic/` | `api.anthropic.com` | `Authorization: Bearer …` | -| github | `/gh-api/` | `api.github.com` | `Authorization: Bearer …` | -| github | `/gh-git/` | `github.com` | `Authorization: Bearer …` | -| gitea | `/gitea/` | configured `Url` | `Authorization: token …` | -| npm | `/npm/` | `registry.npmjs.org` | `Authorization: Bearer …` | +| Common upstream | Route | +|------------------------|-------------------------------------------------------------------------------------------------------------------------------------| +| Anthropic API | `{path: "/anthropic/", upstream: "https://api.anthropic.com", auth_scheme: "Bearer", token_ref: "…", role: "anthropic-base-url"}` | +| GitHub REST API | `{path: "/gh-api/", upstream: "https://api.github.com", auth_scheme: "Bearer", token_ref: "…"}` | +| GitHub git transport | `{path: "/gh-git/", upstream: "https://github.com", auth_scheme: "Bearer", token_ref: "…", role: "git-insteadof"}` | +| Gitea instance | `{path: "/gitea//", upstream: "https://", auth_scheme: "token", token_ref: "…", role: ["git-insteadof", "tea-login"]}` | +| npm registry | `{path: "/npm/", upstream: "https://registry.npmjs.org", auth_scheme: "Bearer", token_ref: "…", role: "npm-registry"}` | Gitea uses `Authorization: token` rather than `Bearer` to sidestep `go-gitea/gitea#16734`. The proxy strips any incoming @@ -443,11 +481,12 @@ Rejected because: ## Open questions -- **Field name.** `bottle.tokens` is the working name. The - research note used `bottle.forge` for the gitea/github - generalization, but "forge" doesn't fit `anthropic` or - `npm`. Alternatives: `bottle.brokered`, `bottle.upstreams`, - `bottle.cred_proxy`. Default: `bottle.tokens`. +- **~~Field name.~~** Resolved during iteration: routes live at + `bottle.cred_proxy.routes` (the nested object reserves room for + per-bottle proxy settings later). Each route is independent; + no `Kind` enum on the route. A `role` field drives the + optional agent-side rewrites — see "Agent-side rewrites" in + Scope. - **Python vs Go for the proxy.** Default: Python, revisit during implementation if SSE pass-through is unreliable. - **Sidecar image base.** Distroless (smallest, no shell — hardest diff --git a/tests/integration/test_cred_proxy_sidecar.py b/tests/integration/test_cred_proxy_sidecar.py index a121780..7e1d468 100644 --- a/tests/integration/test_cred_proxy_sidecar.py +++ b/tests/integration/test_cred_proxy_sidecar.py @@ -154,7 +154,6 @@ class TestCredProxySidecar(unittest.TestCase): slug=self.slug, routes_path=routes_path, upstreams=(CredProxyUpstream( - kind="fake", path="/fake/", upstream=f"http://{FAKE_UPSTREAM_HOST}:{FAKE_UPSTREAM_PORT}", auth_scheme="Bearer", diff --git a/tests/unit/test_cred_proxy.py b/tests/unit/test_cred_proxy.py index 68794b9..d238ab8 100644 --- a/tests/unit/test_cred_proxy.py +++ b/tests/unit/test_cred_proxy.py @@ -14,79 +14,77 @@ from claude_bottle.log import Die from claude_bottle.manifest import Manifest -def _bottle(tokens): +def _bottle(routes): return Manifest.from_json_obj({ - "bottles": {"dev": {"tokens": tokens}}, + "bottles": {"dev": {"cred_proxy": {"routes": routes}}}, "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, }).bottles["dev"] class TestUpstreamLift(unittest.TestCase): - def test_anthropic_yields_one_route(self): - b = _bottle([{"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}]) + def test_single_route_yields_single_upstream(self): + b = _bottle([ + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", + "role": "anthropic-base-url"}, + ]) upstreams = cred_proxy_upstreams_for_bottle(b) self.assertEqual(1, len(upstreams)) u = upstreams[0] - self.assertEqual("anthropic", u.kind) self.assertEqual("/anthropic/", u.path) self.assertEqual("https://api.anthropic.com", u.upstream) self.assertEqual("Bearer", u.auth_scheme) self.assertEqual("CRED_PROXY_TOKEN_0", u.token_env) self.assertEqual("CLAUDE_BOTTLE_OAUTH_TOKEN", u.token_ref) + self.assertEqual(("anthropic-base-url",), u.roles) - def test_github_yields_two_routes_sharing_token_env(self): - b = _bottle([{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}]) + def test_shared_token_ref_collapses_to_one_slot(self): + # Two github routes share GH_PAT — they share token_env. + b = _bottle([ + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "GH_PAT"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH_PAT", + "role": "git-insteadof"}, + ]) upstreams = cred_proxy_upstreams_for_bottle(b) self.assertEqual(2, len(upstreams)) - paths = [u.path for u in upstreams] - self.assertIn("/gh-api/", paths) - self.assertIn("/gh-git/", paths) - self.assertEqual({"CRED_PROXY_TOKEN_0"}, {u.token_env for u in upstreams}) - for u in upstreams: - self.assertEqual("Bearer", u.auth_scheme) - self.assertEqual("GITHUB_TOKEN", u.token_ref) + self.assertEqual({"CRED_PROXY_TOKEN_0"}, + {u.token_env for u in upstreams}) - def test_gitea_uses_token_scheme_and_host_path(self): + def test_distinct_token_refs_get_distinct_slots(self): b = _bottle([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is"}, - ]) - u = cred_proxy_upstreams_for_bottle(b)[0] - self.assertEqual("/gitea/gitea.dideric.is/", u.path) - self.assertEqual("https://gitea.dideric.is", u.upstream) - self.assertEqual("token", u.auth_scheme) - - def test_gitea_url_trailing_slash_stripped(self): - b = _bottle([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is/"}, - ]) - u = cred_proxy_upstreams_for_bottle(b)[0] - self.assertEqual("https://gitea.dideric.is", u.upstream) - - def test_npm_yields_one_route(self): - b = _bottle([{"Kind": "npm", "TokenRef": "NPM_TOKEN"}]) - u = cred_proxy_upstreams_for_bottle(b)[0] - self.assertEqual("/npm/", u.path) - self.assertEqual("https://registry.npmjs.org", u.upstream) - - def test_four_kinds_get_distinct_token_envs(self): - b = _bottle([ - {"Kind": "anthropic", "TokenRef": "A"}, - {"Kind": "github", "TokenRef": "G"}, - {"Kind": "gitea", "TokenRef": "T", - "Url": "https://gitea.dideric.is"}, - {"Kind": "npm", "TokenRef": "N"}, + {"path": "/a/", "upstream": "https://a.example", + "auth_scheme": "Bearer", "token_ref": "T1"}, + {"path": "/b/", "upstream": "https://b.example", + "auth_scheme": "Bearer", "token_ref": "T2"}, + {"path": "/c/", "upstream": "https://c.example", + "auth_scheme": "Bearer", "token_ref": "T1"}, ]) upstreams = cred_proxy_upstreams_for_bottle(b) - # 1 anthropic + 2 github + 1 gitea + 1 npm = 5 routes - self.assertEqual(5, len(upstreams)) - # github shares one token_env across its two routes -> 4 distinct - envs = {u.token_env for u in upstreams} - self.assertEqual({"CRED_PROXY_TOKEN_0", "CRED_PROXY_TOKEN_1", - "CRED_PROXY_TOKEN_2", "CRED_PROXY_TOKEN_3"}, envs) + # T1 -> slot 0, T2 -> slot 1, T1 reuses slot 0. + self.assertEqual("CRED_PROXY_TOKEN_0", upstreams[0].token_env) + self.assertEqual("CRED_PROXY_TOKEN_1", upstreams[1].token_env) + self.assertEqual("CRED_PROXY_TOKEN_0", upstreams[2].token_env) - def test_empty_tokens_yields_empty_upstreams(self): + def test_upstream_trailing_slash_stripped(self): + b = _bottle([ + {"path": "/x/", "upstream": "https://gitea.dideric.is/", + "auth_scheme": "token", "token_ref": "T"}, + ]) + self.assertEqual("https://gitea.dideric.is", + cred_proxy_upstreams_for_bottle(b)[0].upstream) + + def test_roles_list_passes_through(self): + b = _bottle([ + {"path": "/gitea/x/", "upstream": "https://gitea.example.com", + "auth_scheme": "token", "token_ref": "T", + "role": ["git-insteadof", "tea-login"]}, + ]) + self.assertEqual(("git-insteadof", "tea-login"), + cred_proxy_upstreams_for_bottle(b)[0].roles) + + def test_empty_routes_yields_empty_upstreams(self): b = _bottle([]) self.assertEqual((), cred_proxy_upstreams_for_bottle(b)) @@ -94,43 +92,48 @@ class TestUpstreamLift(unittest.TestCase): class TestTokenEnvMap(unittest.TestCase): def test_distinct_envs_yield_full_map(self): b = _bottle([ - {"Kind": "anthropic", "TokenRef": "A"}, - {"Kind": "github", "TokenRef": "G"}, + {"path": "/a/", "upstream": "https://a.example", + "auth_scheme": "Bearer", "token_ref": "A"}, + {"path": "/b/", "upstream": "https://b.example", + "auth_scheme": "Bearer", "token_ref": "B"}, ]) m = cred_proxy_token_env_map(cred_proxy_upstreams_for_bottle(b)) self.assertEqual({"CRED_PROXY_TOKEN_0": "A", - "CRED_PROXY_TOKEN_1": "G"}, m) + "CRED_PROXY_TOKEN_1": "B"}, m) - def test_github_two_routes_coalesce_to_one_env(self): - b = _bottle([{"Kind": "github", "TokenRef": "G"}]) + def test_shared_token_ref_yields_one_env(self): + b = _bottle([ + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "GH"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH"}, + ]) m = cred_proxy_token_env_map(cred_proxy_upstreams_for_bottle(b)) - self.assertEqual({"CRED_PROXY_TOKEN_0": "G"}, m) + self.assertEqual({"CRED_PROXY_TOKEN_0": "GH"}, m) class TestRoutesRender(unittest.TestCase): def test_renders_json_with_expected_shape(self): b = _bottle([ - {"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}, - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is"}, + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN"}, + {"path": "/gitea/x/", "upstream": "https://gitea.dideric.is", + "auth_scheme": "token", "token_ref": "GITEA_TOKEN"}, ]) rendered = cred_proxy_render_routes(cred_proxy_upstreams_for_bottle(b)) payload = json.loads(rendered) self.assertEqual(["routes"], list(payload.keys())) self.assertEqual(2, len(payload["routes"])) - anthropic = payload["routes"][0] + first = payload["routes"][0] self.assertEqual({"path", "upstream", "auth_scheme", "token_env"}, - set(anthropic.keys())) - self.assertEqual("/anthropic/", anthropic["path"]) - self.assertEqual("https://api.anthropic.com", anthropic["upstream"]) - self.assertEqual("Bearer", anthropic["auth_scheme"]) - self.assertEqual("CRED_PROXY_TOKEN_0", anthropic["token_env"]) + set(first.keys())) def test_routes_carry_no_token_values_or_host_env_names(self): # routes.json lives mode-600 in the staging dir and gets # docker cp'd into the sidecar — it must not leak secret values - # or even the host-side TokenRef name. - b = _bottle([{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}]) + # or the host-side TokenRef name. + b = _bottle([{"path": "/x/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_ref": "GITHUB_TOKEN"}]) rendered = cred_proxy_render_routes(cred_proxy_upstreams_for_bottle(b)) self.assertNotIn("GITHUB_TOKEN", rendered) @@ -173,7 +176,13 @@ class TestCredProxyPrepare(unittest.TestCase): def start(self, plan): return "" def stop(self, target): return None - b = _bottle([{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}]) + b = _bottle([ + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "GITHUB_TOKEN"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GITHUB_TOKEN", + "role": "git-insteadof"}, + ]) with tempfile.TemporaryDirectory() as td: stage = Path(td) plan = StubCredProxy().prepare(b, "test-slug", stage) diff --git a/tests/unit/test_docker_cred_proxy.py b/tests/unit/test_docker_cred_proxy.py index 5a0be20..d1c6c42 100644 --- a/tests/unit/test_docker_cred_proxy.py +++ b/tests/unit/test_docker_cred_proxy.py @@ -57,7 +57,7 @@ class TestStartGuards(unittest.TestCase): def test_missing_internal_network_dies(self): upstream = CredProxyUpstream( - kind="anthropic", path="/anthropic/", + path="/anthropic/", upstream="https://api.anthropic.com", auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", token_ref="T", @@ -67,7 +67,7 @@ class TestStartGuards(unittest.TestCase): def test_missing_routes_file_dies(self): upstream = CredProxyUpstream( - kind="anthropic", path="/anthropic/", + path="/anthropic/", upstream="https://api.anthropic.com", auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", token_ref="T", @@ -84,7 +84,7 @@ class TestStartGuards(unittest.TestCase): # URL set + CA path empty/missing is a wiring bug: either both # populated (production) or both empty (test escape hatch). upstream = CredProxyUpstream( - kind="anthropic", path="/anthropic/", + path="/anthropic/", upstream="https://api.anthropic.com", auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", token_ref="T", diff --git a/tests/unit/test_manifest_tokens.py b/tests/unit/test_manifest_tokens.py index d99bf4b..c6cd8ab 100644 --- a/tests/unit/test_manifest_tokens.py +++ b/tests/unit/test_manifest_tokens.py @@ -1,4 +1,4 @@ -"""Unit: Bottle.tokens manifest parsing + validation (PRD 0010).""" +"""Unit: bottle.cred_proxy.routes manifest parsing + validation (PRD 0010).""" import unittest @@ -6,8 +6,8 @@ from claude_bottle.log import Die from claude_bottle.manifest import Manifest -def _manifest(tokens, git=None): - bottle: dict[str, object] = {"tokens": tokens} +def _manifest(routes, git=None): + bottle: dict[str, object] = {"cred_proxy": {"routes": routes}} if git is not None: bottle["git"] = git return { @@ -16,177 +16,156 @@ def _manifest(tokens, git=None): } -class TestTokenEntryParsing(unittest.TestCase): - def test_parses_anthropic_entry(self): +class TestCredProxyRouteParsing(unittest.TestCase): + def test_parses_minimal_route(self): m = Manifest.from_json_obj(_manifest([ - {"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}, + {"path": "/anthropic/", + "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", + "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN"}, ])) - entries = m.bottles["dev"].tokens - self.assertEqual(1, len(entries)) - e = entries[0] - self.assertEqual("anthropic", e.Kind) - self.assertEqual("CLAUDE_BOTTLE_OAUTH_TOKEN", e.TokenRef) - self.assertEqual("", e.Url) - self.assertEqual("api.anthropic.com", e.UpstreamHost) + routes = m.bottles["dev"].cred_proxy.routes + self.assertEqual(1, len(routes)) + r = routes[0] + self.assertEqual("/anthropic/", r.Path) + self.assertEqual("https://api.anthropic.com", r.Upstream) + self.assertEqual("Bearer", r.AuthScheme) + self.assertEqual("CLAUDE_BOTTLE_OAUTH_TOKEN", r.TokenRef) + self.assertEqual((), r.Role) + self.assertEqual("api.anthropic.com", r.UpstreamHost) - def test_parses_github_entry(self): + def test_role_string_normalizes_to_tuple(self): m = Manifest.from_json_obj(_manifest([ - {"Kind": "github", "TokenRef": "GITHUB_TOKEN"}, + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "T", + "role": "anthropic-base-url"}, ])) - e = m.bottles["dev"].tokens[0] - self.assertEqual("github", e.Kind) - self.assertEqual("github.com", e.UpstreamHost) + self.assertEqual(("anthropic-base-url",), + m.bottles["dev"].cred_proxy.routes[0].Role) - def test_parses_npm_entry(self): + def test_role_list_supported(self): m = Manifest.from_json_obj(_manifest([ - {"Kind": "npm", "TokenRef": "NPM_TOKEN"}, + {"path": "/gitea/x/", "upstream": "https://gitea.example.com", + "auth_scheme": "token", "token_ref": "T", + "role": ["git-insteadof", "tea-login"]}, ])) - e = m.bottles["dev"].tokens[0] - self.assertEqual("registry.npmjs.org", e.UpstreamHost) + self.assertEqual(("git-insteadof", "tea-login"), + m.bottles["dev"].cred_proxy.routes[0].Role) - def test_parses_gitea_entry_with_url(self): + def test_upstream_host_extracted(self): m = Manifest.from_json_obj(_manifest([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is"}, + {"path": "/gitea/x/", "upstream": "https://gitea.dideric.is:30443", + "auth_scheme": "token", "token_ref": "T"}, ])) - e = m.bottles["dev"].tokens[0] - self.assertEqual("gitea", e.Kind) - self.assertEqual("https://gitea.dideric.is", e.Url) - self.assertEqual("gitea.dideric.is", e.UpstreamHost) - - def test_gitea_url_with_port_strips_port_from_host(self): - m = Manifest.from_json_obj(_manifest([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is:30009"}, - ])) - self.assertEqual("gitea.dideric.is", m.bottles["dev"].tokens[0].UpstreamHost) + self.assertEqual("gitea.dideric.is", + m.bottles["dev"].cred_proxy.routes[0].UpstreamHost) -class TestTokenEntryValidation(unittest.TestCase): - def test_unknown_kind_dies(self): +class TestCredProxyRouteValidation(unittest.TestCase): + def _route(self, **overrides): + base = { + "path": "/x/", + "upstream": "https://example.com", + "auth_scheme": "Bearer", + "token_ref": "TOK", + } + base.update(overrides) + return base + + def test_missing_path_dies(self): with self.assertRaises(Die): - Manifest.from_json_obj(_manifest([ - {"Kind": "aws", "TokenRef": "AWS_TOKEN"}, - ])) + Manifest.from_json_obj(_manifest([self._route(path=None)])) - def test_missing_kind_dies(self): + def test_path_without_trailing_slash_dies(self): with self.assertRaises(Die): - Manifest.from_json_obj(_manifest([ - {"TokenRef": "GITHUB_TOKEN"}, - ])) + Manifest.from_json_obj(_manifest([self._route(path="/no-slash")])) + + def test_path_without_leading_slash_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([self._route(path="no-slash/")])) + + def test_missing_upstream_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([self._route(upstream=None)])) + + def test_non_https_upstream_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([self._route(upstream="http://x.example")])) + + def test_unknown_auth_scheme_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([self._route(auth_scheme="Basic")])) def test_missing_token_ref_dies(self): with self.assertRaises(Die): - Manifest.from_json_obj(_manifest([ - {"Kind": "github"}, - ])) + Manifest.from_json_obj(_manifest([self._route(token_ref=None)])) - def test_gitea_without_url_dies(self): + def test_unknown_role_dies(self): + with self.assertRaises(Die): + Manifest.from_json_obj(_manifest([self._route(role="something-made-up")])) + + +class TestCredProxyCrossValidation(unittest.TestCase): + def test_duplicate_path_dies(self): with self.assertRaises(Die): Manifest.from_json_obj(_manifest([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN"}, + {"path": "/x/", "upstream": "https://a.example", + "auth_scheme": "Bearer", "token_ref": "T1"}, + {"path": "/x/", "upstream": "https://b.example", + "auth_scheme": "Bearer", "token_ref": "T2"}, ])) - def test_gitea_with_non_https_url_dies(self): + def test_two_routes_same_anthropic_role_dies(self): with self.assertRaises(Die): Manifest.from_json_obj(_manifest([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "http://gitea.dideric.is"}, + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "A1", + "role": "anthropic-base-url"}, + {"path": "/anthropic-2/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "A2", + "role": "anthropic-base-url"}, ])) - def test_non_gitea_kind_with_url_dies(self): - # Url is fixed for anthropic / github / npm — passing one is a - # configuration smell, not an override knob. - with self.assertRaises(Die): - Manifest.from_json_obj(_manifest([ - {"Kind": "github", "TokenRef": "GITHUB_TOKEN", - "Url": "https://api.example.com"}, - ])) - - def test_duplicate_non_gitea_kind_dies(self): - with self.assertRaises(Die): - Manifest.from_json_obj(_manifest([ - {"Kind": "github", "TokenRef": "A"}, - {"Kind": "github", "TokenRef": "B"}, - ])) - - def test_two_gitea_with_distinct_urls_ok(self): + def test_multiple_git_insteadof_ok(self): + # git-insteadof is not a singleton role — each route can + # independently rewrite its own host. m = Manifest.from_json_obj(_manifest([ - {"Kind": "gitea", "TokenRef": "T1", - "Url": "https://gitea.dideric.is"}, - {"Kind": "gitea", "TokenRef": "T2", - "Url": "https://gitea.example.com"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH", + "role": "git-insteadof"}, + {"path": "/gitea/x/", "upstream": "https://gitea.example.com", + "auth_scheme": "token", "token_ref": "GT", + "role": "git-insteadof"}, ])) - self.assertEqual(2, len(m.bottles["dev"].tokens)) + self.assertEqual(2, len(m.bottles["dev"].cred_proxy.routes)) - def test_two_gitea_with_same_url_dies(self): + +class TestLegacyTokensField(unittest.TestCase): + def test_legacy_tokens_field_dies_with_hint(self): + # The PRD-iteration shape ({"tokens": [{Kind: ...}]}) was + # replaced by cred_proxy.routes; old manifests must fail + # loudly with a pointer. with self.assertRaises(Die): - Manifest.from_json_obj(_manifest([ - {"Kind": "gitea", "TokenRef": "T1", - "Url": "https://gitea.dideric.is"}, - {"Kind": "gitea", "TokenRef": "T2", - "Url": "https://gitea.dideric.is"}, - ])) + Manifest.from_json_obj({ + "bottles": {"dev": {"tokens": [ + {"Kind": "anthropic", "TokenRef": "T"}, + ]}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }) -class TestTokenGitCoexistence(unittest.TestCase): - """git-gate brokers SSH push/fetch via an IdentityFile; cred-proxy - brokers HTTPS REST API calls via a PAT. Declaring both on the same - host is the common dev setup (SSH key for git ops, PAT for `tea` / - `gh` API calls), not a configuration error.""" - - def test_github_token_and_github_git_entry_coexist(self): - m = Manifest.from_json_obj(_manifest( - tokens=[{"Kind": "github", "TokenRef": "GITHUB_TOKEN"}], - git=[{ - "Name": "myrepo", - "Upstream": "ssh://git@github.com/me/myrepo.git", - "IdentityFile": "/dev/null", - }], - )) - self.assertEqual(1, len(m.bottles["dev"].tokens)) - self.assertEqual(1, len(m.bottles["dev"].git)) - - def test_gitea_token_and_same_host_git_entry_coexist(self): - m = Manifest.from_json_obj(_manifest( - tokens=[{ - "Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is", - }], - git=[{ - "Name": "myrepo", - "Upstream": "ssh://git@gitea.dideric.is:30009/me/myrepo.git", - "IdentityFile": "/dev/null", - }], - )) - self.assertEqual("gitea.dideric.is", m.bottles["dev"].tokens[0].UpstreamHost) - self.assertEqual("gitea.dideric.is", m.bottles["dev"].git[0].UpstreamHost) - - def test_anthropic_token_and_git_unrelated(self): - # api.anthropic.com isn't a git host; coexistence is trivial. - m = Manifest.from_json_obj(_manifest( - tokens=[{"Kind": "anthropic", "TokenRef": "CLAUDE_BOTTLE_OAUTH_TOKEN"}], - git=[{ - "Name": "myrepo", - "Upstream": "ssh://git@gitea.dideric.is:30009/me/myrepo.git", - "IdentityFile": "/dev/null", - }], - )) - self.assertEqual(1, len(m.bottles["dev"].tokens)) - - -class TestEmptyTokensField(unittest.TestCase): - def test_no_tokens_field_yields_empty_tuple(self): +class TestEmptyCredProxy(unittest.TestCase): + def test_no_cred_proxy_field_yields_empty_routes(self): m = Manifest.from_json_obj({ "bottles": {"dev": {}}, "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, }) - self.assertEqual((), m.bottles["dev"].tokens) + self.assertEqual((), m.bottles["dev"].cred_proxy.routes) - def test_tokens_array_type_required(self): + def test_routes_array_type_required(self): with self.assertRaises(Die): Manifest.from_json_obj({ - "bottles": {"dev": {"tokens": "not-a-list"}}, + "bottles": {"dev": {"cred_proxy": {"routes": "not-a-list"}}}, "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, }) diff --git a/tests/unit/test_pipelock_allowlist.py b/tests/unit/test_pipelock_allowlist.py index bd8bb31..0743974 100644 --- a/tests/unit/test_pipelock_allowlist.py +++ b/tests/unit/test_pipelock_allowlist.py @@ -37,54 +37,43 @@ class TestEffectiveAllowlist(unittest.TestCase): self.assertEqual(eff, sorted(eff), "sorted") +def _routes(routes): + return {"cred_proxy": {"routes": routes}} + + class TestTokenHosts(unittest.TestCase): - def test_github_yields_both_hosts(self): - hosts = pipelock_token_hosts(_bottle({ - "tokens": [{"Kind": "github", "TokenRef": "GH"}], - })) + def test_each_route_contributes_its_upstream_host(self): + hosts = pipelock_token_hosts(_bottle(_routes([ + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "GH"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH"}, + ]))) self.assertEqual(["api.github.com", "github.com"], hosts) - def test_gitea_yields_configured_host(self): - hosts = pipelock_token_hosts(_bottle({ - "tokens": [{"Kind": "gitea", "TokenRef": "T", - "Url": "https://gitea.dideric.is"}], - })) - self.assertEqual(["gitea.dideric.is"], hosts) + def test_dedupe_across_routes(self): + hosts = pipelock_token_hosts(_bottle(_routes([ + {"path": "/a/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_ref": "T1"}, + {"path": "/b/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_ref": "T2"}, + ]))) + self.assertEqual(["x.example"], hosts) - def test_npm_yields_registry(self): - hosts = pipelock_token_hosts(_bottle({ - "tokens": [{"Kind": "npm", "TokenRef": "N"}], - })) - self.assertEqual(["registry.npmjs.org"], hosts) - - def test_anthropic_yields_api_host(self): - hosts = pipelock_token_hosts(_bottle({ - "tokens": [{"Kind": "anthropic", "TokenRef": "A"}], - })) - self.assertEqual(["api.anthropic.com"], hosts) - - def test_no_tokens_empty(self): + def test_no_routes_empty(self): self.assertEqual([], pipelock_token_hosts(_bottle({}))) class TestAllowlistWithTokens(unittest.TestCase): - def test_token_hosts_added_to_allowlist(self): - eff = pipelock_effective_allowlist(_bottle({ - "tokens": [ - {"Kind": "npm", "TokenRef": "N"}, - {"Kind": "github", "TokenRef": "G"}, - ], - })) + def test_route_hosts_added_to_allowlist(self): + eff = pipelock_effective_allowlist(_bottle(_routes([ + {"path": "/npm/", "upstream": "https://registry.npmjs.org", + "auth_scheme": "Bearer", "token_ref": "N"}, + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "G"}, + ]))) self.assertIn("registry.npmjs.org", eff) self.assertIn("api.github.com", eff) - self.assertIn("github.com", eff) - - def test_gitea_host_added(self): - eff = pipelock_effective_allowlist(_bottle({ - "tokens": [{"Kind": "gitea", "TokenRef": "T", - "Url": "https://gitea.dideric.is"}], - })) - self.assertIn("gitea.dideric.is", eff) class TestTlsPassthrough(unittest.TestCase): @@ -92,21 +81,17 @@ class TestTlsPassthrough(unittest.TestCase): passthrough = pipelock_effective_tls_passthrough(_bottle({})) self.assertEqual(["api.anthropic.com"], passthrough) - def test_token_hosts_NOT_added_to_passthrough(self): - # cred-proxy now trusts pipelock's per-bottle CA (loaded into - # its container's trust store via docker cp + update-ca- - # certificates at start time), so pipelock can MITM the - # cred-proxy -> upstream leg and body-scan it. Auto-adding - # cred-proxy hosts to passthrough would silently disable that - # second scanner for github / gitea / npm. - passthrough = pipelock_effective_tls_passthrough(_bottle({ - "tokens": [ - {"Kind": "github", "TokenRef": "G"}, - {"Kind": "npm", "TokenRef": "N"}, - {"Kind": "gitea", "TokenRef": "T", - "Url": "https://gitea.dideric.is"}, - ], - })) + def test_route_hosts_NOT_added_to_passthrough(self): + # cred-proxy now trusts pipelock's per-bottle CA, so pipelock + # can MITM the cred-proxy -> upstream leg and body-scan it. + # Auto-adding cred-proxy hosts to passthrough would silently + # disable that second scanner. + passthrough = pipelock_effective_tls_passthrough(_bottle(_routes([ + {"path": "/gh-api/", "upstream": "https://api.github.com", + "auth_scheme": "Bearer", "token_ref": "G"}, + {"path": "/npm/", "upstream": "https://registry.npmjs.org", + "auth_scheme": "Bearer", "token_ref": "N"}, + ]))) self.assertEqual(["api.anthropic.com"], passthrough) diff --git a/tests/unit/test_provision_cred_proxy.py b/tests/unit/test_provision_cred_proxy.py index 5093cc6..d58735a 100644 --- a/tests/unit/test_provision_cred_proxy.py +++ b/tests/unit/test_provision_cred_proxy.py @@ -14,103 +14,106 @@ from claude_bottle.cred_proxy import cred_proxy_upstreams_for_bottle from claude_bottle.manifest import Manifest -def _bottle(tokens): +def _bottle(routes): return Manifest.from_json_obj({ - "bottles": {"dev": {"tokens": tokens}}, + "bottles": {"dev": {"cred_proxy": {"routes": routes}}}, "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, }).bottles["dev"] -def _upstreams(tokens): - return cred_proxy_upstreams_for_bottle(_bottle(tokens)) +def _upstreams(routes): + return cred_proxy_upstreams_for_bottle(_bottle(routes)) class TestRenderNpmrc(unittest.TestCase): - def test_empty_when_no_npm_route(self): + def test_empty_when_no_role(self): self.assertEqual("", render_npmrc(_upstreams([]))) self.assertEqual("", render_npmrc(_upstreams([ - {"Kind": "anthropic", "TokenRef": "A"}, + {"path": "/x/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_ref": "T"}, ]))) - def test_writes_registry_line(self): + def test_writes_registry_line_for_npm_registry_role(self): out = render_npmrc(_upstreams([ - {"Kind": "npm", "TokenRef": "NPM_TOKEN"}, + {"path": "/npm/", "upstream": "https://registry.npmjs.org", + "auth_scheme": "Bearer", "token_ref": "NPM_TOKEN", + "role": "npm-registry"}, ])) self.assertEqual("registry=http://cred-proxy:9099/npm/\n", out) def test_omits_authtoken(self): - # The proxy injects Authorization at request time. The npmrc - # deliberately carries no _authToken — a stale token there - # would just get stripped, but it also creates the false - # impression that the agent holds a credential. + # The proxy injects Authorization at request time. out = render_npmrc(_upstreams([ - {"Kind": "npm", "TokenRef": "NPM_TOKEN"}, + {"path": "/npm/", "upstream": "https://registry.npmjs.org", + "auth_scheme": "Bearer", "token_ref": "NPM_TOKEN", + "role": "npm-registry"}, ])) self.assertNotIn("_authToken", out) self.assertNotIn("NPM_TOKEN", out) class TestRenderGitconfig(unittest.TestCase): - def test_empty_when_no_github_or_gitea(self): + def test_empty_when_no_role(self): self.assertEqual("", render_cred_proxy_gitconfig(_upstreams([ - {"Kind": "anthropic", "TokenRef": "A"}, - {"Kind": "npm", "TokenRef": "N"}, + {"path": "/anthropic/", "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "A"}, ]))) - def test_github_writes_https_insteadof(self): + def test_writes_insteadof_for_git_insteadof_role(self): out = render_cred_proxy_gitconfig(_upstreams([ - {"Kind": "github", "TokenRef": "GITHUB_TOKEN"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH", + "role": "git-insteadof"}, ])) self.assertIn('[url "http://cred-proxy:9099/gh-git/"]', out) self.assertIn("insteadOf = https://github.com/", out) def test_gitea_writes_per_host_insteadof(self): out = render_cred_proxy_gitconfig(_upstreams([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is"}, + {"path": "/gitea/dideric/", "upstream": "https://gitea.dideric.is", + "auth_scheme": "token", "token_ref": "GITEA", + "role": "git-insteadof"}, ])) - self.assertIn('[url "http://cred-proxy:9099/gitea/gitea.dideric.is/"]', out) + self.assertIn('[url "http://cred-proxy:9099/gitea/dideric/"]', out) self.assertIn("insteadOf = https://gitea.dideric.is/", out) - def test_two_giteas_yield_two_rules(self): + def test_two_routes_yield_two_rules(self): out = render_cred_proxy_gitconfig(_upstreams([ - {"Kind": "gitea", "TokenRef": "G1", - "Url": "https://gitea.dideric.is"}, - {"Kind": "gitea", "TokenRef": "G2", - "Url": "https://gitea.example.com"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH", + "role": "git-insteadof"}, + {"path": "/gitea/x/", "upstream": "https://gitea.example.com", + "auth_scheme": "token", "token_ref": "GT", + "role": "git-insteadof"}, ])) self.assertEqual(2, out.count("insteadOf")) - self.assertIn("gitea.dideric.is/", out) - self.assertIn("gitea.example.com/", out) + self.assertIn("github.com", out) + self.assertIn("gitea.example.com", out) - def test_github_suppressed_when_git_gate_covers_host(self): + def test_suppressed_when_git_gate_covers_host(self): # When bottle.git brokers github.com over SSH, git-gate is the # canonical git path. The cred-proxy https://github.com/ # rewrite would let the agent push over HTTPS — bypassing # gitleaks. Suppress it. out = render_cred_proxy_gitconfig( - _upstreams([{"Kind": "github", "TokenRef": "GH"}]), + _upstreams([ + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "GH", + "role": "git-insteadof"}, + ]), {"github.com"}, ) self.assertEqual("", out) - def test_gitea_suppressed_when_git_gate_covers_host(self): - out = render_cred_proxy_gitconfig( - _upstreams([{"Kind": "gitea", "TokenRef": "T", - "Url": "https://gitea.dideric.is"}]), - {"gitea.dideric.is"}, - ) - self.assertEqual("", out) - - def test_partial_suppression_keeps_other_giteas(self): - # Two gitea instances; git-gate brokers one. The other still - # gets the cred-proxy rewrite. + def test_partial_suppression_keeps_other_hosts(self): out = render_cred_proxy_gitconfig( _upstreams([ - {"Kind": "gitea", "TokenRef": "T1", - "Url": "https://gitea.dideric.is"}, - {"Kind": "gitea", "TokenRef": "T2", - "Url": "https://gitea.example.com"}, + {"path": "/gitea/a/", "upstream": "https://gitea.dideric.is", + "auth_scheme": "token", "token_ref": "T1", + "role": "git-insteadof"}, + {"path": "/gitea/b/", "upstream": "https://gitea.example.com", + "auth_scheme": "token", "token_ref": "T2", + "role": "git-insteadof"}, ]), {"gitea.dideric.is"}, ) @@ -119,24 +122,39 @@ class TestRenderGitconfig(unittest.TestCase): class TestRenderTeaConfig(unittest.TestCase): - def test_empty_when_no_gitea(self): + def test_empty_when_no_role(self): self.assertEqual("", render_tea_config(_upstreams([ - {"Kind": "github", "TokenRef": "G"}, + {"path": "/gh-git/", "upstream": "https://github.com", + "auth_scheme": "Bearer", "token_ref": "G"}, ]))) - def test_single_gitea_login_block(self): + def test_single_login_block(self): out = render_tea_config(_upstreams([ - {"Kind": "gitea", "TokenRef": "GITEA_TOKEN", - "Url": "https://gitea.dideric.is"}, + {"path": "/gitea/dideric/", "upstream": "https://gitea.dideric.is", + "auth_scheme": "token", "token_ref": "GITEA", + "role": "tea-login"}, ])) self.assertIn("logins:", out) + # Login name comes from the upstream host, not the path — + # the path may not encode the host. self.assertIn("- name: gitea.dideric.is", out) - self.assertIn("url: http://cred-proxy:9099/gitea/gitea.dideric.is/", out) - # Placeholder token, not the host env var name (which is not a - # secret but also not useful) or the real value (which the - # provisioner does not have). + self.assertIn("url: http://cred-proxy:9099/gitea/dideric/", out) self.assertIn("token: cred-proxy-placeholder", out) - self.assertNotIn("GITEA_TOKEN", out) + self.assertNotIn("GITEA", out) + + +class TestCombinedRoles(unittest.TestCase): + """A single gitea route typically carries both `git-insteadof` + and `tea-login` — the renderers should each fire independently.""" + + def test_gitea_route_fires_both_renderers(self): + routes = _upstreams([ + {"path": "/gitea/x/", "upstream": "https://gitea.example.com", + "auth_scheme": "token", "token_ref": "T", + "role": ["git-insteadof", "tea-login"]}, + ]) + self.assertIn("insteadOf", render_cred_proxy_gitconfig(routes)) + self.assertIn("logins:", render_tea_config(routes)) if __name__ == "__main__": -- 2.52.0 From 2990c3c90319d5b68c67565fe7800a8b69776b43 Mon Sep 17 00:00:00 2001 From: didericis Date: Fri, 15 May 2026 02:39:10 -0400 Subject: [PATCH 16/24] refactor(cred_proxy): rename Upstream -> Route, fix tea-login AttributeError MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three leftovers from the manifest refactor: 1. provision/cred_proxy.py:223 referenced u.kind == 'gitea' for the tea login count — kind was removed from the runtime class, so any bottle with a tea-login route raised AttributeError at provision time. Switch to `'tea-login' in r.roles`. 2. The runtime class CredProxyUpstream is renamed to CredProxyRoute (its data is a route on the proxy, not an "upstream"; the field route.upstream is the upstream URL). Module's own naming now aligns with manifest.CredProxyRoute and routes.json. 3. cred_proxy_upstreams_for_bottle -> cred_proxy_routes_for_bottle; CredProxyPlan.upstreams -> CredProxyPlan.routes; local `upstreams` collections become `routes`. Callers in backend.py, launch.py, prepare.py, bottle_plan.py, provision/cred_proxy.py, and tests updated. Also strips lingering `bottle.tokens` references from docstrings (pipelock.py, cred_proxy.py prepare(), manifest._parse_https_host, test_pipelock_allowlist.py module doc) and removes dead helpers from the integration test (the _bottle helper used a tokens field that no longer parses). --- claude_bottle/backend/docker/bottle_plan.py | 22 ++-- claude_bottle/backend/docker/cred_proxy.py | 6 +- claude_bottle/backend/docker/launch.py | 4 +- claude_bottle/backend/docker/prepare.py | 2 +- .../backend/docker/provision/cred_proxy.py | 63 +++++----- claude_bottle/cred_proxy.py | 109 ++++++++++-------- claude_bottle/manifest.py | 5 +- claude_bottle/pipelock.py | 2 +- tests/integration/test_cred_proxy_sidecar.py | 31 +---- tests/unit/test_cred_proxy.py | 24 ++-- tests/unit/test_docker_cred_proxy.py | 16 +-- tests/unit/test_pipelock_allowlist.py | 4 +- tests/unit/test_provision_cred_proxy.py | 4 +- 13 files changed, 141 insertions(+), 151 deletions(-) diff --git a/claude_bottle/backend/docker/bottle_plan.py b/claude_bottle/backend/docker/bottle_plan.py index 286d64d..e02ca9c 100644 --- a/claude_bottle/backend/docker/bottle_plan.py +++ b/claude_bottle/backend/docker/bottle_plan.py @@ -106,11 +106,11 @@ class DockerBottlePlan(BottlePlan): info(f" git gate : {'; '.join(git_lines)}") else: info(" git remotes : (none)") - if self.cred_proxy_plan.upstreams: - routes = [f"{u.path}→{u.upstream}" for u in self.cred_proxy_plan.upstreams] - refs = sorted({u.token_ref for u in self.cred_proxy_plan.upstreams}) - info(f" cred-proxy : {len(routes)} route(s); tokens: {', '.join(refs)}") - for line in routes: + if self.cred_proxy_plan.routes: + lines = [f"{r.path}→{r.upstream}" for r in self.cred_proxy_plan.routes] + refs = sorted({r.token_ref for r in self.cred_proxy_plan.routes}) + info(f" cred-proxy : {len(lines)} route(s); tokens: {', '.join(refs)}") + for line in lines: info(f" {line}") else: info(" cred-proxy : (none)") @@ -148,13 +148,13 @@ class DockerBottlePlan(BottlePlan): ], "cred_proxy": [ { - "path": u.path, - "upstream": u.upstream, - "auth_scheme": u.auth_scheme, - "token_ref": u.token_ref, - "roles": list(u.roles), + "path": r.path, + "upstream": r.upstream, + "auth_scheme": r.auth_scheme, + "token_ref": r.token_ref, + "roles": list(r.roles), } - for u in self.cred_proxy_plan.upstreams + for r in self.cred_proxy_plan.routes ], "egress": { "host_count": len(hosts), diff --git a/claude_bottle/backend/docker/cred_proxy.py b/claude_bottle/backend/docker/cred_proxy.py index 54213cc..8c40ced 100644 --- a/claude_bottle/backend/docker/cred_proxy.py +++ b/claude_bottle/backend/docker/cred_proxy.py @@ -1,6 +1,6 @@ """DockerCredProxy — the Docker-specific lifecycle for the per-bottle cred-proxy sidecar (PRD 0010). Inherits the platform-agnostic prepare -step (upstream lift + routes.json render + token-env-map derivation) +step (route lift + routes.json render + token-env-map derivation) from `CredProxy`.""" from __future__ import annotations @@ -91,8 +91,8 @@ class DockerCredProxy(CredProxy): reach the real upstream over HTTPS. 6. `docker start`. Returns the container name (the target passed to `.stop`).""" - if not plan.upstreams: - die("DockerCredProxy.start called with no upstreams; caller should skip") + if not plan.routes: + die("DockerCredProxy.start called with no routes; caller should skip") if not plan.internal_network or not plan.egress_network: die( "DockerCredProxy.start: internal_network / egress_network must be " diff --git a/claude_bottle/backend/docker/launch.py b/claude_bottle/backend/docker/launch.py index a32747d..407f925 100644 --- a/claude_bottle/backend/docker/launch.py +++ b/claude_bottle/backend/docker/launch.py @@ -105,7 +105,7 @@ def launch( stack.callback(git_gate.stop, git_gate_name) # Cred-proxy (PRD 0010). One sidecar per bottle when - # bottle.tokens declares any kind. Must come up AFTER pipelock + # bottle.cred_proxy.routes is non-empty. Must come up AFTER pipelock # — cred-proxy routes its outbound HTTPS through pipelock # (HTTPS_PROXY in environ + the per-bottle CA in its trust # store) so the egress allowlist + body scanner sit in the @@ -113,7 +113,7 @@ def launch( # resolution for `cred-proxy` succeeds on the agent's first # call; tokens flow from the host env into the sidecar's # environ, not the agent's. - if plan.cred_proxy_plan.upstreams: + if plan.cred_proxy_plan.routes: cred_proxy_plan = dataclasses.replace( plan.cred_proxy_plan, internal_network=internal_network, diff --git a/claude_bottle/backend/docker/prepare.py b/claude_bottle/backend/docker/prepare.py index 46a81ea..2e22f7b 100644 --- a/claude_bottle/backend/docker/prepare.py +++ b/claude_bottle/backend/docker/prepare.py @@ -93,7 +93,7 @@ def resolve_plan( # anthropic-base-url role. Manifest validation enforces the # singleton constraint. anthropic_route = next( - (u for u in cred_proxy_plan.upstreams if "anthropic-base-url" in u.roles), + (r for r in cred_proxy_plan.routes if "anthropic-base-url" in r.roles), None, ) if spec.forward_oauth_token and anthropic_route is None: diff --git a/claude_bottle/backend/docker/provision/cred_proxy.py b/claude_bottle/backend/docker/provision/cred_proxy.py index a375cd1..53da4ea 100644 --- a/claude_bottle/backend/docker/provision/cred_proxy.py +++ b/claude_bottle/backend/docker/provision/cred_proxy.py @@ -22,7 +22,7 @@ import os import subprocess from pathlib import Path -from ....cred_proxy import CredProxyUpstream +from ....cred_proxy import CredProxyRoute from ....log import info from .. import util as docker_mod from ..bottle_plan import DockerBottlePlan @@ -31,21 +31,21 @@ from ..cred_proxy import cred_proxy_url def provision_cred_proxy(plan: DockerBottlePlan, target: str) -> None: """Drop the agent-side dotfiles for each declared cred-proxy - route. No-op when the bottle has no tokens.""" - upstreams = plan.cred_proxy_plan.upstreams - if not upstreams: + route. No-op when the bottle has no routes.""" + routes = plan.cred_proxy_plan.routes + if not routes: return bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name) git_gate_hosts = {g.UpstreamHost for g in bottle.git} - _provision_npmrc(plan, target, upstreams) - _provision_gitconfig(plan, target, upstreams, git_gate_hosts) - _provision_tea_config(plan, target, upstreams) + _provision_npmrc(plan, target, routes) + _provision_gitconfig(plan, target, routes, git_gate_hosts) + _provision_tea_config(plan, target, routes) # --- npm -------------------------------------------------------------------- -def render_npmrc(upstreams: tuple[CredProxyUpstream, ...]) -> str: +def render_npmrc(routes: tuple[CredProxyRoute, ...]) -> str: """Render `~/.npmrc` content. Driven by the `npm-registry` role: finds the (single) route that claims it and writes a registry= line at the proxy. Empty string when no such route exists, so @@ -55,18 +55,18 @@ def render_npmrc(upstreams: tuple[CredProxyUpstream, ...]) -> str: npmrc deliberately carries no `_authToken`. The registry alone is enough. Manifest validation enforces that the role is a singleton, so the first match is the only match.""" - for u in upstreams: - if "npm-registry" in u.roles: - return f"registry={cred_proxy_url()}{u.path}\n" + for r in routes: + if "npm-registry" in r.roles: + return f"registry={cred_proxy_url()}{r.path}\n" return "" def _provision_npmrc( plan: DockerBottlePlan, target: str, - upstreams: tuple[CredProxyUpstream, ...], + routes: tuple[CredProxyRoute, ...], ) -> None: - content = render_npmrc(upstreams) + content = render_npmrc(routes) if not content: return container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") @@ -88,7 +88,7 @@ def _provision_npmrc( def render_cred_proxy_gitconfig( - upstreams: tuple[CredProxyUpstream, ...], + routes: tuple[CredProxyRoute, ...], git_gate_hosts: set[str] = frozenset(), # type: ignore[assignment] ) -> str: """Render the `~/.gitconfig` fragment for cred-proxy insteadOf @@ -105,23 +105,23 @@ def render_cred_proxy_gitconfig( suppressing the rewrite means `git clone https:///...` doesn't have a tempting shortcut that just confuses on push. - The insteadOf left-hand side comes from `upstream` (with a + The insteadOf left-hand side comes from `route.upstream` (with a trailing `/` so insteadOf matches at the directory boundary), so the same renderer handles github.com, gitea.dideric.is, and any future host the user wires up.""" rules: list[str] = [] - for u in upstreams: - if "git-insteadof" not in u.roles: + for r in routes: + if "git-insteadof" not in r.roles: continue # Strip scheme to derive the host for the git-gate overlap # check. urllib.parse-free parse: same shape we accept in # manifest validation. - host = u.upstream.removeprefix("https://").partition("/")[0].partition(":")[0] + host = r.upstream.removeprefix("https://").partition("/")[0].partition(":")[0] if host in git_gate_hosts: continue rules.append( - f'[url "{cred_proxy_url()}{u.path}"]\n' - f"\tinsteadOf = {u.upstream}/\n" + f'[url "{cred_proxy_url()}{r.path}"]\n' + f"\tinsteadOf = {r.upstream}/\n" ) if not rules: return "" @@ -136,14 +136,14 @@ def render_cred_proxy_gitconfig( def _provision_gitconfig( plan: DockerBottlePlan, target: str, - upstreams: tuple[CredProxyUpstream, ...], + routes: tuple[CredProxyRoute, ...], git_gate_hosts: set[str], ) -> None: """Append the cred-proxy insteadOf rules to ~/.gitconfig. Runs after `provision_git`, so any git-gate rules already live in the file; we append rather than overwrite. Hosts already brokered by git-gate are skipped — git-gate is the canonical git path there.""" - content = render_cred_proxy_gitconfig(upstreams, git_gate_hosts) + content = render_cred_proxy_gitconfig(routes, git_gate_hosts) if not content: return container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") @@ -179,25 +179,25 @@ def _provision_gitconfig( # --- tea -------------------------------------------------------------------- -def render_tea_config(upstreams: tuple[CredProxyUpstream, ...]) -> str: +def render_tea_config(routes: tuple[CredProxyRoute, ...]) -> str: """Render `~/.config/tea/config.yml`. Driven by the `tea-login` role: each route that claims it produces one `logins:` entry pointing at the cred-proxy. The proxy substitutes the real token at request time; the value in `token:` here is a placeholder. `tea` refuses to make calls without a non-empty token field, so the placeholder is necessary.""" - tea_routes = [u for u in upstreams if "tea-login" in u.roles] + tea_routes = [r for r in routes if "tea-login" in r.roles] if not tea_routes: return "" lines = ["logins:"] - for u in tea_routes: + for r in tea_routes: # Derive a stable login name from the upstream host. The # path may not encode the host (e.g. `/gitea/dideric/` vs # upstream gitea.dideric.is), so we read it off `upstream`. - host = u.upstream.removeprefix("https://").partition("/")[0].partition(":")[0] + host = r.upstream.removeprefix("https://").partition("/")[0].partition(":")[0] lines.extend([ f"- name: {host}", - f" url: {cred_proxy_url()}{u.path}", + f" url: {cred_proxy_url()}{r.path}", " token: cred-proxy-placeholder", " default: false", " ssh_host: \"\"", @@ -210,9 +210,9 @@ def render_tea_config(upstreams: tuple[CredProxyUpstream, ...]) -> str: def _provision_tea_config( plan: DockerBottlePlan, target: str, - upstreams: tuple[CredProxyUpstream, ...], + routes: tuple[CredProxyRoute, ...], ) -> None: - content = render_tea_config(upstreams) + content = render_tea_config(routes) if not content: return container_home = os.environ.get("CLAUDE_BOTTLE_CONTAINER_HOME", "/home/node") @@ -220,7 +220,10 @@ def _provision_tea_config( cfg = plan.stage_dir / "agent_tea_config.yml" cfg.write_text(content) cfg.chmod(0o600) - info(f"writing {container_tea} ({len([u for u in upstreams if u.kind == 'gitea'])} gitea login(s))") + info( + f"writing {container_tea} " + f"({len([r for r in routes if 'tea-login' in r.roles])} tea login(s))" + ) docker_mod.docker_exec_root( target, ["mkdir", "-p", str(Path(container_tea).parent)] ) diff --git a/claude_bottle/cred_proxy.py b/claude_bottle/cred_proxy.py index 1bd4492..79ee4e5 100644 --- a/claude_bottle/cred_proxy.py +++ b/claude_bottle/cred_proxy.py @@ -14,8 +14,8 @@ escaping the agent container, the same threshold pipelock and git-gate already rely on. This module defines the abstract proxy (`CredProxy`), its plan -dataclass (`CredProxyPlan`), and the per-route shape -(`CredProxyUpstream`). The sidecar's start/stop lifecycle is backend- +dataclass (`CredProxyPlan`), and the resolved per-route shape +(`CredProxyRoute`). The sidecar's start/stop lifecycle is backend- specific and lives on concrete subclasses (see `claude_bottle/backend/docker/cred_proxy.py`). """ @@ -32,10 +32,15 @@ from .manifest import Bottle @dataclass(frozen=True) -class CredProxyUpstream: - """One route on the cred-proxy sidecar. Maps a path under the - proxy to a real upstream, an auth scheme, an in-container env-var - slot, and optional provisioner roles. +class CredProxyRoute: + """One resolved route on the cred-proxy sidecar. Maps a path + under the proxy to a real upstream, an auth scheme, an + in-container env-var slot, and optional provisioner roles. + + Distinct from `manifest.CredProxyRoute` (the declaration shape + with Capitalize fields): this is the runtime view after the + abstract `CredProxy.prepare` step assigns token slots and + normalizes URLs. Modules that need both alias one on import. `path` is the agent-facing prefix (e.g. `/anthropic/`). `upstream` is the upstream base URL with scheme. `auth_scheme` @@ -46,12 +51,12 @@ class CredProxyUpstream: `token_env` is the env-var name inside the cred-proxy container (e.g. `CRED_PROXY_TOKEN_0`); `token_ref` is the host env var the CLI reads at launch and forwards into the container's environ - under `token_env`. Routes that share a TokenRef coalesce to one - `token_env` slot. + under `token_env`. Routes that share a `token_ref` coalesce to + one `token_env` slot. `roles` are the provisioner tags from the manifest route (see `manifest.CRED_PROXY_ROLES`). Each tag drives one agent-side - rewrite when this upstream's dotfile family is written.""" + rewrite when this route's dotfile family is written.""" path: str upstream: str @@ -65,16 +70,16 @@ class CredProxyUpstream: class CredProxyPlan: """Output of CredProxy.prepare; consumed by .start. - The slug + routes_path + upstreams + token_env_map fields are + The slug + routes_path + routes + token_env_map fields are filled at prepare time (host-side, side-effect-free on docker). The network + pipelock fields are populated by the backend's launch step via `dataclasses.replace` once those resources exist. Empty defaults are sentinels meaning "not yet set"; `.start` validates that they are populated. - `token_env_map` is `{: }`. - The backend's start step reads `os.environ[TokenRef]` and forwards - the value into the cred-proxy container's environ under + `token_env_map` is `{: }`. + The backend's start step reads `os.environ[token_ref]` and + forwards the value into the cred-proxy container's environ under `token_env`. The plan itself never holds token values — secrets never land in a dataclass that might be logged. @@ -88,7 +93,7 @@ class CredProxyPlan: slug: str routes_path: Path - upstreams: tuple[CredProxyUpstream, ...] + routes: tuple[CredProxyRoute, ...] token_env_map: dict[str, str] internal_network: str = "" egress_network: str = "" @@ -96,28 +101,29 @@ class CredProxyPlan: pipelock_proxy_url: str = "" -def cred_proxy_upstreams_for_bottle( +def cred_proxy_routes_for_bottle( bottle: Bottle, -) -> tuple[CredProxyUpstream, ...]: - """Lift each `bottle.cred_proxy.routes[]` entry into a - CredProxyUpstream. Order is preserved so route lookup is stable. +) -> tuple[CredProxyRoute, ...]: + """Lift each `bottle.cred_proxy.routes[]` manifest entry into a + resolved CredProxyRoute. Order is preserved so route lookup at + the proxy is stable. - Token-env slots are assigned per distinct TokenRef: the first - route with TokenRef "GH_PAT" gets `CRED_PROXY_TOKEN_0`; a second - route with the same TokenRef shares slot 0. The launch step - forwards each TokenRef's value from the host environ into the - sidecar's environ under the matching slot name once. + Token-env slots are assigned per distinct `token_ref`: the first + route with `token_ref` "GH_PAT" gets `CRED_PROXY_TOKEN_0`; a + second route with the same `token_ref` shares slot 0. The launch + step forwards each `token_ref`'s value from the host environ into + the sidecar's environ under the matching slot name once. Manifest validation already enforced uniqueness rules (no duplicate paths, singleton-role enforcement).""" - out: list[CredProxyUpstream] = [] + out: list[CredProxyRoute] = [] slot_for_token: dict[str, str] = {} for r in bottle.cred_proxy.routes: token_env = slot_for_token.get(r.TokenRef) if token_env is None: token_env = f"CRED_PROXY_TOKEN_{len(slot_for_token)}" slot_for_token[r.TokenRef] = token_env - out.append(CredProxyUpstream( + out.append(CredProxyRoute( path=r.Path, upstream=r.Upstream.rstrip("/"), auth_scheme=r.AuthScheme, @@ -129,27 +135,27 @@ def cred_proxy_upstreams_for_bottle( def cred_proxy_token_env_map( - upstreams: tuple[CredProxyUpstream, ...], + routes: tuple[CredProxyRoute, ...], ) -> dict[str, str]: - """Collapse the upstream list into `{token_env: TokenRef}`. Two + """Collapse the route list into `{token_env: token_ref}`. Two routes that share a token (gh-api + gh-git) coalesce; the result is the set of env vars the backend's start step must forward into the sidecar's environ.""" out: dict[str, str] = {} - for u in upstreams: - existing = out.get(u.token_env) - if existing is not None and existing != u.token_ref: + for r in routes: + existing = out.get(r.token_env) + if existing is not None and existing != r.token_ref: die( - f"cred-proxy plan conflict: {u.token_env} maps to both " - f"{existing!r} and {u.token_ref!r}. Two routes sharing a " + f"cred-proxy plan conflict: {r.token_env} maps to both " + f"{existing!r} and {r.token_ref!r}. Two routes sharing a " f"token slot must reference the same host env var." ) - out[u.token_env] = u.token_ref + out[r.token_env] = r.token_ref return out def cred_proxy_render_routes( - upstreams: tuple[CredProxyUpstream, ...], + routes: tuple[CredProxyRoute, ...], ) -> str: """Serialize the route table for the cred-proxy server to read. JSON, no token values, no host env-var names — the only thing @@ -159,12 +165,12 @@ def cred_proxy_render_routes( payload = { "routes": [ { - "path": u.path, - "upstream": u.upstream, - "auth_scheme": u.auth_scheme, - "token_env": u.token_env, + "path": r.path, + "upstream": r.upstream, + "auth_scheme": r.auth_scheme, + "token_env": r.token_env, } - for u in upstreams + for r in routes ], } return json.dumps(payload, indent=2, sort_keys=False) + "\n" @@ -201,29 +207,30 @@ def cred_proxy_resolve_token_values( class CredProxy(ABC): """The per-bottle credential proxy. Encapsulates the host-side - prepare (upstream lift + routes.json render + token-env-map + prepare (route lift + routes.json render + token-env-map derivation); the sidecar's start/stop lifecycle is backend- specific and lives on concrete subclasses.""" def prepare(self, bottle: Bottle, slug: str, stage_dir: Path) -> CredProxyPlan: - """Lift `bottle.tokens` into the upstream table, render the - routes.json (mode 600) under `stage_dir`, and return the plan. - Pure host-side, no docker subprocess. The token-env map records - the mapping the launch step uses to forward values from the - host's environ into the sidecar's environ. + """Lift `bottle.cred_proxy.routes` into resolved routes, + render the routes.json (mode 600) under `stage_dir`, and + return the plan. Pure host-side, no docker subprocess. The + token-env map records the mapping the launch step uses to + forward values from the host's environ into the sidecar's + environ. Returned plan is incomplete: the launch step must fill `internal_network` / `egress_network` via `dataclasses.replace` before passing it to `.start`.""" - upstreams = cred_proxy_upstreams_for_bottle(bottle) + routes = cred_proxy_routes_for_bottle(bottle) routes_path = stage_dir / "cred_proxy_routes.json" - routes_path.write_text(cred_proxy_render_routes(upstreams)) + routes_path.write_text(cred_proxy_render_routes(routes)) routes_path.chmod(0o600) return CredProxyPlan( slug=slug, routes_path=routes_path, - upstreams=upstreams, - token_env_map=cred_proxy_token_env_map(upstreams), + routes=routes, + token_env_map=cred_proxy_token_env_map(routes), ) @abstractmethod @@ -242,9 +249,9 @@ class CredProxy(ABC): __all__ = [ "CredProxy", "CredProxyPlan", - "CredProxyUpstream", + "CredProxyRoute", "cred_proxy_render_routes", "cred_proxy_resolve_token_values", + "cred_proxy_routes_for_bottle", "cred_proxy_token_env_map", - "cred_proxy_upstreams_for_bottle", ] diff --git a/claude_bottle/manifest.py b/claude_bottle/manifest.py index dc332fe..f5dba36 100644 --- a/claude_bottle/manifest.py +++ b/claude_bottle/manifest.py @@ -608,8 +608,9 @@ def _parse_git_upstream(url: str, label: str) -> tuple[str, str, str, str]: def _parse_https_host(url: str, label: str) -> str: """Extract the host from an `https://host[:port][/path]` URL. Dies if `url` is not an https:// URL or the host segment is empty. - Used to derive `TokenEntry.UpstreamHost` from a gitea Url so the - cross-validator can spot collisions with `bottle.git` hosts.""" + Used to derive `CredProxyRoute.UpstreamHost` from a route's + `upstream` so pipelock's allowlist (and the provisioner's git-gate + overlap check) can match on host alone.""" if not url.startswith("https://"): die(f"{label} must be an https:// URL (was {url!r})") rest = url[len("https://"):] diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 3ae11e0..52f5b23 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -74,7 +74,7 @@ def pipelock_token_hosts(bottle: Bottle) -> list[str]: def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: """Deduplicated union of: baked-in defaults, bottle.egress.allowlist, - and the cred-proxy upstream hosts derived from bottle.tokens. + and the cred-proxy upstream hosts derived from bottle.cred_proxy.routes. Sorted for stability. Git upstreams declared in `bottle.git` do NOT contribute here — git traffic flows through the per-agent git-gate sidecar (PRD 0008), not pipelock.""" diff --git a/tests/integration/test_cred_proxy_sidecar.py b/tests/integration/test_cred_proxy_sidecar.py index 7e1d468..c407380 100644 --- a/tests/integration/test_cred_proxy_sidecar.py +++ b/tests/integration/test_cred_proxy_sidecar.py @@ -11,7 +11,6 @@ egress net. cred-proxy straddles both. from __future__ import annotations -import dataclasses import json import os import shutil @@ -32,8 +31,6 @@ from claude_bottle.backend.docker.network import ( network_create_internal, network_remove, ) -from claude_bottle.cred_proxy import CredProxy -from claude_bottle.manifest import Manifest from tests._docker import skip_unless_docker @@ -43,24 +40,6 @@ FAKE_UPSTREAM_HOST = "fake-upstream" FAKE_UPSTREAM_PORT = "8080" -def _bottle(tokens): - return Manifest.from_json_obj({ - "bottles": {"dev": {"tokens": tokens}}, - "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, - }).bottles["dev"] - - -class _StubCredProxy(CredProxy): - """CredProxy.prepare's render uses the Kind defaults, but the - integration test needs the cred-proxy to forward to the fake - upstream — not api.anthropic.com / github.com / npmjs.org. We - pass a one-route plan in directly via DockerCredProxy.start - rather than going through the manifest path.""" - - def start(self, plan): raise NotImplementedError - def stop(self, target): return None - - def _make_routes_json(upstream_host: str, upstream_port: str) -> str: payload = { "routes": [ @@ -140,12 +119,12 @@ class TestCredProxySidecar(unittest.TestCase): def _start_cred_proxy_via_production_code(self) -> str: """Run DockerCredProxy.start with a plan that points at the - fake upstream. We bypass the manifest path (which fixes - upstreams by Kind) by handing .start an already-rendered - routes.json.""" + fake upstream. We bypass the manifest path so we can route + the proxy at a test-only upstream (the fake-upstream + container) without going through the parser.""" from claude_bottle.cred_proxy import ( CredProxyPlan, - CredProxyUpstream, + CredProxyRoute, ) routes_path = self.work_dir / "routes.json" routes_path.write_text(_make_routes_json(FAKE_UPSTREAM_HOST, FAKE_UPSTREAM_PORT)) @@ -153,7 +132,7 @@ class TestCredProxySidecar(unittest.TestCase): plan = CredProxyPlan( slug=self.slug, routes_path=routes_path, - upstreams=(CredProxyUpstream( + routes=(CredProxyRoute( path="/fake/", upstream=f"http://{FAKE_UPSTREAM_HOST}:{FAKE_UPSTREAM_PORT}", auth_scheme="Bearer", diff --git a/tests/unit/test_cred_proxy.py b/tests/unit/test_cred_proxy.py index d238ab8..b62cd7c 100644 --- a/tests/unit/test_cred_proxy.py +++ b/tests/unit/test_cred_proxy.py @@ -1,4 +1,4 @@ -"""Unit: CredProxy upstream lift + routes.json render + token resolution +"""Unit: CredProxy route lift + routes.json render + token resolution (PRD 0010).""" import json @@ -8,7 +8,7 @@ from claude_bottle.cred_proxy import ( cred_proxy_render_routes, cred_proxy_resolve_token_values, cred_proxy_token_env_map, - cred_proxy_upstreams_for_bottle, + cred_proxy_routes_for_bottle, ) from claude_bottle.log import Die from claude_bottle.manifest import Manifest @@ -28,7 +28,7 @@ class TestUpstreamLift(unittest.TestCase): "auth_scheme": "Bearer", "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", "role": "anthropic-base-url"}, ]) - upstreams = cred_proxy_upstreams_for_bottle(b) + upstreams = cred_proxy_routes_for_bottle(b) self.assertEqual(1, len(upstreams)) u = upstreams[0] self.assertEqual("/anthropic/", u.path) @@ -47,7 +47,7 @@ class TestUpstreamLift(unittest.TestCase): "auth_scheme": "Bearer", "token_ref": "GH_PAT", "role": "git-insteadof"}, ]) - upstreams = cred_proxy_upstreams_for_bottle(b) + upstreams = cred_proxy_routes_for_bottle(b) self.assertEqual(2, len(upstreams)) self.assertEqual({"CRED_PROXY_TOKEN_0"}, {u.token_env for u in upstreams}) @@ -61,7 +61,7 @@ class TestUpstreamLift(unittest.TestCase): {"path": "/c/", "upstream": "https://c.example", "auth_scheme": "Bearer", "token_ref": "T1"}, ]) - upstreams = cred_proxy_upstreams_for_bottle(b) + upstreams = cred_proxy_routes_for_bottle(b) # T1 -> slot 0, T2 -> slot 1, T1 reuses slot 0. self.assertEqual("CRED_PROXY_TOKEN_0", upstreams[0].token_env) self.assertEqual("CRED_PROXY_TOKEN_1", upstreams[1].token_env) @@ -73,7 +73,7 @@ class TestUpstreamLift(unittest.TestCase): "auth_scheme": "token", "token_ref": "T"}, ]) self.assertEqual("https://gitea.dideric.is", - cred_proxy_upstreams_for_bottle(b)[0].upstream) + cred_proxy_routes_for_bottle(b)[0].upstream) def test_roles_list_passes_through(self): b = _bottle([ @@ -82,11 +82,11 @@ class TestUpstreamLift(unittest.TestCase): "role": ["git-insteadof", "tea-login"]}, ]) self.assertEqual(("git-insteadof", "tea-login"), - cred_proxy_upstreams_for_bottle(b)[0].roles) + cred_proxy_routes_for_bottle(b)[0].roles) def test_empty_routes_yields_empty_upstreams(self): b = _bottle([]) - self.assertEqual((), cred_proxy_upstreams_for_bottle(b)) + self.assertEqual((), cred_proxy_routes_for_bottle(b)) class TestTokenEnvMap(unittest.TestCase): @@ -97,7 +97,7 @@ class TestTokenEnvMap(unittest.TestCase): {"path": "/b/", "upstream": "https://b.example", "auth_scheme": "Bearer", "token_ref": "B"}, ]) - m = cred_proxy_token_env_map(cred_proxy_upstreams_for_bottle(b)) + m = cred_proxy_token_env_map(cred_proxy_routes_for_bottle(b)) self.assertEqual({"CRED_PROXY_TOKEN_0": "A", "CRED_PROXY_TOKEN_1": "B"}, m) @@ -108,7 +108,7 @@ class TestTokenEnvMap(unittest.TestCase): {"path": "/gh-git/", "upstream": "https://github.com", "auth_scheme": "Bearer", "token_ref": "GH"}, ]) - m = cred_proxy_token_env_map(cred_proxy_upstreams_for_bottle(b)) + m = cred_proxy_token_env_map(cred_proxy_routes_for_bottle(b)) self.assertEqual({"CRED_PROXY_TOKEN_0": "GH"}, m) @@ -120,7 +120,7 @@ class TestRoutesRender(unittest.TestCase): {"path": "/gitea/x/", "upstream": "https://gitea.dideric.is", "auth_scheme": "token", "token_ref": "GITEA_TOKEN"}, ]) - rendered = cred_proxy_render_routes(cred_proxy_upstreams_for_bottle(b)) + rendered = cred_proxy_render_routes(cred_proxy_routes_for_bottle(b)) payload = json.loads(rendered) self.assertEqual(["routes"], list(payload.keys())) self.assertEqual(2, len(payload["routes"])) @@ -134,7 +134,7 @@ class TestRoutesRender(unittest.TestCase): # or the host-side TokenRef name. b = _bottle([{"path": "/x/", "upstream": "https://x.example", "auth_scheme": "Bearer", "token_ref": "GITHUB_TOKEN"}]) - rendered = cred_proxy_render_routes(cred_proxy_upstreams_for_bottle(b)) + rendered = cred_proxy_render_routes(cred_proxy_routes_for_bottle(b)) self.assertNotIn("GITHUB_TOKEN", rendered) def test_empty_upstreams_renders_empty_routes_array(self): diff --git a/tests/unit/test_docker_cred_proxy.py b/tests/unit/test_docker_cred_proxy.py index d1c6c42..69b3cee 100644 --- a/tests/unit/test_docker_cred_proxy.py +++ b/tests/unit/test_docker_cred_proxy.py @@ -15,7 +15,7 @@ from claude_bottle.backend.docker.cred_proxy import ( cred_proxy_container_name, cred_proxy_url, ) -from claude_bottle.cred_proxy import CredProxyPlan, CredProxyUpstream +from claude_bottle.cred_proxy import CredProxyPlan, CredProxyRoute from claude_bottle.log import Die @@ -23,7 +23,7 @@ def _empty_plan(**overrides): base = { "slug": "demo", "routes_path": Path("/nonexistent"), - "upstreams": (), + "routes": (), "token_env_map": {}, "internal_network": "", "egress_network": "", @@ -56,17 +56,17 @@ class TestStartGuards(unittest.TestCase): self.proxy.start(_empty_plan()) def test_missing_internal_network_dies(self): - upstream = CredProxyUpstream( + upstream = CredProxyRoute( path="/anthropic/", upstream="https://api.anthropic.com", auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", token_ref="T", ) with self.assertRaises(Die): - self.proxy.start(_empty_plan(upstreams=(upstream,))) + self.proxy.start(_empty_plan(routes=(upstream,))) def test_missing_routes_file_dies(self): - upstream = CredProxyUpstream( + upstream = CredProxyRoute( path="/anthropic/", upstream="https://api.anthropic.com", auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", @@ -74,7 +74,7 @@ class TestStartGuards(unittest.TestCase): ) with self.assertRaises(Die): self.proxy.start(_empty_plan( - upstreams=(upstream,), + routes=(upstream,), internal_network="net-x", egress_network="egress-x", routes_path=Path("/tmp/cred-proxy-test-does-not-exist.json"), @@ -83,7 +83,7 @@ class TestStartGuards(unittest.TestCase): def test_pipelock_url_without_ca_dies(self): # URL set + CA path empty/missing is a wiring bug: either both # populated (production) or both empty (test escape hatch). - upstream = CredProxyUpstream( + upstream = CredProxyRoute( path="/anthropic/", upstream="https://api.anthropic.com", auth_scheme="Bearer", token_env="CRED_PROXY_TOKEN_0", @@ -92,7 +92,7 @@ class TestStartGuards(unittest.TestCase): with tempfile.NamedTemporaryFile() as routes: with self.assertRaises(Die): self.proxy.start(_empty_plan( - upstreams=(upstream,), + routes=(upstream,), internal_network="net-x", egress_network="egress-x", routes_path=Path(routes.name), diff --git a/tests/unit/test_pipelock_allowlist.py b/tests/unit/test_pipelock_allowlist.py index 0743974..29125aa 100644 --- a/tests/unit/test_pipelock_allowlist.py +++ b/tests/unit/test_pipelock_allowlist.py @@ -1,7 +1,7 @@ """Unit: pipelock_effective_allowlist — the union of baked-in defaults, bottle.egress.allowlist, and cred-proxy upstream hosts derived from -bottle.tokens (PRD 0010). Git upstreams declared in bottle.git do not -contribute here; they flow through the per-agent git-gate (PRD 0008).""" +bottle.cred_proxy.routes (PRD 0010). Git upstreams declared in bottle.git +do not contribute here; they flow through the per-agent git-gate (PRD 0008).""" import unittest diff --git a/tests/unit/test_provision_cred_proxy.py b/tests/unit/test_provision_cred_proxy.py index d58735a..6fc026a 100644 --- a/tests/unit/test_provision_cred_proxy.py +++ b/tests/unit/test_provision_cred_proxy.py @@ -10,7 +10,7 @@ from claude_bottle.backend.docker.provision.cred_proxy import ( render_npmrc, render_tea_config, ) -from claude_bottle.cred_proxy import cred_proxy_upstreams_for_bottle +from claude_bottle.cred_proxy import cred_proxy_routes_for_bottle from claude_bottle.manifest import Manifest @@ -22,7 +22,7 @@ def _bottle(routes): def _upstreams(routes): - return cred_proxy_upstreams_for_bottle(_bottle(routes)) + return cred_proxy_routes_for_bottle(_bottle(routes)) class TestRenderNpmrc(unittest.TestCase): -- 2.52.0 From 0eb482daf0a901407e7b0b0f1d74cd072d444914 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 12:33:54 -0400 Subject: [PATCH 17/24] fix(docker): surface sidecar docker errors + probe for name orphans MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two failure-clarity paper cuts from the cred-proxy debugging: 1. Every docker create / start / network-connect call on the three sidecars (pipelock, git-gate, cred-proxy) was piping stderr to DEVNULL. A stuck orphan from a previous run produced "failed to create pipelock sidecar claude-bottle-pipelock-demo" with no pointer at the real cause ("Conflict. The container name ... is already in use ..."). Switch each call to capture_output=True and include the stripped stderr in the die() message. 2. The agent container had a container_exists() probe in resolve_plan that fails fast with a hint, but the sidecars (whose names are deterministic from the slug) didn't. So an orphan caused launch() to bail deep inside docker create. Add a probe in resolve_plan for each sidecar this launch will actually try to create: pipelock always; git-gate when bottle.git is non-empty; cred-proxy when bottle.cred_proxy.routes is non-empty. Die with a "./cli.py cleanup" pointer. Smoke-tested with an orphaned pipelock- container — the new probe fires with the expected hint before any sidecar build/start work begins. --- claude_bottle/backend/docker/cred_proxy.py | 42 +++++++++---------- claude_bottle/backend/docker/git_gate.py | 41 +++++++++--------- claude_bottle/backend/docker/pipelock.py | 49 ++++++++++++++-------- claude_bottle/backend/docker/prepare.py | 33 +++++++++++++-- 4 files changed, 104 insertions(+), 61 deletions(-) diff --git a/claude_bottle/backend/docker/cred_proxy.py b/claude_bottle/backend/docker/cred_proxy.py index 8c40ced..6da6fce 100644 --- a/claude_bottle/backend/docker/cred_proxy.py +++ b/claude_bottle/backend/docker/cred_proxy.py @@ -161,14 +161,14 @@ class DockerCredProxy(CredProxy): child_env: dict[str, str] = {**os.environ, **token_values} - if subprocess.run( - create_args, - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - env=child_env, - check=False, - ).returncode != 0: - die(f"failed to create cred-proxy sidecar {name}") + create_result = subprocess.run( + create_args, capture_output=True, text=True, env=child_env, check=False, + ) + if create_result.returncode != 0: + die( + f"failed to create cred-proxy sidecar {name}: " + f"{create_result.stderr.strip()}" + ) cps: list[tuple[str, str, str]] = [ (str(plan.routes_path), CRED_PROXY_ROUTES_IN_CONTAINER, "routes.json"), @@ -202,12 +202,11 @@ class DockerCredProxy(CredProxy): f"{cp_result.stderr.strip()}" ) - if subprocess.run( + connect_result = subprocess.run( ["docker", "network", "connect", plan.egress_network, name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: + capture_output=True, text=True, check=False, + ) + if connect_result.returncode != 0: subprocess.run( ["docker", "rm", "-f", name], stdout=subprocess.DEVNULL, @@ -216,22 +215,23 @@ class DockerCredProxy(CredProxy): ) die( f"failed to attach cred-proxy sidecar {name} to egress network " - f"{plan.egress_network}" + f"{plan.egress_network}: {connect_result.stderr.strip()}" ) - if subprocess.run( - ["docker", "start", name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: + start_result = subprocess.run( + ["docker", "start", name], capture_output=True, text=True, check=False, + ) + if start_result.returncode != 0: subprocess.run( ["docker", "rm", "-f", name], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False, ) - die(f"failed to start cred-proxy sidecar {name}") + die( + f"failed to start cred-proxy sidecar {name}: " + f"{start_result.stderr.strip()}" + ) return name diff --git a/claude_bottle/backend/docker/git_gate.py b/claude_bottle/backend/docker/git_gate.py index 5ad312c..935968b 100644 --- a/claude_bottle/backend/docker/git_gate.py +++ b/claude_bottle/backend/docker/git_gate.py @@ -110,13 +110,14 @@ class DockerGitGate(GitGate): for host, ip in git_gate_aggregate_extra_hosts(plan.upstreams).items(): create_args.extend(["--add-host", f"{host}:{ip}"]) create_args.append(GIT_GATE_IMAGE) - if subprocess.run( - create_args, - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: - die(f"failed to create git-gate sidecar {name}") + create_result = subprocess.run( + create_args, capture_output=True, text=True, check=False, + ) + if create_result.returncode != 0: + die( + f"failed to create git-gate sidecar {name}: " + f"{create_result.stderr.strip()}" + ) # Order matters: entrypoint + hook first so they're present # when docker start fires. Per-upstream creds afterwards. @@ -166,12 +167,11 @@ class DockerGitGate(GitGate): f"{cp_result.stderr.strip()}" ) - if subprocess.run( + connect_result = subprocess.run( ["docker", "network", "connect", plan.egress_network, name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: + capture_output=True, text=True, check=False, + ) + if connect_result.returncode != 0: subprocess.run( ["docker", "rm", "-f", name], stdout=subprocess.DEVNULL, @@ -180,22 +180,23 @@ class DockerGitGate(GitGate): ) die( f"failed to attach git-gate sidecar {name} to egress network " - f"{plan.egress_network}" + f"{plan.egress_network}: {connect_result.stderr.strip()}" ) - if subprocess.run( - ["docker", "start", name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: + start_result = subprocess.run( + ["docker", "start", name], capture_output=True, text=True, check=False, + ) + if start_result.returncode != 0: subprocess.run( ["docker", "rm", "-f", name], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False, ) - die(f"failed to start git-gate sidecar {name}") + die( + f"failed to start git-gate sidecar {name}: " + f"{start_result.stderr.strip()}" + ) return name diff --git a/claude_bottle/backend/docker/pipelock.py b/claude_bottle/backend/docker/pipelock.py index 7359da4..89ce9f2 100644 --- a/claude_bottle/backend/docker/pipelock.py +++ b/claude_bottle/backend/docker/pipelock.py @@ -110,8 +110,14 @@ class DockerPipelockProxy(PipelockProxy): "run", "--config", "/etc/pipelock.yaml", "--listen", f"0.0.0.0:{PIPELOCK_PORT}", ] - if subprocess.run(create_args, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False).returncode != 0: - die(f"failed to create pipelock sidecar {name}") + create_result = subprocess.run( + create_args, capture_output=True, text=True, check=False, + ) + if create_result.returncode != 0: + die( + f"failed to create pipelock sidecar {name}: " + f"{create_result.stderr.strip()}" + ) for src, dst, label in ( (plan.yaml_path, "/etc/pipelock.yaml", "yaml"), @@ -131,23 +137,32 @@ class DockerPipelockProxy(PipelockProxy): ) die(f"failed to copy pipelock {label} into {name}: {cp_result.stderr.strip()}") - if subprocess.run( + connect_result = subprocess.run( ["docker", "network", "connect", plan.egress_network, name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: - subprocess.run(["docker", "rm", "-f", name], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False) - die(f"failed to attach pipelock sidecar {name} to egress network {plan.egress_network}") + capture_output=True, text=True, check=False, + ) + if connect_result.returncode != 0: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False, + ) + die( + f"failed to attach pipelock sidecar {name} to egress network " + f"{plan.egress_network}: {connect_result.stderr.strip()}" + ) - if subprocess.run( - ["docker", "start", name], - stdout=subprocess.DEVNULL, - stderr=subprocess.DEVNULL, - check=False, - ).returncode != 0: - subprocess.run(["docker", "rm", "-f", name], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False) - die(f"failed to start pipelock sidecar {name}") + start_result = subprocess.run( + ["docker", "start", name], capture_output=True, text=True, check=False, + ) + if start_result.returncode != 0: + subprocess.run( + ["docker", "rm", "-f", name], + stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, check=False, + ) + die( + f"failed to start pipelock sidecar {name}: " + f"{start_result.stderr.strip()}" + ) return name diff --git a/claude_bottle/backend/docker/prepare.py b/claude_bottle/backend/docker/prepare.py index 2e22f7b..be0ddb5 100644 --- a/claude_bottle/backend/docker/prepare.py +++ b/claude_bottle/backend/docker/prepare.py @@ -19,9 +19,13 @@ from ...log import die from .. import BottleSpec from . import util as docker_mod from .bottle_plan import DockerBottlePlan -from .cred_proxy import DockerCredProxy, cred_proxy_url -from .git_gate import DockerGitGate -from .pipelock import DockerPipelockProxy +from .cred_proxy import ( + DockerCredProxy, + cred_proxy_container_name, + cred_proxy_url, +) +from .git_gate import DockerGitGate, git_gate_container_name +from .pipelock import DockerPipelockProxy, pipelock_container_name def resolve_plan( @@ -76,6 +80,29 @@ def resolve_plan( f"clean up old containers with 'docker rm -f '" ) + # Probe sidecar container names for orphans from a previous run. + # Sidecar names are deterministic from the slug; an orphan would + # surface as a docker-create conflict deep inside launch() with no + # actionable hint. Fail fast here with a cleanup pointer instead. + # Only probe sidecars this launch will actually try to create: + # pipelock always; git-gate when bottle.git is non-empty; cred-proxy + # when bottle.cred_proxy.routes is non-empty. + sidecar_probes: list[tuple[str, str]] = [ + ("pipelock", pipelock_container_name(slug)), + ] + if bottle.git: + sidecar_probes.append(("git-gate", git_gate_container_name(slug))) + if bottle.cred_proxy.routes: + sidecar_probes.append(("cred-proxy", cred_proxy_container_name(slug))) + for label, sidecar_name in sidecar_probes: + if docker_mod.container_exists(sidecar_name): + die( + f"{label} sidecar container '{sidecar_name}' already exists. " + f"This is an orphan from a previous run; clean it up with " + f"'./cli.py cleanup' (or 'docker rm -f {sidecar_name}') and " + f"retry." + ) + env_file = stage_dir / "agent.env" prompt_file = stage_dir / "prompt.txt" prompt_file.write_text("") -- 2.52.0 From 32b62cbacc657d1072e83e05e9c4dc24081a4413 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 12:56:09 -0400 Subject: [PATCH 18/24] feat(cred_proxy)!: cred-proxy is the only Anthropic auth path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Removes the legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` -> `CLAUDE_CODE_OAUTH_TOKEN` forward in prepare.py. Bottles that need claude-code to authenticate must declare a cred_proxy route with role: "anthropic-base-url" — there is no fallback that hands the token to the agent directly. Drops the now-dead BottleSpec.forward_oauth_token field, the CLI setter that read CLAUDE_BOTTLE_OAUTH_TOKEN from the host env at prepare time, and the forward_oauth_token=False arg in the six pipelock integration tests. PRD 0010 and README updated; the dev ~/claude-bottle.json gains an anthropic-base-url route so the implementer/researcher agents keep working. BREAKING: bottles previously relying on the implicit OAuth forward will now produce an agent environ without any Anthropic credential. Verified with --dry-run: a bottle with no anthropic-base-url route yields env_names: [] (no token at all); a bottle that declares the route yields ANTHROPIC_BASE_URL plus a non-secret placeholder for CLAUDE_CODE_OAUTH_TOKEN. --- README.md | 42 +++++++++++++------ claude_bottle/backend/__init__.py | 1 - claude_bottle/backend/docker/prepare.py | 12 +++--- claude_bottle/cli/start.py | 1 - docs/prds/0010-cred-proxy.md | 14 ++++--- tests/integration/test_pipelock_allow_node.py | 1 - .../test_pipelock_allows_normal_https.py | 1 - tests/integration/test_pipelock_block_node.py | 1 - .../test_pipelock_blocks_secret_https_post.py | 1 - .../test_pipelock_blocks_secret_post.py | 1 - .../test_pipelock_llm_passthrough.py | 1 - 11 files changed, 42 insertions(+), 34 deletions(-) diff --git a/README.md b/README.md index 9326be2..43bbfc3 100644 --- a/README.md +++ b/README.md @@ -284,20 +284,36 @@ as `CLAUDE_BOTTLE_OAUTH_TOKEN`: export CLAUDE_BOTTLE_OAUTH_TOKEN="" ``` -By default `cli.py` forwards the token into the agent container as -`CLAUDE_CODE_OAUTH_TOKEN`. Declare a `bottle.cred_proxy.routes` entry -with `role: "anthropic-base-url"` and `token_ref: -"CLAUDE_BOTTLE_OAUTH_TOKEN"` to route via cred-proxy instead: the -token then lives only in the cred-proxy sidecar's environ, the agent's -`ANTHROPIC_BASE_URL` points at the proxy, and `printenv` inside the -agent does not surface the real token. Either way the value is never -written to disk or placed on argv on the host. +The bottle reaches the Anthropic API only through the cred-proxy +sidecar. To let `claude` authenticate, declare a route in +`bottle.cred_proxy.routes` with `role: "anthropic-base-url"` and +`token_ref: "CLAUDE_BOTTLE_OAUTH_TOKEN"`: -Inside the container, `claude` picks up `CLAUDE_CODE_OAUTH_TOKEN` and -authenticates against your subscription. Caveats: the token is bound -to your subscription tier (Pro/Max/Team/Enterprise), it does not work -with `claude --bare` (which only reads `ANTHROPIC_API_KEY`), and if it -leaks, regenerate via `claude setup-token` again. Reference: +```jsonc +{ + "path": "/anthropic/", + "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", + "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", + "role": "anthropic-base-url" +} +``` + +At launch, `cli.py` reads `CLAUDE_BOTTLE_OAUTH_TOKEN` from the host +env and forwards it into the cred-proxy container's environ — never +into the agent's. The agent receives `ANTHROPIC_BASE_URL` pointing at +`http://cred-proxy:9099/anthropic` and a non-secret placeholder for +`CLAUDE_CODE_OAUTH_TOKEN` (claude-code refuses to start without one; +the proxy strips and replaces the header on every request). `printenv` +inside the agent does not surface the real token, and the value is +never written to disk or placed on argv on the host. + +A bottle without an `anthropic-base-url` route has no path to the +Anthropic API — there is no fallback that forwards the token directly +to the agent. Caveats: the token is bound to your subscription tier +(Pro/Max/Team/Enterprise), it does not work with `claude --bare` +(which only reads `ANTHROPIC_API_KEY`), and if it leaks, regenerate +via `claude setup-token` again. Reference: . ## Trademarks diff --git a/claude_bottle/backend/__init__.py b/claude_bottle/backend/__init__.py index cb04496..45c65ac 100644 --- a/claude_bottle/backend/__init__.py +++ b/claude_bottle/backend/__init__.py @@ -53,7 +53,6 @@ class BottleSpec: agent_name: str copy_cwd: bool user_cwd: str - forward_oauth_token: bool @dataclass(frozen=True) diff --git a/claude_bottle/backend/docker/prepare.py b/claude_bottle/backend/docker/prepare.py index be0ddb5..8c23f38 100644 --- a/claude_bottle/backend/docker/prepare.py +++ b/claude_bottle/backend/docker/prepare.py @@ -118,17 +118,15 @@ def resolve_plan( forwarded_env: dict[str, str] = dict(resolved.forwarded) # Find the (at most one) cred-proxy route claiming the # anthropic-base-url role. Manifest validation enforces the - # singleton constraint. + # singleton constraint. cred-proxy is the only path the Anthropic + # OAuth token reaches the bottle — there is no fallback that + # forwards it into the agent's environ directly. Bottles that + # need claude-code to authenticate must declare an + # anthropic-base-url route. anthropic_route = next( (r for r in cred_proxy_plan.routes if "anthropic-base-url" in r.roles), None, ) - if spec.forward_oauth_token and anthropic_route is None: - # Pre-PRD 0010 behavior: agent reads CLAUDE_CODE_OAUTH_TOKEN - # directly. Still the path when no cred_proxy.routes entry - # is tagged anthropic-base-url; otherwise the sidecar holds - # the token. - forwarded_env["CLAUDE_CODE_OAUTH_TOKEN"] = os.environ["CLAUDE_BOTTLE_OAUTH_TOKEN"] if anthropic_route is not None: # Point claude-code at the cred-proxy. The sidecar holds the # OAuth token; the agent's environ does not. Strip the diff --git a/claude_bottle/cli/start.py b/claude_bottle/cli/start.py index 585bd75..a98e330 100644 --- a/claude_bottle/cli/start.py +++ b/claude_bottle/cli/start.py @@ -42,7 +42,6 @@ def cmd_start(argv: list[str]) -> int: agent_name=args.name, copy_cwd=args.cwd, user_cwd=USER_CWD, - forward_oauth_token=bool(os.environ.get("CLAUDE_BOTTLE_OAUTH_TOKEN")), ) stage_dir = Path(tempfile.mkdtemp(prefix="claude-bottle-stage.")) diff --git a/docs/prds/0010-cred-proxy.md b/docs/prds/0010-cred-proxy.md index 35789ec..4d58c83 100644 --- a/docs/prds/0010-cred-proxy.md +++ b/docs/prds/0010-cred-proxy.md @@ -306,14 +306,16 @@ Why the agent can't reach the sidecar's environ: `CredProxyConfig`, `Bottle.cred_proxy: CredProxyConfig`. Parse + validate route shape, role enum, path uniqueness, singleton- role constraints. -- **`claude_bottle/backend/docker/prepare.py`** — switch the - agent's OAuth handling: when a route claims the - `anthropic-base-url` role, write `ANTHROPIC_BASE_URL` (pointing - at the proxy) plus a non-secret placeholder for +- **`claude_bottle/backend/docker/prepare.py`** — drop the + legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` → `CLAUDE_CODE_OAUTH_TOKEN` + forward entirely. cred-proxy is the only path the Anthropic + OAuth token reaches the bottle. When a route claims the + `anthropic-base-url` role, write `ANTHROPIC_BASE_URL` + (pointing at the proxy) plus a non-secret placeholder for `CLAUDE_CODE_OAUTH_TOKEN` (claude-code refuses to start otherwise; the proxy strips & replaces on every request). - When no such route exists, fall back to the pre-PRD-0010 path - (forward `CLAUDE_BOTTLE_OAUTH_TOKEN` as `CLAUDE_CODE_OAUTH_TOKEN`). + Bottles that need claude-code to authenticate must declare + the route; there is no fallback. - **`claude_bottle/backend/docker/backend.py`** — instantiate `DockerCredProxy` alongside `DockerPipelockProxy` and `DockerGitGate`; thread its `prepare` / `start` / `stop` diff --git a/tests/integration/test_pipelock_allow_node.py b/tests/integration/test_pipelock_allow_node.py index 1d68d57..20bf1d1 100644 --- a/tests/integration/test_pipelock_allow_node.py +++ b/tests/integration/test_pipelock_allow_node.py @@ -79,7 +79,6 @@ class TestPipelockAllowsNode(unittest.TestCase): agent_name="demo", copy_cwd=False, user_cwd=str(stage_dir), - forward_oauth_token=False, ) plan = backend.prepare(spec, stage_dir=stage_dir) with backend.launch(plan) as bottle: diff --git a/tests/integration/test_pipelock_allows_normal_https.py b/tests/integration/test_pipelock_allows_normal_https.py index 97b1732..41acabe 100644 --- a/tests/integration/test_pipelock_allows_normal_https.py +++ b/tests/integration/test_pipelock_allows_normal_https.py @@ -44,7 +44,6 @@ class TestPipelockAllowsNormalHttps(unittest.TestCase): agent_name="demo", copy_cwd=False, user_cwd=str(stage_dir), - forward_oauth_token=False, ) plan = backend.prepare(spec, stage_dir=stage_dir) with backend.launch(plan) as bottle: diff --git a/tests/integration/test_pipelock_block_node.py b/tests/integration/test_pipelock_block_node.py index ba95888..62708f2 100644 --- a/tests/integration/test_pipelock_block_node.py +++ b/tests/integration/test_pipelock_block_node.py @@ -75,7 +75,6 @@ class TestPipelockBlocksNode(unittest.TestCase): agent_name="demo", copy_cwd=False, user_cwd=str(stage_dir), - forward_oauth_token=False, ) plan = backend.prepare(spec, stage_dir=stage_dir) with backend.launch(plan) as bottle: diff --git a/tests/integration/test_pipelock_blocks_secret_https_post.py b/tests/integration/test_pipelock_blocks_secret_https_post.py index 92f9f80..2b597ae 100644 --- a/tests/integration/test_pipelock_blocks_secret_https_post.py +++ b/tests/integration/test_pipelock_blocks_secret_https_post.py @@ -63,7 +63,6 @@ class TestPipelockBlocksSecretHttpsPost(unittest.TestCase): agent_name="demo", copy_cwd=False, user_cwd=str(stage_dir), - forward_oauth_token=False, ) plan = backend.prepare(spec, stage_dir=stage_dir) with backend.launch(plan) as bottle: diff --git a/tests/integration/test_pipelock_blocks_secret_post.py b/tests/integration/test_pipelock_blocks_secret_post.py index 6d6fb72..8c58bb6 100644 --- a/tests/integration/test_pipelock_blocks_secret_post.py +++ b/tests/integration/test_pipelock_blocks_secret_post.py @@ -99,7 +99,6 @@ class TestPipelockBlocksSecretPost(unittest.TestCase): agent_name="demo", copy_cwd=False, user_cwd=str(stage_dir), - forward_oauth_token=False, ) plan = backend.prepare(spec, stage_dir=stage_dir) with backend.launch(plan) as bottle: diff --git a/tests/integration/test_pipelock_llm_passthrough.py b/tests/integration/test_pipelock_llm_passthrough.py index 7fecb3b..bca19b7 100644 --- a/tests/integration/test_pipelock_llm_passthrough.py +++ b/tests/integration/test_pipelock_llm_passthrough.py @@ -60,7 +60,6 @@ class TestPipelockLlmPassthrough(unittest.TestCase): agent_name="demo", copy_cwd=False, user_cwd=str(stage_dir), - forward_oauth_token=False, ) plan = backend.prepare(spec, stage_dir=stage_dir) with backend.launch(plan) as bottle: -- 2.52.0 From f4452b391d107bed4762a2e176dbdf9fbf959346 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 13:25:21 -0400 Subject: [PATCH 19/24] fix(pipelock): auto-allow cred-proxy hostname when routes are declared MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The agent's HTTP_PROXY env points at pipelock, so an ANTHROPIC_BASE_URL like http://cred-proxy:9099/anthropic doesn't short-circuit through Docker's embedded DNS — it gets forwarded through pipelock, which then checks its api_allowlist for the hostname `cred-proxy` and 403's because the name isn't there. The agent surfaces the failure as "API Error: 403 blocked: domain not in allowlist: cred-proxy" on Claude's first call. Fix: pipelock_effective_allowlist auto-adds CRED_PROXY_HOSTNAME when bottle.cred_proxy.routes is non-empty (i.e., when the sidecar will actually be running and reachable). Move CRED_PROXY_HOSTNAME from backend/docker/cred_proxy.py to the backend-agnostic claude_bottle/cred_proxy.py so pipelock can reference it without a layering violation; the docker concrete imports it from the same place. --- claude_bottle/backend/docker/cred_proxy.py | 8 +------- claude_bottle/cred_proxy.py | 11 +++++++++++ claude_bottle/pipelock.py | 18 ++++++++++++++---- tests/unit/test_pipelock_allowlist.py | 16 ++++++++++++++++ 4 files changed, 42 insertions(+), 11 deletions(-) diff --git a/claude_bottle/backend/docker/cred_proxy.py b/claude_bottle/backend/docker/cred_proxy.py index 6da6fce..d0cfd69 100644 --- a/claude_bottle/backend/docker/cred_proxy.py +++ b/claude_bottle/backend/docker/cred_proxy.py @@ -10,6 +10,7 @@ import subprocess from pathlib import Path from ...cred_proxy import ( + CRED_PROXY_HOSTNAME, CredProxy, CredProxyPlan, cred_proxy_resolve_token_values, @@ -30,13 +31,6 @@ CRED_PROXY_DOCKERFILE = "Dockerfile.cred-proxy" # both reference it. CRED_PROXY_PORT = int(os.environ.get("CLAUDE_BOTTLE_CRED_PROXY_PORT", "9099")) -# DNS name agents use to reach the sidecar. Attached as a -# --network-alias on the internal docker network so the URL the -# provisioner writes into the agent's environ is stable across -# bottles (the container name carries the per-bottle slug; the alias -# does not). -CRED_PROXY_HOSTNAME = "cred-proxy" - # In-container path the proxy server reads its route table from. # Pre-created in Dockerfile.cred-proxy so `docker cp` can drop the # file directly. diff --git a/claude_bottle/cred_proxy.py b/claude_bottle/cred_proxy.py index 79ee4e5..0856d85 100644 --- a/claude_bottle/cred_proxy.py +++ b/claude_bottle/cred_proxy.py @@ -31,6 +31,16 @@ from .log import die from .manifest import Bottle +# DNS name agents use to reach the per-bottle cred-proxy sidecar. +# Backend-agnostic by contract: every concrete backend (Docker today, +# others later) attaches this name to its sidecar on the bottle's +# internal network so the agent's manifest-driven URLs (`http:// +# cred-proxy:9099/...`) work without a backend-specific hostname. +# pipelock's allowlist also references this when adding the +# auto-allow entry for cred-proxy traffic from the agent. +CRED_PROXY_HOSTNAME = "cred-proxy" + + @dataclass(frozen=True) class CredProxyRoute: """One resolved route on the cred-proxy sidecar. Maps a path @@ -247,6 +257,7 @@ class CredProxy(ABC): __all__ = [ + "CRED_PROXY_HOSTNAME", "CredProxy", "CredProxyPlan", "CredProxyRoute", diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 52f5b23..67234ce 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -17,6 +17,7 @@ from dataclasses import dataclass from pathlib import Path from typing import cast +from .cred_proxy import CRED_PROXY_HOSTNAME from .manifest import Bottle # Baked-in default allowlist for hosts Claude Code itself needs. @@ -74,10 +75,17 @@ def pipelock_token_hosts(bottle: Bottle) -> list[str]: def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: """Deduplicated union of: baked-in defaults, bottle.egress.allowlist, - and the cred-proxy upstream hosts derived from bottle.cred_proxy.routes. - Sorted for stability. Git upstreams declared in `bottle.git` do NOT - contribute here — git traffic flows through the per-agent git-gate - sidecar (PRD 0008), not pipelock.""" + the cred-proxy upstream hosts derived from bottle.cred_proxy.routes, + and the cred-proxy sidecar's own hostname when any cred_proxy route + is declared. Sorted for stability. Git upstreams declared in + `bottle.git` do NOT contribute here — git traffic flows through the + per-agent git-gate sidecar (PRD 0008), not pipelock. + + The cred-proxy hostname is auto-added because the agent's + HTTP_PROXY points at pipelock, so a manifest-driven URL like + `http://cred-proxy:9099/anthropic/...` arrives at pipelock as a + request for hostname `cred-proxy`. Without this auto-allow, + pipelock would 403 the request before it reached the sidecar.""" seen: dict[str, None] = {} for h in DEFAULT_ALLOWLIST: seen.setdefault(h, None) @@ -86,6 +94,8 @@ def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: seen.setdefault(h, None) for h in pipelock_token_hosts(bottle): seen.setdefault(h, None) + if bottle.cred_proxy.routes: + seen.setdefault(CRED_PROXY_HOSTNAME, None) return sorted(seen.keys()) diff --git a/tests/unit/test_pipelock_allowlist.py b/tests/unit/test_pipelock_allowlist.py index 29125aa..a10fae1 100644 --- a/tests/unit/test_pipelock_allowlist.py +++ b/tests/unit/test_pipelock_allowlist.py @@ -75,6 +75,22 @@ class TestAllowlistWithTokens(unittest.TestCase): self.assertIn("registry.npmjs.org", eff) self.assertIn("api.github.com", eff) + def test_cred_proxy_hostname_auto_added_when_routes_exist(self): + # The agent's HTTP_PROXY points at pipelock, so a request for + # http://cred-proxy:9099/... arrives at pipelock as a request + # for hostname `cred-proxy`. pipelock must allow it or the + # agent can't reach its own sidecar. + eff = pipelock_effective_allowlist(_bottle(_routes([ + {"path": "/x/", "upstream": "https://x.example", + "auth_scheme": "Bearer", "token_ref": "T"}, + ]))) + self.assertIn("cred-proxy", eff) + + def test_cred_proxy_hostname_NOT_added_when_no_routes(self): + # No cred-proxy sidecar, no auto-allow. + eff = pipelock_effective_allowlist(_bottle({})) + self.assertNotIn("cred-proxy", eff) + class TestTlsPassthrough(unittest.TestCase): def test_default_includes_api_anthropic(self): -- 2.52.0 From 51b20340a91fec6435491c48f2ba153360273df7 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 13:39:27 -0400 Subject: [PATCH 20/24] fix(pipelock): allow agent->sidecar traffic via SSRF exception MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The agent's HTTP_PROXY points at pipelock, so a request to http://cred-proxy:9099/... arrives at pipelock; pipelock resolves the host, sees an RFC1918 address (the bottle's internal Docker network sits in 172.x), and 403's "SSRF blocked: cred-proxy resolves to internal IP 172.20.0.4". Bypassing pipelock entirely would also remove its body scanner from the agent->cred-proxy leg — we want to keep that DLP coverage. Pipelock has `ssrf.ip_allowlist` for exactly this: CIDRs that override the built-in internal-IP block while api_allowlist + body scanning + tls_interception keep firing. Wiring: - `pipelock_build_config` accepts `ssrf_ip_allowlist`; when non-empty, emits an `ssrf: { ip_allowlist: [...] }` block. - `pipelock_render_yaml` renders that block. - `PipelockProxyPlan` gains `internal_network_cidr`. - New `network_inspect_cidr(name)` helper reads the Docker-assigned subnet via `docker network inspect`. - launch.py: after `network_create_internal`, inspect the CIDR, re-render the yaml with `ssrf_ip_allowlist=(cidr,)`, overwrite the file in place; `DockerPipelockProxy.start` then docker-cp's the updated content. Prepare's initial render stays unchanged (CIDR isn't known yet at prepare time). The exception scope is the bottle's own internal network only — agent ↔ pipelock / git-gate / cred-proxy. Body scanning still applies to the bytes flowing through pipelock; pipelock just no longer treats those internal IPs as exfil targets. --- claude_bottle/backend/docker/launch.py | 33 +++++++++++++++++++++- claude_bottle/backend/docker/network.py | 23 +++++++++++++++ claude_bottle/pipelock.py | 37 +++++++++++++++++++++---- tests/unit/test_pipelock_yaml.py | 27 ++++++++++++++++++ 4 files changed, 114 insertions(+), 6 deletions(-) diff --git a/claude_bottle/backend/docker/launch.py b/claude_bottle/backend/docker/launch.py index 407f925..c59fb7f 100644 --- a/claude_bottle/backend/docker/launch.py +++ b/claude_bottle/backend/docker/launch.py @@ -18,13 +18,20 @@ from pathlib import Path from typing import Callable, Generator from ...log import die, info +from ...pipelock import pipelock_build_config, pipelock_render_yaml from . import network as network_mod from . import util as docker_mod from .bottle import DockerBottle from .bottle_plan import DockerBottlePlan from .cred_proxy import DockerCredProxy from .git_gate import DockerGitGate -from .pipelock import DockerPipelockProxy, pipelock_proxy_url, pipelock_tls_init +from .pipelock import ( + PIPELOCK_CA_CERT_IN_CONTAINER, + PIPELOCK_CA_KEY_IN_CONTAINER, + DockerPipelockProxy, + pipelock_proxy_url, + pipelock_tls_init, +) from .provision.ca import AGENT_CA_BUNDLE, AGENT_CA_PATH @@ -68,6 +75,14 @@ def launch( egress_network = network_mod.network_create_egress(plan.slug) stack.callback(network_mod.network_remove, egress_network) + # Docker assigns a CIDR to the new internal network. Pipelock's + # SSRF guard otherwise rejects any destination resolving into + # RFC1918 space — which includes the cred-proxy / git-gate / + # pipelock sidecars themselves. Allowlist the bottle's own + # internal subnet so the agent can reach its sidecars via + # pipelock; api_allowlist + body-scanning still apply. + internal_cidr = network_mod.network_inspect_cidr(internal_network) + # Per-bottle ephemeral CA for pipelock's TLS interception # (PRD 0006). One-shot pipelock container writes ca.pem + # ca-key.pem under plan.stage_dir; .start docker-cp's them @@ -75,9 +90,25 @@ def launch( # stage dir, which start.py's outer finally `shutil.rmtree`s # after the sidecar is torn down. ca_cert_host, ca_key_host = pipelock_tls_init(plan.stage_dir) + + # Re-render the pipelock yaml with the SSRF allowlist now that + # we know the internal CIDR. Prepare wrote the yaml without + # the ssrf block (CIDR wasn't known yet); overwrite the same + # path so .start docker-cp's the updated content. + bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name) + cfg = pipelock_build_config( + bottle, + ca_cert_path=PIPELOCK_CA_CERT_IN_CONTAINER, + ca_key_path=PIPELOCK_CA_KEY_IN_CONTAINER, + ssrf_ip_allowlist=(internal_cidr,), + ) + plan.proxy_plan.yaml_path.write_text(pipelock_render_yaml(cfg)) + plan.proxy_plan.yaml_path.chmod(0o600) + proxy_plan = dataclasses.replace( plan.proxy_plan, internal_network=internal_network, + internal_network_cidr=internal_cidr, egress_network=egress_network, ca_cert_host_path=ca_cert_host, ca_key_host_path=ca_key_host, diff --git a/claude_bottle/backend/docker/network.py b/claude_bottle/backend/docker/network.py index 1d082d8..9cc0981 100644 --- a/claude_bottle/backend/docker/network.py +++ b/claude_bottle/backend/docker/network.py @@ -81,6 +81,29 @@ def network_create_egress(slug: str) -> str: return _network_create_with_prefix(network_egress_name_for_slug(slug), internal=False) +def network_inspect_cidr(name: str) -> str: + """Return the IPv4 CIDR Docker assigned to a user-defined network. + + Used by pipelock's SSRF guard exception: the bottle's internal + network sits in RFC1918 space, so pipelock's `internal:` list + would block any agent request whose destination resolves there + — including the cred-proxy sidecar's address. Adding the + network's CIDR to pipelock's `ssrf.ip_allowlist` lets traffic + targeted at the bottle's own sidecars through while pipelock + still body-scans and api_allowlist-gates as usual.""" + result = subprocess.run( + ["docker", "network", "inspect", + "--format", "{{range .IPAM.Config}}{{.Subnet}}{{end}}", name], + capture_output=True, text=True, check=False, + ) + if result.returncode != 0: + die(f"docker network inspect {name} failed: {result.stderr.strip()}") + cidr = result.stdout.strip() + if not cidr: + die(f"network {name!r} has no IPAM subnet configured") + return cidr + + def network_attach(network: str, container: str) -> None: result = subprocess.run( ["docker", "network", "connect", network, container], diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 67234ce..c4f30cf 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -152,6 +152,7 @@ def pipelock_build_config( *, ca_cert_path: str = "", ca_key_path: str = "", + ssrf_ip_allowlist: tuple[str, ...] = (), ) -> dict[str, object]: """Build the structured pipelock config dict the sidecar will load. @@ -166,7 +167,17 @@ def pipelock_build_config( Pass both or neither: both → emit `tls_interception` block with `enabled: true`; neither → omit the block entirely (pipelock falls back to its built-in default of `enabled: false`). Used - by PRD 0006 to turn on pipelock's native TLS interception.""" + by PRD 0006 to turn on pipelock's native TLS interception. + + `ssrf_ip_allowlist` is the list of IPs / CIDRs that bypass + pipelock's SSRF guard. Pipelock blocks RFC1918-resolved + destinations by default, which would catch the agent's + cred-proxy traffic (cred-proxy sits on the bottle's internal + Docker network in 172.x space). Pass the bottle's internal + network CIDR here so `cred-proxy:9099` requests get through + pipelock while api_allowlist + body-scanning still apply. Empty + by default; omitted from the rendered yaml when empty so + pipelock keeps its built-in SSRF defaults.""" cfg: dict[str, object] = { "version": 1, "mode": "strict", @@ -193,6 +204,8 @@ def pipelock_build_config( "ca_key": ca_key_path, "passthrough_domains": pipelock_effective_tls_passthrough(bottle), } + if ssrf_ip_allowlist: + cfg["ssrf"] = {"ip_allowlist": list(ssrf_ip_allowlist)} return cfg @@ -236,6 +249,13 @@ def pipelock_render_yaml(cfg: dict[str, object]) -> str: lines.append(" passthrough_domains:") for d in passthrough: lines.append(f' - "{d}"') + if "ssrf" in cfg: + lines.append("") + lines.append("ssrf:") + ssrf = cast(dict[str, object], cfg["ssrf"]) + lines.append(" ip_allowlist:") + for ip in cast(list[str], ssrf["ip_allowlist"]): + lines.append(f' - "{ip}"') return "\n".join(lines) + "\n" @@ -252,14 +272,21 @@ class PipelockProxyPlan: already so it doesn't need the host paths to be valid). The remaining fields are populated by the backend's launch step via `dataclasses.replace`: internal/egress networks once - those networks exist, and the CA host paths once the - one-shot `pipelock tls init` has run. Empty defaults are - sentinels meaning "not yet set"; `.start` validates that - they are populated.""" + those networks exist, the CA host paths once the one-shot + `pipelock tls init` has run, and `internal_network_cidr` once + Docker has assigned a subnet to the internal network. Empty + defaults are sentinels meaning "not yet set"; `.start` validates + that they are populated. + + `internal_network_cidr` ends up on pipelock's `ssrf.ip_allowlist` + so the agent's requests at `cred-proxy:9099` (or any other + bottle-internal sidecar) bypass pipelock's RFC1918 SSRF guard + while api_allowlist and body-scanning still apply.""" yaml_path: Path slug: str internal_network: str = "" + internal_network_cidr: str = "" egress_network: str = "" ca_cert_host_path: Path = Path() ca_key_host_path: Path = Path() diff --git a/tests/unit/test_pipelock_yaml.py b/tests/unit/test_pipelock_yaml.py index caa1105..ac3bb36 100644 --- a/tests/unit/test_pipelock_yaml.py +++ b/tests/unit/test_pipelock_yaml.py @@ -77,6 +77,21 @@ class TestBuildConfig(unittest.TestCase): ca_cert_path="/etc/pipelock-ca.pem", ) + def test_ssrf_block_omitted_when_no_allowlist(self): + cfg = pipelock_build_config(fixture_minimal().bottles["dev"]) + self.assertNotIn("ssrf", cfg) + + def test_ssrf_block_emitted_when_allowlist_supplied(self): + # The bottle's internal Docker subnet lands here at launch + # time so cred-proxy:9099 (172.x.x.x) doesn't trip pipelock's + # RFC1918 SSRF guard. + cfg = pipelock_build_config( + fixture_minimal().bottles["dev"], + ssrf_ip_allowlist=("172.20.0.0/16",), + ) + self.assertIn("ssrf", cfg) + self.assertEqual({"ip_allowlist": ["172.20.0.0/16"]}, cfg["ssrf"]) + class TestRenderAndWrite(unittest.TestCase): def setUp(self): @@ -148,6 +163,18 @@ class TestRenderAndWrite(unittest.TestCase): self.assertIn("passthrough_domains:", content) self.assertIn('- "api.anthropic.com"', content) + def test_render_emits_ssrf_block_when_allowlist_given(self): + cfg = pipelock_build_config( + fixture_minimal().bottles["dev"], + ca_cert_path="/etc/pipelock-ca.pem", + ca_key_path="/etc/pipelock-ca-key.pem", + ssrf_ip_allowlist=("172.20.0.0/16",), + ) + text = pipelock_render_yaml(cfg) + self.assertIn("ssrf:", text) + self.assertIn("ip_allowlist:", text) + self.assertIn('- "172.20.0.0/16"', text) + if __name__ == "__main__": unittest.main() -- 2.52.0 From c5d729e25dee3c39a93cb6d82949192567f40f21 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 13:49:31 -0400 Subject: [PATCH 21/24] fix(pipelock): suppress BIP-39 detector on cred-proxy anthropic path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit claude-code's chat bodies legitimately trip pipelock's BIP-39 seed- phrase detector — any 12+ English words that pass the BIP-39 checksum match. The direct path to api.anthropic.com already sits on tls_interception.passthrough_domains so no body scan runs there, but the cred-proxy hop is plain HTTP through pipelock and the body scanner fires. Add an anthropic-route-specific suppress entry: suppress: - rule: "BIP-39 Seed Phrase" path: "/anthropic/**" Just this one detector, only on this one path. Every other DLP pattern (AKIA, gh*_, sk-ant-, etc.) keeps firing — those are unambiguous credential shapes with no legitimate reason to appear in a chat completion. Other detectors that fire on natural language can be added to the suppress list when/if they surface. Wiring: pipelock_effective_suppress(bottle) computes the entries from bottle.cred_proxy.routes; pipelock_build_config accepts them and emits a `suppress:` block; pipelock_render_yaml renders it. Probed schema with `pipelock check --config` to confirm the {rule, path} shape; full yaml validates clean. --- claude_bottle/pipelock.py | 34 ++++++++++++++++++++++++++ tests/unit/test_pipelock_yaml.py | 41 ++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+) diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index c4f30cf..7721968 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -99,6 +99,31 @@ def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: return sorted(seen.keys()) +def pipelock_effective_suppress(bottle: Bottle) -> list[dict[str, str]]: + """Per-bottle pipelock detector suppressions. + + Pipelock's `suppress:` block silences a named rule on a path glob. + LLM conversation bodies legitimately trip detectors that look for + natural-language token shapes — most famously the BIP-39 seed- + phrase detector, which fires on any 12+ English words that pass + the BIP-39 checksum. The direct path to `api.anthropic.com` is + already on tls_interception.passthrough_domains so no body scan + runs there, but the cred-proxy hop (where the agent dials + `http://cred-proxy:9099/anthropic/...`) is plain HTTP through + pipelock — body scanning fires. + + For each route with the `anthropic-base-url` role, suppress + BIP-39 on `**` so claude-code's chat bodies make it + through. All other detectors (credit-card, IBAN, token regexes, + etc.) keep firing — those are unambiguous credential shapes + that have no legitimate reason to appear in a chat completion.""" + out: list[dict[str, str]] = [] + for r in bottle.cred_proxy.routes: + if "anthropic-base-url" in r.Role: + out.append({"rule": "BIP-39 Seed Phrase", "path": f"{r.Path}**"}) + return out + + def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]: """Hostnames pipelock should pass through (no TLS MITM, no body scan). Default carries the LLM API endpoint — its request bodies @@ -185,6 +210,9 @@ def pipelock_build_config( "api_allowlist": pipelock_effective_allowlist(bottle), "forward_proxy": {"enabled": True}, } + suppress = pipelock_effective_suppress(bottle) + if suppress: + cfg["suppress"] = suppress cfg["dlp"] = {"include_defaults": True, "scan_env": True} # Body-scan enforcement is a separate pipelock section (each DLP # "surface" — body, MCP, response — has its own action). Pipelock's @@ -225,6 +253,12 @@ def pipelock_render_yaml(cfg: dict[str, object]) -> str: for h in cast(list[str], cfg["api_allowlist"]): lines.append(f' - "{h}"') lines.append("") + if "suppress" in cfg: + lines.append("suppress:") + for entry in cast(list[dict[str, str]], cfg["suppress"]): + lines.append(f' - rule: "{entry["rule"]}"') + lines.append(f' path: "{entry["path"]}"') + lines.append("") lines.append("forward_proxy:") fp = cast(dict[str, object], cfg["forward_proxy"]) lines.append(f" enabled: {_bool(fp['enabled'])}") diff --git a/tests/unit/test_pipelock_yaml.py b/tests/unit/test_pipelock_yaml.py index ac3bb36..f6fefd8 100644 --- a/tests/unit/test_pipelock_yaml.py +++ b/tests/unit/test_pipelock_yaml.py @@ -92,6 +92,31 @@ class TestBuildConfig(unittest.TestCase): self.assertIn("ssrf", cfg) self.assertEqual({"ip_allowlist": ["172.20.0.0/16"]}, cfg["ssrf"]) + def test_suppress_absent_when_no_anthropic_route(self): + cfg = pipelock_build_config(fixture_minimal().bottles["dev"]) + self.assertNotIn("suppress", cfg) + + def test_suppress_emits_bip39_for_anthropic_route(self): + # claude-code's chat bodies trip pipelock's BIP-39 detector + # (12+ English words). Suppress just that detector on the + # cred-proxy's anthropic path — all the other DLP patterns + # keep firing. + from claude_bottle.manifest import Manifest + bottle = Manifest.from_json_obj({ + "bottles": {"dev": {"cred_proxy": {"routes": [ + {"path": "/anthropic/", + "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "T", + "role": "anthropic-base-url"}, + ]}}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }).bottles["dev"] + cfg = pipelock_build_config(bottle) + self.assertEqual( + [{"rule": "BIP-39 Seed Phrase", "path": "/anthropic/**"}], + cfg["suppress"], + ) + class TestRenderAndWrite(unittest.TestCase): def setUp(self): @@ -175,6 +200,22 @@ class TestRenderAndWrite(unittest.TestCase): self.assertIn("ip_allowlist:", text) self.assertIn('- "172.20.0.0/16"', text) + def test_render_emits_suppress_block_for_anthropic_route(self): + from claude_bottle.manifest import Manifest + bottle = Manifest.from_json_obj({ + "bottles": {"dev": {"cred_proxy": {"routes": [ + {"path": "/anthropic/", + "upstream": "https://api.anthropic.com", + "auth_scheme": "Bearer", "token_ref": "T", + "role": "anthropic-base-url"}, + ]}}}, + "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, + }).bottles["dev"] + text = pipelock_render_yaml(pipelock_build_config(bottle)) + self.assertIn("suppress:", text) + self.assertIn('rule: "BIP-39 Seed Phrase"', text) + self.assertIn('path: "/anthropic/**"', text) + if __name__ == "__main__": unittest.main() -- 2.52.0 From 4662087b329f496b356906018c100e8415f03002 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 13:59:05 -0400 Subject: [PATCH 22/24] fix(pipelock): disable seed_phrase_detection for anthropic bottles MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous attempt added a `suppress: [{rule, path}]` entry. The yaml validated and the entry showed up in the live pipelock's config, but the BIP-39 detector kept firing — `suppress` only silences alerts, not enforcement. Reproduced the failure in isolation, probed three knobs against a real pipelock with a canonical BIP-39 body (`abandon abandon ... about`): suppress: [{rule: "BIP-39 Seed Phrase", path: "/anthropic/**"}] -> still 403 rules.disabled: ["dlp:BIP-39 Seed Phrase"] -> still 403 seed_phrase_detection: { enabled: false } -> 200 (forwarded) Only the global toggle actually stops the block. Pipelock 2.3.0 has no per-path / per-host knob for this detector, so the trade-off is: when the bottle declares an `anthropic-base-url` route, BIP-39 detection comes off globally for that bottle. Every other DLP pattern (gh*_, sk-ant-, AKIA, etc.) keeps firing — the ones that actually map to claude-bottle's threat model. Drops the `suppress:` emitter from pipelock_build_config / pipelock_render_yaml; replaces with a `seed_phrase_detection: { enabled: false }` block driven by `pipelock_seed_phrase_detection_enabled(bottle)`. Tests flip from suppress-shape to seed_phrase shape. End-to-end probe through the real pipelock image confirms BIP-39 bodies forward. --- claude_bottle/pipelock.py | 62 +++++++++++++++++--------------- tests/unit/test_pipelock_yaml.py | 30 ++++++++-------- 2 files changed, 49 insertions(+), 43 deletions(-) diff --git a/claude_bottle/pipelock.py b/claude_bottle/pipelock.py index 7721968..db85926 100644 --- a/claude_bottle/pipelock.py +++ b/claude_bottle/pipelock.py @@ -99,29 +99,35 @@ def pipelock_effective_allowlist(bottle: Bottle) -> list[str]: return sorted(seen.keys()) -def pipelock_effective_suppress(bottle: Bottle) -> list[dict[str, str]]: - """Per-bottle pipelock detector suppressions. +def pipelock_seed_phrase_detection_enabled(bottle: Bottle) -> bool: + """Whether pipelock's BIP-39 seed-phrase detector stays on for + this bottle. - Pipelock's `suppress:` block silences a named rule on a path glob. - LLM conversation bodies legitimately trip detectors that look for - natural-language token shapes — most famously the BIP-39 seed- - phrase detector, which fires on any 12+ English words that pass - the BIP-39 checksum. The direct path to `api.anthropic.com` is - already on tls_interception.passthrough_domains so no body scan - runs there, but the cred-proxy hop (where the agent dials - `http://cred-proxy:9099/anthropic/...`) is plain HTTP through - pipelock — body scanning fires. + LLM conversation bodies legitimately trip the detector — any 12+ + English words that pass the BIP-39 checksum match — so any + bottle that routes claude through pipelock's body scanner gets + blocked on the first real chat. We tried two narrower knobs + first: - For each route with the `anthropic-base-url` role, suppress - BIP-39 on `**` so claude-code's chat bodies make it - through. All other detectors (credit-card, IBAN, token regexes, - etc.) keep firing — those are unambiguous credential shapes - that have no legitimate reason to appear in a chat completion.""" - out: list[dict[str, str]] = [] - for r in bottle.cred_proxy.routes: - if "anthropic-base-url" in r.Role: - out.append({"rule": "BIP-39 Seed Phrase", "path": f"{r.Path}**"}) - return out + - `suppress: [{rule, path}]` — pipelock accepts the schema + but the entry only silences the alert; the body_dlp block + still fires. + - `rules.disabled: ["dlp:BIP-39 Seed Phrase"]` — same shape, + same outcome: 403 still returned. + + Empirically only `seed_phrase_detection.enabled: false` + actually stops the block (verified by sending a 12-word BIP-39 + body through three pipelock instances). It is a global toggle + — there is no per-path / per-host knob in pipelock 2.3.0 — so + we turn the detector off for the entire bottle when an + `anthropic-base-url` route is declared. The trade-off is + accepted: BIP-39 detection has little value in claude-bottle's + threat model (the agent has no access to a user's crypto + wallet seeds; the patterns that matter — gh*_, sk-ant-, AKIA, + etc. — keep firing).""" + return not any( + "anthropic-base-url" in r.Role for r in bottle.cred_proxy.routes + ) def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]: @@ -210,9 +216,8 @@ def pipelock_build_config( "api_allowlist": pipelock_effective_allowlist(bottle), "forward_proxy": {"enabled": True}, } - suppress = pipelock_effective_suppress(bottle) - if suppress: - cfg["suppress"] = suppress + if not pipelock_seed_phrase_detection_enabled(bottle): + cfg["seed_phrase_detection"] = {"enabled": False} cfg["dlp"] = {"include_defaults": True, "scan_env": True} # Body-scan enforcement is a separate pipelock section (each DLP # "surface" — body, MCP, response — has its own action). Pipelock's @@ -253,11 +258,10 @@ def pipelock_render_yaml(cfg: dict[str, object]) -> str: for h in cast(list[str], cfg["api_allowlist"]): lines.append(f' - "{h}"') lines.append("") - if "suppress" in cfg: - lines.append("suppress:") - for entry in cast(list[dict[str, str]], cfg["suppress"]): - lines.append(f' - rule: "{entry["rule"]}"') - lines.append(f' path: "{entry["path"]}"') + if "seed_phrase_detection" in cfg: + lines.append("seed_phrase_detection:") + spd = cast(dict[str, object], cfg["seed_phrase_detection"]) + lines.append(f" enabled: {_bool(spd['enabled'])}") lines.append("") lines.append("forward_proxy:") fp = cast(dict[str, object], cfg["forward_proxy"]) diff --git a/tests/unit/test_pipelock_yaml.py b/tests/unit/test_pipelock_yaml.py index f6fefd8..68caed6 100644 --- a/tests/unit/test_pipelock_yaml.py +++ b/tests/unit/test_pipelock_yaml.py @@ -92,15 +92,21 @@ class TestBuildConfig(unittest.TestCase): self.assertIn("ssrf", cfg) self.assertEqual({"ip_allowlist": ["172.20.0.0/16"]}, cfg["ssrf"]) - def test_suppress_absent_when_no_anthropic_route(self): + def test_seed_phrase_detection_left_at_default_when_no_anthropic_route(self): + # No override emitted -> pipelock keeps its built-in default + # (BIP-39 detection enabled). Bottles that don't carry an + # Anthropic route don't need the false-positive workaround. cfg = pipelock_build_config(fixture_minimal().bottles["dev"]) - self.assertNotIn("suppress", cfg) + self.assertNotIn("seed_phrase_detection", cfg) - def test_suppress_emits_bip39_for_anthropic_route(self): + def test_seed_phrase_detection_disabled_for_anthropic_route(self): # claude-code's chat bodies trip pipelock's BIP-39 detector - # (12+ English words). Suppress just that detector on the - # cred-proxy's anthropic path — all the other DLP patterns - # keep firing. + # (12+ English words that pass the checksum). pipelock 2.3.0 + # has no per-path knob for this detector, and both `suppress` + # and `rules.disabled` only silence alerts — the block still + # fires. The only knob that actually skips the block is the + # global on/off, so we flip it off whenever the bottle is set + # up to route claude through pipelock. from claude_bottle.manifest import Manifest bottle = Manifest.from_json_obj({ "bottles": {"dev": {"cred_proxy": {"routes": [ @@ -112,10 +118,7 @@ class TestBuildConfig(unittest.TestCase): "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, }).bottles["dev"] cfg = pipelock_build_config(bottle) - self.assertEqual( - [{"rule": "BIP-39 Seed Phrase", "path": "/anthropic/**"}], - cfg["suppress"], - ) + self.assertEqual({"enabled": False}, cfg["seed_phrase_detection"]) class TestRenderAndWrite(unittest.TestCase): @@ -200,7 +203,7 @@ class TestRenderAndWrite(unittest.TestCase): self.assertIn("ip_allowlist:", text) self.assertIn('- "172.20.0.0/16"', text) - def test_render_emits_suppress_block_for_anthropic_route(self): + def test_render_emits_seed_phrase_off_for_anthropic_route(self): from claude_bottle.manifest import Manifest bottle = Manifest.from_json_obj({ "bottles": {"dev": {"cred_proxy": {"routes": [ @@ -212,9 +215,8 @@ class TestRenderAndWrite(unittest.TestCase): "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}}, }).bottles["dev"] text = pipelock_render_yaml(pipelock_build_config(bottle)) - self.assertIn("suppress:", text) - self.assertIn('rule: "BIP-39 Seed Phrase"', text) - self.assertIn('path: "/anthropic/**"', text) + self.assertIn("seed_phrase_detection:", text) + self.assertIn("enabled: false", text) if __name__ == "__main__": -- 2.52.0 From 77a51702fcb7e9e46000b53203efbccfdefecea7 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 14:08:35 -0400 Subject: [PATCH 23/24] fix(cred_proxy): force identity encoding on upstream requests MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit claude-code sends Accept-Encoding: gzip, deflate, br on every request. api.anthropic.com honors it and returns gzip-compressed SSE responses. Pipelock 2.3.0 has no decompression path; its response scanner fails closed with "blocked: compressed sse_stream response cannot be scanned" — and that gate fires even with response_scanning.enabled=false and sse_streaming disabled. Verified empirically against the real pipelock image. Cleanest fix that preserves DLP coverage end-to-end: have cred-proxy ask upstream for uncompressed bytes. Strip the agent's Accept-Encoding when building the upstream headers and inject `Accept-Encoding: identity`. Upstream returns plaintext; pipelock can scan; no 403. Bandwidth cost is the gzip ratio one-way (cred-proxy ↔ upstream through pipelock). For LLM SSE streams that's a few KB extra per turn — trivial compared to the alternative of leaving pipelock's response scanner blind. --- claude_bottle/cred_proxy_server.py | 15 ++++++++++++++- tests/unit/test_cred_proxy_server.py | 13 +++++++++++++ 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/claude_bottle/cred_proxy_server.py b/claude_bottle/cred_proxy_server.py index 1a0f4a3..5fc1c8a 100644 --- a/claude_bottle/cred_proxy_server.py +++ b/claude_bottle/cred_proxy_server.py @@ -157,7 +157,16 @@ _HOP_BY_HOP = frozenset({ "upgrade", }) -_STRIPPED = _HOP_BY_HOP | frozenset({"host", "authorization", "content-length"}) +# Strip the agent's Accept-Encoding on the upstream leg and force +# `identity` instead. The response then flows back uncompressed, +# which lets pipelock's response scanner read the body — pipelock +# 2.3.0 has no decompression path and otherwise blocks with +# "compressed sse_stream response cannot be scanned". The cost is +# bandwidth from upstream; for LLM SSE streams this is negligible +# and the DLP coverage on the agent leg is the win. +_STRIPPED = _HOP_BY_HOP | frozenset({ + "host", "authorization", "content-length", "accept-encoding", +}) def build_forward_headers( @@ -177,6 +186,9 @@ def build_forward_headers( every listed header name. - Inject `Authorization: ` and a Host header pointing at the upstream. + - Force `Accept-Encoding: identity` so the upstream returns + uncompressed bytes — pipelock's response scanner can't read + gzip/br/deflate and would otherwise 403 the response. """ incoming_list = list(incoming) # Headers listed in `Connection:` are also hop-by-hop for this hop. @@ -193,6 +205,7 @@ def build_forward_headers( forwarded.append((name, value)) forwarded.append(("Host", upstream_host)) forwarded.append(("Authorization", f"{auth_scheme} {token}")) + forwarded.append(("Accept-Encoding", "identity")) return forwarded diff --git a/tests/unit/test_cred_proxy_server.py b/tests/unit/test_cred_proxy_server.py index ce22889..bace39a 100644 --- a/tests/unit/test_cred_proxy_server.py +++ b/tests/unit/test_cred_proxy_server.py @@ -141,6 +141,19 @@ class TestBuildForwardHeaders(unittest.TestCase): self.assertNotIn("x-custom", names) # listed in Connection: -> hop-by-hop self.assertIn("x-real", names) + def test_forces_identity_accept_encoding(self): + # The agent's gzip/br Accept-Encoding gets replaced with + # `identity` so the upstream returns uncompressed bytes — + # pipelock's response scanner can't read compressed bodies + # and would 403 with "compressed sse_stream response cannot + # be scanned". + headers = build_forward_headers( + [("Accept-Encoding", "gzip, deflate, br")], + auth_scheme="Bearer", token="t", upstream_host="x.example", + ) + ae = [v for n, v in headers if n.lower() == "accept-encoding"] + self.assertEqual(["identity"], ae) + def test_strips_content_length(self): # http.client recomputes Content-Length; passing it through # double-counts and breaks the upstream. -- 2.52.0 From 6b915067066dd2d518e7b97f4eceea4c2410b377 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 14:23:26 -0400 Subject: [PATCH 24/24] docs: redraw README architecture to show pipelock as HTTP/S chokepoint MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous diagram showed three parallel egress lanes — agent ↔ pipelock, agent ↔ git-gate, agent ↔ cred-proxy — each going off-box independently. That was true of an earlier shape but is now wrong on two counts: 1. cred-proxy's outbound HTTPS routes through pipelock (set when the SSRF / CA-trust wiring landed). All cred-proxy upstream bytes pass pipelock's allowlist + body scanner. 2. git-gate's SSH push/fetch is direct out the egress network and has never gone through pipelock — pipelock is HTTP-only. Reflect both: the diagram now collapses to one HTTP/HTTPS chokepoint (pipelock) that the agent and cred-proxy share, plus a separate SSH lane for git-gate. Prose paragraph above the diagram updated to call out the "everything except SSH" framing explicitly. Verified against the current code: HTTPS_PROXY=pipelock set on the agent in launch.py and on cred-proxy in DockerCredProxy.start; git-gate's create-args carry no proxy env vars. --- README.md | 72 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 41 insertions(+), 31 deletions(-) diff --git a/README.md b/README.md index 43bbfc3..32b2469 100644 --- a/README.md +++ b/README.md @@ -71,43 +71,53 @@ pieces of v1. A bottle is the agent container plus up to three per-protocol egress sidecars on a per-agent Docker `--internal` network. The agent has no -default route off-box; its only way out is through the pipelock -sidecar (for HTTP/HTTPS), the git-gate sidecar (for git operations -against declared upstreams), or the cred-proxy sidecar (for API -calls that need a manifest-declared token — Anthropic OAuth, GitHub -PAT, Gitea PAT, npm). Each sidecar also sits on an egress network -that does have internet access, so the agent's traffic always passes -through a container that enforces the manifest before it leaves the -host. +default route off-box. All HTTP and HTTPS egress — from the agent +*and* from cred-proxy when it dials an upstream — funnels through +pipelock, where the egress allowlist, TLS interception, and +request-body DLP scanner enforce the manifest before any byte leaves +the host. The only egress that doesn't traverse pipelock is git-gate's +SSH push/fetch to `bottle.git` upstreams — pipelock can't proxy SSH, +so git-gate is its own L4-style egress path with gitleaks doing the +pre-receive scan. ``` host ( ./cli.py ) │ starts │ stops ▼ - ┌─────────────────────────── bottle ──────────────────────────┐ - │ │ - │ ┌──────────────────┐ │ - │ │ agent image │ HTTPS_PROXY ┌────────────────┐ │ HTTPS to - │ │ (claude-code, │ ───────────────► │ pipelock image │──┼──► allowlisted - │ │ built locally) │ │ (TLS bump, DLP,│ │ hosts - │ │ │ │ allowlist) │ │ - │ │ skills, env, │ └────────────────┘ │ - │ │ ~/.gitconfig, │ │ - │ │ ~/.npmrc, tea │ git ops ┌────────────────┐ │ SSH (push/ - │ │ │ ───────────────► │ git-gate image │──┼──► fetch) to - │ │ │ │ (gitleaks + │ │ bottle.git - │ │ environ: URLs │ │ git daemon) │ │ upstreams - │ │ only, no real │ └────────────────┘ │ - │ │ tokens │ bearer-auth ┌────────────────┐ │ HTTPS to - │ │ │ ───────────────► │ cred-proxy │──┼──► bottle.tokens - │ │ │ HTTP, plain │ (strips/injects│ │ upstreams - │ │ │ │ Authorization)│ │ (with the - │ └──────────────────┘ └────────────────┘ │ real token) - │ │ - │ agent on internal network (no default route); │ - │ sidecars also attached to an egress network. │ - └─────────────────────────────────────────────────────────────┘ + ┌─────────────────────────── bottle ──────────────────────────────────┐ + │ │ + │ ┌──────────────────┐ │ + │ │ agent image │ HTTPS_PROXY │ + │ │ (claude-code, │ ────────────────────────┐ │ + │ │ built locally) │ │ │ + │ │ │ plain HTTP │ │ + │ │ skills, env, │ (token injection) ┌────▼─────────┐ │ + │ │ ~/.gitconfig, │ ──────────────────►│ cred-proxy │ │ + │ │ ~/.npmrc, tea │ │ (strips/inj │ │ + │ │ │ │ Authoriz.) │ │ + │ │ environ: URLs │ └─────┬────────┘ │ + │ │ only, no real │ HTTPS_PROXY │ │ + │ │ tokens │ ▼ │ + │ │ │ ┌────────────────┐ │ HTTPS to + │ │ │ │ pipelock image │──────────┼──► allowlisted + │ │ │ │ (TLS bump, DLP │ │ hosts (incl. + │ │ │ │ body scan, │ │ cred-proxy + │ │ │ │ allowlist) │ │ upstreams) + │ │ │ └────────────────┘ │ + │ │ │ │ + │ │ │ git:// ┌────────────────┐ │ SSH push/fetch + │ │ │ ────────────────►│ git-gate image │──────────┼──► to bottle.git + │ │ │ │ (gitleaks + │ │ upstreams + │ └──────────────────┘ │ git daemon) │ │ (direct — not + │ └────────────────┘ │ via pipelock) + │ │ + │ agent on internal network (no default route); pipelock, │ + │ cred-proxy, and git-gate straddle internal + egress networks. │ + │ pipelock is the single HTTP/HTTPS chokepoint — cred-proxy's │ + │ outbound traverses it too. git-gate's SSH egress is direct │ + │ because pipelock is HTTP-only. │ + └─────────────────────────────────────────────────────────────────────┘ ``` - **agent image** — built from the repo `Dockerfile` (`node:22-slim` -- 2.52.0