Files
bot-bottle/docs/prds/0010-cred-proxy.md
T
didericis 3747927b9e docs: align cred-proxy architecture diagram
Trim one trailing space from the four arrow/HTTPS rows and add
one dash to the bottle-container bottom edge so all box-bound
lines are 68 columns.
2026-05-13 15:35:37 -04:00

21 KiB

PRD 0010: Credential proxy for agent-bound API tokens

  • Status: Draft
  • Author: didericis
  • Created: 2026-05-13

Summary

Per-bottle reverse proxy that holds API tokens (Anthropic OAuth, GitHub PAT, Gitea PAT, npm token) in a root-owned process inside the agent container. The agent (node, UID 1000) keeps only URLs in its environ; the proxy injects the right Authorization header and forwards over TLS. The boundary that makes this meaningful is the kernel's ptrace_may_access check: node cannot read root's /proc/<pid>/environ and cannot ptrace attach without CAP_SYS_PTRACE / CAP_PERFMON, which claude-bottle does not grant.

AWS / SigV4 is explicitly out of scope — it is per-request signing, not header injection, and does not fit this proxy's shape. If a bottle needs AWS credentials later, that lives in a separate PRD.

Problem

Today CLAUDE_CODE_OAUTH_TOKEN (and any bottle.env secrets such as a Gitea PAT, GitHub PAT, or npm token) gets docker run -e'd straight into the agent's environ. Inside the bottle the agent runs as node with --dangerously-skip-permissions; its Bash tool can do printenv, cat /proc/self/environ, or node -e 'console.log(process.env)' and capture every value into the conversation. From there a prompt-injected or hijacked agent can exfil over any allowed egress (api.anthropic.com itself if nothing else).

Linux has no per-env-var ACL — once a variable is in a process's environ, the process and its descendants own it. The credible boundary is process-level: hold the credential in a different process the agent cannot read. Default Docker already enforces that boundary at the kernel line via ptrace_may_access, the same property the (removed) ssh-gate and the current git-gate rely on.

The research note agent-credential-proxy-landscape.md surveys the existing tools and concludes that a small claude-bottle-specific reverse proxy is less work and less risk than either adopting nono (alpha, unaudited) or Infisical Agent Vault (TLS-MITM topology that doubles up on pipelock's CA stack). This PRD is the build.

Goals / Success Criteria

Each test runs inside a bottle whose manifest declares the four supported kinds (anthropic, github, gitea, npm):

  1. No plaintext tokens in the agent's environ. printenv and cat /proc/self/environ from the agent's shell return only URLs pointing at 127.0.0.1:<PORT>/.... None of the bottle.tokens[].TokenRef values appear.
  2. Kernel boundary holds. From the agent's shell, cat /proc/<cred-proxy-pid>/environ returns EACCES and gdb -p <cred-proxy-pid> / strace -p <cred-proxy-pid> fails with EPERM.
  3. Anthropic API works. claude makes a successful streaming tool-use round-trip via ANTHROPIC_BASE_URL127.0.0.1:<PORT>/anthropic. SSE chunks arrive without buffering; anthropic-version, anthropic-beta, and X-Claude-Code-Session-Id headers round-trip untouched.
  4. Git push to declared remotes works. git push against a bottle.tokens[].Kind: github or gitea upstream succeeds; the upstream sees the gate's token, not the agent's.
  5. npm install works. npm install <public-package> succeeds against the registry pointed at the proxy. A scoped install that requires the token (e.g. against a private registry) also succeeds.
  6. Wrong token rejected at the source, not silently swapped. If the agent tries to send its own Authorization: … header, the proxy strips and replaces with the configured one. A manifest token revoked at the upstream produces a 401 to the agent, not a 5xx.

Non-goals

  • AWS / SigV4. Per-request signing is a different shape; a bearer-injecting proxy doesn't help. Hold for a future PRD (likely an IMDS emulator sidecar handing out short-lived STS credentials).
  • DB-backed credential store. Flat env / mode-600 file only. The LiteLLM CVE-2026-42208 incident is the cautionary tale: any DB-backed credential gateway is itself a high-value attack target.
  • Generic LLM-gateway features. No cost tracking, no fallbacks, no virtual keys, no multi-tenant routing, no usage metering. The proxy is a credential-injection trust endpoint, not a gateway.
  • Subsuming pipelock. pipelock keeps its egress-allowlist role. It drops the api.anthropic.com TLS-MITM job because cred-proxy is now the trust endpoint for that host; everything else pipelock does stays.
  • TLS interception inside the bottle. The agent talks plain HTTP to loopback; cred-proxy speaks real HTTPS outbound. No container-local CA, no golang/go#28866 loopback workaround.
  • Cross-bottle credential sharing. One proxy per bottle, same one-sidecar-per-agent posture as pipelock and git-gate.
  • claude --bare mode. Reads only ANTHROPIC_API_KEY, not the OAuth token. Not in claude-bottle's flow today.
  • MCP-server tokens, package-installer tokens for languages beyond npm. PyPI / Bun / cargo can land in a follow-up if needed; the routing pattern generalizes.

Scope

In scope

  • Manifest field. bottle.tokens: [TokenEntry, ...]. Each entry carries Kind (anthropic | github | gitea | npm), an optional Url (required for gitea, defaulted for the others), and TokenRef (the name of a host env var the CLI resolves at launch time).
  • cred-proxy process. Runs as root inside the agent container, listens on 127.0.0.1:<PORT>. Holds the tokens in its own environ — never on argv, never written to disk. Per-Kind route handler: inject the right header, forward over TLS, stream the response back to the client without buffering.
  • Agent-side rewrites. Provisioner writes:
    • ANTHROPIC_BASE_URL=http://127.0.0.1:<PORT>/anthropic to the agent's environ
    • ~/.npmrc registry = http://127.0.0.1:<PORT>/npm/
    • ~/.gitconfig [url …] insteadOf = … for each declared github / gitea upstream
    • ~/.config/tea/config.yml with the proxy URL for each declared gitea entry
  • Process lifecycle. Container entrypoint launches the proxy first as root, waits for it to bind, then exec setpriv … --reuid=node --regid=node … for the claude child. Proxy death is fatal (the container exits); this is also the PID-1-zombie story.
  • pipelock interop. Drop api.anthropic.com from pipelock's TLS-MITM list; keep it on the allowlist as a plain HTTPS host (cred-proxy is the trust endpoint now). Verify pipelock still lets cred-proxy's HTTPS connections out for the four upstream hosts.
  • Plan rendering. bottle_plan.py and the y/N preflight show: which tokens are configured (kind + ref name, not the value), the proxy port, the routes the proxy will publish.
  • Drop the existing CLAUDE_CODE_OAUTH_TOKEN forward in prepare.py. Today it lands in the agent's environ; once this PRD ships, it lands in the proxy's environ instead.
  • Tests. Integration tests for each of the six success criteria; unit tests for manifest parsing, route table generation, header injection.

Out of scope

  • AWS / SigV4 (see Non-goals).
  • Per-method / per-path allowlist inside a kind. Defer to a follow-up once observed traffic stabilizes.
  • Replacing bottle.env for non-token secrets. The proxy handles the four kinds listed above; other env vars keep their current path.
  • Migrating an in-flight bottle from "token in agent env" to "token via proxy" mid-session. Restart required.
  • Audit logging. The proxy doesn't write request logs in v1. Add only if a concrete debugging need surfaces.

Proposed Design

Architecture

┌── Host (macOS) ──────────────────────────────────────────────────┐
│   Secrets at rest (keychain / .env):                             │
│     CLAUDE_BOTTLE_OAUTH_TOKEN, GITHUB_TOKEN,                     │
│     GITEA_SERVER_TOKEN, NPM_TOKEN                                │
│        │ docker run -e KEY  (no =VALUE on argv)                  │
│        ▼                                                         │
│   ┌── Bottle container ────────────────────────────────────────┐ │
│   │                                                            │ │
│   │   ┌── UID 1000 (node) ─────────────────────────────────┐   │ │
│   │   │  claude --dangerously-skip-permissions             │   │ │
│   │   │  environ: URLs only, no plaintext tokens           │   │ │
│   │   │    ANTHROPIC_BASE_URL=http://127.0.0.1:PORT/anth.. │   │ │
│   │   │    npm  registry    → http://127.0.0.1:PORT/npm/   │   │ │
│   │   │    git  remote.url  → http://127.0.0.1:PORT/...    │   │ │
│   │   │    tea  --url       → http://127.0.0.1:PORT/gitea  │   │ │
│   │   └────────────┬───────────────────────────────────────┘   │ │
│   │                │ plain HTTP, loopback                      │ │
│   │                ▼                                           │ │
│   │   ┌── UID 0 (root) ────────────────────────────────────┐   │ │
│   │   │  cred-proxy   listens 127.0.0.1:PORT               │   │ │
│   │   │  tokens live ONLY in this process's environ        │   │ │
│   │   │  per-route: inject auth header, forward over TLS   │   │ │
│   │   │    /anthropic → api.anthropic.com   Bearer         │   │ │
│   │   │    /gh-api    → api.github.com      Bearer         │   │ │
│   │   │    /gh-git    → github.com          Bearer         │   │ │
│   │   │    /gitea     → gitea.dideric.is    token          │   │ │
│   │   │    /npm       → registry.npmjs.org  Bearer         │   │ │
│   │   │  SSE pass-through, no buffering                    │   │ │
│   │   └────────────┬───────────────────────────────────────┘   │ │
│   │                │ HTTPS                                     │ │
│   │                ▼                                           │ │
│   │   ┌── pipelock (egress allowlist) ─────────────────────┐   │ │
│   │   │  allow: api.anthropic.com, api.github.com,         │   │ │
│   │   │         github.com, gitea.dideric.is,              │   │ │
│   │   │         registry.npmjs.org                         │   │ │
│   │   │  block: statsig, sentry, autoupdater, *            │   │ │
│   │   └────────────┬───────────────────────────────────────┘   │ │
│   └────────────────┼───────────────────────────────────────────┘ │
│                    ▼                                             │
└────────────────────┼─────────────────────────────────────────────┘
                     ▼
              Upstream APIs


Why node@1000 can't just steal the tokens:
   ┌─────────────────────────────────────────────────────────┐
   │  node tries:                                            │
   │     cat /proc/<cred-proxy-pid>/environ   → EACCES       │
   │     ptrace(PTRACE_ATTACH, <cred-proxy-pid>, ...) → EPERM│
   │  Kernel's ptrace_may_access rejects: UID mismatch       │
   │  and no CAP_SYS_PTRACE / CAP_PERFMON in the container.  │
   └─────────────────────────────────────────────────────────┘

New components

  • claude_bottle/cred_proxy.py (new): abstract CredProxy
    • CredProxyPlan dataclass. prepare is host-side and side-effect-free on Docker; renders the route table and resolves TokenRefs against host env. Mirrors the existing GitGate / Pipelock shape.
  • claude_bottle/backend/docker/cred_proxy.py (new): DockerCredProxy concrete subclass. Bakes the proxy binary into the agent image; start writes the route table to a mode-600 file under stage_dir and arranges the entrypoint so the proxy boots first.
  • claude_bottle/backend/docker/provision/cred_proxy.py (new): renders ANTHROPIC_BASE_URL, ~/.npmrc, ~/.gitconfig insteadOf blocks, and ~/.config/tea/config.yml into the agent's home for each declared kind.
  • The proxy binary itself. Bundled into the agent image at /usr/local/libexec/cred-proxy. See "External dependencies" for the language choice.

Existing code touched

  • claude_bottle/manifest.py — add TokenEntry, Bottle.tokens: tuple[TokenEntry, ...] = (), parse + validate (at most one entry per Kind except gitea, which may carry multiple Urls).
  • claude_bottle/backend/docker/prepare.py — delete the CLAUDE_BOTTLE_OAUTH_TOKENCLAUDE_CODE_OAUTH_TOKEN branch in the agent's forwarded env. The OAuth token now flows to the proxy's environ via the cred-proxy lifecycle.
  • claude_bottle/backend/docker/backend.py — instantiate DockerCredProxy; thread its prepare / start / stop through resolve_plan / launch.
  • claude_bottle/backend/docker/launch.py — add cred-proxy start before the cred-proxy provisioner runs (provisioner writes URLs that reference the proxy port, so it must be up).
  • claude_bottle/backend/docker/bottle_plan.py — new CredProxyPlan field; preflight shows kind + ref name + port + route table.
  • claude_bottle/pipelock.py — drop the api.anthropic.com TLS-MITM branch; the host stays on the allowlist as a plain HTTPS destination. Confirm the four upstream hosts are allowlisted by default when bottle.tokens declares them.
  • README.md — replace the architecture diagram with the one above; document the bottle.tokens field.
  • claude-bottle.example.json — add a tokens array to one bottle showing each Kind.
  • Tests — new unit tests for manifest parsing, route table generation, header injection; new integration tests for the six success criteria. Delete the bits of prepare.py tests that asserted on CLAUDE_CODE_OAUTH_TOKEN landing in the agent's env.

Data model changes

@dataclass(frozen=True)
class TokenEntry:
    Kind: Literal["anthropic", "github", "gitea", "npm"]
    TokenRef: str             # name of host env var
    Url: str | None = None    # required for gitea; defaulted otherwise

@dataclass(frozen=True)
class Bottle:
    ...
    tokens: tuple[TokenEntry, ...] = ()

Validation:

  • Kind must be one of the four supported values.
  • TokenRef must resolve against os.environ at launch (fail fast with a clear "host env var X is unset" if missing).
  • gitea entries require Url; others fall back to the documented upstream.
  • At most one entry per Kind except gitea, which may have multiple distinct Urls.
  • No silent overlap with bottle.git upstreams that already flow through git-gate; if a tokens[].Kind: github|gitea entry's Url collides with a git[].Upstream's host, parse fails with a "git-gate already brokers this remote, drop one" hint. (Both paths broker credentials; doubling up is a configuration smell, not a feature.)

Routing table

Kind Proxy path Upstream Header
anthropic /anthropic/ api.anthropic.com Authorization: Bearer …
github /gh-api/ api.github.com Authorization: Bearer …
github /gh-git/ github.com Authorization: Bearer …
gitea /gitea/<Url> configured Url Authorization: token …
npm /npm/ registry.npmjs.org Authorization: Bearer …

Gitea uses Authorization: token rather than Bearer to sidestep go-gitea/gitea#16734. The proxy strips any incoming Authorization header before injecting its own — the agent cannot smuggle a stolen token through this path.

External dependencies

The proxy binary. Two real options:

  • Python (stdlib)http.server + urllib/http.client, no new pip packages. Matches CLAUDE.md's "bash-first, low-deps" posture. SSE pass-through is fiddly but doable.
  • Go single binary — cleaner SSE story, smaller runtime, one static binary baked into the image. New build dependency.

Default: Python, baked into the agent image. Reconsider in the implementation PR if SSE behavior is troublesome under load.

No new Python packages. No DB. No admin API. The proxy's configuration is a single mode-600 JSON file passed in via /run/cred-proxy/routes.json.

Future work

  • AWS / SigV4. Likely an IMDS emulator sidecar handing out short-lived STS tokens. Different threat model (the agent ends up holding the STS creds — the proxy just shortens their lifetime). Separate PRD.
  • Per-method / per-path allowlist inside a kind. Once the set of API operations claude actually performs is observed, reject everything else. Narrows the within-allowlist surface.
  • Short-lived token minting. For services that support it (GitHub Apps, GitLab project-access tokens, fine-grained PATs with TTL), have the proxy mint a fresh per-session child credential from a long-lived parent.
  • Smolmachines colocation. Same packing question as pipelock / git-gate; the cred-proxy can sit inside the agent VM (current shape) or in a separate VM (stricter isolation, per-bottle TCP hop). Backend decision, not a manifest decision.
  • More kinds. PyPI, Bun, cargo, Docker Hub. The routing pattern generalizes; add as needed.

Open questions

  • Field name. bottle.tokens is the working name. The research note used bottle.forge for the gitea/github generalization, but "forge" doesn't fit anthropic or npm. Alternatives: bottle.brokered, bottle.upstreams, bottle.cred_proxy. Default: bottle.tokens.
  • Python vs Go for the proxy. Default: Python, revisit during implementation if SSE pass-through is unreliable.
  • Process inside the agent container vs sidecar container. v1: inside (simpler lifecycle, no extra container; ptrace boundary is enough). The sidecar option becomes attractive only if we want a network-layer split between proxy and agent on top of the UID split.
  • Belt-and-braces on outbound telemetry. Set CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 and DISABLE_ERROR_REPORTING=1 in the agent's environ by default? Default: yes — they don't route through ANTHROPIC_BASE_URL, so the proxy doesn't catch them; the flags are the only off switch.
  • git push over a rewritten URL vs. credential-helper shim. [url "http://…"] insteadOf = "https://github.com/" captures push/fetch/clone/pull/ls-remote in one config knob; a credential helper would need separate wiring. Default: insteadOf.
  • Token-refresh story for the Anthropic OAuth token. The token is ~1-year and there's no client-side refresh, so the proxy holds a static value. The 1-year blast radius is the cost, documented in claude-code-token-revocation.md. No design change here; flagged for awareness.
  • anthropics/claude-code#36998. Older claude-code versions bypassed ANTHROPIC_BASE_URL for some startup calls (auth validation, org lookup). Marked closed upstream; the implementation PR verifies with strace -e connect against the pinned claude-code build before trusting the isolation.

References