From 00649d27e93f5bbb4304f06583a1bf06a77dba17 Mon Sep 17 00:00:00 2001 From: didericis Date: Tue, 12 May 2026 23:25:12 -0400 Subject: [PATCH] docs(research): add credential-proxy landscape and DLP-minimization framing Consolidates oauth-token-exposure-to-claude.md and tea-token-isolation-via-proxy.md into agent-credential-proxy-landscape.md, adding a May-2026 survey of existing tools (Docker AI Sandboxes, Cloudflare Sandbox Auth, Infisical Agent Vault, nono, Aembit, LiteLLM CVE-2026-42208, Portkey, Helicone, etc.) and a build-vs-adopt verdict. Adds secret-minimization-over-dlp.md explaining why pipelock's body DLP and gitleaks's pre-receive scan cannot stop encoding/splitting exfil, and why moving credentials out of the bottle (the git-gate pattern, generalized) is the only robust answer. Updates git-secret-scanning-hardening.md's reference to point at the new consolidated landscape doc. Co-Authored-By: Claude Opus 4.7 --- .../agent-credential-proxy-landscape.md | 392 ++++++++++++++++++ .../research/git-secret-scanning-hardening.md | 2 +- .../oauth-token-exposure-to-claude.md | 161 ------- docs/research/secret-minimization-over-dlp.md | 317 ++++++++++++++ .../research/tea-token-isolation-via-proxy.md | 189 --------- 5 files changed, 710 insertions(+), 351 deletions(-) create mode 100644 docs/research/agent-credential-proxy-landscape.md delete mode 100644 docs/research/oauth-token-exposure-to-claude.md create mode 100644 docs/research/secret-minimization-over-dlp.md delete mode 100644 docs/research/tea-token-isolation-via-proxy.md diff --git a/docs/research/agent-credential-proxy-landscape.md b/docs/research/agent-credential-proxy-landscape.md new file mode 100644 index 0000000..98a9d1e --- /dev/null +++ b/docs/research/agent-credential-proxy-landscape.md @@ -0,0 +1,392 @@ +# Agent credential proxy landscape + +Consolidated research on running an auth-header-injecting proxy in +front of an AI agent so API tokens stay out of the agent's process +space. Folds in the per-service mechanics for the Anthropic OAuth +token and the Gitea PAT — the two cases claude-bottle hits first — +and surveys existing tools as of May 2026. + +Companion to +[`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md) +(the architectural framing — why this matters), and to +[`local-vs-remote-agent-execution.md`](local-vs-remote-agent-execution.md) +(the broader threat model that flagged long-lived static tokens as +the biggest credential risk). + +## Summary + +Today every claude-bottle agent gets `CLAUDE_CODE_OAUTH_TOKEN` (and +any `bottle.env` secrets like a Gitea PAT) injected as env vars, +which means the agent process can read them with `printenv` or +`/proc/self/environ`. A prompt-injected or hijacked agent can ship +those bytes to any allowed host. Linux has no primitive for +"this env var exists in my process but I can't read it" — the only +credible boundary is to put the credential in a *different* process +that the agent cannot read, and let the agent talk to it over a +narrow API. Default Docker enforces that boundary at the kernel +level via `ptrace_may_access`; a future smolmachines backend +enforces it harder, at the VM line. + +Several existing tools implement this pattern, but none of them are +a clean drop-in for claude-bottle today: the most architecturally +aligned (nono) is alpha; the most mature open-source +(Infisical Agent Vault) requires TLS MITM and would double up on +pipelock's TLS-interception stack. For the Anthropic-token slice, a +small claude-bottle-specific reverse proxy modeled on the +phantom-token shape is probably the right call. For Gitea / GitHub / +GitLab, the same proxy generalizes by config. + +## The shared problem + +Linux has no per-env-var ACL. Once a var is in a process's +`environ`, the process and its descendants own it. The deeper +boundary is **process-level**: hold the credential in a process the +agent cannot read. + +Default Docker enforces that boundary for you. The kernel's +`ptrace_may_access` check rejects `/proc//environ` reads when +the caller's UID/GID don't match the target's and the caller lacks +`CAP_SYS_PTRACE` or `CAP_PERFMON`. A `node`-uid claude attempting to +read a root-owned proxy's environ gets `EACCES`. Escape hatches +(`--cap-add=SYS_PTRACE`, `--cap-add=PERFMON`, `--privileged`) are +not used by claude-bottle. Yama `ptrace_scope` is irrelevant — it +only relaxes the *same-UID* relationship check; the cross-UID +match requirement still blocks the read. On a smolmachines backend +the boundary becomes the VM line; same property, harder. + +claude-code's `apiKeyHelper` setting is **not** a boundary. The +helper is invoked by claude's own process, so claude can just call +it via Bash and capture stdout. Same trust domain. + +The remaining credible designs reduce to three: + +- **Header-injecting reverse proxy** — agent points at a localhost + URL; proxy holds the credential; proxy adds the auth header and + forwards. Cleanest fit for services that support a `BASE_URL`-style + override (Anthropic, OpenAI, Portkey, etc.). +- **Forward proxy with TLS termination** — agent keeps the real + service URL; an `HTTPS_PROXY` MITM intercepts, terminates TLS with + a container-local CA, injects the header, re-encrypts. Heavier; + required when the agent's tool can't be pointed at an explicit URL. +- **Don't ship the token at all** — fall back to per-session login + or short-lived child tokens. Operationally heavier; the long-lived + OAuth token was chosen precisely because it's portable + (Keychain on macOS, file on Linux). + +## Per-service mechanics + +### Anthropic / Claude Code + +**Today's wiring** (`claude_bottle/cli/start.py`): the host's +`CLAUDE_BOTTLE_OAUTH_TOKEN` is forwarded into the bottle as +`CLAUDE_CODE_OAUTH_TOKEN` via `docker run -e CLAUDE_CODE_OAUTH_TOKEN` +(no `=value`, so the value never lands on argv — good). Inside the +bottle, claude runs as `node` (UID 1000) with +`--dangerously-skip-permissions`. Its Bash tool can do +`printenv CLAUDE_CODE_OAUTH_TOKEN`, `cat /proc/self/environ`, +`node -e 'console.log(process.env)'` and capture the value into +the conversation. The DLP / egress story +([`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md)) +explains why scanning on the way out doesn't save you here. + +**Routing primitive:** `ANTHROPIC_BASE_URL` is documented as a +generic proxy/gateway override, not just Bedrock/Vertex, and works +alongside bearer auth. The proxy sets +`Authorization: Bearer $TOKEN` and forwards to +`https://api.anthropic.com`. Claude as `node` only sees the URL, +never the token. + +**Confirmed gotchas:** + +- **SSE streaming**: the proxy must not buffer responses (nginx + `proxy_buffering off`, or a streaming-aware proxy). Claude Code + uses SSE only — no websockets. +- **Forward `anthropic-version`, `anthropic-beta`, and + `X-Claude-Code-Session-Id` untouched.** Stripping them breaks tool + use / extended thinking / session aggregation. +- **GitHub issue + [#36998](https://github.com/anthropics/claude-code/issues/36998)**: + interactive mode historically bypassed `ANTHROPIC_BASE_URL` for + some startup calls (auth validation / org lookup), connecting + directly to api.anthropic.com. Marked closed but verify with + `tcpdump` or `strace -e connect` against the pinned claude-code + build before trusting the isolation. +- **Tool search** (`ENABLE_TOOL_SEARCH`) is disabled by default when + `ANTHROPIC_BASE_URL` is non-Anthropic; re-enable explicitly if + needed. +- **Out-of-band outbound traffic** does *not* route through + `ANTHROPIC_BASE_URL`: + - `statsig.anthropic.com` — telemetry + (disable: `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`, + `DISABLE_TELEMETRY=1`) + - Sentry error reporting (disable: `DISABLE_ERROR_REPORTING=1`) + - `registry.npmjs.org`, `github.com`, + `release-assets.githubusercontent.com` — MCP installs + autoupdater + - `pypi.org`, `bun.sh` — if the Bash tool installs Python or Bun + packages during a session + + A hijacked claude could exfil the captured token (or any other + data) through any of these even with the proxy in place. Pair + the proxy with an explicit egress allowlist for the full benefit + (claude-bottle does this via pipelock). +- **Token refresh**: `claude setup-token` issues a ~1-year OAuth + token with no client-side refresh, so a static proxy value is + fine. The flip side is a one-year blast radius if the token leaks + — see + [`claude-code-token-revocation.md`](claude-code-token-revocation.md). +- **No request signing / anti-replay** on the Messages API; header + rewriting is safe. +- **`--bare` mode** reads only `ANTHROPIC_API_KEY`, not + `CLAUDE_CODE_OAUTH_TOKEN`. Not relevant to the interactive flow + claude-bottle ships, but worth noting if `--bare` is ever wired in. + +### Gitea (`tea` + git HTTPS) + +**Token sources, in precedence order:** + +1. **`GITEA_SERVER_TOKEN` env var** — registered via + `cli.EnvVars("GITEA_SERVER_TOKEN")` in `cmd/login/add.go`. +2. **`~/.config/tea/config.yml`** (XDG) or `~/.tea/tea.yml` + (legacy fallback) — plaintext YAML, `token` field under the + login entry. +3. **OS credstore** — OAuth logins only; PAT-based logins go to + the YAML file. + +There is **no `credential.helper` analogue**: no `--token-file`, +no FD-passing, no socket-based credential protocol. So the token +can't be hidden *inside* `tea`'s process — it has to be held by a +*different* process the agent cannot read. For HTTPS git +operations, `tea` uses `go-git` directly with +`BasicAuth{Username: token, Password: ""}` +(`modules/git/auth.go`), bypassing git's own credential.helper +machinery. A credential-helper shim alone won't intercept +`tea repo clone` — the proxy has to sit on the HTTP path itself. + +**Header form:** use `Authorization: token <…>`, **not** `Bearer`. +[go-gitea/gitea#16734](https://github.com/go-gitea/gitea/issues/16734) +emitted spurious "missing CSRF token" errors for `Bearer` on some +endpoints. The fix landed upstream, but `token` has always been +the header-safe choice. + +**No CSRF / no per-request nonce** on the Gitea API for token +auth, so a header-rewriting proxy is safe. + +**Plain `git push`** from claude can use either the proxy +(rewritten remote URL) or a credential-helper shim that calls the +proxy. The rewritten-remote approach keeps the token bytes out of +git's credential negotiation entirely. (Note: this is parallel to +the existing git-gate in PRD 0008, which solves the SSH-push case +via a per-bottle mirror.) + +### GitHub / GitLab + +Structurally identical to Gitea for PAT auth: stateless +`Authorization: Bearer <…>` (GitHub PATs and GitLab PATs both +accept Bearer, and GitHub also accepts `token <…>` for legacy +clients), no CSRF, no signing. Per-route allowlisting at the +proxy is the lever for narrowing blast radius. GitHub fine-grained +PATs and GitLab project-access tokens are the issuance-side +mitigation. Either composes cleanly with the same proxy. + +## Proxy architectures + +Four shapes worth comparing. The first is the lowest-friction +match for claude-bottle today. + +| Shape | Pros | Cons | +|---|---|---| +| **In-container reverse proxy** (recommended) | Self-contained per agent, no host changes, no MITM CA, no Go-loopback workaround. Works for any service with a `BASE_URL`-style override (Anthropic, OpenAI, Portkey). | Doesn't work for services that hardcode the upstream URL — requires either rewriting the client config or moving up to a forward proxy. | +| **In-container forward proxy + TLS termination** | Transparent to the agent's tooling — every HTTPS request gets intercepted regardless of base-URL support. | Needs a container-local CA in the trust store (same machinery PRD 0006 set up for pipelock). Has the `golang/go#28866` loopback gotcha: `net/http` ignores `HTTPS_PROXY` when set to `127.0.0.1`/`localhost`, so the proxy must bind on a non-loopback address (Docker bridge IP, `host.docker.internal`, or `ip addr add 10.0.0.1/32 dev lo`). | +| **Host-side proxy** | Token stays entirely outside the Linux VM. This is the Docker AI Sandbox shape. | A host daemon to maintain; the published port is reachable by any container on the host unless firewalled. UDS-across-VM doesn't work on Docker Desktop on macOS (no AF_UNIX `connect()` over the VM), but `host.docker.internal:` over TCP works fine. | +| **Sidecar container** | Clean isolation; portable across hosts. Matches the existing pipelock / ssh-gate / git-gate topology. | Another container to orchestrate per agent; the token is in another container's env, which is a lateral move unless the sidecar runs with stricter isolation than the agent container does. | + +For claude-bottle today — local Docker, per-agent containers, the +root-owned-helper pattern already established by the SSH agent — +the **in-container reverse proxy** is the lowest-friction option +that gives the desired property. The sidecar-container shape is +the natural evolution if the proxy needs the same per-bottle +isolation that pipelock has. + +## Landscape of existing tools (May 2026) + +Two categories: + +- **A. Generic LLM / API gateways** that happen to support credential + injection as a side feature. +- **B. Purpose-built agent credential brokers** — newer, closer to + what claude-bottle wants. + +| Tool | Category | License | Topology | Injection mechanism | `ANTHROPIC_BASE_URL` compatible | Per-route allowlist | Maturity | +|---|---|---|---|---|---|---|---| +| **Docker AI Sandboxes** | B | Proprietary | Host-side proxy | Header overwrite, OS keychain | No (intercepts by domain) | Domain only | GA (Mar 2026) | +| **Cloudflare Sandbox Auth** | B | Proprietary | Sandbox sidecar + ephemeral CA | TLS intercept + Outbound Worker | No (platform-specific) | Host/IP/method | GA (Apr 2026) | +| **Infisical Agent Vault** | B | MIT (EE carve-out) | In-process HTTPS_PROXY forward proxy | TLS MITM, dummy-to-real swap | No — HTTPS_PROXY model | Service-level | Active; v0.19.0 May 2026, ~1k⭐ | +| **nono** | B | Apache-2.0 | In-process reverse proxy | Phantom token, explicit URL routing | **Yes** — `BASE_URL=http://127.0.0.1:PORT/…` | Host + endpoint | Early alpha; v0.53.0 May 2026, 2.4k⭐ | +| **Aegis** | B | Apache-2.0 | In-process reverse proxy | Path routing (`localhost:3100/{svc}/…`) | Configurable, undocumented for Anthropic | Method/path/rate/time | Very new, 10⭐ | +| **OneCLI** | B | Apache-2.0 | Reverse proxy + management UI | Host/path matching, Bitwarden integration | Configurable | Per-agent scoping | Active; v1.23.0 May 2026, 2.1k⭐ | +| **Aembit** | B | Proprietary | Sidecar + cloud control plane | TLS intercept, SPIFFE, JIT creds | No — intercepts by destination | Policy-based | GA (Apr 2026) | +| **LiteLLM Proxy** | A | MIT | Reverse proxy | Virtual key → upstream key | Yes — set base URL to LiteLLM | Route-level | 45k⭐; **CVE-2026-42208 exploited Apr 2026**, patch v1.83.7 | +| **Portkey Gateway** | A | MIT (OSS core) | Reverse proxy | Virtual key vault (cloud or Enterprise self-host) | Yes — documented for Claude Code | Config-based | Production; virtual-key vault needs Enterprise for self-host | +| **Helicone** | A | Apache-2.0 | Reverse proxy | Proxy header auth; agent still holds own key | Yes | No | Maintenance mode (Mintlify acq. Mar 2026) | +| **LangSmith LLM Auth Proxy** | A | OSS Helm | Envoy sidecar | JWT + ext_authz upstream key injection | Yes | URL allowlist | Enterprise (LangSmith ≥ v0.13.33) | +| **Kong AI Gateway** | A | Apache-2.0 | Reverse proxy | Plugin per-route/consumer | Yes | Plugin-level | Production, heavy | +| **AWS IMDSv2** | — | n/a | Link-local | Per-instance metadata | n/a | n/a | Conceptual analog only | + +### Cluster commentary + +- **The phantom-token pattern** (nono) is the cleanest architectural + fit for claude-bottle. The agent receives a per-session + cryptographically random token scoped to the localhost proxy; + the proxy validates and swaps for the real upstream credential. + No TLS interception, no CA trust setup, works directly with + `ANTHROPIC_BASE_URL`. **Blocker:** nono is explicitly + "early alpha, not security audited." + +- **TLS-MITM forward proxies** (Infisical Agent Vault, Cloudflare + Sandbox Auth, Aembit, the existing pipelock) all double up on + the CA-trust machinery PRD 0006 already built for pipelock. + Adopting Agent Vault would mean two MITM proxies in each bottle + unless one is dropped. Also subject to `golang/go#28866` — must + bind on a non-loopback address. + +- **LLM gateways** (LiteLLM, Portkey, Helicone, Kong) all support + credential injection but are built for cost / observability / + fallback, not isolation. **Specific concern:** the LiteLLM + CVE-2026-42208 (CVSS 9.3, pre-auth SQL injection on the Bearer + auth path, exploited within 36 hours of disclosure) is a + reminder that any self-hosted DB-backed credential gateway is + itself a high-value attack target. Prefer a flat-file or + env-only credential store on the sidecar over a database. + +- **Helicone is in maintenance mode** since the Mintlify + acquisition in March 2026 (security fixes only, no features). + Treat as legacy. + +- **Portkey's virtual-key vault** — the actual credential-injection + feature — requires the Enterprise plan for self-host. The + open-source gateway alone does routing without injection. + +## Build-vs-adopt synthesis + +**Architecturally aligned:** nono. Phantom-token + explicit-URL +routing matches the design recommended here exactly; zero TLS +work. But "not security audited" + "early alpha" means adopting it +is a bet on the project rather than a buy-vs-build win. + +**Most mature OSS purpose-built:** Infisical Agent Vault. MIT, +v0.19.0 active, v0.17.0 added a containerized agent mode that +maps directly to claude-bottle. Friction is the TLS-MITM topology +— another container-local CA, the Go-loopback workaround, +duplication with pipelock's existing TLS interception layer. + +**For the immediate Anthropic-token slice, a ~100-line Rust or Go +reverse proxy modeled on nono's phantom-token shape is probably +less work and less risk than adopting either.** The surface is +small: hold the token, inject one header, forward to +api.anthropic.com over TLS, pass through SSE without buffering. +For Gitea / GitHub / GitLab the same proxy generalizes by config. + +The build path also keeps the credential store flat (env file or +mode-600 YAML on the sidecar), which sidesteps the +"DB-backed-gateway as attack surface" concern the LiteLLM CVE +exposed. + +## Recommended path forward + +In priority order: + +1. **In-container reverse proxy holding `CLAUDE_CODE_OAUTH_TOKEN`.** + Highest-leverage change: credential isolation **and** the + ability to drop the `api.anthropic.com` TLS passthrough in + pipelock (see + [`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md) + §2). Proxy runs as root inside the agent container, listens on + `127.0.0.1` (no Go-loopback issue for the reverse-proxy case — + the agent isn't using `HTTPS_PROXY`), injects + `Authorization: Bearer …`, sets the bottle's + `ANTHROPIC_BASE_URL` to the local URL. + +2. **Layer in + `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`** plus the existing + pipelock egress allowlist (api.anthropic.com only, plus the + per-agent set of MCP / git / package-registry hosts). A + hijacked claude can no longer exfil through statsig / Sentry / + npm even if it captures something. Also disable Sentry error + reporting via `DISABLE_ERROR_REPORTING=1`. + +3. **Generalize the same proxy to forge tokens.** Add a manifest + field along the lines of + `forge: { kind: "gitea", url, tokenRef }` so a per-bottle token + reference resolves at launch, the proxy starts as root before + `node` is exec'd, and `tea` plus git HTTPS remotes are + pre-configured to point at the proxy. Use + `Authorization: token <…>` for Gitea, `Bearer` for GitHub / + GitLab. + +4. **Scope-narrow the tokens at issuance.** `repo:write` only, no + `admin`, no user management. Fine-grained GitHub PATs, GitLab + project-access tokens, Gitea per-repo tokens. Cheapest single + thing to do; bounds blast radius regardless of whether the + proxy ships. + +5. **Allowlist at the proxy** once usage is stable. Method + path + filter keyed off the agent's actual API calls; reject + everything else. Doesn't prevent abuse within the allowlist but + narrows the surface to known good operations. + +The current `docker run -e CLAUDE_CODE_OAUTH_TOKEN` pattern is +fine for argv hygiene on the host, but inside the bottle the +token is fully exposed. The proxy pattern moves it across a +kernel-enforced boundary — the same property the SSH agent +already gives us for keys, and the same property the git-gate +already gives us for upstream push credentials. + +## Sources + +### Mechanics + +- [Authentication — Claude Code docs](https://code.claude.com/docs/en/authentication) +- [LLM gateway configuration — Claude Code docs](https://code.claude.com/docs/en/llm-gateway) +- [Claude Code environment variables](https://code.claude.com/docs/en/env-vars) +- [GitHub issue anthropics/claude-code#36998 — interactive mode bypasses ANTHROPIC_BASE_URL](https://github.com/anthropics/claude-code/issues/36998) +- [GitHub issue anthropics/claude-code#11587 — apiKeyHelper vs CLAUDE_CODE_OAUTH_TOKEN](https://github.com/anthropics/claude-code/issues/11587) +- [`proc_pid_environ(5)` man page](https://man7.org/linux/man-pages/man5/proc_pid_environ.5.html) +- [Documenting ptrace access mode checking — LWN](https://lwn.net/Articles/692203/) +- [StepSecurity — Claude Code Action outbound network analysis](https://www.stepsecurity.io/blog/anthropics-claude-code-action-security-how-to-secure-claude-code-in-github-actions-with-harden-runner) +- [Manage API key environment variables — Claude Help Center](https://support.claude.com/en/articles/12304248-manage-api-key-environment-variables-in-claude-code) +- [tea source — `cmd/login/add.go`](https://gitea.com/gitea/tea/src/branch/main/cmd/login/add.go) +- [tea source — `modules/config/config.go`](https://gitea.com/gitea/tea/src/branch/main/modules/config/config.go) +- [tea source — `modules/git/auth.go`](https://gitea.com/gitea/tea/src/branch/main/modules/git/auth.go) +- [Gitea API usage docs](https://docs.gitea.com/development/api-usage) +- [go-gitea/gitea#16734 — `Authorization: Bearer` triggers spurious CSRF](https://github.com/go-gitea/gitea/issues/16734) +- [golang/go#28866 — `net/http` ignores `HTTPS_PROXY` for `127.0.0.1`/`localhost`](https://github.com/golang/go/issues/28866) +- [git credential helper docs](https://git-scm.com/docs/gitcredentials) + +### Landscape + +- [Docker AI Sandboxes — credentials](https://docs.docker.com/ai/sandboxes/security/credentials/) +- [docker/desktop-feedback#130 — custom injection rules](https://github.com/docker/desktop-feedback/issues/130) +- [Cloudflare Sandbox Auth blog](https://blog.cloudflare.com/sandbox-auth/) +- [Cloudflare Outbound Workers GA changelog](https://developers.cloudflare.com/changelog/post/2026-04-13-sandbox-outbound-workers-tls-auth/) +- [Cloudflare Sandboxes GA — InfoQ](https://www.infoq.com/news/2026/04/cloudflare-sandboxes-ga/) +- [Infisical agent-vault — GitHub](https://github.com/Infisical/agent-vault) +- [Infisical agent-vault — releases](https://github.com/Infisical/agent-vault/releases) +- [Infisical agent-vault — blog](https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents) +- [nono — GitHub](https://github.com/always-further/nono) +- [nono — phantom token blog](https://nono.sh/blog/blog-credential-injection) +- [Aegis — GitHub](https://github.com/getaegis/aegis) +- [OneCLI — GitHub](https://github.com/onecli/onecli) +- [Sandbox0 — GitHub](https://github.com/sandbox0-ai/sandbox0) +- [Buildkite Cleanroom — GitHub](https://github.com/buildkite/cleanroom) +- [Aembit IAM for Agentic AI — GA](https://aembit.io/blog/aembit-iam-for-agentic-ai-is-now-generally-available/) +- [Aembit Claude integration docs](https://docs.aembit.io/user-guide/access-policies/server-workloads/guides/claude) +- [LiteLLM CVE-2026-42208 — Sysdig writeup](https://www.sysdig.com/blog/cve-2026-42208-targeted-sql-injection-against-litellms-authentication-path-discovered-36-hours-following-vulnerability-disclosure/) +- [LiteLLM — GitHub](https://github.com/BerriAI/litellm) +- [Portkey + Claude Code](https://portkey.ai/docs/virtual_key_old/integrations/libraries/claude-code) +- [Portkey gateway — GitHub](https://github.com/Portkey-ai/gateway) +- [Helicone maintenance mode announcement](https://dev.to/torrixai/helicone-is-now-in-maintenance-mode-here-is-how-to-switch-to-a-self-hosted-alternative-in-5-4li0) +- [LangSmith LLM auth proxy docs](https://docs.langchain.com/langsmith/llm-auth-proxy-self-hosted) +- [AWS IMDSv2 docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html) +- [Pipelock — Help Net Security](https://www.helpnetsecurity.com/2026/05/04/pipelock-open-source-ai-agent-firewall/) +- [CB4A IETF draft — Credential Broker for Agents](https://www.ietf.org/archive/id/draft-hartman-credential-broker-4-agents-00.html) +- [List of coding agent sandboxes (May 2026)](https://gist.github.com/wincent/2752d8d97727577050c043e4ff9e386e) diff --git a/docs/research/git-secret-scanning-hardening.md b/docs/research/git-secret-scanning-hardening.md index a0028e8..99d762f 100644 --- a/docs/research/git-secret-scanning-hardening.md +++ b/docs/research/git-secret-scanning-hardening.md @@ -126,7 +126,7 @@ forwarding to the real remote: unprivileged agent cannot read or modify. It holds the real push credential (deploy key, PAT, ssh agent socket) — the bottle never sees it, same as the auth-injecting proxy for `ANTHROPIC_BASE_URL` - in `oauth-token-exposure-to-claude.md`. + in `agent-credential-proxy-landscape.md`. - On receive, the gate runs `gitleaks detect` against the incoming refs (and their message text) in a temporary working tree. Clean pushes are forwarded to the real remote. Findings cause the push to diff --git a/docs/research/oauth-token-exposure-to-claude.md b/docs/research/oauth-token-exposure-to-claude.md deleted file mode 100644 index 6b82a1d..0000000 --- a/docs/research/oauth-token-exposure-to-claude.md +++ /dev/null @@ -1,161 +0,0 @@ -# OAuth token exposure to claude inside the bottle - -Research into whether `CLAUDE_CODE_OAUTH_TOKEN` — as currently forwarded into -each claude-bottle container — is reachable by claude itself, what (if any) -deeper-than-prompt mechanisms could hide it, and whether routing claude -through an auth-injecting proxy is viable. - -## Summary - -Yes, claude can read `CLAUDE_CODE_OAUTH_TOKEN` trivially today. There is no -Linux primitive for "this env var exists in my process but I cannot read it"; -the only credible boundary is to put the credential in a *different* process -that claude cannot read. Default Docker enforces that boundary at the kernel -level (a non-root process cannot read `/proc//environ`), so a -root-owned auth-injecting reverse proxy listening on `127.0.0.1` is a -realistic design. Claude Code's `ANTHROPIC_BASE_URL` officially supports -this routing pattern with bearer auth, with documented caveats around SSE, -header passthrough, and out-of-band outbound traffic (telemetry, npm, etc.) -that does not route through `ANTHROPIC_BASE_URL` at all. - -## How the token reaches claude today - -1. `claude_bottle/cli/start.py` (around line 237–238) — host's - `CLAUDE_BOTTLE_OAUTH_TOKEN` is exported into the launcher process as - `CLAUDE_CODE_OAUTH_TOKEN`, then forwarded with - `docker run -e CLAUDE_CODE_OAUTH_TOKEN` (no `=value`, so the value - never lands on argv — good). -2. `claude_bottle/cli/start.py` (around line 318–325) — claude is launched via - `docker exec -it claude …`, which inherits the container - PID 1's env, including the token. -3. claude runs as `node` (UID 1000) with `--dangerously-skip-permissions`. - Its Bash tool can run `printenv CLAUDE_CODE_OAUTH_TOKEN`, - `cat /proc/self/environ`, `node -e 'console.log(process.env)'`, etc. - and capture the value into the conversation. - -A prompt-injection vector — a poisoned skill, a malicious string in a file -claude reads, or a hijacked MCP server — can extract the token and -exfiltrate it through any allowed outbound channel. The -`local-vs-remote-agent-execution.md` note already flags static long-lived -tokens as the biggest credential risk; this is exactly that risk, present -in the local topology today. - -## Hiding env vars "at a deeper level" - -Linux has no primitive to mark an individual env var as unreadable to the -process that holds it. Once a var is in a process's `environ`, the process -and its descendants have full access. The deeper-level lever is process -boundary, not env-var ACL: put the credential in a *different* process -that claude cannot read. - -Default Docker enforces this for you. The kernel's `ptrace_may_access` -check rejects `/proc//environ` reads when the caller's UID/GID don't -match the target's and the caller lacks `CAP_SYS_PTRACE` or `CAP_PERFMON`. -A `node`-uid claude process attempting to read a root-owned proxy's -environ gets `EACCES`. Escape hatches are explicit and not used by -claude-bottle: `--cap-add=SYS_PTRACE`, `--cap-add=PERFMON`, -`--privileged`. Yama `ptrace_scope` is irrelevant here because it only -relaxes the *same-UID* relationship check; the cross-UID UID-match -requirement still blocks the read. - -The `apiKeyHelper` setting in claude-code is **not** a boundary. The -helper is invoked by claude's own process, so claude can just call it via -Bash and capture stdout. Same trust domain. - -The only credible designs: - -- **Header-injecting reverse proxy** — claude points at a localhost URL; - proxy holds the credential; proxy adds `Authorization: Bearer` and - forwards. (See next section.) -- **Network namespace + outbound proxy** — claude runs with - `--network none` and a unix-socket proxy that holds the credential and - enforces an egress allowlist. Anthropic's secure-deployment docs - describe this pattern; the existing research note on remote agents - recommends adding it locally first as the highest-leverage change. -- **Don't ship the OAuth token at all** — fall back to per-session login - or short-lived tokens. Operationally heavier, and the long-lived OAuth - token is the chosen design here precisely because it's portable across - hosts (Keychain on macOS, file on Linux). - -## Proxy auth: viable, with caveats - -Pattern: - -- Run a small reverse proxy as **root** inside the container, listening on - `127.0.0.1:N` (or a root-owned unix socket with `SO_PEERCRED` checks). -- Set `ANTHROPIC_BASE_URL=http://127.0.0.1:N` (or the socket path) in - claude's env. Claude as `node` only sees the URL, not the token. -- The proxy injects `Authorization: Bearer $TOKEN` and forwards to - `https://api.anthropic.com`. -- Token lives only in the root proxy's env; node-uid claude cannot read - `/proc//environ` (kernel-enforced). - -`ANTHROPIC_BASE_URL` is documented as routing for proxies/gateways, not -just Bedrock/Vertex, and works alongside bearer auth. Confirmed gotchas: - -- **SSE streaming**: proxy must not buffer responses (nginx - `proxy_buffering off`, or use a streaming-aware proxy). Claude Code - uses SSE only — no websockets. -- **Forward `anthropic-version`, `anthropic-beta`, and - `X-Claude-Code-Session-Id` untouched** — stripping them breaks tool - use / extended thinking / session aggregation. -- **GitHub issue [#36998](https://github.com/anthropics/claude-code/issues/36998)**: - interactive mode historically bypassed `ANTHROPIC_BASE_URL` for some - startup calls (auth validation / org lookup), connecting directly to - `api.anthropic.com`. Marked closed but verify with `tcpdump` or - `strace -e connect` against the pinned 2.1.126 build before trusting - the isolation. -- **Tool search** (`ENABLE_TOOL_SEARCH`): disabled by default when - `ANTHROPIC_BASE_URL` is non-Anthropic; re-enable explicitly if needed. -- **Out-of-band outbound traffic** is the weak link. None of these route - through `ANTHROPIC_BASE_URL`: - - `statsig.anthropic.com` — telemetry - (disable: `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`, - `DISABLE_TELEMETRY=1`) - - Sentry error reporting (disable: `DISABLE_ERROR_REPORTING=1`) - - `registry.npmjs.org`, `github.com`, `release-assets.githubusercontent.com` - — MCP installs and autoupdater - - `pypi.org`, `bun.sh` — if the Bash tool installs Python or Bun - packages during a session - - A hijacked claude could exfiltrate the captured token (or any other - data) through these channels even with the proxy in place. Pair the - proxy with an explicit egress allowlist (iptables / Docker network - policy) for the full benefit. -- **Token refresh**: `claude setup-token` issues a ~1-year token with no - client-side refresh, so a static proxy value is fine. -- **No request signing / anti-replay** on the Messages API; header - rewriting is safe. -- **`--bare` mode** does not read `CLAUDE_CODE_OAUTH_TOKEN` at all (only - `ANTHROPIC_API_KEY`). Not relevant to the interactive flow claude-bottle - ships, but worth noting if `--bare` is ever wired in. - -## Recommended path forward - -In priority order: - -1. **`--network none` + a localhost (or unix-socket) auth-injecting - proxy** that holds the token. Highest-leverage change: credential - isolation **and** egress containment in one pass. Aligns with the - recommendation already in `local-vs-remote-agent-execution.md`. -2. Layer in `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1` plus an explicit - egress allowlist (api.anthropic.com only, plus the per-agent set of - MCP / git / package-registry hosts) so a hijacked claude can't - exfiltrate through statsig / Sentry / npm. - -The current `docker run -e CLAUDE_CODE_OAUTH_TOKEN` pattern is fine for -argv hygiene on the host, but inside the container the token is fully -exposed to claude. The proxy pattern moves it across a real -kernel-enforced boundary. - -## Sources - -- [Authentication — Claude Code Docs](https://code.claude.com/docs/en/authentication) -- [LLM gateway configuration — Claude Code Docs](https://code.claude.com/docs/en/llm-gateway) -- [Claude Code Environment Variables](https://code.claude.com/docs/en/env-vars) -- [GitHub issue #36998 — Interactive mode ignores ANTHROPIC_BASE_URL](https://github.com/anthropics/claude-code/issues/36998) -- [GitHub issue #11587 — Auth conflict: CLAUDE_CODE_OAUTH_TOKEN and apiKeyHelper](https://github.com/anthropics/claude-code/issues/11587) -- [proc_pid_environ(5) Linux manual page](https://man7.org/linux/man-pages/man5/proc_pid_environ.5.html) -- [Documenting ptrace access mode checking — LWN.net](https://lwn.net/Articles/692203/) -- [StepSecurity — Claude Code Action outbound network analysis](https://www.stepsecurity.io/blog/anthropics-claude-code-action-security-how-to-secure-claude-code-in-github-actions-with-harden-runner) -- [Manage API key environment variables — Claude Help Center](https://support.claude.com/en/articles/12304248-manage-api-key-environment-variables-in-claude-code) diff --git a/docs/research/secret-minimization-over-dlp.md b/docs/research/secret-minimization-over-dlp.md new file mode 100644 index 0000000..f98fc24 --- /dev/null +++ b/docs/research/secret-minimization-over-dlp.md @@ -0,0 +1,317 @@ +# Secret minimization beats DLP for the encoding/splitting attack class + +Why pipelock's body DLP and gitleaks's pre-receive scan cannot stop a +determined exfil attempt, and what architectural shifts actually +bound the damage. Companion to +[`secret-exfil-tripwire-encodings.md`](secret-exfil-tripwire-encodings.md) +(which covers the inverse — detecting a known secret across encodings) +and to +[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md) +(which establishes the gate / auth-injecting-proxy pattern, the +per-service mechanics for Anthropic / Gitea / GitHub, and the +existing tools in that space). + +## Summary + +claude-bottle's v1 egress story is: pipelock allowlists hostnames, +intercepts TLS, body-scans every request against 48 builtin DLP +patterns, and blocks on hit. Gitleaks does the analog on `git push`. +Both are signature-based. Against a *determined* compromised or +prompt-injected agent, both lose: base64 / hex / gzip / split-across- +requests / custom-encoding all evade signature match, and one +exfil-channel — the Anthropic API conversation body — is deliberately +not scanned at all because pipelock's BIP-39 detector false-positives +on ordinary English prose. + +The fix is not "scan harder." It is the same shift that the git-gate +already makes for upstream push credentials: **keep the secret out of +the bottle in the first place.** The agent talks to a per-bottle +gate; the gate holds the credential and uses it on the agent's behalf. +Whatever the agent can encode and exfil, it cannot encode bytes it +never had. + +The encoding/splitting attack is fundamentally unbounded for any +secret the bottle holds, but it stops mattering once the bottle holds +fewer secrets. + +## Why egress scanning is a losing position + +Three independent failure modes, each sufficient on its own: + +### 1. Encoding evades signature match + +The 48 builtin pipelock patterns (and gitleaks's analog set) recognize +known credential *shapes*: `ghp_…`, `sk_live_…`, AWS access-key prefixes, +PEM block markers, etc. The same bytes under any of: + +- base64 / base64url / base32 / hex / ascii-85 +- gzip + base64 +- per-char numeric encoding +- ROT13, custom alphabet substitution +- splitting across request boundaries (first 20 chars in request A, + last 20 in request B) +- chunking via path segments instead of body bytes +- DNS subdomain encoding (out-of-scope for pipelock entirely; only + the v1 dnsmasq / iptables layer in `network-egress-guard.md` would + catch it) + +…do not match any pipelock or gitleaks pattern, because the patterns +are byte-literal on the canonical form. `secret-exfil-tripwire-encodings.md` +documents the detection side of this gap: no off-the-shelf tool emits +N encoded forms of a known secret for downstream matching, and even +the multi-encoding decoders in gitleaks 8.20+ / TruffleHog only undo +one layer. + +A motivated attacker controls the encoding. A static signature set +cannot win that race. + +### 2. The `api.anthropic.com` passthrough is unscanned by design + +`pipelock.py:DEFAULT_TLS_PASSTHROUGH` exempts `api.anthropic.com` +from TLS interception because pipelock's BIP-39 seed-phrase detector +false-positives on any 12+ consecutive English words from the BIP-39 +wordlist that happen to checksum cleanly — i.e., on a lot of normal +prose. The recommended treatment per pipelock's own docs is +`passthrough_domains`: the CONNECT host gate still applies, but body +is not decrypted. + +The agent's conversation channel is therefore wide open as an exfil +path. A prompt-injected agent that has been told a secret can ship +it to Anthropic as conversation text, formatted however it likes, +and pipelock sees only `CONNECT api.anthropic.com:443`. The +`CLAUDE_BOTTLE_OAUTH_TOKEN` itself rides this exact path. + +### 3. Out-of-band channels exist regardless + +Even with body DLP fully on, the v1 design knowingly leaves several +non-HTTP egress paths unscanned (called out as non-goals in PRD 0001 +§Non-goals): SSH session content via ssh-gate, raw TCP, UDP, ICMP, +TLS-SNI domain fronting on allowlisted hosts. The ssh-gate is dumb +L4 socat — `scp`, `-L`/`-R`/`-D` port forwards, and arbitrary +remote-shell content are all opaque to it. Any one of these is +sufficient to ship a secret the agent holds. + +## The shift: stop holding the secret + +The README already states the operative principle: + +> The container itself adds a layer between the agent and the host, +> but the v1 design leans more on secret minimization and egress +> allowlisting than on the container as a hardened boundary. + +The encoding/splitting class is the residual that *egress allowlisting +cannot solve* and that *secret minimization can*. The git-gate is the +proof-of-concept: the agent's effective capability is "push to +upstream," not "hold an SSH key." Moving the key from the agent into +the gate left the capability intact and the credential bytes +unreachable to the agent. The same move applies to anything where +the agent's actual need is *perform operation X against service Y*, +not *hold a credential*. + +[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md) +covers the gate pattern in depth: kernel-boundary mechanics, the +per-service wiring for `CLAUDE_CODE_OAUTH_TOKEN` / Gitea PAT / +GitHub PAT, the four proxy topologies, and the landscape of +existing tools. What follows is the same idea generalized to the +full secret set in a bottle. + +## Concrete mitigations, ordered by leverage + +### 1. Generalize the gate pattern to API credentials (highest leverage) + +For every credential the agent currently holds, ask: does the agent +*need* the bytes, or does it need to *perform an operation* the +credential authorizes? If the latter, the credential moves into a +per-bottle gate that: + +- Holds the secret in a process the agent cannot read + (kernel-enforced cross-UID `/proc//environ` deny on Docker; + separate VM on smolmachines). +- Exposes a narrow surface (a localhost URL, or a service name on + the internal network). +- Optionally enforces a method/path allowlist so the credential's + full scope isn't blanket-granted. + +Two concrete instances worth implementing: + +**Anthropic-API gate.** Holds `CLAUDE_BOTTLE_OAUTH_TOKEN`. Agent's +`ANTHROPIC_BASE_URL` points at the gate; gate injects +`Authorization: Bearer …` and forwards to api.anthropic.com. The +token is no longer in the bottle's env. Once the token is out, +`DEFAULT_TLS_PASSTHROUGH` for api.anthropic.com can be dropped and +pipelock can body-scan the conversation channel with the noisy +patterns (BIP-39) disabled and only high-confidence ones (long +high-entropy strings, known token formats) left on. The known +gotchas — SSE streaming, header passthrough for `anthropic-version` +/ `anthropic-beta` / `X-Claude-Code-Session-Id`, out-of-band +telemetry to `statsig.anthropic.com` — are documented in +[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md) +§Anthropic / Claude Code. + +**Forge-API gate (Gitea / GitHub / GitLab).** Holds the PAT; +exposes a narrow REST surface. Token auth on all three is +stateless `Authorization`-header injection — no CSRF, no request +signing, no per-request nonce — so one proxy generalizes by +config. Per-service mechanics (Gitea uses +`Authorization: token <…>` not Bearer; `tea` bypasses git's +credential.helper; etc.) are in +[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md) +§Gitea / §GitHub / GitLab. + +Both compose with the existing git-gate (which addresses the push +credential), not replace it. + +### 2. Stop passthrough'ing `api.anthropic.com` + +A direct consequence of #1: once the token is out of the bottle, +pipelock can decrypt and inspect the conversation channel. Replace +the broad BIP-39 detector with a tighter ruleset — high-entropy +strings ≥40 chars, known credential format prefixes, long base64 +blobs — and accept the residual false-positive rate. The unscanned +channel was the largest single exfil hole; closing it is worth a +modest reduction in DLP precision. + +If the gate work is deferred, an interim is to body-scan even with +the token in place: the token will be in every Anthropic request's +`Authorization` header anyway, and pipelock's header DLP is already +on. The risk is false-positive blocks on legitimate conversation +prose; opt-in initially. + +### 3. Egress data budget + +PRD 0001 §Resolution rules reserves `data_budget` as a future +field under the existing `egress` block. Cap small (e.g. 1 MiB +outbound per session to non-Anthropic hosts). This doesn't stop +one-secret one-time exfil — a 4 KiB token still fits — but: + +- Bulk exfil (`.git` history dumps, full credential stores, model + weights, large datasets) becomes visibly noisy or impossible. +- Repeated small exfil attempts trip the budget and produce a + log line / alert. +- Forces the attacker to be efficient; efficient channels are + easier to fingerprint after the fact. + +### 4. Pre-clean `.git` history when `--cwd` is used + +`provision/git.py:_provision_cwd_git` copies the host cwd's `.git` +into `/home/node/workspace/.git` wholesale. Git history contains +every secret ever committed, even if removed from HEAD. The agent +reads them with `git log --all -p` — a free, unscanned exfil +channel via any allowed host. + +Add a provision-time gitleaks scan of the host `.git` before copy, +and refuse to launch (or quarantine the offending commits into a +read-only orphan branch view) on hit. Doesn't help the legitimate +case where the user wants to debug a leaked-secret commit, but +gates the common case where the user is unaware of historical +secrets in the repo. Pairs with the push-side gitleaks already in +the git-gate. + +### 5. Manifest hygiene: explicit secret tagging + +Today `bottle.env` accepts any entry. A `?prompt-at-runtime` or +`${HOST_VAR}` interpolation has no special status vs. a literal — +the manifest can't tell which entries are credential material and +which are configuration. + +Add a `secret: true` flag (or a `secrets:` sibling of `env:`) that: + +- Surfaces those entries prominently in the y/N preflight + ("this bottle carries 3 secret values: GITEA_TOKEN, GH_PAT, + AWS_SECRET_ACCESS_KEY"). +- Refuses to launch if `egress.allowlist` contains any host that + is not source-controlled by the user (heuristic: not on a + built-in `KNOWN_FORGE_HOSTS` list). +- Forces an explicit acknowledgement that a credential is being + placed into the bottle rather than behind a gate. + +Forces the design pressure toward gates for any new credential +type, rather than reaching for `bottle.env`. + +### 6. Detection over prevention for the residual + +For everything that remains after the above — and a residual will +always remain — log every outbound request to allowlisted hosts in +structured form: timestamp, host, path, byte count, content +hash, request shape. Don't block; alert. Offline analysis surfaces +per-host byte-count outliers, entropy histograms, base64-shaped +path/body distributions, and repeat patterns. + +This doesn't catch the first careful exfil. It does catch repeated +abuse, scaled exfil, and the second-attempt-after-discovery, and +provides forensic trail for the first incident. Pipelock's +stdout-only logging today (PRD 0001 §Out of scope: "Audit logging +or persistent log storage of pipelock decisions") is the precursor; +this would be its persistent / structured follow-up. + +### 7. Shorten OAuth token lifetime + +The `claude setup-token`-issued OAuth token is ~1 year. If a leak +happens, blast radius is a year of access to the user's Pro/Max +subscription. Even without the gate (#1), encouraging users to +rotate monthly bounds the damage. With the gate, the gate can in +principle negotiate fresh short-lived tokens per session (depends +on whether Anthropic ever ships an OAuth refresh / child-token flow +— closed today per `claude-code-token-revocation.md`). + +## What you don't get + +Even with all of the above: + +- An agent that can talk to *any* LLM can use the conversation as a + covert channel — encode bits in word choice, response length, + formatting. Pipelock can scan for known patterns but cannot stop + arbitrary content over a channel whose purpose is arbitrary content. +- An agent with shell access to any remote host via ssh-gate can + exfil through the SSH session. The ssh-gate is L4 by design (PRD + 0007 §Non-goals: "SSH-protocol awareness"); generalizing the + gate pattern to SSH content inspection is the analog of building + the git-gate for arbitrary SSH operations, which is out of scope. +- A determined adversary always wins against a single boundary. + The point of stacking the above is to make the cost of the *first* + exfil higher than the value of the secrets in scope, and to make + *repeated* exfil detectable. + +## Recommended path forward + +In priority order: + +1. **Anthropic-API gate** + ([`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md) + §Recommended). Removes the highest-value secret and closes the + passthrough hole as a side effect. +2. **Forge-API gate** (same doc, same section — one proxy + generalizes across Gitea / GitHub / GitLab by config). +3. **Egress data budget** in pipelock — small lift, large damage + bound. +4. **Pre-launch `.git` gitleaks scan** for `--cwd` bottles. +5. **`secret: true` manifest flag** + preflight surfacing. +6. **Structured pipelock logs** persisted somewhere durable; offline + analysis as a separate follow-up. + +The encoding/splitting attack stops being load-bearing once the +bottle doesn't *have* the secret the attacker wants to encode. Every +item above is a step toward that property. + +## References + +- `agent-credential-proxy-landscape.md` — gate pattern, kernel + boundary mechanics, per-service wiring (Anthropic / Gitea / + GitHub / GitLab), proxy topologies, landscape of existing + tools (Docker AI Sandboxes, Cloudflare Sandbox Auth, + Infisical Agent Vault, nono, Aembit, LiteLLM, Portkey, …) + and a build-vs-adopt verdict. +- `secret-exfil-tripwire-encodings.md` — inverse problem: detecting + a known secret across encodings; explains why generic DLP can't + enumerate the encoded forms. +- `network-egress-guard.md` — v1 iptables / dnsmasq baseline; the + layer below pipelock that catches DNS-subdomain exfil pipelock + doesn't see. +- `pipelock-assessment.md` — why pipelock was chosen, what its DLP + scanners do and don't cover. +- `claude-code-token-revocation.md` — Anthropic OAuth refresh + story; informs the token-lifetime mitigation. +- PRD 0001 — pipelock topology and the `data_budget` reservation. +- PRD 0006 — TLS interception; defines the + `DEFAULT_TLS_PASSTHROUGH` set this doc proposes to drop. +- PRD 0008 — git-gate; the proof-of-concept for the gate pattern. diff --git a/docs/research/tea-token-isolation-via-proxy.md b/docs/research/tea-token-isolation-via-proxy.md deleted file mode 100644 index d565ebb..0000000 --- a/docs/research/tea-token-isolation-via-proxy.md +++ /dev/null @@ -1,189 +0,0 @@ -# Isolating the Gitea `tea` token via an auth-injecting proxy - -Research into whether authentication for the `tea` CLI (Gitea's command-line -client) can be brokered by a proxy so the access token never enters the -container — not as an env var, not in `~/.config/tea/config.yml`. Parallel -question to `oauth-token-exposure-to-claude.md`, but for the Gitea credential -rather than the Anthropic one. - -## Summary - -Yes. `tea` itself has no credential-helper hook, so the leverage point is on -the wire: a root-owned reverse proxy inside the container holds the token and -injects `Authorization: token <…>` on every forwarded request. `tea` is -configured with `--url` pointing at the proxy and a dummy/empty token; the -proxy talks to the real Gitea over TLS. The same kernel boundary that hides -the OAuth token in the proxy pattern (a `node`-uid claude cannot read -`/proc//environ` under default Docker) hides the Gitea token here. -Git HTTPS push gets the same treatment by rewriting the remote URL to go -through the proxy. The unavoidable tradeoff is that a hijacked claude can -*use* the token's full scope but cannot *exfiltrate* the bytes — a strict -improvement over the env-var status quo, not a panacea. - -## How `tea` authenticates - -`tea` reads the token from three places, in precedence order: - -1. **`GITEA_SERVER_TOKEN` env var** — - `cmd/login/add.go` registers it via `cli.EnvVars("GITEA_SERVER_TOKEN")`. -2. **`~/.config/tea/config.yml`** (XDG) or `~/.tea/tea.yml` (legacy fallback) - — the token is stored in plaintext YAML under the `token` field of the - login entry. -3. **OS credstore** — only for OAuth logins; PAT-based logins go to the - YAML file. - -There is **no `credential.helper` analogue**: no `--token-file`, no -FD-passing, no socket-based credential protocol. The only ways to feed `tea` -a token are env var or config file, both of which are readable by the -process holding them. So the token can't be hidden *inside* `tea`'s -process — it has to be held by a *different* process the agent cannot -read. - -For HTTPS git operations, `tea` uses `go-git` directly with -`BasicAuth{Username: token, Password: ""}` (`modules/git/auth.go`), bypassing -git's own credential.helper machinery. This matters: a credential-helper -shim alone won't intercept `tea repo clone` — the proxy has to sit on the -HTTP path itself. - -## Why a proxy is the only credible boundary - -Same logic as the OAuth note. Linux has no per-env-var ACL: once a var is in -a process's `environ`, the process owns it. The lever is process boundary, -not env-var ACL. Default Docker enforces that boundary at the kernel level -via `ptrace_may_access`: a `node`-uid claude trying to read a root-owned -proxy's `/proc//environ` gets `EACCES` without `CAP_SYS_PTRACE`, -`CAP_PERFMON`, or `--privileged`. claude-bottle uses none of those. - -Gitea's API is friendly to header-injecting proxies: token auth is -stateless, no CSRF, no request signing, no per-request nonces. Use the -`Authorization: token <…>` form; an old Gitea bug -([go-gitea/gitea#16734](https://github.com/go-gitea/gitea/issues/16734)) -emitted spurious "missing CSRF token" errors for the `Bearer` form on some -endpoints. The fix landed upstream, but `token` has always been the -header-safe choice. - -## Proxy architectures - -Four shapes worth comparing: - -- **In-container reverse proxy (recommended).** Root-owned process inside - the container listens on a non-loopback address (e.g. a Docker bridge IP - or an alias). Token is passed via `docker run -e GITEA_TOKEN`, inherited - only by the root proxy, never by `node`. `tea login add --url - http://:` writes a config file whose `token` field is empty - or a dummy. Git HTTPS uses a rewritten remote pointing at the same proxy. - Pros: simple, self-contained per agent, no host changes, no MITM CA. - Cons: requires a non-loopback bind (see below). - -- **In-container forward proxy with TLS termination.** Root-owned mitmproxy - intercepts outbound HTTPS, terminates with a container-local CA, injects - the header, re-encrypts. `tea` keeps the real Gitea URL and `HTTPS_PROXY` - points at the proxy. **Critical Go quirk**: `net/http` ignores - `HTTPS_PROXY` when the proxy address is `127.0.0.1` or `localhost` - ([golang/go#28866](https://github.com/golang/go/issues/28866)). Workaround - is the same — bind on a non-loopback address — and you also pay for CA - trust setup. Worth it only if you need transparent interception of - multiple unrelated hosts. - -- **Host-side proxy.** Proxy runs on macOS; container reaches it via - `host.docker.internal:`. The UDS-across-VM constraint already noted - in CLAUDE.md (Docker Desktop on macOS does not forward unix-socket - `connect()` across the VM) does not apply — TCP via `host.docker.internal` - works fine, and the Go loopback bypass isn't an issue because the target - is not `127.0.0.1`. Pros: token stays entirely outside the Linux VM. - Cons: a host daemon to maintain, and the published port is reachable by - any container on the host unless firewalled. This is the architecture - Docker's own AI Sandbox product uses. - -- **Sidecar container.** Token-holding container in a shared Docker - network. Pros: clean isolation, portable across hosts. Cons: a second - container to orchestrate per agent; the token is in another container's - env, which is a lateral move rather than a deeper boundary unless the - sidecar runs with stricter isolation than the agent container. - -For claude-bottle's threat model — local Docker, per-agent containers, -already comfortable with root-owned helpers (the SSH agent precedent) — -the in-container reverse proxy is the lowest-friction option that gives -the desired property. - -## Caveats and gotchas - -- **Bind on a non-loopback address.** Required for forward-proxy use because - of golang/go#28866; harmless for the reverse-proxy case but worth doing - consistently so the same proxy works for both shapes. A Docker network - alias or `ip addr add 10.0.0.1/32 dev lo` works. -- **Use `Authorization: token <…>`, not `Bearer`.** Avoids the legacy - CSRF-error path on older Gitea versions. -- **`tea` git operations bypass git's credential.helper.** A credential - helper shim is not enough; the proxy must sit on the HTTP path. Plain - `git push` from claude can use either the proxy (rewritten remote URL) or - a credential-helper shim that calls the proxy — the rewritten-remote - approach keeps the token bytes out of git's credential negotiation - entirely. -- **Token scope is the blast radius.** A pass-through proxy grants the - agent the token's full API scope. Mitigate with fine-grained Gitea token - scopes (`repo:write` only, no `admin`), an HTTP method/path allowlist at - the proxy, rate limits, and audit logging. None of these prevent abuse — - they bound and observe it. -- **No exfil isn't no harm.** A hijacked claude can still push branches, - open PRs, and do whatever the token's scope permits. Pair the proxy with - the egress-guard work in `network-egress-guard.md` for the full benefit; - the two compose cleanly because the proxy is itself an explicit egress - endpoint. -- **`tea` config file is no longer authoritative.** The launcher must run - `tea login add` against the proxy URL (or write a config file directly) - before claude starts, otherwise the agent will hit "no logins configured." - Empty-token configs are accepted. - -## Prior art - -This is a known pattern with several recent named implementations: - -- **Docker AI Sandboxes** — host-side intercepting proxy that overwrites the - auth header; token stays on host, container sees a `proxy-managed` - placeholder. Closest analog to what claude-bottle would build. -- **Cloudflare Sandbox Auth** — programmable egress with per-sandbox MITM - CA for credential injection. -- **Infisical agent-vault** — open-source TLS-intercepting forward proxy - purpose-built for AI-agent workloads. Research preview as of early 2026. -- **AWS IMDSv2** — the canonical credential broker on - `169.254.169.254`; same shape, different problem domain. - -## Recommended path forward - -In priority order: - -1. **In-container reverse proxy holding the Gitea token.** Add a manifest - field (e.g. `gitea: { url, tokenRef }`) so a per-agent token reference - resolves at launch time, the proxy starts as root before `node` is - exec'd, and `tea` plus git remotes are pre-configured to point at the - proxy. Reuse the same root-owned-helper pattern the SSH agent already - establishes. -2. **Scope-narrow the Gitea token** at issuance — `repo:write` for the - target repo, no `admin`, no user management. This is the cheapest single - thing to do and bounds blast radius regardless of whether the proxy - ships. -3. **Allowlist at the proxy** once usage is stable. Method + path filter - keyed off the agent's actual Gitea calls; reject everything else. -4. **Compose with `network-egress-guard.md`.** The proxy is one egress - endpoint; the egress guard enforces that nothing else escapes. - -The current `docker run -e GITEA_TOKEN`-style pattern is fine for argv -hygiene, but inside the container the token is fully exposed to claude. -The proxy moves it across a kernel-enforced boundary — same property the -SSH agent already gives us for keys. - -## Sources - -- [tea source — `cmd/login/add.go`](https://gitea.com/gitea/tea/src/branch/main/cmd/login/add.go) -- [tea source — `modules/config/config.go`](https://gitea.com/gitea/tea/src/branch/main/modules/config/config.go) -- [tea source — `modules/git/auth.go`](https://gitea.com/gitea/tea/src/branch/main/modules/git/auth.go) -- [Gitea API Usage docs](https://docs.gitea.com/development/api-usage) -- [go-gitea/gitea#16734 — `Authorization: Bearer` triggers spurious CSRF error](https://github.com/go-gitea/gitea/issues/16734) -- [golang/go#28866 — `net/http` ignores `HTTPS_PROXY` for `127.0.0.1`/`localhost`](https://github.com/golang/go/issues/28866) -- [Docker AI Sandbox credentials docs](https://docs.docker.com/ai/sandboxes/security/credentials/) -- [Cloudflare Sandbox Auth blog](https://blog.cloudflare.com/sandbox-auth/) -- [Infisical agent-vault — GitHub](https://github.com/Infisical/agent-vault) -- [Infisical agent-vault — blog post](https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents) -- [AWS IMDSv2 documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html) -- [git credential helper docs](https://git-scm.com/docs/gitcredentials)