docs(research): add credential-proxy landscape and DLP-minimization framing
Consolidates oauth-token-exposure-to-claude.md and tea-token-isolation-via-proxy.md into agent-credential-proxy-landscape.md, adding a May-2026 survey of existing tools (Docker AI Sandboxes, Cloudflare Sandbox Auth, Infisical Agent Vault, nono, Aembit, LiteLLM CVE-2026-42208, Portkey, Helicone, etc.) and a build-vs-adopt verdict. Adds secret-minimization-over-dlp.md explaining why pipelock's body DLP and gitleaks's pre-receive scan cannot stop encoding/splitting exfil, and why moving credentials out of the bottle (the git-gate pattern, generalized) is the only robust answer. Updates git-secret-scanning-hardening.md's reference to point at the new consolidated landscape doc. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,392 @@
|
||||
# Agent credential proxy landscape
|
||||
|
||||
Consolidated research on running an auth-header-injecting proxy in
|
||||
front of an AI agent so API tokens stay out of the agent's process
|
||||
space. Folds in the per-service mechanics for the Anthropic OAuth
|
||||
token and the Gitea PAT — the two cases claude-bottle hits first —
|
||||
and surveys existing tools as of May 2026.
|
||||
|
||||
Companion to
|
||||
[`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md)
|
||||
(the architectural framing — why this matters), and to
|
||||
[`local-vs-remote-agent-execution.md`](local-vs-remote-agent-execution.md)
|
||||
(the broader threat model that flagged long-lived static tokens as
|
||||
the biggest credential risk).
|
||||
|
||||
## Summary
|
||||
|
||||
Today every claude-bottle agent gets `CLAUDE_CODE_OAUTH_TOKEN` (and
|
||||
any `bottle.env` secrets like a Gitea PAT) injected as env vars,
|
||||
which means the agent process can read them with `printenv` or
|
||||
`/proc/self/environ`. A prompt-injected or hijacked agent can ship
|
||||
those bytes to any allowed host. Linux has no primitive for
|
||||
"this env var exists in my process but I can't read it" — the only
|
||||
credible boundary is to put the credential in a *different* process
|
||||
that the agent cannot read, and let the agent talk to it over a
|
||||
narrow API. Default Docker enforces that boundary at the kernel
|
||||
level via `ptrace_may_access`; a future smolmachines backend
|
||||
enforces it harder, at the VM line.
|
||||
|
||||
Several existing tools implement this pattern, but none of them are
|
||||
a clean drop-in for claude-bottle today: the most architecturally
|
||||
aligned (nono) is alpha; the most mature open-source
|
||||
(Infisical Agent Vault) requires TLS MITM and would double up on
|
||||
pipelock's TLS-interception stack. For the Anthropic-token slice, a
|
||||
small claude-bottle-specific reverse proxy modeled on the
|
||||
phantom-token shape is probably the right call. For Gitea / GitHub /
|
||||
GitLab, the same proxy generalizes by config.
|
||||
|
||||
## The shared problem
|
||||
|
||||
Linux has no per-env-var ACL. Once a var is in a process's
|
||||
`environ`, the process and its descendants own it. The deeper
|
||||
boundary is **process-level**: hold the credential in a process the
|
||||
agent cannot read.
|
||||
|
||||
Default Docker enforces that boundary for you. The kernel's
|
||||
`ptrace_may_access` check rejects `/proc/<pid>/environ` reads when
|
||||
the caller's UID/GID don't match the target's and the caller lacks
|
||||
`CAP_SYS_PTRACE` or `CAP_PERFMON`. A `node`-uid claude attempting to
|
||||
read a root-owned proxy's environ gets `EACCES`. Escape hatches
|
||||
(`--cap-add=SYS_PTRACE`, `--cap-add=PERFMON`, `--privileged`) are
|
||||
not used by claude-bottle. Yama `ptrace_scope` is irrelevant — it
|
||||
only relaxes the *same-UID* relationship check; the cross-UID
|
||||
match requirement still blocks the read. On a smolmachines backend
|
||||
the boundary becomes the VM line; same property, harder.
|
||||
|
||||
claude-code's `apiKeyHelper` setting is **not** a boundary. The
|
||||
helper is invoked by claude's own process, so claude can just call
|
||||
it via Bash and capture stdout. Same trust domain.
|
||||
|
||||
The remaining credible designs reduce to three:
|
||||
|
||||
- **Header-injecting reverse proxy** — agent points at a localhost
|
||||
URL; proxy holds the credential; proxy adds the auth header and
|
||||
forwards. Cleanest fit for services that support a `BASE_URL`-style
|
||||
override (Anthropic, OpenAI, Portkey, etc.).
|
||||
- **Forward proxy with TLS termination** — agent keeps the real
|
||||
service URL; an `HTTPS_PROXY` MITM intercepts, terminates TLS with
|
||||
a container-local CA, injects the header, re-encrypts. Heavier;
|
||||
required when the agent's tool can't be pointed at an explicit URL.
|
||||
- **Don't ship the token at all** — fall back to per-session login
|
||||
or short-lived child tokens. Operationally heavier; the long-lived
|
||||
OAuth token was chosen precisely because it's portable
|
||||
(Keychain on macOS, file on Linux).
|
||||
|
||||
## Per-service mechanics
|
||||
|
||||
### Anthropic / Claude Code
|
||||
|
||||
**Today's wiring** (`claude_bottle/cli/start.py`): the host's
|
||||
`CLAUDE_BOTTLE_OAUTH_TOKEN` is forwarded into the bottle as
|
||||
`CLAUDE_CODE_OAUTH_TOKEN` via `docker run -e CLAUDE_CODE_OAUTH_TOKEN`
|
||||
(no `=value`, so the value never lands on argv — good). Inside the
|
||||
bottle, claude runs as `node` (UID 1000) with
|
||||
`--dangerously-skip-permissions`. Its Bash tool can do
|
||||
`printenv CLAUDE_CODE_OAUTH_TOKEN`, `cat /proc/self/environ`,
|
||||
`node -e 'console.log(process.env)'` and capture the value into
|
||||
the conversation. The DLP / egress story
|
||||
([`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md))
|
||||
explains why scanning on the way out doesn't save you here.
|
||||
|
||||
**Routing primitive:** `ANTHROPIC_BASE_URL` is documented as a
|
||||
generic proxy/gateway override, not just Bedrock/Vertex, and works
|
||||
alongside bearer auth. The proxy sets
|
||||
`Authorization: Bearer $TOKEN` and forwards to
|
||||
`https://api.anthropic.com`. Claude as `node` only sees the URL,
|
||||
never the token.
|
||||
|
||||
**Confirmed gotchas:**
|
||||
|
||||
- **SSE streaming**: the proxy must not buffer responses (nginx
|
||||
`proxy_buffering off`, or a streaming-aware proxy). Claude Code
|
||||
uses SSE only — no websockets.
|
||||
- **Forward `anthropic-version`, `anthropic-beta`, and
|
||||
`X-Claude-Code-Session-Id` untouched.** Stripping them breaks tool
|
||||
use / extended thinking / session aggregation.
|
||||
- **GitHub issue
|
||||
[#36998](https://github.com/anthropics/claude-code/issues/36998)**:
|
||||
interactive mode historically bypassed `ANTHROPIC_BASE_URL` for
|
||||
some startup calls (auth validation / org lookup), connecting
|
||||
directly to api.anthropic.com. Marked closed but verify with
|
||||
`tcpdump` or `strace -e connect` against the pinned claude-code
|
||||
build before trusting the isolation.
|
||||
- **Tool search** (`ENABLE_TOOL_SEARCH`) is disabled by default when
|
||||
`ANTHROPIC_BASE_URL` is non-Anthropic; re-enable explicitly if
|
||||
needed.
|
||||
- **Out-of-band outbound traffic** does *not* route through
|
||||
`ANTHROPIC_BASE_URL`:
|
||||
- `statsig.anthropic.com` — telemetry
|
||||
(disable: `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`,
|
||||
`DISABLE_TELEMETRY=1`)
|
||||
- Sentry error reporting (disable: `DISABLE_ERROR_REPORTING=1`)
|
||||
- `registry.npmjs.org`, `github.com`,
|
||||
`release-assets.githubusercontent.com` — MCP installs + autoupdater
|
||||
- `pypi.org`, `bun.sh` — if the Bash tool installs Python or Bun
|
||||
packages during a session
|
||||
|
||||
A hijacked claude could exfil the captured token (or any other
|
||||
data) through any of these even with the proxy in place. Pair
|
||||
the proxy with an explicit egress allowlist for the full benefit
|
||||
(claude-bottle does this via pipelock).
|
||||
- **Token refresh**: `claude setup-token` issues a ~1-year OAuth
|
||||
token with no client-side refresh, so a static proxy value is
|
||||
fine. The flip side is a one-year blast radius if the token leaks
|
||||
— see
|
||||
[`claude-code-token-revocation.md`](claude-code-token-revocation.md).
|
||||
- **No request signing / anti-replay** on the Messages API; header
|
||||
rewriting is safe.
|
||||
- **`--bare` mode** reads only `ANTHROPIC_API_KEY`, not
|
||||
`CLAUDE_CODE_OAUTH_TOKEN`. Not relevant to the interactive flow
|
||||
claude-bottle ships, but worth noting if `--bare` is ever wired in.
|
||||
|
||||
### Gitea (`tea` + git HTTPS)
|
||||
|
||||
**Token sources, in precedence order:**
|
||||
|
||||
1. **`GITEA_SERVER_TOKEN` env var** — registered via
|
||||
`cli.EnvVars("GITEA_SERVER_TOKEN")` in `cmd/login/add.go`.
|
||||
2. **`~/.config/tea/config.yml`** (XDG) or `~/.tea/tea.yml`
|
||||
(legacy fallback) — plaintext YAML, `token` field under the
|
||||
login entry.
|
||||
3. **OS credstore** — OAuth logins only; PAT-based logins go to
|
||||
the YAML file.
|
||||
|
||||
There is **no `credential.helper` analogue**: no `--token-file`,
|
||||
no FD-passing, no socket-based credential protocol. So the token
|
||||
can't be hidden *inside* `tea`'s process — it has to be held by a
|
||||
*different* process the agent cannot read. For HTTPS git
|
||||
operations, `tea` uses `go-git` directly with
|
||||
`BasicAuth{Username: token, Password: ""}`
|
||||
(`modules/git/auth.go`), bypassing git's own credential.helper
|
||||
machinery. A credential-helper shim alone won't intercept
|
||||
`tea repo clone` — the proxy has to sit on the HTTP path itself.
|
||||
|
||||
**Header form:** use `Authorization: token <…>`, **not** `Bearer`.
|
||||
[go-gitea/gitea#16734](https://github.com/go-gitea/gitea/issues/16734)
|
||||
emitted spurious "missing CSRF token" errors for `Bearer` on some
|
||||
endpoints. The fix landed upstream, but `token` has always been
|
||||
the header-safe choice.
|
||||
|
||||
**No CSRF / no per-request nonce** on the Gitea API for token
|
||||
auth, so a header-rewriting proxy is safe.
|
||||
|
||||
**Plain `git push`** from claude can use either the proxy
|
||||
(rewritten remote URL) or a credential-helper shim that calls the
|
||||
proxy. The rewritten-remote approach keeps the token bytes out of
|
||||
git's credential negotiation entirely. (Note: this is parallel to
|
||||
the existing git-gate in PRD 0008, which solves the SSH-push case
|
||||
via a per-bottle mirror.)
|
||||
|
||||
### GitHub / GitLab
|
||||
|
||||
Structurally identical to Gitea for PAT auth: stateless
|
||||
`Authorization: Bearer <…>` (GitHub PATs and GitLab PATs both
|
||||
accept Bearer, and GitHub also accepts `token <…>` for legacy
|
||||
clients), no CSRF, no signing. Per-route allowlisting at the
|
||||
proxy is the lever for narrowing blast radius. GitHub fine-grained
|
||||
PATs and GitLab project-access tokens are the issuance-side
|
||||
mitigation. Either composes cleanly with the same proxy.
|
||||
|
||||
## Proxy architectures
|
||||
|
||||
Four shapes worth comparing. The first is the lowest-friction
|
||||
match for claude-bottle today.
|
||||
|
||||
| Shape | Pros | Cons |
|
||||
|---|---|---|
|
||||
| **In-container reverse proxy** (recommended) | Self-contained per agent, no host changes, no MITM CA, no Go-loopback workaround. Works for any service with a `BASE_URL`-style override (Anthropic, OpenAI, Portkey). | Doesn't work for services that hardcode the upstream URL — requires either rewriting the client config or moving up to a forward proxy. |
|
||||
| **In-container forward proxy + TLS termination** | Transparent to the agent's tooling — every HTTPS request gets intercepted regardless of base-URL support. | Needs a container-local CA in the trust store (same machinery PRD 0006 set up for pipelock). Has the `golang/go#28866` loopback gotcha: `net/http` ignores `HTTPS_PROXY` when set to `127.0.0.1`/`localhost`, so the proxy must bind on a non-loopback address (Docker bridge IP, `host.docker.internal`, or `ip addr add 10.0.0.1/32 dev lo`). |
|
||||
| **Host-side proxy** | Token stays entirely outside the Linux VM. This is the Docker AI Sandbox shape. | A host daemon to maintain; the published port is reachable by any container on the host unless firewalled. UDS-across-VM doesn't work on Docker Desktop on macOS (no AF_UNIX `connect()` over the VM), but `host.docker.internal:<port>` over TCP works fine. |
|
||||
| **Sidecar container** | Clean isolation; portable across hosts. Matches the existing pipelock / ssh-gate / git-gate topology. | Another container to orchestrate per agent; the token is in another container's env, which is a lateral move unless the sidecar runs with stricter isolation than the agent container does. |
|
||||
|
||||
For claude-bottle today — local Docker, per-agent containers, the
|
||||
root-owned-helper pattern already established by the SSH agent —
|
||||
the **in-container reverse proxy** is the lowest-friction option
|
||||
that gives the desired property. The sidecar-container shape is
|
||||
the natural evolution if the proxy needs the same per-bottle
|
||||
isolation that pipelock has.
|
||||
|
||||
## Landscape of existing tools (May 2026)
|
||||
|
||||
Two categories:
|
||||
|
||||
- **A. Generic LLM / API gateways** that happen to support credential
|
||||
injection as a side feature.
|
||||
- **B. Purpose-built agent credential brokers** — newer, closer to
|
||||
what claude-bottle wants.
|
||||
|
||||
| Tool | Category | License | Topology | Injection mechanism | `ANTHROPIC_BASE_URL` compatible | Per-route allowlist | Maturity |
|
||||
|---|---|---|---|---|---|---|---|
|
||||
| **Docker AI Sandboxes** | B | Proprietary | Host-side proxy | Header overwrite, OS keychain | No (intercepts by domain) | Domain only | GA (Mar 2026) |
|
||||
| **Cloudflare Sandbox Auth** | B | Proprietary | Sandbox sidecar + ephemeral CA | TLS intercept + Outbound Worker | No (platform-specific) | Host/IP/method | GA (Apr 2026) |
|
||||
| **Infisical Agent Vault** | B | MIT (EE carve-out) | In-process HTTPS_PROXY forward proxy | TLS MITM, dummy-to-real swap | No — HTTPS_PROXY model | Service-level | Active; v0.19.0 May 2026, ~1k⭐ |
|
||||
| **nono** | B | Apache-2.0 | In-process reverse proxy | Phantom token, explicit URL routing | **Yes** — `BASE_URL=http://127.0.0.1:PORT/…` | Host + endpoint | Early alpha; v0.53.0 May 2026, 2.4k⭐ |
|
||||
| **Aegis** | B | Apache-2.0 | In-process reverse proxy | Path routing (`localhost:3100/{svc}/…`) | Configurable, undocumented for Anthropic | Method/path/rate/time | Very new, 10⭐ |
|
||||
| **OneCLI** | B | Apache-2.0 | Reverse proxy + management UI | Host/path matching, Bitwarden integration | Configurable | Per-agent scoping | Active; v1.23.0 May 2026, 2.1k⭐ |
|
||||
| **Aembit** | B | Proprietary | Sidecar + cloud control plane | TLS intercept, SPIFFE, JIT creds | No — intercepts by destination | Policy-based | GA (Apr 2026) |
|
||||
| **LiteLLM Proxy** | A | MIT | Reverse proxy | Virtual key → upstream key | Yes — set base URL to LiteLLM | Route-level | 45k⭐; **CVE-2026-42208 exploited Apr 2026**, patch v1.83.7 |
|
||||
| **Portkey Gateway** | A | MIT (OSS core) | Reverse proxy | Virtual key vault (cloud or Enterprise self-host) | Yes — documented for Claude Code | Config-based | Production; virtual-key vault needs Enterprise for self-host |
|
||||
| **Helicone** | A | Apache-2.0 | Reverse proxy | Proxy header auth; agent still holds own key | Yes | No | Maintenance mode (Mintlify acq. Mar 2026) |
|
||||
| **LangSmith LLM Auth Proxy** | A | OSS Helm | Envoy sidecar | JWT + ext_authz upstream key injection | Yes | URL allowlist | Enterprise (LangSmith ≥ v0.13.33) |
|
||||
| **Kong AI Gateway** | A | Apache-2.0 | Reverse proxy | Plugin per-route/consumer | Yes | Plugin-level | Production, heavy |
|
||||
| **AWS IMDSv2** | — | n/a | Link-local | Per-instance metadata | n/a | n/a | Conceptual analog only |
|
||||
|
||||
### Cluster commentary
|
||||
|
||||
- **The phantom-token pattern** (nono) is the cleanest architectural
|
||||
fit for claude-bottle. The agent receives a per-session
|
||||
cryptographically random token scoped to the localhost proxy;
|
||||
the proxy validates and swaps for the real upstream credential.
|
||||
No TLS interception, no CA trust setup, works directly with
|
||||
`ANTHROPIC_BASE_URL`. **Blocker:** nono is explicitly
|
||||
"early alpha, not security audited."
|
||||
|
||||
- **TLS-MITM forward proxies** (Infisical Agent Vault, Cloudflare
|
||||
Sandbox Auth, Aembit, the existing pipelock) all double up on
|
||||
the CA-trust machinery PRD 0006 already built for pipelock.
|
||||
Adopting Agent Vault would mean two MITM proxies in each bottle
|
||||
unless one is dropped. Also subject to `golang/go#28866` — must
|
||||
bind on a non-loopback address.
|
||||
|
||||
- **LLM gateways** (LiteLLM, Portkey, Helicone, Kong) all support
|
||||
credential injection but are built for cost / observability /
|
||||
fallback, not isolation. **Specific concern:** the LiteLLM
|
||||
CVE-2026-42208 (CVSS 9.3, pre-auth SQL injection on the Bearer
|
||||
auth path, exploited within 36 hours of disclosure) is a
|
||||
reminder that any self-hosted DB-backed credential gateway is
|
||||
itself a high-value attack target. Prefer a flat-file or
|
||||
env-only credential store on the sidecar over a database.
|
||||
|
||||
- **Helicone is in maintenance mode** since the Mintlify
|
||||
acquisition in March 2026 (security fixes only, no features).
|
||||
Treat as legacy.
|
||||
|
||||
- **Portkey's virtual-key vault** — the actual credential-injection
|
||||
feature — requires the Enterprise plan for self-host. The
|
||||
open-source gateway alone does routing without injection.
|
||||
|
||||
## Build-vs-adopt synthesis
|
||||
|
||||
**Architecturally aligned:** nono. Phantom-token + explicit-URL
|
||||
routing matches the design recommended here exactly; zero TLS
|
||||
work. But "not security audited" + "early alpha" means adopting it
|
||||
is a bet on the project rather than a buy-vs-build win.
|
||||
|
||||
**Most mature OSS purpose-built:** Infisical Agent Vault. MIT,
|
||||
v0.19.0 active, v0.17.0 added a containerized agent mode that
|
||||
maps directly to claude-bottle. Friction is the TLS-MITM topology
|
||||
— another container-local CA, the Go-loopback workaround,
|
||||
duplication with pipelock's existing TLS interception layer.
|
||||
|
||||
**For the immediate Anthropic-token slice, a ~100-line Rust or Go
|
||||
reverse proxy modeled on nono's phantom-token shape is probably
|
||||
less work and less risk than adopting either.** The surface is
|
||||
small: hold the token, inject one header, forward to
|
||||
api.anthropic.com over TLS, pass through SSE without buffering.
|
||||
For Gitea / GitHub / GitLab the same proxy generalizes by config.
|
||||
|
||||
The build path also keeps the credential store flat (env file or
|
||||
mode-600 YAML on the sidecar), which sidesteps the
|
||||
"DB-backed-gateway as attack surface" concern the LiteLLM CVE
|
||||
exposed.
|
||||
|
||||
## Recommended path forward
|
||||
|
||||
In priority order:
|
||||
|
||||
1. **In-container reverse proxy holding `CLAUDE_CODE_OAUTH_TOKEN`.**
|
||||
Highest-leverage change: credential isolation **and** the
|
||||
ability to drop the `api.anthropic.com` TLS passthrough in
|
||||
pipelock (see
|
||||
[`secret-minimization-over-dlp.md`](secret-minimization-over-dlp.md)
|
||||
§2). Proxy runs as root inside the agent container, listens on
|
||||
`127.0.0.1` (no Go-loopback issue for the reverse-proxy case —
|
||||
the agent isn't using `HTTPS_PROXY`), injects
|
||||
`Authorization: Bearer …`, sets the bottle's
|
||||
`ANTHROPIC_BASE_URL` to the local URL.
|
||||
|
||||
2. **Layer in
|
||||
`CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`** plus the existing
|
||||
pipelock egress allowlist (api.anthropic.com only, plus the
|
||||
per-agent set of MCP / git / package-registry hosts). A
|
||||
hijacked claude can no longer exfil through statsig / Sentry /
|
||||
npm even if it captures something. Also disable Sentry error
|
||||
reporting via `DISABLE_ERROR_REPORTING=1`.
|
||||
|
||||
3. **Generalize the same proxy to forge tokens.** Add a manifest
|
||||
field along the lines of
|
||||
`forge: { kind: "gitea", url, tokenRef }` so a per-bottle token
|
||||
reference resolves at launch, the proxy starts as root before
|
||||
`node` is exec'd, and `tea` plus git HTTPS remotes are
|
||||
pre-configured to point at the proxy. Use
|
||||
`Authorization: token <…>` for Gitea, `Bearer` for GitHub /
|
||||
GitLab.
|
||||
|
||||
4. **Scope-narrow the tokens at issuance.** `repo:write` only, no
|
||||
`admin`, no user management. Fine-grained GitHub PATs, GitLab
|
||||
project-access tokens, Gitea per-repo tokens. Cheapest single
|
||||
thing to do; bounds blast radius regardless of whether the
|
||||
proxy ships.
|
||||
|
||||
5. **Allowlist at the proxy** once usage is stable. Method + path
|
||||
filter keyed off the agent's actual API calls; reject
|
||||
everything else. Doesn't prevent abuse within the allowlist but
|
||||
narrows the surface to known good operations.
|
||||
|
||||
The current `docker run -e CLAUDE_CODE_OAUTH_TOKEN` pattern is
|
||||
fine for argv hygiene on the host, but inside the bottle the
|
||||
token is fully exposed. The proxy pattern moves it across a
|
||||
kernel-enforced boundary — the same property the SSH agent
|
||||
already gives us for keys, and the same property the git-gate
|
||||
already gives us for upstream push credentials.
|
||||
|
||||
## Sources
|
||||
|
||||
### Mechanics
|
||||
|
||||
- [Authentication — Claude Code docs](https://code.claude.com/docs/en/authentication)
|
||||
- [LLM gateway configuration — Claude Code docs](https://code.claude.com/docs/en/llm-gateway)
|
||||
- [Claude Code environment variables](https://code.claude.com/docs/en/env-vars)
|
||||
- [GitHub issue anthropics/claude-code#36998 — interactive mode bypasses ANTHROPIC_BASE_URL](https://github.com/anthropics/claude-code/issues/36998)
|
||||
- [GitHub issue anthropics/claude-code#11587 — apiKeyHelper vs CLAUDE_CODE_OAUTH_TOKEN](https://github.com/anthropics/claude-code/issues/11587)
|
||||
- [`proc_pid_environ(5)` man page](https://man7.org/linux/man-pages/man5/proc_pid_environ.5.html)
|
||||
- [Documenting ptrace access mode checking — LWN](https://lwn.net/Articles/692203/)
|
||||
- [StepSecurity — Claude Code Action outbound network analysis](https://www.stepsecurity.io/blog/anthropics-claude-code-action-security-how-to-secure-claude-code-in-github-actions-with-harden-runner)
|
||||
- [Manage API key environment variables — Claude Help Center](https://support.claude.com/en/articles/12304248-manage-api-key-environment-variables-in-claude-code)
|
||||
- [tea source — `cmd/login/add.go`](https://gitea.com/gitea/tea/src/branch/main/cmd/login/add.go)
|
||||
- [tea source — `modules/config/config.go`](https://gitea.com/gitea/tea/src/branch/main/modules/config/config.go)
|
||||
- [tea source — `modules/git/auth.go`](https://gitea.com/gitea/tea/src/branch/main/modules/git/auth.go)
|
||||
- [Gitea API usage docs](https://docs.gitea.com/development/api-usage)
|
||||
- [go-gitea/gitea#16734 — `Authorization: Bearer` triggers spurious CSRF](https://github.com/go-gitea/gitea/issues/16734)
|
||||
- [golang/go#28866 — `net/http` ignores `HTTPS_PROXY` for `127.0.0.1`/`localhost`](https://github.com/golang/go/issues/28866)
|
||||
- [git credential helper docs](https://git-scm.com/docs/gitcredentials)
|
||||
|
||||
### Landscape
|
||||
|
||||
- [Docker AI Sandboxes — credentials](https://docs.docker.com/ai/sandboxes/security/credentials/)
|
||||
- [docker/desktop-feedback#130 — custom injection rules](https://github.com/docker/desktop-feedback/issues/130)
|
||||
- [Cloudflare Sandbox Auth blog](https://blog.cloudflare.com/sandbox-auth/)
|
||||
- [Cloudflare Outbound Workers GA changelog](https://developers.cloudflare.com/changelog/post/2026-04-13-sandbox-outbound-workers-tls-auth/)
|
||||
- [Cloudflare Sandboxes GA — InfoQ](https://www.infoq.com/news/2026/04/cloudflare-sandboxes-ga/)
|
||||
- [Infisical agent-vault — GitHub](https://github.com/Infisical/agent-vault)
|
||||
- [Infisical agent-vault — releases](https://github.com/Infisical/agent-vault/releases)
|
||||
- [Infisical agent-vault — blog](https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents)
|
||||
- [nono — GitHub](https://github.com/always-further/nono)
|
||||
- [nono — phantom token blog](https://nono.sh/blog/blog-credential-injection)
|
||||
- [Aegis — GitHub](https://github.com/getaegis/aegis)
|
||||
- [OneCLI — GitHub](https://github.com/onecli/onecli)
|
||||
- [Sandbox0 — GitHub](https://github.com/sandbox0-ai/sandbox0)
|
||||
- [Buildkite Cleanroom — GitHub](https://github.com/buildkite/cleanroom)
|
||||
- [Aembit IAM for Agentic AI — GA](https://aembit.io/blog/aembit-iam-for-agentic-ai-is-now-generally-available/)
|
||||
- [Aembit Claude integration docs](https://docs.aembit.io/user-guide/access-policies/server-workloads/guides/claude)
|
||||
- [LiteLLM CVE-2026-42208 — Sysdig writeup](https://www.sysdig.com/blog/cve-2026-42208-targeted-sql-injection-against-litellms-authentication-path-discovered-36-hours-following-vulnerability-disclosure/)
|
||||
- [LiteLLM — GitHub](https://github.com/BerriAI/litellm)
|
||||
- [Portkey + Claude Code](https://portkey.ai/docs/virtual_key_old/integrations/libraries/claude-code)
|
||||
- [Portkey gateway — GitHub](https://github.com/Portkey-ai/gateway)
|
||||
- [Helicone maintenance mode announcement](https://dev.to/torrixai/helicone-is-now-in-maintenance-mode-here-is-how-to-switch-to-a-self-hosted-alternative-in-5-4li0)
|
||||
- [LangSmith LLM auth proxy docs](https://docs.langchain.com/langsmith/llm-auth-proxy-self-hosted)
|
||||
- [AWS IMDSv2 docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html)
|
||||
- [Pipelock — Help Net Security](https://www.helpnetsecurity.com/2026/05/04/pipelock-open-source-ai-agent-firewall/)
|
||||
- [CB4A IETF draft — Credential Broker for Agents](https://www.ietf.org/archive/id/draft-hartman-credential-broker-4-agents-00.html)
|
||||
- [List of coding agent sandboxes (May 2026)](https://gist.github.com/wincent/2752d8d97727577050c043e4ff9e386e)
|
||||
@@ -126,7 +126,7 @@ forwarding to the real remote:
|
||||
unprivileged agent cannot read or modify. It holds the real push
|
||||
credential (deploy key, PAT, ssh agent socket) — the bottle never
|
||||
sees it, same as the auth-injecting proxy for `ANTHROPIC_BASE_URL`
|
||||
in `oauth-token-exposure-to-claude.md`.
|
||||
in `agent-credential-proxy-landscape.md`.
|
||||
- On receive, the gate runs `gitleaks detect` against the incoming
|
||||
refs (and their message text) in a temporary working tree. Clean
|
||||
pushes are forwarded to the real remote. Findings cause the push to
|
||||
|
||||
@@ -1,161 +0,0 @@
|
||||
# OAuth token exposure to claude inside the bottle
|
||||
|
||||
Research into whether `CLAUDE_CODE_OAUTH_TOKEN` — as currently forwarded into
|
||||
each claude-bottle container — is reachable by claude itself, what (if any)
|
||||
deeper-than-prompt mechanisms could hide it, and whether routing claude
|
||||
through an auth-injecting proxy is viable.
|
||||
|
||||
## Summary
|
||||
|
||||
Yes, claude can read `CLAUDE_CODE_OAUTH_TOKEN` trivially today. There is no
|
||||
Linux primitive for "this env var exists in my process but I cannot read it";
|
||||
the only credible boundary is to put the credential in a *different* process
|
||||
that claude cannot read. Default Docker enforces that boundary at the kernel
|
||||
level (a non-root process cannot read `/proc/<root-pid>/environ`), so a
|
||||
root-owned auth-injecting reverse proxy listening on `127.0.0.1` is a
|
||||
realistic design. Claude Code's `ANTHROPIC_BASE_URL` officially supports
|
||||
this routing pattern with bearer auth, with documented caveats around SSE,
|
||||
header passthrough, and out-of-band outbound traffic (telemetry, npm, etc.)
|
||||
that does not route through `ANTHROPIC_BASE_URL` at all.
|
||||
|
||||
## How the token reaches claude today
|
||||
|
||||
1. `claude_bottle/cli/start.py` (around line 237–238) — host's
|
||||
`CLAUDE_BOTTLE_OAUTH_TOKEN` is exported into the launcher process as
|
||||
`CLAUDE_CODE_OAUTH_TOKEN`, then forwarded with
|
||||
`docker run -e CLAUDE_CODE_OAUTH_TOKEN` (no `=value`, so the value
|
||||
never lands on argv — good).
|
||||
2. `claude_bottle/cli/start.py` (around line 318–325) — claude is launched via
|
||||
`docker exec -it <container> claude …`, which inherits the container
|
||||
PID 1's env, including the token.
|
||||
3. claude runs as `node` (UID 1000) with `--dangerously-skip-permissions`.
|
||||
Its Bash tool can run `printenv CLAUDE_CODE_OAUTH_TOKEN`,
|
||||
`cat /proc/self/environ`, `node -e 'console.log(process.env)'`, etc.
|
||||
and capture the value into the conversation.
|
||||
|
||||
A prompt-injection vector — a poisoned skill, a malicious string in a file
|
||||
claude reads, or a hijacked MCP server — can extract the token and
|
||||
exfiltrate it through any allowed outbound channel. The
|
||||
`local-vs-remote-agent-execution.md` note already flags static long-lived
|
||||
tokens as the biggest credential risk; this is exactly that risk, present
|
||||
in the local topology today.
|
||||
|
||||
## Hiding env vars "at a deeper level"
|
||||
|
||||
Linux has no primitive to mark an individual env var as unreadable to the
|
||||
process that holds it. Once a var is in a process's `environ`, the process
|
||||
and its descendants have full access. The deeper-level lever is process
|
||||
boundary, not env-var ACL: put the credential in a *different* process
|
||||
that claude cannot read.
|
||||
|
||||
Default Docker enforces this for you. The kernel's `ptrace_may_access`
|
||||
check rejects `/proc/<pid>/environ` reads when the caller's UID/GID don't
|
||||
match the target's and the caller lacks `CAP_SYS_PTRACE` or `CAP_PERFMON`.
|
||||
A `node`-uid claude process attempting to read a root-owned proxy's
|
||||
environ gets `EACCES`. Escape hatches are explicit and not used by
|
||||
claude-bottle: `--cap-add=SYS_PTRACE`, `--cap-add=PERFMON`,
|
||||
`--privileged`. Yama `ptrace_scope` is irrelevant here because it only
|
||||
relaxes the *same-UID* relationship check; the cross-UID UID-match
|
||||
requirement still blocks the read.
|
||||
|
||||
The `apiKeyHelper` setting in claude-code is **not** a boundary. The
|
||||
helper is invoked by claude's own process, so claude can just call it via
|
||||
Bash and capture stdout. Same trust domain.
|
||||
|
||||
The only credible designs:
|
||||
|
||||
- **Header-injecting reverse proxy** — claude points at a localhost URL;
|
||||
proxy holds the credential; proxy adds `Authorization: Bearer` and
|
||||
forwards. (See next section.)
|
||||
- **Network namespace + outbound proxy** — claude runs with
|
||||
`--network none` and a unix-socket proxy that holds the credential and
|
||||
enforces an egress allowlist. Anthropic's secure-deployment docs
|
||||
describe this pattern; the existing research note on remote agents
|
||||
recommends adding it locally first as the highest-leverage change.
|
||||
- **Don't ship the OAuth token at all** — fall back to per-session login
|
||||
or short-lived tokens. Operationally heavier, and the long-lived OAuth
|
||||
token is the chosen design here precisely because it's portable across
|
||||
hosts (Keychain on macOS, file on Linux).
|
||||
|
||||
## Proxy auth: viable, with caveats
|
||||
|
||||
Pattern:
|
||||
|
||||
- Run a small reverse proxy as **root** inside the container, listening on
|
||||
`127.0.0.1:N` (or a root-owned unix socket with `SO_PEERCRED` checks).
|
||||
- Set `ANTHROPIC_BASE_URL=http://127.0.0.1:N` (or the socket path) in
|
||||
claude's env. Claude as `node` only sees the URL, not the token.
|
||||
- The proxy injects `Authorization: Bearer $TOKEN` and forwards to
|
||||
`https://api.anthropic.com`.
|
||||
- Token lives only in the root proxy's env; node-uid claude cannot read
|
||||
`/proc/<root-pid>/environ` (kernel-enforced).
|
||||
|
||||
`ANTHROPIC_BASE_URL` is documented as routing for proxies/gateways, not
|
||||
just Bedrock/Vertex, and works alongside bearer auth. Confirmed gotchas:
|
||||
|
||||
- **SSE streaming**: proxy must not buffer responses (nginx
|
||||
`proxy_buffering off`, or use a streaming-aware proxy). Claude Code
|
||||
uses SSE only — no websockets.
|
||||
- **Forward `anthropic-version`, `anthropic-beta`, and
|
||||
`X-Claude-Code-Session-Id` untouched** — stripping them breaks tool
|
||||
use / extended thinking / session aggregation.
|
||||
- **GitHub issue [#36998](https://github.com/anthropics/claude-code/issues/36998)**:
|
||||
interactive mode historically bypassed `ANTHROPIC_BASE_URL` for some
|
||||
startup calls (auth validation / org lookup), connecting directly to
|
||||
`api.anthropic.com`. Marked closed but verify with `tcpdump` or
|
||||
`strace -e connect` against the pinned 2.1.126 build before trusting
|
||||
the isolation.
|
||||
- **Tool search** (`ENABLE_TOOL_SEARCH`): disabled by default when
|
||||
`ANTHROPIC_BASE_URL` is non-Anthropic; re-enable explicitly if needed.
|
||||
- **Out-of-band outbound traffic** is the weak link. None of these route
|
||||
through `ANTHROPIC_BASE_URL`:
|
||||
- `statsig.anthropic.com` — telemetry
|
||||
(disable: `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`,
|
||||
`DISABLE_TELEMETRY=1`)
|
||||
- Sentry error reporting (disable: `DISABLE_ERROR_REPORTING=1`)
|
||||
- `registry.npmjs.org`, `github.com`, `release-assets.githubusercontent.com`
|
||||
— MCP installs and autoupdater
|
||||
- `pypi.org`, `bun.sh` — if the Bash tool installs Python or Bun
|
||||
packages during a session
|
||||
|
||||
A hijacked claude could exfiltrate the captured token (or any other
|
||||
data) through these channels even with the proxy in place. Pair the
|
||||
proxy with an explicit egress allowlist (iptables / Docker network
|
||||
policy) for the full benefit.
|
||||
- **Token refresh**: `claude setup-token` issues a ~1-year token with no
|
||||
client-side refresh, so a static proxy value is fine.
|
||||
- **No request signing / anti-replay** on the Messages API; header
|
||||
rewriting is safe.
|
||||
- **`--bare` mode** does not read `CLAUDE_CODE_OAUTH_TOKEN` at all (only
|
||||
`ANTHROPIC_API_KEY`). Not relevant to the interactive flow claude-bottle
|
||||
ships, but worth noting if `--bare` is ever wired in.
|
||||
|
||||
## Recommended path forward
|
||||
|
||||
In priority order:
|
||||
|
||||
1. **`--network none` + a localhost (or unix-socket) auth-injecting
|
||||
proxy** that holds the token. Highest-leverage change: credential
|
||||
isolation **and** egress containment in one pass. Aligns with the
|
||||
recommendation already in `local-vs-remote-agent-execution.md`.
|
||||
2. Layer in `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1` plus an explicit
|
||||
egress allowlist (api.anthropic.com only, plus the per-agent set of
|
||||
MCP / git / package-registry hosts) so a hijacked claude can't
|
||||
exfiltrate through statsig / Sentry / npm.
|
||||
|
||||
The current `docker run -e CLAUDE_CODE_OAUTH_TOKEN` pattern is fine for
|
||||
argv hygiene on the host, but inside the container the token is fully
|
||||
exposed to claude. The proxy pattern moves it across a real
|
||||
kernel-enforced boundary.
|
||||
|
||||
## Sources
|
||||
|
||||
- [Authentication — Claude Code Docs](https://code.claude.com/docs/en/authentication)
|
||||
- [LLM gateway configuration — Claude Code Docs](https://code.claude.com/docs/en/llm-gateway)
|
||||
- [Claude Code Environment Variables](https://code.claude.com/docs/en/env-vars)
|
||||
- [GitHub issue #36998 — Interactive mode ignores ANTHROPIC_BASE_URL](https://github.com/anthropics/claude-code/issues/36998)
|
||||
- [GitHub issue #11587 — Auth conflict: CLAUDE_CODE_OAUTH_TOKEN and apiKeyHelper](https://github.com/anthropics/claude-code/issues/11587)
|
||||
- [proc_pid_environ(5) Linux manual page](https://man7.org/linux/man-pages/man5/proc_pid_environ.5.html)
|
||||
- [Documenting ptrace access mode checking — LWN.net](https://lwn.net/Articles/692203/)
|
||||
- [StepSecurity — Claude Code Action outbound network analysis](https://www.stepsecurity.io/blog/anthropics-claude-code-action-security-how-to-secure-claude-code-in-github-actions-with-harden-runner)
|
||||
- [Manage API key environment variables — Claude Help Center](https://support.claude.com/en/articles/12304248-manage-api-key-environment-variables-in-claude-code)
|
||||
@@ -0,0 +1,317 @@
|
||||
# Secret minimization beats DLP for the encoding/splitting attack class
|
||||
|
||||
Why pipelock's body DLP and gitleaks's pre-receive scan cannot stop a
|
||||
determined exfil attempt, and what architectural shifts actually
|
||||
bound the damage. Companion to
|
||||
[`secret-exfil-tripwire-encodings.md`](secret-exfil-tripwire-encodings.md)
|
||||
(which covers the inverse — detecting a known secret across encodings)
|
||||
and to
|
||||
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
|
||||
(which establishes the gate / auth-injecting-proxy pattern, the
|
||||
per-service mechanics for Anthropic / Gitea / GitHub, and the
|
||||
existing tools in that space).
|
||||
|
||||
## Summary
|
||||
|
||||
claude-bottle's v1 egress story is: pipelock allowlists hostnames,
|
||||
intercepts TLS, body-scans every request against 48 builtin DLP
|
||||
patterns, and blocks on hit. Gitleaks does the analog on `git push`.
|
||||
Both are signature-based. Against a *determined* compromised or
|
||||
prompt-injected agent, both lose: base64 / hex / gzip / split-across-
|
||||
requests / custom-encoding all evade signature match, and one
|
||||
exfil-channel — the Anthropic API conversation body — is deliberately
|
||||
not scanned at all because pipelock's BIP-39 detector false-positives
|
||||
on ordinary English prose.
|
||||
|
||||
The fix is not "scan harder." It is the same shift that the git-gate
|
||||
already makes for upstream push credentials: **keep the secret out of
|
||||
the bottle in the first place.** The agent talks to a per-bottle
|
||||
gate; the gate holds the credential and uses it on the agent's behalf.
|
||||
Whatever the agent can encode and exfil, it cannot encode bytes it
|
||||
never had.
|
||||
|
||||
The encoding/splitting attack is fundamentally unbounded for any
|
||||
secret the bottle holds, but it stops mattering once the bottle holds
|
||||
fewer secrets.
|
||||
|
||||
## Why egress scanning is a losing position
|
||||
|
||||
Three independent failure modes, each sufficient on its own:
|
||||
|
||||
### 1. Encoding evades signature match
|
||||
|
||||
The 48 builtin pipelock patterns (and gitleaks's analog set) recognize
|
||||
known credential *shapes*: `ghp_…`, `sk_live_…`, AWS access-key prefixes,
|
||||
PEM block markers, etc. The same bytes under any of:
|
||||
|
||||
- base64 / base64url / base32 / hex / ascii-85
|
||||
- gzip + base64
|
||||
- per-char numeric encoding
|
||||
- ROT13, custom alphabet substitution
|
||||
- splitting across request boundaries (first 20 chars in request A,
|
||||
last 20 in request B)
|
||||
- chunking via path segments instead of body bytes
|
||||
- DNS subdomain encoding (out-of-scope for pipelock entirely; only
|
||||
the v1 dnsmasq / iptables layer in `network-egress-guard.md` would
|
||||
catch it)
|
||||
|
||||
…do not match any pipelock or gitleaks pattern, because the patterns
|
||||
are byte-literal on the canonical form. `secret-exfil-tripwire-encodings.md`
|
||||
documents the detection side of this gap: no off-the-shelf tool emits
|
||||
N encoded forms of a known secret for downstream matching, and even
|
||||
the multi-encoding decoders in gitleaks 8.20+ / TruffleHog only undo
|
||||
one layer.
|
||||
|
||||
A motivated attacker controls the encoding. A static signature set
|
||||
cannot win that race.
|
||||
|
||||
### 2. The `api.anthropic.com` passthrough is unscanned by design
|
||||
|
||||
`pipelock.py:DEFAULT_TLS_PASSTHROUGH` exempts `api.anthropic.com`
|
||||
from TLS interception because pipelock's BIP-39 seed-phrase detector
|
||||
false-positives on any 12+ consecutive English words from the BIP-39
|
||||
wordlist that happen to checksum cleanly — i.e., on a lot of normal
|
||||
prose. The recommended treatment per pipelock's own docs is
|
||||
`passthrough_domains`: the CONNECT host gate still applies, but body
|
||||
is not decrypted.
|
||||
|
||||
The agent's conversation channel is therefore wide open as an exfil
|
||||
path. A prompt-injected agent that has been told a secret can ship
|
||||
it to Anthropic as conversation text, formatted however it likes,
|
||||
and pipelock sees only `CONNECT api.anthropic.com:443`. The
|
||||
`CLAUDE_BOTTLE_OAUTH_TOKEN` itself rides this exact path.
|
||||
|
||||
### 3. Out-of-band channels exist regardless
|
||||
|
||||
Even with body DLP fully on, the v1 design knowingly leaves several
|
||||
non-HTTP egress paths unscanned (called out as non-goals in PRD 0001
|
||||
§Non-goals): SSH session content via ssh-gate, raw TCP, UDP, ICMP,
|
||||
TLS-SNI domain fronting on allowlisted hosts. The ssh-gate is dumb
|
||||
L4 socat — `scp`, `-L`/`-R`/`-D` port forwards, and arbitrary
|
||||
remote-shell content are all opaque to it. Any one of these is
|
||||
sufficient to ship a secret the agent holds.
|
||||
|
||||
## The shift: stop holding the secret
|
||||
|
||||
The README already states the operative principle:
|
||||
|
||||
> The container itself adds a layer between the agent and the host,
|
||||
> but the v1 design leans more on secret minimization and egress
|
||||
> allowlisting than on the container as a hardened boundary.
|
||||
|
||||
The encoding/splitting class is the residual that *egress allowlisting
|
||||
cannot solve* and that *secret minimization can*. The git-gate is the
|
||||
proof-of-concept: the agent's effective capability is "push to
|
||||
upstream," not "hold an SSH key." Moving the key from the agent into
|
||||
the gate left the capability intact and the credential bytes
|
||||
unreachable to the agent. The same move applies to anything where
|
||||
the agent's actual need is *perform operation X against service Y*,
|
||||
not *hold a credential*.
|
||||
|
||||
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
|
||||
covers the gate pattern in depth: kernel-boundary mechanics, the
|
||||
per-service wiring for `CLAUDE_CODE_OAUTH_TOKEN` / Gitea PAT /
|
||||
GitHub PAT, the four proxy topologies, and the landscape of
|
||||
existing tools. What follows is the same idea generalized to the
|
||||
full secret set in a bottle.
|
||||
|
||||
## Concrete mitigations, ordered by leverage
|
||||
|
||||
### 1. Generalize the gate pattern to API credentials (highest leverage)
|
||||
|
||||
For every credential the agent currently holds, ask: does the agent
|
||||
*need* the bytes, or does it need to *perform an operation* the
|
||||
credential authorizes? If the latter, the credential moves into a
|
||||
per-bottle gate that:
|
||||
|
||||
- Holds the secret in a process the agent cannot read
|
||||
(kernel-enforced cross-UID `/proc/<pid>/environ` deny on Docker;
|
||||
separate VM on smolmachines).
|
||||
- Exposes a narrow surface (a localhost URL, or a service name on
|
||||
the internal network).
|
||||
- Optionally enforces a method/path allowlist so the credential's
|
||||
full scope isn't blanket-granted.
|
||||
|
||||
Two concrete instances worth implementing:
|
||||
|
||||
**Anthropic-API gate.** Holds `CLAUDE_BOTTLE_OAUTH_TOKEN`. Agent's
|
||||
`ANTHROPIC_BASE_URL` points at the gate; gate injects
|
||||
`Authorization: Bearer …` and forwards to api.anthropic.com. The
|
||||
token is no longer in the bottle's env. Once the token is out,
|
||||
`DEFAULT_TLS_PASSTHROUGH` for api.anthropic.com can be dropped and
|
||||
pipelock can body-scan the conversation channel with the noisy
|
||||
patterns (BIP-39) disabled and only high-confidence ones (long
|
||||
high-entropy strings, known token formats) left on. The known
|
||||
gotchas — SSE streaming, header passthrough for `anthropic-version`
|
||||
/ `anthropic-beta` / `X-Claude-Code-Session-Id`, out-of-band
|
||||
telemetry to `statsig.anthropic.com` — are documented in
|
||||
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
|
||||
§Anthropic / Claude Code.
|
||||
|
||||
**Forge-API gate (Gitea / GitHub / GitLab).** Holds the PAT;
|
||||
exposes a narrow REST surface. Token auth on all three is
|
||||
stateless `Authorization`-header injection — no CSRF, no request
|
||||
signing, no per-request nonce — so one proxy generalizes by
|
||||
config. Per-service mechanics (Gitea uses
|
||||
`Authorization: token <…>` not Bearer; `tea` bypasses git's
|
||||
credential.helper; etc.) are in
|
||||
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
|
||||
§Gitea / §GitHub / GitLab.
|
||||
|
||||
Both compose with the existing git-gate (which addresses the push
|
||||
credential), not replace it.
|
||||
|
||||
### 2. Stop passthrough'ing `api.anthropic.com`
|
||||
|
||||
A direct consequence of #1: once the token is out of the bottle,
|
||||
pipelock can decrypt and inspect the conversation channel. Replace
|
||||
the broad BIP-39 detector with a tighter ruleset — high-entropy
|
||||
strings ≥40 chars, known credential format prefixes, long base64
|
||||
blobs — and accept the residual false-positive rate. The unscanned
|
||||
channel was the largest single exfil hole; closing it is worth a
|
||||
modest reduction in DLP precision.
|
||||
|
||||
If the gate work is deferred, an interim is to body-scan even with
|
||||
the token in place: the token will be in every Anthropic request's
|
||||
`Authorization` header anyway, and pipelock's header DLP is already
|
||||
on. The risk is false-positive blocks on legitimate conversation
|
||||
prose; opt-in initially.
|
||||
|
||||
### 3. Egress data budget
|
||||
|
||||
PRD 0001 §Resolution rules reserves `data_budget` as a future
|
||||
field under the existing `egress` block. Cap small (e.g. 1 MiB
|
||||
outbound per session to non-Anthropic hosts). This doesn't stop
|
||||
one-secret one-time exfil — a 4 KiB token still fits — but:
|
||||
|
||||
- Bulk exfil (`.git` history dumps, full credential stores, model
|
||||
weights, large datasets) becomes visibly noisy or impossible.
|
||||
- Repeated small exfil attempts trip the budget and produce a
|
||||
log line / alert.
|
||||
- Forces the attacker to be efficient; efficient channels are
|
||||
easier to fingerprint after the fact.
|
||||
|
||||
### 4. Pre-clean `.git` history when `--cwd` is used
|
||||
|
||||
`provision/git.py:_provision_cwd_git` copies the host cwd's `.git`
|
||||
into `/home/node/workspace/.git` wholesale. Git history contains
|
||||
every secret ever committed, even if removed from HEAD. The agent
|
||||
reads them with `git log --all -p` — a free, unscanned exfil
|
||||
channel via any allowed host.
|
||||
|
||||
Add a provision-time gitleaks scan of the host `.git` before copy,
|
||||
and refuse to launch (or quarantine the offending commits into a
|
||||
read-only orphan branch view) on hit. Doesn't help the legitimate
|
||||
case where the user wants to debug a leaked-secret commit, but
|
||||
gates the common case where the user is unaware of historical
|
||||
secrets in the repo. Pairs with the push-side gitleaks already in
|
||||
the git-gate.
|
||||
|
||||
### 5. Manifest hygiene: explicit secret tagging
|
||||
|
||||
Today `bottle.env` accepts any entry. A `?prompt-at-runtime` or
|
||||
`${HOST_VAR}` interpolation has no special status vs. a literal —
|
||||
the manifest can't tell which entries are credential material and
|
||||
which are configuration.
|
||||
|
||||
Add a `secret: true` flag (or a `secrets:` sibling of `env:`) that:
|
||||
|
||||
- Surfaces those entries prominently in the y/N preflight
|
||||
("this bottle carries 3 secret values: GITEA_TOKEN, GH_PAT,
|
||||
AWS_SECRET_ACCESS_KEY").
|
||||
- Refuses to launch if `egress.allowlist` contains any host that
|
||||
is not source-controlled by the user (heuristic: not on a
|
||||
built-in `KNOWN_FORGE_HOSTS` list).
|
||||
- Forces an explicit acknowledgement that a credential is being
|
||||
placed into the bottle rather than behind a gate.
|
||||
|
||||
Forces the design pressure toward gates for any new credential
|
||||
type, rather than reaching for `bottle.env`.
|
||||
|
||||
### 6. Detection over prevention for the residual
|
||||
|
||||
For everything that remains after the above — and a residual will
|
||||
always remain — log every outbound request to allowlisted hosts in
|
||||
structured form: timestamp, host, path, byte count, content
|
||||
hash, request shape. Don't block; alert. Offline analysis surfaces
|
||||
per-host byte-count outliers, entropy histograms, base64-shaped
|
||||
path/body distributions, and repeat patterns.
|
||||
|
||||
This doesn't catch the first careful exfil. It does catch repeated
|
||||
abuse, scaled exfil, and the second-attempt-after-discovery, and
|
||||
provides forensic trail for the first incident. Pipelock's
|
||||
stdout-only logging today (PRD 0001 §Out of scope: "Audit logging
|
||||
or persistent log storage of pipelock decisions") is the precursor;
|
||||
this would be its persistent / structured follow-up.
|
||||
|
||||
### 7. Shorten OAuth token lifetime
|
||||
|
||||
The `claude setup-token`-issued OAuth token is ~1 year. If a leak
|
||||
happens, blast radius is a year of access to the user's Pro/Max
|
||||
subscription. Even without the gate (#1), encouraging users to
|
||||
rotate monthly bounds the damage. With the gate, the gate can in
|
||||
principle negotiate fresh short-lived tokens per session (depends
|
||||
on whether Anthropic ever ships an OAuth refresh / child-token flow
|
||||
— closed today per `claude-code-token-revocation.md`).
|
||||
|
||||
## What you don't get
|
||||
|
||||
Even with all of the above:
|
||||
|
||||
- An agent that can talk to *any* LLM can use the conversation as a
|
||||
covert channel — encode bits in word choice, response length,
|
||||
formatting. Pipelock can scan for known patterns but cannot stop
|
||||
arbitrary content over a channel whose purpose is arbitrary content.
|
||||
- An agent with shell access to any remote host via ssh-gate can
|
||||
exfil through the SSH session. The ssh-gate is L4 by design (PRD
|
||||
0007 §Non-goals: "SSH-protocol awareness"); generalizing the
|
||||
gate pattern to SSH content inspection is the analog of building
|
||||
the git-gate for arbitrary SSH operations, which is out of scope.
|
||||
- A determined adversary always wins against a single boundary.
|
||||
The point of stacking the above is to make the cost of the *first*
|
||||
exfil higher than the value of the secrets in scope, and to make
|
||||
*repeated* exfil detectable.
|
||||
|
||||
## Recommended path forward
|
||||
|
||||
In priority order:
|
||||
|
||||
1. **Anthropic-API gate**
|
||||
([`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
|
||||
§Recommended). Removes the highest-value secret and closes the
|
||||
passthrough hole as a side effect.
|
||||
2. **Forge-API gate** (same doc, same section — one proxy
|
||||
generalizes across Gitea / GitHub / GitLab by config).
|
||||
3. **Egress data budget** in pipelock — small lift, large damage
|
||||
bound.
|
||||
4. **Pre-launch `.git` gitleaks scan** for `--cwd` bottles.
|
||||
5. **`secret: true` manifest flag** + preflight surfacing.
|
||||
6. **Structured pipelock logs** persisted somewhere durable; offline
|
||||
analysis as a separate follow-up.
|
||||
|
||||
The encoding/splitting attack stops being load-bearing once the
|
||||
bottle doesn't *have* the secret the attacker wants to encode. Every
|
||||
item above is a step toward that property.
|
||||
|
||||
## References
|
||||
|
||||
- `agent-credential-proxy-landscape.md` — gate pattern, kernel
|
||||
boundary mechanics, per-service wiring (Anthropic / Gitea /
|
||||
GitHub / GitLab), proxy topologies, landscape of existing
|
||||
tools (Docker AI Sandboxes, Cloudflare Sandbox Auth,
|
||||
Infisical Agent Vault, nono, Aembit, LiteLLM, Portkey, …)
|
||||
and a build-vs-adopt verdict.
|
||||
- `secret-exfil-tripwire-encodings.md` — inverse problem: detecting
|
||||
a known secret across encodings; explains why generic DLP can't
|
||||
enumerate the encoded forms.
|
||||
- `network-egress-guard.md` — v1 iptables / dnsmasq baseline; the
|
||||
layer below pipelock that catches DNS-subdomain exfil pipelock
|
||||
doesn't see.
|
||||
- `pipelock-assessment.md` — why pipelock was chosen, what its DLP
|
||||
scanners do and don't cover.
|
||||
- `claude-code-token-revocation.md` — Anthropic OAuth refresh
|
||||
story; informs the token-lifetime mitigation.
|
||||
- PRD 0001 — pipelock topology and the `data_budget` reservation.
|
||||
- PRD 0006 — TLS interception; defines the
|
||||
`DEFAULT_TLS_PASSTHROUGH` set this doc proposes to drop.
|
||||
- PRD 0008 — git-gate; the proof-of-concept for the gate pattern.
|
||||
@@ -1,189 +0,0 @@
|
||||
# Isolating the Gitea `tea` token via an auth-injecting proxy
|
||||
|
||||
Research into whether authentication for the `tea` CLI (Gitea's command-line
|
||||
client) can be brokered by a proxy so the access token never enters the
|
||||
container — not as an env var, not in `~/.config/tea/config.yml`. Parallel
|
||||
question to `oauth-token-exposure-to-claude.md`, but for the Gitea credential
|
||||
rather than the Anthropic one.
|
||||
|
||||
## Summary
|
||||
|
||||
Yes. `tea` itself has no credential-helper hook, so the leverage point is on
|
||||
the wire: a root-owned reverse proxy inside the container holds the token and
|
||||
injects `Authorization: token <…>` on every forwarded request. `tea` is
|
||||
configured with `--url` pointing at the proxy and a dummy/empty token; the
|
||||
proxy talks to the real Gitea over TLS. The same kernel boundary that hides
|
||||
the OAuth token in the proxy pattern (a `node`-uid claude cannot read
|
||||
`/proc/<root-pid>/environ` under default Docker) hides the Gitea token here.
|
||||
Git HTTPS push gets the same treatment by rewriting the remote URL to go
|
||||
through the proxy. The unavoidable tradeoff is that a hijacked claude can
|
||||
*use* the token's full scope but cannot *exfiltrate* the bytes — a strict
|
||||
improvement over the env-var status quo, not a panacea.
|
||||
|
||||
## How `tea` authenticates
|
||||
|
||||
`tea` reads the token from three places, in precedence order:
|
||||
|
||||
1. **`GITEA_SERVER_TOKEN` env var** —
|
||||
`cmd/login/add.go` registers it via `cli.EnvVars("GITEA_SERVER_TOKEN")`.
|
||||
2. **`~/.config/tea/config.yml`** (XDG) or `~/.tea/tea.yml` (legacy fallback)
|
||||
— the token is stored in plaintext YAML under the `token` field of the
|
||||
login entry.
|
||||
3. **OS credstore** — only for OAuth logins; PAT-based logins go to the
|
||||
YAML file.
|
||||
|
||||
There is **no `credential.helper` analogue**: no `--token-file`, no
|
||||
FD-passing, no socket-based credential protocol. The only ways to feed `tea`
|
||||
a token are env var or config file, both of which are readable by the
|
||||
process holding them. So the token can't be hidden *inside* `tea`'s
|
||||
process — it has to be held by a *different* process the agent cannot
|
||||
read.
|
||||
|
||||
For HTTPS git operations, `tea` uses `go-git` directly with
|
||||
`BasicAuth{Username: token, Password: ""}` (`modules/git/auth.go`), bypassing
|
||||
git's own credential.helper machinery. This matters: a credential-helper
|
||||
shim alone won't intercept `tea repo clone` — the proxy has to sit on the
|
||||
HTTP path itself.
|
||||
|
||||
## Why a proxy is the only credible boundary
|
||||
|
||||
Same logic as the OAuth note. Linux has no per-env-var ACL: once a var is in
|
||||
a process's `environ`, the process owns it. The lever is process boundary,
|
||||
not env-var ACL. Default Docker enforces that boundary at the kernel level
|
||||
via `ptrace_may_access`: a `node`-uid claude trying to read a root-owned
|
||||
proxy's `/proc/<pid>/environ` gets `EACCES` without `CAP_SYS_PTRACE`,
|
||||
`CAP_PERFMON`, or `--privileged`. claude-bottle uses none of those.
|
||||
|
||||
Gitea's API is friendly to header-injecting proxies: token auth is
|
||||
stateless, no CSRF, no request signing, no per-request nonces. Use the
|
||||
`Authorization: token <…>` form; an old Gitea bug
|
||||
([go-gitea/gitea#16734](https://github.com/go-gitea/gitea/issues/16734))
|
||||
emitted spurious "missing CSRF token" errors for the `Bearer` form on some
|
||||
endpoints. The fix landed upstream, but `token` has always been the
|
||||
header-safe choice.
|
||||
|
||||
## Proxy architectures
|
||||
|
||||
Four shapes worth comparing:
|
||||
|
||||
- **In-container reverse proxy (recommended).** Root-owned process inside
|
||||
the container listens on a non-loopback address (e.g. a Docker bridge IP
|
||||
or an alias). Token is passed via `docker run -e GITEA_TOKEN`, inherited
|
||||
only by the root proxy, never by `node`. `tea login add --url
|
||||
http://<proxy>:<port>` writes a config file whose `token` field is empty
|
||||
or a dummy. Git HTTPS uses a rewritten remote pointing at the same proxy.
|
||||
Pros: simple, self-contained per agent, no host changes, no MITM CA.
|
||||
Cons: requires a non-loopback bind (see below).
|
||||
|
||||
- **In-container forward proxy with TLS termination.** Root-owned mitmproxy
|
||||
intercepts outbound HTTPS, terminates with a container-local CA, injects
|
||||
the header, re-encrypts. `tea` keeps the real Gitea URL and `HTTPS_PROXY`
|
||||
points at the proxy. **Critical Go quirk**: `net/http` ignores
|
||||
`HTTPS_PROXY` when the proxy address is `127.0.0.1` or `localhost`
|
||||
([golang/go#28866](https://github.com/golang/go/issues/28866)). Workaround
|
||||
is the same — bind on a non-loopback address — and you also pay for CA
|
||||
trust setup. Worth it only if you need transparent interception of
|
||||
multiple unrelated hosts.
|
||||
|
||||
- **Host-side proxy.** Proxy runs on macOS; container reaches it via
|
||||
`host.docker.internal:<port>`. The UDS-across-VM constraint already noted
|
||||
in CLAUDE.md (Docker Desktop on macOS does not forward unix-socket
|
||||
`connect()` across the VM) does not apply — TCP via `host.docker.internal`
|
||||
works fine, and the Go loopback bypass isn't an issue because the target
|
||||
is not `127.0.0.1`. Pros: token stays entirely outside the Linux VM.
|
||||
Cons: a host daemon to maintain, and the published port is reachable by
|
||||
any container on the host unless firewalled. This is the architecture
|
||||
Docker's own AI Sandbox product uses.
|
||||
|
||||
- **Sidecar container.** Token-holding container in a shared Docker
|
||||
network. Pros: clean isolation, portable across hosts. Cons: a second
|
||||
container to orchestrate per agent; the token is in another container's
|
||||
env, which is a lateral move rather than a deeper boundary unless the
|
||||
sidecar runs with stricter isolation than the agent container.
|
||||
|
||||
For claude-bottle's threat model — local Docker, per-agent containers,
|
||||
already comfortable with root-owned helpers (the SSH agent precedent) —
|
||||
the in-container reverse proxy is the lowest-friction option that gives
|
||||
the desired property.
|
||||
|
||||
## Caveats and gotchas
|
||||
|
||||
- **Bind on a non-loopback address.** Required for forward-proxy use because
|
||||
of golang/go#28866; harmless for the reverse-proxy case but worth doing
|
||||
consistently so the same proxy works for both shapes. A Docker network
|
||||
alias or `ip addr add 10.0.0.1/32 dev lo` works.
|
||||
- **Use `Authorization: token <…>`, not `Bearer`.** Avoids the legacy
|
||||
CSRF-error path on older Gitea versions.
|
||||
- **`tea` git operations bypass git's credential.helper.** A credential
|
||||
helper shim is not enough; the proxy must sit on the HTTP path. Plain
|
||||
`git push` from claude can use either the proxy (rewritten remote URL) or
|
||||
a credential-helper shim that calls the proxy — the rewritten-remote
|
||||
approach keeps the token bytes out of git's credential negotiation
|
||||
entirely.
|
||||
- **Token scope is the blast radius.** A pass-through proxy grants the
|
||||
agent the token's full API scope. Mitigate with fine-grained Gitea token
|
||||
scopes (`repo:write` only, no `admin`), an HTTP method/path allowlist at
|
||||
the proxy, rate limits, and audit logging. None of these prevent abuse —
|
||||
they bound and observe it.
|
||||
- **No exfil isn't no harm.** A hijacked claude can still push branches,
|
||||
open PRs, and do whatever the token's scope permits. Pair the proxy with
|
||||
the egress-guard work in `network-egress-guard.md` for the full benefit;
|
||||
the two compose cleanly because the proxy is itself an explicit egress
|
||||
endpoint.
|
||||
- **`tea` config file is no longer authoritative.** The launcher must run
|
||||
`tea login add` against the proxy URL (or write a config file directly)
|
||||
before claude starts, otherwise the agent will hit "no logins configured."
|
||||
Empty-token configs are accepted.
|
||||
|
||||
## Prior art
|
||||
|
||||
This is a known pattern with several recent named implementations:
|
||||
|
||||
- **Docker AI Sandboxes** — host-side intercepting proxy that overwrites the
|
||||
auth header; token stays on host, container sees a `proxy-managed`
|
||||
placeholder. Closest analog to what claude-bottle would build.
|
||||
- **Cloudflare Sandbox Auth** — programmable egress with per-sandbox MITM
|
||||
CA for credential injection.
|
||||
- **Infisical agent-vault** — open-source TLS-intercepting forward proxy
|
||||
purpose-built for AI-agent workloads. Research preview as of early 2026.
|
||||
- **AWS IMDSv2** — the canonical credential broker on
|
||||
`169.254.169.254`; same shape, different problem domain.
|
||||
|
||||
## Recommended path forward
|
||||
|
||||
In priority order:
|
||||
|
||||
1. **In-container reverse proxy holding the Gitea token.** Add a manifest
|
||||
field (e.g. `gitea: { url, tokenRef }`) so a per-agent token reference
|
||||
resolves at launch time, the proxy starts as root before `node` is
|
||||
exec'd, and `tea` plus git remotes are pre-configured to point at the
|
||||
proxy. Reuse the same root-owned-helper pattern the SSH agent already
|
||||
establishes.
|
||||
2. **Scope-narrow the Gitea token** at issuance — `repo:write` for the
|
||||
target repo, no `admin`, no user management. This is the cheapest single
|
||||
thing to do and bounds blast radius regardless of whether the proxy
|
||||
ships.
|
||||
3. **Allowlist at the proxy** once usage is stable. Method + path filter
|
||||
keyed off the agent's actual Gitea calls; reject everything else.
|
||||
4. **Compose with `network-egress-guard.md`.** The proxy is one egress
|
||||
endpoint; the egress guard enforces that nothing else escapes.
|
||||
|
||||
The current `docker run -e GITEA_TOKEN`-style pattern is fine for argv
|
||||
hygiene, but inside the container the token is fully exposed to claude.
|
||||
The proxy moves it across a kernel-enforced boundary — same property the
|
||||
SSH agent already gives us for keys.
|
||||
|
||||
## Sources
|
||||
|
||||
- [tea source — `cmd/login/add.go`](https://gitea.com/gitea/tea/src/branch/main/cmd/login/add.go)
|
||||
- [tea source — `modules/config/config.go`](https://gitea.com/gitea/tea/src/branch/main/modules/config/config.go)
|
||||
- [tea source — `modules/git/auth.go`](https://gitea.com/gitea/tea/src/branch/main/modules/git/auth.go)
|
||||
- [Gitea API Usage docs](https://docs.gitea.com/development/api-usage)
|
||||
- [go-gitea/gitea#16734 — `Authorization: Bearer` triggers spurious CSRF error](https://github.com/go-gitea/gitea/issues/16734)
|
||||
- [golang/go#28866 — `net/http` ignores `HTTPS_PROXY` for `127.0.0.1`/`localhost`](https://github.com/golang/go/issues/28866)
|
||||
- [Docker AI Sandbox credentials docs](https://docs.docker.com/ai/sandboxes/security/credentials/)
|
||||
- [Cloudflare Sandbox Auth blog](https://blog.cloudflare.com/sandbox-auth/)
|
||||
- [Infisical agent-vault — GitHub](https://github.com/Infisical/agent-vault)
|
||||
- [Infisical agent-vault — blog post](https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents)
|
||||
- [AWS IMDSv2 documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html)
|
||||
- [git credential helper docs](https://git-scm.com/docs/gitcredentials)
|
||||
Reference in New Issue
Block a user