diff --git a/docs/research/tea-token-isolation-via-proxy.md b/docs/research/tea-token-isolation-via-proxy.md new file mode 100644 index 0000000..d565ebb --- /dev/null +++ b/docs/research/tea-token-isolation-via-proxy.md @@ -0,0 +1,189 @@ +# Isolating the Gitea `tea` token via an auth-injecting proxy + +Research into whether authentication for the `tea` CLI (Gitea's command-line +client) can be brokered by a proxy so the access token never enters the +container — not as an env var, not in `~/.config/tea/config.yml`. Parallel +question to `oauth-token-exposure-to-claude.md`, but for the Gitea credential +rather than the Anthropic one. + +## Summary + +Yes. `tea` itself has no credential-helper hook, so the leverage point is on +the wire: a root-owned reverse proxy inside the container holds the token and +injects `Authorization: token <…>` on every forwarded request. `tea` is +configured with `--url` pointing at the proxy and a dummy/empty token; the +proxy talks to the real Gitea over TLS. The same kernel boundary that hides +the OAuth token in the proxy pattern (a `node`-uid claude cannot read +`/proc//environ` under default Docker) hides the Gitea token here. +Git HTTPS push gets the same treatment by rewriting the remote URL to go +through the proxy. The unavoidable tradeoff is that a hijacked claude can +*use* the token's full scope but cannot *exfiltrate* the bytes — a strict +improvement over the env-var status quo, not a panacea. + +## How `tea` authenticates + +`tea` reads the token from three places, in precedence order: + +1. **`GITEA_SERVER_TOKEN` env var** — + `cmd/login/add.go` registers it via `cli.EnvVars("GITEA_SERVER_TOKEN")`. +2. **`~/.config/tea/config.yml`** (XDG) or `~/.tea/tea.yml` (legacy fallback) + — the token is stored in plaintext YAML under the `token` field of the + login entry. +3. **OS credstore** — only for OAuth logins; PAT-based logins go to the + YAML file. + +There is **no `credential.helper` analogue**: no `--token-file`, no +FD-passing, no socket-based credential protocol. The only ways to feed `tea` +a token are env var or config file, both of which are readable by the +process holding them. So the token can't be hidden *inside* `tea`'s +process — it has to be held by a *different* process the agent cannot +read. + +For HTTPS git operations, `tea` uses `go-git` directly with +`BasicAuth{Username: token, Password: ""}` (`modules/git/auth.go`), bypassing +git's own credential.helper machinery. This matters: a credential-helper +shim alone won't intercept `tea repo clone` — the proxy has to sit on the +HTTP path itself. + +## Why a proxy is the only credible boundary + +Same logic as the OAuth note. Linux has no per-env-var ACL: once a var is in +a process's `environ`, the process owns it. The lever is process boundary, +not env-var ACL. Default Docker enforces that boundary at the kernel level +via `ptrace_may_access`: a `node`-uid claude trying to read a root-owned +proxy's `/proc//environ` gets `EACCES` without `CAP_SYS_PTRACE`, +`CAP_PERFMON`, or `--privileged`. claude-bottle uses none of those. + +Gitea's API is friendly to header-injecting proxies: token auth is +stateless, no CSRF, no request signing, no per-request nonces. Use the +`Authorization: token <…>` form; an old Gitea bug +([go-gitea/gitea#16734](https://github.com/go-gitea/gitea/issues/16734)) +emitted spurious "missing CSRF token" errors for the `Bearer` form on some +endpoints. The fix landed upstream, but `token` has always been the +header-safe choice. + +## Proxy architectures + +Four shapes worth comparing: + +- **In-container reverse proxy (recommended).** Root-owned process inside + the container listens on a non-loopback address (e.g. a Docker bridge IP + or an alias). Token is passed via `docker run -e GITEA_TOKEN`, inherited + only by the root proxy, never by `node`. `tea login add --url + http://:` writes a config file whose `token` field is empty + or a dummy. Git HTTPS uses a rewritten remote pointing at the same proxy. + Pros: simple, self-contained per agent, no host changes, no MITM CA. + Cons: requires a non-loopback bind (see below). + +- **In-container forward proxy with TLS termination.** Root-owned mitmproxy + intercepts outbound HTTPS, terminates with a container-local CA, injects + the header, re-encrypts. `tea` keeps the real Gitea URL and `HTTPS_PROXY` + points at the proxy. **Critical Go quirk**: `net/http` ignores + `HTTPS_PROXY` when the proxy address is `127.0.0.1` or `localhost` + ([golang/go#28866](https://github.com/golang/go/issues/28866)). Workaround + is the same — bind on a non-loopback address — and you also pay for CA + trust setup. Worth it only if you need transparent interception of + multiple unrelated hosts. + +- **Host-side proxy.** Proxy runs on macOS; container reaches it via + `host.docker.internal:`. The UDS-across-VM constraint already noted + in CLAUDE.md (Docker Desktop on macOS does not forward unix-socket + `connect()` across the VM) does not apply — TCP via `host.docker.internal` + works fine, and the Go loopback bypass isn't an issue because the target + is not `127.0.0.1`. Pros: token stays entirely outside the Linux VM. + Cons: a host daemon to maintain, and the published port is reachable by + any container on the host unless firewalled. This is the architecture + Docker's own AI Sandbox product uses. + +- **Sidecar container.** Token-holding container in a shared Docker + network. Pros: clean isolation, portable across hosts. Cons: a second + container to orchestrate per agent; the token is in another container's + env, which is a lateral move rather than a deeper boundary unless the + sidecar runs with stricter isolation than the agent container. + +For claude-bottle's threat model — local Docker, per-agent containers, +already comfortable with root-owned helpers (the SSH agent precedent) — +the in-container reverse proxy is the lowest-friction option that gives +the desired property. + +## Caveats and gotchas + +- **Bind on a non-loopback address.** Required for forward-proxy use because + of golang/go#28866; harmless for the reverse-proxy case but worth doing + consistently so the same proxy works for both shapes. A Docker network + alias or `ip addr add 10.0.0.1/32 dev lo` works. +- **Use `Authorization: token <…>`, not `Bearer`.** Avoids the legacy + CSRF-error path on older Gitea versions. +- **`tea` git operations bypass git's credential.helper.** A credential + helper shim is not enough; the proxy must sit on the HTTP path. Plain + `git push` from claude can use either the proxy (rewritten remote URL) or + a credential-helper shim that calls the proxy — the rewritten-remote + approach keeps the token bytes out of git's credential negotiation + entirely. +- **Token scope is the blast radius.** A pass-through proxy grants the + agent the token's full API scope. Mitigate with fine-grained Gitea token + scopes (`repo:write` only, no `admin`), an HTTP method/path allowlist at + the proxy, rate limits, and audit logging. None of these prevent abuse — + they bound and observe it. +- **No exfil isn't no harm.** A hijacked claude can still push branches, + open PRs, and do whatever the token's scope permits. Pair the proxy with + the egress-guard work in `network-egress-guard.md` for the full benefit; + the two compose cleanly because the proxy is itself an explicit egress + endpoint. +- **`tea` config file is no longer authoritative.** The launcher must run + `tea login add` against the proxy URL (or write a config file directly) + before claude starts, otherwise the agent will hit "no logins configured." + Empty-token configs are accepted. + +## Prior art + +This is a known pattern with several recent named implementations: + +- **Docker AI Sandboxes** — host-side intercepting proxy that overwrites the + auth header; token stays on host, container sees a `proxy-managed` + placeholder. Closest analog to what claude-bottle would build. +- **Cloudflare Sandbox Auth** — programmable egress with per-sandbox MITM + CA for credential injection. +- **Infisical agent-vault** — open-source TLS-intercepting forward proxy + purpose-built for AI-agent workloads. Research preview as of early 2026. +- **AWS IMDSv2** — the canonical credential broker on + `169.254.169.254`; same shape, different problem domain. + +## Recommended path forward + +In priority order: + +1. **In-container reverse proxy holding the Gitea token.** Add a manifest + field (e.g. `gitea: { url, tokenRef }`) so a per-agent token reference + resolves at launch time, the proxy starts as root before `node` is + exec'd, and `tea` plus git remotes are pre-configured to point at the + proxy. Reuse the same root-owned-helper pattern the SSH agent already + establishes. +2. **Scope-narrow the Gitea token** at issuance — `repo:write` for the + target repo, no `admin`, no user management. This is the cheapest single + thing to do and bounds blast radius regardless of whether the proxy + ships. +3. **Allowlist at the proxy** once usage is stable. Method + path filter + keyed off the agent's actual Gitea calls; reject everything else. +4. **Compose with `network-egress-guard.md`.** The proxy is one egress + endpoint; the egress guard enforces that nothing else escapes. + +The current `docker run -e GITEA_TOKEN`-style pattern is fine for argv +hygiene, but inside the container the token is fully exposed to claude. +The proxy moves it across a kernel-enforced boundary — same property the +SSH agent already gives us for keys. + +## Sources + +- [tea source — `cmd/login/add.go`](https://gitea.com/gitea/tea/src/branch/main/cmd/login/add.go) +- [tea source — `modules/config/config.go`](https://gitea.com/gitea/tea/src/branch/main/modules/config/config.go) +- [tea source — `modules/git/auth.go`](https://gitea.com/gitea/tea/src/branch/main/modules/git/auth.go) +- [Gitea API Usage docs](https://docs.gitea.com/development/api-usage) +- [go-gitea/gitea#16734 — `Authorization: Bearer` triggers spurious CSRF error](https://github.com/go-gitea/gitea/issues/16734) +- [golang/go#28866 — `net/http` ignores `HTTPS_PROXY` for `127.0.0.1`/`localhost`](https://github.com/golang/go/issues/28866) +- [Docker AI Sandbox credentials docs](https://docs.docker.com/ai/sandboxes/security/credentials/) +- [Cloudflare Sandbox Auth blog](https://blog.cloudflare.com/sandbox-auth/) +- [Infisical agent-vault — GitHub](https://github.com/Infisical/agent-vault) +- [Infisical agent-vault — blog post](https://infisical.com/blog/agent-vault-the-open-source-credential-proxy-and-vault-for-agents) +- [AWS IMDSv2 documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html) +- [git credential helper docs](https://git-scm.com/docs/gitcredentials)