Consolidates oauth-token-exposure-to-claude.md and tea-token-isolation-via-proxy.md into agent-credential-proxy-landscape.md, adding a May-2026 survey of existing tools (Docker AI Sandboxes, Cloudflare Sandbox Auth, Infisical Agent Vault, nono, Aembit, LiteLLM CVE-2026-42208, Portkey, Helicone, etc.) and a build-vs-adopt verdict. Adds secret-minimization-over-dlp.md explaining why pipelock's body DLP and gitleaks's pre-receive scan cannot stop encoding/splitting exfil, and why moving credentials out of the bottle (the git-gate pattern, generalized) is the only robust answer. Updates git-secret-scanning-hardening.md's reference to point at the new consolidated landscape doc. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
14 KiB
Secret minimization beats DLP for the encoding/splitting attack class
Why pipelock's body DLP and gitleaks's pre-receive scan cannot stop a
determined exfil attempt, and what architectural shifts actually
bound the damage. Companion to
secret-exfil-tripwire-encodings.md
(which covers the inverse — detecting a known secret across encodings)
and to
agent-credential-proxy-landscape.md
(which establishes the gate / auth-injecting-proxy pattern, the
per-service mechanics for Anthropic / Gitea / GitHub, and the
existing tools in that space).
Summary
claude-bottle's v1 egress story is: pipelock allowlists hostnames,
intercepts TLS, body-scans every request against 48 builtin DLP
patterns, and blocks on hit. Gitleaks does the analog on git push.
Both are signature-based. Against a determined compromised or
prompt-injected agent, both lose: base64 / hex / gzip / split-across-
requests / custom-encoding all evade signature match, and one
exfil-channel — the Anthropic API conversation body — is deliberately
not scanned at all because pipelock's BIP-39 detector false-positives
on ordinary English prose.
The fix is not "scan harder." It is the same shift that the git-gate already makes for upstream push credentials: keep the secret out of the bottle in the first place. The agent talks to a per-bottle gate; the gate holds the credential and uses it on the agent's behalf. Whatever the agent can encode and exfil, it cannot encode bytes it never had.
The encoding/splitting attack is fundamentally unbounded for any secret the bottle holds, but it stops mattering once the bottle holds fewer secrets.
Why egress scanning is a losing position
Three independent failure modes, each sufficient on its own:
1. Encoding evades signature match
The 48 builtin pipelock patterns (and gitleaks's analog set) recognize
known credential shapes: ghp_…, sk_live_…, AWS access-key prefixes,
PEM block markers, etc. The same bytes under any of:
- base64 / base64url / base32 / hex / ascii-85
- gzip + base64
- per-char numeric encoding
- ROT13, custom alphabet substitution
- splitting across request boundaries (first 20 chars in request A, last 20 in request B)
- chunking via path segments instead of body bytes
- DNS subdomain encoding (out-of-scope for pipelock entirely; only
the v1 dnsmasq / iptables layer in
network-egress-guard.mdwould catch it)
…do not match any pipelock or gitleaks pattern, because the patterns
are byte-literal on the canonical form. secret-exfil-tripwire-encodings.md
documents the detection side of this gap: no off-the-shelf tool emits
N encoded forms of a known secret for downstream matching, and even
the multi-encoding decoders in gitleaks 8.20+ / TruffleHog only undo
one layer.
A motivated attacker controls the encoding. A static signature set cannot win that race.
2. The api.anthropic.com passthrough is unscanned by design
pipelock.py:DEFAULT_TLS_PASSTHROUGH exempts api.anthropic.com
from TLS interception because pipelock's BIP-39 seed-phrase detector
false-positives on any 12+ consecutive English words from the BIP-39
wordlist that happen to checksum cleanly — i.e., on a lot of normal
prose. The recommended treatment per pipelock's own docs is
passthrough_domains: the CONNECT host gate still applies, but body
is not decrypted.
The agent's conversation channel is therefore wide open as an exfil
path. A prompt-injected agent that has been told a secret can ship
it to Anthropic as conversation text, formatted however it likes,
and pipelock sees only CONNECT api.anthropic.com:443. The
CLAUDE_BOTTLE_OAUTH_TOKEN itself rides this exact path.
3. Out-of-band channels exist regardless
Even with body DLP fully on, the v1 design knowingly leaves several
non-HTTP egress paths unscanned (called out as non-goals in PRD 0001
§Non-goals): SSH session content via ssh-gate, raw TCP, UDP, ICMP,
TLS-SNI domain fronting on allowlisted hosts. The ssh-gate is dumb
L4 socat — scp, -L/-R/-D port forwards, and arbitrary
remote-shell content are all opaque to it. Any one of these is
sufficient to ship a secret the agent holds.
The shift: stop holding the secret
The README already states the operative principle:
The container itself adds a layer between the agent and the host, but the v1 design leans more on secret minimization and egress allowlisting than on the container as a hardened boundary.
The encoding/splitting class is the residual that egress allowlisting cannot solve and that secret minimization can. The git-gate is the proof-of-concept: the agent's effective capability is "push to upstream," not "hold an SSH key." Moving the key from the agent into the gate left the capability intact and the credential bytes unreachable to the agent. The same move applies to anything where the agent's actual need is perform operation X against service Y, not hold a credential.
agent-credential-proxy-landscape.md
covers the gate pattern in depth: kernel-boundary mechanics, the
per-service wiring for CLAUDE_CODE_OAUTH_TOKEN / Gitea PAT /
GitHub PAT, the four proxy topologies, and the landscape of
existing tools. What follows is the same idea generalized to the
full secret set in a bottle.
Concrete mitigations, ordered by leverage
1. Generalize the gate pattern to API credentials (highest leverage)
For every credential the agent currently holds, ask: does the agent need the bytes, or does it need to perform an operation the credential authorizes? If the latter, the credential moves into a per-bottle gate that:
- Holds the secret in a process the agent cannot read
(kernel-enforced cross-UID
/proc/<pid>/environdeny on Docker; separate VM on smolmachines). - Exposes a narrow surface (a localhost URL, or a service name on the internal network).
- Optionally enforces a method/path allowlist so the credential's full scope isn't blanket-granted.
Two concrete instances worth implementing:
Anthropic-API gate. Holds CLAUDE_BOTTLE_OAUTH_TOKEN. Agent's
ANTHROPIC_BASE_URL points at the gate; gate injects
Authorization: Bearer … and forwards to api.anthropic.com. The
token is no longer in the bottle's env. Once the token is out,
DEFAULT_TLS_PASSTHROUGH for api.anthropic.com can be dropped and
pipelock can body-scan the conversation channel with the noisy
patterns (BIP-39) disabled and only high-confidence ones (long
high-entropy strings, known token formats) left on. The known
gotchas — SSE streaming, header passthrough for anthropic-version
/ anthropic-beta / X-Claude-Code-Session-Id, out-of-band
telemetry to statsig.anthropic.com — are documented in
agent-credential-proxy-landscape.md
§Anthropic / Claude Code.
Forge-API gate (Gitea / GitHub / GitLab). Holds the PAT;
exposes a narrow REST surface. Token auth on all three is
stateless Authorization-header injection — no CSRF, no request
signing, no per-request nonce — so one proxy generalizes by
config. Per-service mechanics (Gitea uses
Authorization: token <…> not Bearer; tea bypasses git's
credential.helper; etc.) are in
agent-credential-proxy-landscape.md
§Gitea / §GitHub / GitLab.
Both compose with the existing git-gate (which addresses the push credential), not replace it.
2. Stop passthrough'ing api.anthropic.com
A direct consequence of #1: once the token is out of the bottle, pipelock can decrypt and inspect the conversation channel. Replace the broad BIP-39 detector with a tighter ruleset — high-entropy strings ≥40 chars, known credential format prefixes, long base64 blobs — and accept the residual false-positive rate. The unscanned channel was the largest single exfil hole; closing it is worth a modest reduction in DLP precision.
If the gate work is deferred, an interim is to body-scan even with
the token in place: the token will be in every Anthropic request's
Authorization header anyway, and pipelock's header DLP is already
on. The risk is false-positive blocks on legitimate conversation
prose; opt-in initially.
3. Egress data budget
PRD 0001 §Resolution rules reserves data_budget as a future
field under the existing egress block. Cap small (e.g. 1 MiB
outbound per session to non-Anthropic hosts). This doesn't stop
one-secret one-time exfil — a 4 KiB token still fits — but:
- Bulk exfil (
.githistory dumps, full credential stores, model weights, large datasets) becomes visibly noisy or impossible. - Repeated small exfil attempts trip the budget and produce a log line / alert.
- Forces the attacker to be efficient; efficient channels are easier to fingerprint after the fact.
4. Pre-clean .git history when --cwd is used
provision/git.py:_provision_cwd_git copies the host cwd's .git
into /home/node/workspace/.git wholesale. Git history contains
every secret ever committed, even if removed from HEAD. The agent
reads them with git log --all -p — a free, unscanned exfil
channel via any allowed host.
Add a provision-time gitleaks scan of the host .git before copy,
and refuse to launch (or quarantine the offending commits into a
read-only orphan branch view) on hit. Doesn't help the legitimate
case where the user wants to debug a leaked-secret commit, but
gates the common case where the user is unaware of historical
secrets in the repo. Pairs with the push-side gitleaks already in
the git-gate.
5. Manifest hygiene: explicit secret tagging
Today bottle.env accepts any entry. A ?prompt-at-runtime or
${HOST_VAR} interpolation has no special status vs. a literal —
the manifest can't tell which entries are credential material and
which are configuration.
Add a secret: true flag (or a secrets: sibling of env:) that:
- Surfaces those entries prominently in the y/N preflight ("this bottle carries 3 secret values: GITEA_TOKEN, GH_PAT, AWS_SECRET_ACCESS_KEY").
- Refuses to launch if
egress.allowlistcontains any host that is not source-controlled by the user (heuristic: not on a built-inKNOWN_FORGE_HOSTSlist). - Forces an explicit acknowledgement that a credential is being placed into the bottle rather than behind a gate.
Forces the design pressure toward gates for any new credential
type, rather than reaching for bottle.env.
6. Detection over prevention for the residual
For everything that remains after the above — and a residual will always remain — log every outbound request to allowlisted hosts in structured form: timestamp, host, path, byte count, content hash, request shape. Don't block; alert. Offline analysis surfaces per-host byte-count outliers, entropy histograms, base64-shaped path/body distributions, and repeat patterns.
This doesn't catch the first careful exfil. It does catch repeated abuse, scaled exfil, and the second-attempt-after-discovery, and provides forensic trail for the first incident. Pipelock's stdout-only logging today (PRD 0001 §Out of scope: "Audit logging or persistent log storage of pipelock decisions") is the precursor; this would be its persistent / structured follow-up.
7. Shorten OAuth token lifetime
The claude setup-token-issued OAuth token is ~1 year. If a leak
happens, blast radius is a year of access to the user's Pro/Max
subscription. Even without the gate (#1), encouraging users to
rotate monthly bounds the damage. With the gate, the gate can in
principle negotiate fresh short-lived tokens per session (depends
on whether Anthropic ever ships an OAuth refresh / child-token flow
— closed today per claude-code-token-revocation.md).
What you don't get
Even with all of the above:
- An agent that can talk to any LLM can use the conversation as a covert channel — encode bits in word choice, response length, formatting. Pipelock can scan for known patterns but cannot stop arbitrary content over a channel whose purpose is arbitrary content.
- An agent with shell access to any remote host via ssh-gate can exfil through the SSH session. The ssh-gate is L4 by design (PRD 0007 §Non-goals: "SSH-protocol awareness"); generalizing the gate pattern to SSH content inspection is the analog of building the git-gate for arbitrary SSH operations, which is out of scope.
- A determined adversary always wins against a single boundary. The point of stacking the above is to make the cost of the first exfil higher than the value of the secrets in scope, and to make repeated exfil detectable.
Recommended path forward
In priority order:
- Anthropic-API gate
(
agent-credential-proxy-landscape.md§Recommended). Removes the highest-value secret and closes the passthrough hole as a side effect. - Forge-API gate (same doc, same section — one proxy generalizes across Gitea / GitHub / GitLab by config).
- Egress data budget in pipelock — small lift, large damage bound.
- Pre-launch
.gitgitleaks scan for--cwdbottles. secret: truemanifest flag + preflight surfacing.- Structured pipelock logs persisted somewhere durable; offline analysis as a separate follow-up.
The encoding/splitting attack stops being load-bearing once the bottle doesn't have the secret the attacker wants to encode. Every item above is a step toward that property.
References
agent-credential-proxy-landscape.md— gate pattern, kernel boundary mechanics, per-service wiring (Anthropic / Gitea / GitHub / GitLab), proxy topologies, landscape of existing tools (Docker AI Sandboxes, Cloudflare Sandbox Auth, Infisical Agent Vault, nono, Aembit, LiteLLM, Portkey, …) and a build-vs-adopt verdict.secret-exfil-tripwire-encodings.md— inverse problem: detecting a known secret across encodings; explains why generic DLP can't enumerate the encoded forms.network-egress-guard.md— v1 iptables / dnsmasq baseline; the layer below pipelock that catches DNS-subdomain exfil pipelock doesn't see.pipelock-assessment.md— why pipelock was chosen, what its DLP scanners do and don't cover.claude-code-token-revocation.md— Anthropic OAuth refresh story; informs the token-lifetime mitigation.- PRD 0001 — pipelock topology and the
data_budgetreservation. - PRD 0006 — TLS interception; defines the
DEFAULT_TLS_PASSTHROUGHset this doc proposes to drop. - PRD 0008 — git-gate; the proof-of-concept for the gate pattern.