bot-bottle/docs/research/secret-minimization-over-dlp.md

# Secret minimization beats DLP for the encoding/splitting attack class

Why pipelock's body DLP and gitleaks's pre-receive scan cannot stop a
determined exfil attempt, and what architectural shifts actually
bound the damage. Companion to
[`secret-exfil-tripwire-encodings.md`](secret-exfil-tripwire-encodings.md)
(which covers the inverse — detecting a known secret across encodings)
and to
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
(which establishes the gate / auth-injecting-proxy pattern, the
per-service mechanics for Anthropic / Gitea / GitHub, and the
existing tools in that space).

## Summary

bot-bottle's v1 egress story is: pipelock allowlists hostnames,
intercepts TLS, body-scans every request against 48 builtin DLP
patterns, and blocks on hit. Gitleaks does the analog on `git push`.
Both are signature-based. Against a *determined* compromised or
prompt-injected agent, both lose: base64 / hex / gzip / split-across-
requests / custom-encoding all evade signature match, and one
exfil-channel — the Anthropic API conversation body — is deliberately
not scanned at all because pipelock's BIP-39 detector false-positives
on ordinary English prose.

The fix is not "scan harder." It is the same shift that the git-gate
already makes for upstream push credentials: **keep the secret out of
the bottle in the first place.** The agent talks to a per-bottle
gate; the gate holds the credential and uses it on the agent's behalf.
Whatever the agent can encode and exfil, it cannot encode bytes it
never had.

The encoding/splitting attack is fundamentally unbounded for any
secret the bottle holds, but it stops mattering once the bottle holds
fewer secrets.

## Why egress scanning is a losing position

Three independent failure modes, each sufficient on its own:

### 1. Encoding evades signature match

The 48 builtin pipelock patterns (and gitleaks's analog set) recognize
known credential *shapes*: `ghp_…`, `sk_live_…`, AWS access-key prefixes,
PEM block markers, etc. The same bytes under any of:

- base64 / base64url / base32 / hex / ascii-85
- gzip + base64
- per-char numeric encoding
- ROT13, custom alphabet substitution
- splitting across request boundaries (first 20 chars in request A,
  last 20 in request B)
- chunking via path segments instead of body bytes
- DNS subdomain encoding (out-of-scope for pipelock entirely; only
  the v1 dnsmasq / iptables layer in `network-egress-guard.md` would
  catch it)

…do not match any pipelock or gitleaks pattern, because the patterns
are byte-literal on the canonical form. `secret-exfil-tripwire-encodings.md`
documents the detection side of this gap: no off-the-shelf tool emits
N encoded forms of a known secret for downstream matching, and even
the multi-encoding decoders in gitleaks 8.20+ / TruffleHog only undo
one layer.

A motivated attacker controls the encoding. A static signature set
cannot win that race.

### 2. The `api.anthropic.com` passthrough is unscanned by design

`pipelock.py:DEFAULT_TLS_PASSTHROUGH` exempts `api.anthropic.com`
from TLS interception because pipelock's BIP-39 seed-phrase detector
false-positives on any 12+ consecutive English words from the BIP-39
wordlist that happen to checksum cleanly — i.e., on a lot of normal
prose. The recommended treatment per pipelock's own docs is
`passthrough_domains`: the CONNECT host gate still applies, but body
is not decrypted.

The agent's conversation channel is therefore wide open as an exfil
path. A prompt-injected agent that has been told a secret can ship
it to Anthropic as conversation text, formatted however it likes,
and pipelock sees only `CONNECT api.anthropic.com:443`. The
`BOT_BOTTLE_CLAUDE_OAUTH_TOKEN` itself rides this exact path.

### 3. Out-of-band channels exist regardless

Even with body DLP fully on, the v1 design knowingly leaves several
non-HTTP egress paths unscanned (called out as non-goals in PRD 0001
§Non-goals): SSH session content via ssh-gate, raw TCP, UDP, ICMP,
TLS-SNI domain fronting on allowlisted hosts. The ssh-gate is dumb
L4 socat — `scp`, `-L`/`-R`/`-D` port forwards, and arbitrary
remote-shell content are all opaque to it. Any one of these is
sufficient to ship a secret the agent holds.

## The shift: stop holding the secret

The README already states the operative principle:

> The container itself adds a layer between the agent and the host,
> but the v1 design leans more on secret minimization and egress
> allowlisting than on the container as a hardened boundary.

The encoding/splitting class is the residual that *egress allowlisting
cannot solve* and that *secret minimization can*. The git-gate is the
proof-of-concept: the agent's effective capability is "push to
upstream," not "hold an SSH key." Moving the key from the agent into
the gate left the capability intact and the credential bytes
unreachable to the agent. The same move applies to anything where
the agent's actual need is *perform operation X against service Y*,
not *hold a credential*.

[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
covers the gate pattern in depth: kernel-boundary mechanics, the
per-service wiring for `CLAUDE_CODE_OAUTH_TOKEN` / Gitea PAT /
GitHub PAT, the four proxy topologies, and the landscape of
existing tools. What follows is the same idea generalized to the
full secret set in a bottle.

## Concrete mitigations, ordered by leverage

### 1. Generalize the gate pattern to API credentials (highest leverage)

For every credential the agent currently holds, ask: does the agent
*need* the bytes, or does it need to *perform an operation* the
credential authorizes? If the latter, the credential moves into a
per-bottle gate that:

- Holds the secret in a process the agent cannot read
  (kernel-enforced cross-UID `/proc/<pid>/environ` deny on Docker;
  separate VM on smolmachines).
- Exposes a narrow surface (a localhost URL, or a service name on
  the internal network).
- Optionally enforces a method/path allowlist so the credential's
  full scope isn't blanket-granted.

Two concrete instances worth implementing:

**Anthropic-API gate.** Holds `BOT_BOTTLE_CLAUDE_OAUTH_TOKEN`. Agent's
`ANTHROPIC_BASE_URL` points at the gate; gate injects
`Authorization: Bearer …` and forwards to api.anthropic.com. The
token is no longer in the bottle's env. Once the token is out,
`DEFAULT_TLS_PASSTHROUGH` for api.anthropic.com can be dropped and
pipelock can body-scan the conversation channel with the noisy
patterns (BIP-39) disabled and only high-confidence ones (long
high-entropy strings, known token formats) left on. The known
gotchas — SSE streaming, header passthrough for `anthropic-version`
/ `anthropic-beta` / `X-Claude-Code-Session-Id`, out-of-band
telemetry to `statsig.anthropic.com` — are documented in
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
§Anthropic / Claude Code.

**Git-host-API gate (Gitea / GitHub / GitLab).** Holds the PAT;
exposes a narrow REST surface. Token auth on all three is
stateless `Authorization`-header injection — no CSRF, no request
signing, no per-request nonce — so one proxy generalizes by
config. Per-service mechanics (Gitea uses
`Authorization: token <…>` not Bearer; `tea` bypasses git's
credential.helper; etc.) are in
[`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
§Gitea / §GitHub / GitLab.

Both compose with the existing git-gate (which addresses the push
credential), not replace it.

### 2. Stop passthrough'ing `api.anthropic.com`

A direct consequence of #1: once the token is out of the bottle,
pipelock can decrypt and inspect the conversation channel. Replace
the broad BIP-39 detector with a tighter ruleset — high-entropy
strings ≥40 chars, known credential format prefixes, long base64
blobs — and accept the residual false-positive rate. The unscanned
channel was the largest single exfil hole; closing it is worth a
modest reduction in DLP precision.

If the gate work is deferred, an interim is to body-scan even with
the token in place: the token will be in every Anthropic request's
`Authorization` header anyway, and pipelock's header DLP is already
on. The risk is false-positive blocks on legitimate conversation
prose; opt-in initially.

### 3. Egress data budget

PRD 0001 §Resolution rules reserves `data_budget` as a future
field under the existing `egress` block. Cap small (e.g. 1 MiB
outbound per session to non-Anthropic hosts). This doesn't stop
one-secret one-time exfil — a 4 KiB token still fits — but:

- Bulk exfil (`.git` history dumps, full credential stores, model
  weights, large datasets) becomes visibly noisy or impossible.
- Repeated small exfil attempts trip the budget and produce a
  log line / alert.
- Forces the attacker to be efficient; efficient channels are
  easier to fingerprint after the fact.

### 4. Pre-clean `.git` history when `--cwd` is used

`provision/git.py:_provision_cwd_git` copies the host cwd's `.git`
into `/home/node/workspace/.git` wholesale. Git history contains
every secret ever committed, even if removed from HEAD. The agent
reads them with `git log --all -p` — a free, unscanned exfil
channel via any allowed host.

Add a provision-time gitleaks scan of the host `.git` before copy,
and refuse to launch (or quarantine the offending commits into a
read-only orphan branch view) on hit. Doesn't help the legitimate
case where the user wants to debug a leaked-secret commit, but
gates the common case where the user is unaware of historical
secrets in the repo. Pairs with the push-side gitleaks already in
the git-gate.

### 5. Manifest hygiene: explicit secret tagging

Today `bottle.env` accepts any entry. A `?prompt-at-runtime` or
`${HOST_VAR}` interpolation has no special status vs. a literal —
the manifest can't tell which entries are credential material and
which are configuration.

Add a `secret: true` flag (or a `secrets:` sibling of `env:`) that:

- Surfaces those entries prominently in the y/N preflight
  ("this bottle carries 3 secret values: GITEA_TOKEN, GH_PAT,
  AWS_SECRET_ACCESS_KEY").
- Refuses to launch if `egress.allowlist` contains any host that
  is not source-controlled by the user (heuristic: not on a
  built-in `KNOWN_GIT_HOSTS` list).
- Forces an explicit acknowledgement that a credential is being
  placed into the bottle rather than behind a gate.

Forces the design pressure toward gates for any new credential
type, rather than reaching for `bottle.env`.

### 6. Detection over prevention for the residual

For everything that remains after the above — and a residual will
always remain — log every outbound request to allowlisted hosts in
structured form: timestamp, host, path, byte count, content
hash, request shape. Don't block; alert. Offline analysis surfaces
per-host byte-count outliers, entropy histograms, base64-shaped
path/body distributions, and repeat patterns.

This doesn't catch the first careful exfil. It does catch repeated
abuse, scaled exfil, and the second-attempt-after-discovery, and
provides forensic trail for the first incident. Pipelock's
stdout-only logging today (PRD 0001 §Out of scope: "Audit logging
or persistent log storage of pipelock decisions") is the precursor;
this would be its persistent / structured follow-up.

### 7. Shorten OAuth token lifetime

The `claude setup-token`-issued OAuth token is ~1 year. If a leak
happens, blast radius is a year of access to the user's Pro/Max
subscription. Even without the gate (#1), encouraging users to
rotate monthly bounds the damage. With the gate, the gate can in
principle negotiate fresh short-lived tokens per session (depends
on whether Anthropic ever ships an OAuth refresh / child-token flow
— closed today per `claude-code-token-revocation.md`).

## What you don't get

Even with all of the above:

- An agent that can talk to *any* LLM can use the conversation as a
  covert channel — encode bits in word choice, response length,
  formatting. Pipelock can scan for known patterns but cannot stop
  arbitrary content over a channel whose purpose is arbitrary content.
- An agent with shell access to any remote host via ssh-gate can
  exfil through the SSH session. The ssh-gate is L4 by design (PRD
  0007 §Non-goals: "SSH-protocol awareness"); generalizing the
  gate pattern to SSH content inspection is the analog of building
  the git-gate for arbitrary SSH operations, which is out of scope.
- A determined adversary always wins against a single boundary.
  The point of stacking the above is to make the cost of the *first*
  exfil higher than the value of the secrets in scope, and to make
  *repeated* exfil detectable.

## Recommended path forward

In priority order:

1. **Anthropic-API gate**
   ([`agent-credential-proxy-landscape.md`](agent-credential-proxy-landscape.md)
   §Recommended). Removes the highest-value secret and closes the
   passthrough hole as a side effect.
2. **Git-host-API gate** (same doc, same section — one proxy
   generalizes across Gitea / GitHub / GitLab by config).
3. **Egress data budget** in pipelock — small lift, large damage
   bound.
4. **Pre-launch `.git` gitleaks scan** for `--cwd` bottles.
5. **`secret: true` manifest flag** + preflight surfacing.
6. **Structured pipelock logs** persisted somewhere durable; offline
   analysis as a separate follow-up.

The encoding/splitting attack stops being load-bearing once the
bottle doesn't *have* the secret the attacker wants to encode. Every
item above is a step toward that property.

## References

- `agent-credential-proxy-landscape.md` — gate pattern, kernel
  boundary mechanics, per-service wiring (Anthropic / Gitea /
  GitHub / GitLab), proxy topologies, landscape of existing
  tools (Docker AI Sandboxes, Cloudflare Sandbox Auth,
  Infisical Agent Vault, nono, Aembit, LiteLLM, Portkey, …)
  and a build-vs-adopt verdict.
- `secret-exfil-tripwire-encodings.md` — inverse problem: detecting
  a known secret across encodings; explains why generic DLP can't
  enumerate the encoded forms.
- `network-egress-guard.md` — v1 iptables / dnsmasq baseline; the
  layer below pipelock that catches DNS-subdomain exfil pipelock
  doesn't see.
- `pipelock-assessment.md` — why pipelock was chosen, what its DLP
  scanners do and don't cover.
- `claude-code-token-revocation.md` — Anthropic OAuth refresh
  story; informs the token-lifetime mitigation.
- PRD 0001 — pipelock topology and the `data_budget` reservation.
- PRD 0006 — TLS interception; defines the
  `DEFAULT_TLS_PASSTHROUGH` set this doc proposes to drop.
- PRD 0008 — git-gate; the proof-of-concept for the gate pattern.