fix(pipelock): disable seed_phrase_detection for anthropic bottles

The previous attempt added a `suppress: [{rule, path}]` entry. The yaml validated and the entry showed up in the live pipelock's config, but the BIP-39 detector kept firing — `suppress` only silences alerts, not enforcement. Reproduced the failure in isolation, probed three knobs against a real pipelock with a canonical BIP-39 body (`abandon abandon ... about`): suppress: [{rule: "BIP-39 Seed Phrase", path: "/anthropic/**"}] -> still 403 rules.disabled: ["dlp:BIP-39 Seed Phrase"] -> still 403 seed_phrase_detection: { enabled: false } -> 200 (forwarded) Only the global toggle actually stops the block. Pipelock 2.3.0 has no per-path / per-host knob for this detector, so the trade-off is: when the bottle declares an `anthropic-base-url` route, BIP-39 detection comes off globally for that bottle. Every other DLP pattern (gh*_, sk-ant-, AKIA, etc.) keeps firing — the ones that actually map to claude-bottle's threat model. Drops the `suppress:` emitter from pipelock_build_config / pipelock_render_yaml; replaces with a `seed_phrase_detection: { enabled: false }` block driven by `pipelock_seed_phrase_detection_enabled(bottle)`. Tests flip from suppress-shape to seed_phrase shape. End-to-end probe through the real pipelock image confirms BIP-39 bodies forward.
2026-05-24 13:59:05 -04:00
parent c5d729e25d
commit 4662087b32
2 changed files with 49 additions and 43 deletions
@@ -99,29 +99,35 @@ def pipelock_effective_allowlist(bottle: Bottle) -> list[str]:
    return sorted(seen.keys())


-def pipelock_effective_suppress(bottle: Bottle) -> list[dict[str, str]]:
-    """Per-bottle pipelock detector suppressions.
+def pipelock_seed_phrase_detection_enabled(bottle: Bottle) -> bool:
+    """Whether pipelock's BIP-39 seed-phrase detector stays on for
+    this bottle.

-    Pipelock's `suppress:` block silences a named rule on a path glob.
-    LLM conversation bodies legitimately trip detectors that look for
-    natural-language token shapes — most famously the BIP-39 seed-
-    phrase detector, which fires on any 12+ English words that pass
-    the BIP-39 checksum. The direct path to `api.anthropic.com` is
-    already on tls_interception.passthrough_domains so no body scan
-    runs there, but the cred-proxy hop (where the agent dials
-    `http://cred-proxy:9099/anthropic/...`) is plain HTTP through
-    pipelock — body scanning fires.
+    LLM conversation bodies legitimately trip the detector — any 12+
+    English words that pass the BIP-39 checksum match — so any
+    bottle that routes claude through pipelock's body scanner gets
+    blocked on the first real chat. We tried two narrower knobs
+    first:

-    For each route with the `anthropic-base-url` role, suppress
-    BIP-39 on `<path>**` so claude-code's chat bodies make it
-    through. All other detectors (credit-card, IBAN, token regexes,
-    etc.) keep firing — those are unambiguous credential shapes
-    that have no legitimate reason to appear in a chat completion."""
-    out: list[dict[str, str]] = []
-    for r in bottle.cred_proxy.routes:
-        if "anthropic-base-url" in r.Role:
-            out.append({"rule": "BIP-39 Seed Phrase", "path": f"{r.Path}**"})
-    return out
+      - `suppress: [{rule, path}]` — pipelock accepts the schema
+        but the entry only silences the alert; the body_dlp block
+        still fires.
+      - `rules.disabled: ["dlp:BIP-39 Seed Phrase"]` — same shape,
+        same outcome: 403 still returned.
+
+    Empirically only `seed_phrase_detection.enabled: false`
+    actually stops the block (verified by sending a 12-word BIP-39
+    body through three pipelock instances). It is a global toggle
+    — there is no per-path / per-host knob in pipelock 2.3.0 — so
+    we turn the detector off for the entire bottle when an
+    `anthropic-base-url` route is declared. The trade-off is
+    accepted: BIP-39 detection has little value in claude-bottle's
+    threat model (the agent has no access to a user's crypto
+    wallet seeds; the patterns that matter — gh*_, sk-ant-, AKIA,
+    etc. — keep firing)."""
+    return not any(
+        "anthropic-base-url" in r.Role for r in bottle.cred_proxy.routes
+    )


 def pipelock_effective_tls_passthrough(bottle: Bottle) -> list[str]:
@@ -210,9 +216,8 @@ def pipelock_build_config(
        "api_allowlist": pipelock_effective_allowlist(bottle),
        "forward_proxy": {"enabled": True},
    }
-    suppress = pipelock_effective_suppress(bottle)
-    if suppress:
-        cfg["suppress"] = suppress
+    if not pipelock_seed_phrase_detection_enabled(bottle):
+        cfg["seed_phrase_detection"] = {"enabled": False}
    cfg["dlp"] = {"include_defaults": True, "scan_env": True}
    # Body-scan enforcement is a separate pipelock section (each DLP
    # "surface" — body, MCP, response — has its own action). Pipelock's
@@ -253,11 +258,10 @@ def pipelock_render_yaml(cfg: dict[str, object]) -> str:
    for h in cast(list[str], cfg["api_allowlist"]):
        lines.append(f'  - "{h}"')
    lines.append("")
-    if "suppress" in cfg:
-        lines.append("suppress:")
-        for entry in cast(list[dict[str, str]], cfg["suppress"]):
-            lines.append(f'  - rule: "{entry["rule"]}"')
-            lines.append(f'    path: "{entry["path"]}"')
+    if "seed_phrase_detection" in cfg:
+        lines.append("seed_phrase_detection:")
+        spd = cast(dict[str, object], cfg["seed_phrase_detection"])
+        lines.append(f"  enabled: {_bool(spd['enabled'])}")
        lines.append("")
    lines.append("forward_proxy:")
    fp = cast(dict[str, object], cfg["forward_proxy"])