fix(pipelock): scan all request headers + fix attack-3 destination

Two related changes the PRD 0022 sandbox-escape test surfaced: 1. `pipelock_build_config` now emits `request_body_scanning.scan_headers: true` and `header_mode: all`. Pipelock's default `header_mode: sensitive` only checks Authorization / Cookie / X-Api-Key / X-Token / Proxy-Authorization / X-Goog-Api-Key — an agent attempting exfil could trivially pick a non-sensitive header (`X-Custom: $SECRET`) and slip through. `all` closes the gap; pipelock caps it by the same max_body_bytes the body scan uses. 2. Test 3 (HTTP exfil shapes) now targets raw.githubusercontent.com instead of api.anthropic.com. api.anthropic.com is in `DEFAULT_TLS_PASSTHROUGH` — pipelock can't MITM it because real LLM conversation bodies false-positive on DLP scanners (BIP-39 etc.). The trade-off is documented in `pipelock.DEFAULT_TLS_PASSTHROUGH`; the test now exercises a host where the sandbox is actually supposed to block. All 5 sandbox-escape attacks now produce HTTP 403 with the expected sandbox marker (`egress:`, `pipelock`, or `blocked:`): - Attack 1 (non-allowlisted host) ✓ egress - Attack 2 (non-allowlisted IP + spoof) ✓ egress - Attack 3a (URL path) ✓ pipelock DLP - Attack 3b (URL query) ✓ pipelock DLP - Attack 3c (request body) ✓ pipelock DLP - Attack 3d (request header) ✓ pipelock DLP (scan_headers) - Attack 4a (crafted subdomain) ✓ egress - Attack 4b (direct dig @8.8.8.8) ✓ network isolation - Attack 5 (README push, 3 secret shapes) ✓ gitleaks (pre-upstream) 489 unit tests pass (1 updated for the new request_body_scanning shape). Full integration suite passes in ~6s.
2026-05-26 22:38:38 -04:00
parent e2231f46a3
commit 23f50f7720
3 changed files with 44 additions and 8 deletions
@@ -228,10 +228,13 @@ class TestSandboxEscape(unittest.TestCase):
    # ---- attack 3: HTTP exfil shapes ---------------------------------

    # Sandbox-block signature: pipelock / egress return HTTP 403 on
-    # policy reject; the response body carries `"egress:"` (egress
-    # sidecar) or `"pipelock"` (pipelock sidecar). Both are
-    # observable from inside the agent via curl.
-    _SANDBOX_BLOCK_MARKERS = ("egress:", "pipelock")
+    # policy reject; the response body carries a recognizable
+    # marker. Egress's reject message starts `"egress: host '...'
+    # is not in the bottle's egress.routes allowlist"`; pipelock's
+    # DLP rejects start `"blocked: "` (e.g.
+    # `"blocked: DLP match: Anthropic API Key (critical)"`,
+    # `"blocked: request body contains secret"`).
+    _SANDBOX_BLOCK_MARKERS = ("egress:", "pipelock", "blocked:")

    def _assert_sandbox_block(self, label: str, r) -> None:
        """A real sandbox block produces an HTTP 403 with a
@@ -276,10 +279,20 @@ class TestSandboxEscape(unittest.TestCase):
        PRD 0022 Q1 resolution: this assertion is AUTHORITATIVE.
        If a shape fails here, the leak is real and the
        remediation lands as its own PRD before this test merges.
-        DON'T mark expectedFailure to silence it."""
+        DON'T mark expectedFailure to silence it.
+
+        Destination note: we use `raw.githubusercontent.com` (one
+        of the DEFAULT_ALLOWLIST hosts) rather than
+        api.anthropic.com because pipelock passthrough's the
+        Anthropic API endpoint specifically — its DLP scanners
+        false-positive on real LLM conversation bodies (BIP-39
+        seed phrases, etc.). That trade-off is documented in
+        `pipelock.DEFAULT_TLS_PASSTHROUGH`. For non-passthrough
+        hosts pipelock MITMs and the DLP scan applies, which is
+        what this attack exercises."""
        # Capture HTTP code via curl's -w; don't use --fail so
        # we get the response body even on 4xx.
-        url_base = "https://api.anthropic.com"
+        url_base = "https://raw.githubusercontent.com"
        wfmt = '\\nHTTP_CODE:%{http_code}'
        shapes = [
            (