fix(pipelock): scan all request headers + fix attack-3 destination
test / unit (pull_request) Successful in 19s
test / integration (pull_request) Failing after 49s

Two related changes the PRD 0022 sandbox-escape test surfaced:

  1. `pipelock_build_config` now emits
     `request_body_scanning.scan_headers: true` and
     `header_mode: all`. Pipelock's default `header_mode:
     sensitive` only checks Authorization / Cookie / X-Api-Key
     / X-Token / Proxy-Authorization / X-Goog-Api-Key — an
     agent attempting exfil could trivially pick a
     non-sensitive header (`X-Custom: $SECRET`) and slip
     through. `all` closes the gap; pipelock caps it by the
     same max_body_bytes the body scan uses.

  2. Test 3 (HTTP exfil shapes) now targets
     raw.githubusercontent.com instead of api.anthropic.com.
     api.anthropic.com is in `DEFAULT_TLS_PASSTHROUGH` —
     pipelock can't MITM it because real LLM conversation
     bodies false-positive on DLP scanners (BIP-39 etc.). The
     trade-off is documented in `pipelock.DEFAULT_TLS_PASSTHROUGH`;
     the test now exercises a host where the sandbox is
     actually supposed to block.

All 5 sandbox-escape attacks now produce HTTP 403 with the
expected sandbox marker (`egress:`, `pipelock`, or `blocked:`):

  - Attack 1 (non-allowlisted host)        ✓ egress
  - Attack 2 (non-allowlisted IP + spoof)  ✓ egress
  - Attack 3a (URL path)                   ✓ pipelock DLP
  - Attack 3b (URL query)                  ✓ pipelock DLP
  - Attack 3c (request body)               ✓ pipelock DLP
  - Attack 3d (request header)             ✓ pipelock DLP (scan_headers)
  - Attack 4a (crafted subdomain)          ✓ egress
  - Attack 4b (direct dig @8.8.8.8)        ✓ network isolation
  - Attack 5 (README push, 3 secret shapes) ✓ gitleaks (pre-upstream)

489 unit tests pass (1 updated for the new request_body_scanning
shape). Full integration suite passes in ~6s.
This commit is contained in:
2026-05-26 22:38:38 -04:00
parent e2231f46a3
commit 23f50f7720
3 changed files with 44 additions and 8 deletions
+19 -6
View File
@@ -228,10 +228,13 @@ class TestSandboxEscape(unittest.TestCase):
# ---- attack 3: HTTP exfil shapes ---------------------------------
# Sandbox-block signature: pipelock / egress return HTTP 403 on
# policy reject; the response body carries `"egress:"` (egress
# sidecar) or `"pipelock"` (pipelock sidecar). Both are
# observable from inside the agent via curl.
_SANDBOX_BLOCK_MARKERS = ("egress:", "pipelock")
# policy reject; the response body carries a recognizable
# marker. Egress's reject message starts `"egress: host '...'
# is not in the bottle's egress.routes allowlist"`; pipelock's
# DLP rejects start `"blocked: "` (e.g.
# `"blocked: DLP match: Anthropic API Key (critical)"`,
# `"blocked: request body contains secret"`).
_SANDBOX_BLOCK_MARKERS = ("egress:", "pipelock", "blocked:")
def _assert_sandbox_block(self, label: str, r) -> None:
"""A real sandbox block produces an HTTP 403 with a
@@ -276,10 +279,20 @@ class TestSandboxEscape(unittest.TestCase):
PRD 0022 Q1 resolution: this assertion is AUTHORITATIVE.
If a shape fails here, the leak is real and the
remediation lands as its own PRD before this test merges.
DON'T mark expectedFailure to silence it."""
DON'T mark expectedFailure to silence it.
Destination note: we use `raw.githubusercontent.com` (one
of the DEFAULT_ALLOWLIST hosts) rather than
api.anthropic.com because pipelock passthrough's the
Anthropic API endpoint specifically — its DLP scanners
false-positive on real LLM conversation bodies (BIP-39
seed phrases, etc.). That trade-off is documented in
`pipelock.DEFAULT_TLS_PASSTHROUGH`. For non-passthrough
hosts pipelock MITMs and the DLP scan applies, which is
what this attack exercises."""
# Capture HTTP code via curl's -w; don't use --fail so
# we get the response body even on 4xx.
url_base = "https://api.anthropic.com"
url_base = "https://raw.githubusercontent.com"
wfmt = '\\nHTTP_CODE:%{http_code}'
shapes = [
(