Stop scanning the request body for CRLF injection
A 403 "egress DLP: URL-encoded CRLF (%0d%0a)" was firing on legitimate requests (e.g. the Claude Code login flow) and bypassing the on-match policy entirely, because CRLF blocks carry no matched value and were routed straight to a hard 403. Root cause: CRLF injection is only an attack in the request line and headers. An HTTP body is delimited by Content-Length, so CRLF bytes in the body cannot split the request — but the scan flattened the body into the same blob it checked, so form-encoded / multi-line body content (which legitimately contains %0d%0a) tripped it. Fix: - scan_outbound takes a crlf_text param; the addon scans CRLF only over the body-excluded request line + headers. crlf_text=None keeps the old full-blob behavior for host-side callers/tests; the websocket path passes "" since a data frame is not a request line. - The redact policy now also scrubs CRLF (new strip_crlf helper) from the path and headers, so redact is a complete escape hatch and structural CRLF in the URL/headers can be forwarded when a route opts into it. Tests: strip_crlf unit tests; scan_outbound crlf_text body-exclusion and backward-compat tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
This commit is contained in:
@@ -283,6 +283,14 @@ _CRLF_ENCODED_RE = re.compile(r"%0[dD]%0[aA]", re.ASCII)
|
||||
_CRLF_HEADER_INJECT_RE = re.compile(r"\r\n[A-Za-z][A-Za-z0-9\-]+\s*:", re.ASCII)
|
||||
|
||||
|
||||
def strip_crlf(text: str) -> str:
|
||||
"""Remove URL-encoded and literal CRLF injection sequences from a request
|
||||
surface (PRD 0062 redact policy). Used to scrub the request line / headers
|
||||
so the request can be forwarded instead of hard-blocked."""
|
||||
text = _CRLF_ENCODED_RE.sub("", text)
|
||||
return _CRLF_HEADER_INJECT_RE.sub(lambda m: m.group(0)[2:], text)
|
||||
|
||||
|
||||
def scan_crlf_injection(text: str) -> ScanResult | None:
|
||||
if _CRLF_ENCODED_RE.search(text):
|
||||
return ScanResult(
|
||||
@@ -306,4 +314,5 @@ __all__ = [
|
||||
"scan_known_secrets",
|
||||
"scan_naive_injection",
|
||||
"scan_token_patterns",
|
||||
"strip_crlf",
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user