- _alnum_projection(): strip non-alphanumeric chars for separator-injection detection
- scan_known_secrets() gains two extra passes per secret after exact-variant matching:
alnum-projection exact match (catches hyphens/spaces between secret chars) and a
sliding-window partial-match scan (catches chunked substrings ≥ PARTIAL_MATCH_MIN_LEN)
- scan_known_secrets() accepts sensitive_prefixes param (default ("EGRESS_TOKEN_",))
so redact_tokens and call-sites can extend the scanned env-var prefix set
- scan_entropy() warn-only detector flagging windows with Shannon entropy ≥ 5.5 bits/char
- "entropy" added to OUTBOUND_DETECTOR_NAMES; scan_outbound opts it in only when
explicitly listed in dlp.outbound_detectors (never part of the default "all" set)
- scan_outbound reads BOT_BOTTLE_SENSITIVE_PREFIXES from environ to extend
scan_known_secrets beyond EGRESS_TOKEN_* without schema changes
- Binary bodies decoded via latin-1 fallback (bijective byte↔codepoint) instead
of utf-8 errors=replace, preserving ASCII secret strings in binary payloads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each DLP block/warn now reports where the match was found (body,
authorization header, response body) and includes a context snippet:
SNIPPET_CONTEXT chars before and after the match, with the matched
value replaced by REDACT ("********").
scan_token_patterns/scan_known_secrets/scan_naive_injection all gain
`location` and `context` fields on their ScanResult returns. The
outbound scanner takes `auth_header` as a separate kwarg so the two
locations are scanned and reported independently.
redact_tokens() is added to dlp_detectors and used in egress_addon.py
to scrub token patterns and provisioned secrets from host/path fields
before they appear in any log output (level 1 and 2).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Token detection is already handled by the token_patterns detector
running separately — calling it again from scan_naive_injection was
redundant. New logic:
- Warn on any disclosure phrase
- Warn on any jailbreak phrase
- Block when both appear within 500 chars of each other
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>