bot-bottle

Author	SHA1	Message	Date
didericis-claude	451e6fc2fc	feat(dlp): add 7 token patterns, Unicode normalization, CRLF injection detection (PRD 0053) Token patterns: HuggingFace (hf_), Databricks (dapi), Slack (xox[baprs]-), npm (npm_), SendGrid (SG.x.y), PyPI (pypi-), HashiCorp Vault (hvs.). Unicode normalization (_normalize_text) applies NFKD + strips combining marks and control chars before pattern matching, defeating fullwidth-char and combining-mark evasion. CRLF injection (scan_crlf_injection) detects %0d%0a in URLs and literal \r\n header-injection patterns; runs unconditionally in scan_outbound regardless of outbound_detectors config.	2026-06-07 23:19:11 -04:00
didericis-claude	1ecef55fea	feat(dlp): websocket scanning, response headers, extended encoding variants, sk-proj pattern (PRD 0053)	2026-06-07 23:19:11 -04:00
didericis	545ff3582f	fix(lint): resolve pylint and pyright issues on egress-log-option lint / lint (push) Failing after 1m34s Details test / unit (pull_request) Successful in 32s Details test / integration (pull_request) Successful in 44s Details - egress.py: extract _render_match_entry helper to reduce nesting depth - egress_addon_core.py: make request_method/request_headers keyword-only to satisfy too-many-positional-arguments; wrap long lazy import lines - egress_addon.py: remove unused Route import; add pylint disable for import-error on sidecar-only mitmproxy/egress_addon_core imports - dlp_detectors.py: remove dead _min_distance function (superseded by _closest_pair) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 20:10:32 -04:00
didericis	86b0a4d285	feat(egress): add location, context snippets, and token redaction to DLP logging Each DLP block/warn now reports where the match was found (body, authorization header, response body) and includes a context snippet: SNIPPET_CONTEXT chars before and after the match, with the matched value replaced by REDACT ("********"). scan_token_patterns/scan_known_secrets/scan_naive_injection all gain `location` and `context` fields on their ScanResult returns. The outbound scanner takes `auth_header` as a separate kwarg so the two locations are scanned and reported independently. redact_tokens() is added to dlp_detectors and used in egress_addon.py to scrub token patterns and provisioned secrets from host/path fields before they appear in any log output (level 1 and 2). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 14:41:27 -04:00
didericis-claude	abcb336e7c	fix(dlp): rework naive injection to proximity-based disclosure+jailbreak lint / lint (push) Failing after 1m24s Details test / unit (pull_request) Successful in 30s Details test / integration (pull_request) Successful in 44s Details Token detection is already handled by the token_patterns detector running separately — calling it again from scan_naive_injection was redundant. New logic: - Warn on any disclosure phrase - Warn on any jailbreak phrase - Block when both appear within 500 chars of each other Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-05 20:34:21 +00:00
didericis-claude	726713d081	feat(egress): implement PRD 0053 — DLP addon with Gateway API matches lint / lint (push) Failing after 1m43s Details test / unit (pull_request) Successful in 40s Details test / integration (pull_request) Successful in 50s Details Replace path_allowlist with Gateway API HTTPRoute match vocabulary (paths, methods, headers with AND/OR semantics) and add DLP scanning to the egress proxy: - Token pattern detection (AWS, GitHub, Anthropic, OpenAI, Stripe, JWT) - Known secret detection (EGRESS_TOKEN_* with base64/URL/hex variants) - Naive prompt injection detection (disclosure + credential, jailbreak) - Per-route DLP configuration via manifest dlp block - Inbound response scanning with block/warn severity Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-05 19:53:23 +00:00

6 Commits