PRD: Strengthen outbound exfiltration detection #263

Merged
didericis merged 10 commits from strengthen-outbound-exfil-detection into main 2026-06-25 00:15:33 -04:00
Collaborator

Closes #259.

PRD

Summary

  • Inject a per-session canary token (BOT_BOTTLE_CANARY) into the agent environment and EGRESS_TOKEN_CANARY into the egress sidecar so any outbound leak is detected with zero false positives
  • Broaden scan_known_secrets with a sensitive_prefixes parameter; scan_outbound reads BOT_BOTTLE_SENSITIVE_PREFIXES from environ to support arbitrary additional credential prefixes
  • Add alnum-projection and sliding-window partial-match passes to scan_known_secrets to defeat separator-injection and chunked-substring evasion
  • Add opt-in scan_entropy warn-only detector for high-entropy blobs; register "entropy" in OUTBOUND_DETECTOR_NAMES
  • Decode binary bodies via latin-1 (bijective, byte-preserving) instead of utf-8 errors=replace
Closes #259. [PRD](https://gitea.dideric.is/didericis/bot-bottle/src/branch/strengthen-outbound-exfil-detection/docs/prds/prd-new-strengthen-outbound-exfil-detection.md) ## Summary - Inject a per-session canary token (`BOT_BOTTLE_CANARY`) into the agent environment and `EGRESS_TOKEN_CANARY` into the egress sidecar so any outbound leak is detected with zero false positives - Broaden `scan_known_secrets` with a `sensitive_prefixes` parameter; `scan_outbound` reads `BOT_BOTTLE_SENSITIVE_PREFIXES` from `environ` to support arbitrary additional credential prefixes - Add alnum-projection and sliding-window partial-match passes to `scan_known_secrets` to defeat separator-injection and chunked-substring evasion - Add opt-in `scan_entropy` warn-only detector for high-entropy blobs; register `"entropy"` in `OUTBOUND_DETECTOR_NAMES` - Decode binary bodies via `latin-1` (bijective, byte-preserving) instead of `utf-8 errors=replace`
didericis force-pushed strengthen-outbound-exfil-detection from affd628df6 to 00e50973c2 2026-06-24 22:33:51 -04:00 Compare
didericis force-pushed strengthen-outbound-exfil-detection from 00e50973c2 to 217eadf9a1 2026-06-24 22:45:56 -04:00 Compare
didericis added 6 commits 2026-06-24 23:09:19 -04:00
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _alnum_projection(): strip non-alphanumeric chars for separator-injection detection
- scan_known_secrets() gains two extra passes per secret after exact-variant matching:
  alnum-projection exact match (catches hyphens/spaces between secret chars) and a
  sliding-window partial-match scan (catches chunked substrings ≥ PARTIAL_MATCH_MIN_LEN)
- scan_known_secrets() accepts sensitive_prefixes param (default ("EGRESS_TOKEN_",))
  so redact_tokens and call-sites can extend the scanned env-var prefix set
- scan_entropy() warn-only detector flagging windows with Shannon entropy ≥ 5.5 bits/char
- "entropy" added to OUTBOUND_DETECTOR_NAMES; scan_outbound opts it in only when
  explicitly listed in dlp.outbound_detectors (never part of the default "all" set)
- scan_outbound reads BOT_BOTTLE_SENSITIVE_PREFIXES from environ to extend
  scan_known_secrets beyond EGRESS_TOKEN_* without schema changes
- Binary bodies decoded via latin-1 fallback (bijective byte↔codepoint) instead
  of utf-8 errors=replace, preserving ASCII secret strings in binary payloads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EgressPlan gains a `canary: str` field (default "") populated in Egress.prepare()
using secrets.token_urlsafe(32).  Each launched bottle:

  - sidecar receives EGRESS_TOKEN_CANARY=<value> (literal env entry, scanned by
    existing known-secrets detector without any detector code changes)
  - agent receives BOT_BOTTLE_CANARY=<value> (visible fake secret that signals
    exfiltration with zero false positives if it appears in outbound traffic)

Docker compose and macos-container backends updated; smolmachines shares docker
compose and so picks this up automatically.  Unit tests cover canary uniqueness,
detection via scan_known_secrets, and EgressPlan backward-compat default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a supervisor-approved safe-token exactly matched an env secret
(Pass 1), Passes 2 & 3 (alnum projection) still ran and re-blocked on
the same value.  Track whether any variant was found-and-approved and
skip the projection passes for that secret in that case.
fix(tests): remove unused dlp entropy import
lint / lint (push) Successful in 2m8s
test / unit (pull_request) Successful in 40s
test / integration (pull_request) Successful in 23s
0a7e166b35
didericis force-pushed strengthen-outbound-exfil-detection from fcfe2f0849 to 0a7e166b35 2026-06-24 23:09:19 -04:00 Compare
Owner

@didericis-codex

Let's not name the canary "Canary", is a bit too obvious. Let's try creating a random name with postfix SECRET as the env var. Something like CANON_ALPHA_SECRET. Have it change on startup. If there's no stdlib way of generating random words, create a fixed set of random words and choose a random 2 for the secret name. If you think of a better way to do this or any problems with this approach let me know.

@didericis-codex Let's not name the canary "Canary", is a bit too obvious. Let's try creating a random name with postfix SECRET as the env var. Something like `CANON_ALPHA_SECRET`. Have it change on startup. If there's no stdlib way of generating random words, create a fixed set of random words and choose a random 2 for the secret name. If you think of a better way to do this or any problems with this approach let me know.
didericis added 1 commit 2026-06-24 23:26:05 -04:00
fix(egress): randomize canary secret env name
lint / lint (push) Successful in 2m15s
test / unit (pull_request) Successful in 45s
test / integration (pull_request) Successful in 26s
4808ef557a
Collaborator

Done in 4808ef5.

The canary now uses a per-launch random WORD_WORD_SECRET env var name generated from a fixed word list. Both Docker and macOS launch paths inject that same name/value into the agent and sidecar, and the sidecar also gets BOT_BOTTLE_SENSITIVE_PREFIXES=<that exact env name> so the existing known-secret detector scans it without exposing a fixed canary-looking name.

Validation: pyright clean; focused egress/rendering tests pass; full unit suite passed before the final rebase.

Done in `4808ef5`. The canary now uses a per-launch random `WORD_WORD_SECRET` env var name generated from a fixed word list. Both Docker and macOS launch paths inject that same name/value into the agent and sidecar, and the sidecar also gets `BOT_BOTTLE_SENSITIVE_PREFIXES=<that exact env name>` so the existing known-secret detector scans it without exposing a fixed canary-looking name. Validation: pyright clean; focused egress/rendering tests pass; full unit suite passed before the final rebase.
didericis added 1 commit 2026-06-24 23:32:03 -04:00
fix(egress): wire canary env for smolmachines
lint / lint (push) Successful in 2m16s
test / unit (pull_request) Successful in 42s
test / integration (pull_request) Successful in 23s
14ae89580a
Collaborator

Good catch: smolmachines was not covered by the Docker compose launch path. It shares the sidecar image and EgressPlan, but has separate docker run bundle env assembly plus separate smolvm guest env stamping.

I fixed that in 14ae895:

  • smolmachines sidecar bundle now receives <random>_SECRET=<canary> and BOT_BOTTLE_SENSITIVE_PREFIXES=<random>_SECRET
  • smolvm guest/agent env now receives the same <random>_SECRET=<canary>
  • added smolmachines tests for both paths

Validation: pyright clean; focused smolmachines/egress/rendering tests passed; full unit suite passed.

Good catch: smolmachines was **not** covered by the Docker compose launch path. It shares the sidecar image and `EgressPlan`, but has separate `docker run` bundle env assembly plus separate smolvm guest env stamping. I fixed that in `14ae895`: - smolmachines sidecar bundle now receives `<random>_SECRET=<canary>` and `BOT_BOTTLE_SENSITIVE_PREFIXES=<random>_SECRET` - smolvm guest/agent env now receives the same `<random>_SECRET=<canary>` - added smolmachines tests for both paths Validation: pyright clean; focused smolmachines/egress/rendering tests passed; full unit suite passed.
didericis reviewed 2026-06-24 23:33:10 -04:00
@@ -353,6 +353,9 @@ def _sidecar_env_entries(plan: MacosContainerBottlePlan) -> tuple[str, ...]:
env: list[str] = []
if plan.egress_plan.routes:
env.extend(sorted(plan.egress_plan.token_env_map.keys()))
if plan.egress_plan.canary and plan.egress_plan.canary_env:
Owner

The fact that we needed to update all the backends to propagate the env vars here is a red flag, should be backend agnostic. Can we move this sidecar env provisioning to a location that gets shared between backends?

The fact that we needed to update all the backends to propagate the env vars here is a red flag, should be backend agnostic. Can we move this sidecar env provisioning to a location that gets shared between backends?
didericis marked this conversation as resolved
didericis added 1 commit 2026-06-24 23:35:35 -04:00
refactor(egress): centralize launch env entries
lint / lint (push) Successful in 2m12s
test / unit (pull_request) Successful in 43s
test / integration (pull_request) Successful in 25s
5204b98777
Collaborator

Addressed the backend-agnostic concern from https://gitea.dideric.is/didericis/bot-bottle/pulls/263/files#issuecomment-2326 in 5204b98.

The egress launch env rules now live in shared helpers:

  • egress_sidecar_env_entries(plan.egress_plan) returns route token env slots plus the canary env and BOT_BOTTLE_SENSITIVE_PREFIXES
  • egress_agent_env_entries(plan.egress_plan) returns the agent-visible canary bait env

Docker, macOS container, and smolmachines all call those helpers now, so the canary/sensitive-prefix propagation no longer has per-backend policy copies. Added direct unit coverage for the shared helpers.

Validation: pyright clean; focused backend/rendering tests passed; full unit suite passed.

Addressed the backend-agnostic concern from https://gitea.dideric.is/didericis/bot-bottle/pulls/263/files#issuecomment-2326 in `5204b98`. The egress launch env rules now live in shared helpers: - `egress_sidecar_env_entries(plan.egress_plan)` returns route token env slots plus the canary env and `BOT_BOTTLE_SENSITIVE_PREFIXES` - `egress_agent_env_entries(plan.egress_plan)` returns the agent-visible canary bait env Docker, macOS container, and smolmachines all call those helpers now, so the canary/sensitive-prefix propagation no longer has per-backend policy copies. Added direct unit coverage for the shared helpers. Validation: pyright clean; focused backend/rendering tests passed; full unit suite passed.
didericis reviewed 2026-06-24 23:56:18 -04:00
@@ -1,4 +1,4 @@
"""DLP detectors for the egress proxy (PRD 0053).
"""DLP detectors for the egress proxy (PRD 0053, prd-new).
Owner

remove prd-new citations in docstrings

remove prd-new citations in docstrings
didericis added 1 commit 2026-06-24 23:57:57 -04:00
docs: remove prd-new code citations
test / integration (pull_request) Successful in 46s
test / unit (pull_request) Successful in 1m4s
lint / lint (push) Successful in 2m36s
prd-number / assign-numbers (push) Successful in 1m24s
test / integration (push) Successful in 34s
test / unit (push) Successful in 52s
Update Quality Badges / update-badges (push) Successful in 2m11s
d9a9eef276
Collaborator

Addressed in d9a9eef.

Removed the prd-new placeholder citations from Python docstrings/section comments in the detector and related tests. I also scanned bot_bottle and tests Python files to confirm no prd-new references remain.

Validation: pyright clean; focused detector/egress tests passed.

Addressed in `d9a9eef`. Removed the `prd-new` placeholder citations from Python docstrings/section comments in the detector and related tests. I also scanned `bot_bottle` and `tests` Python files to confirm no `prd-new` references remain. Validation: pyright clean; focused detector/egress tests passed.
didericis approved these changes 2026-06-24 23:59:22 -04:00
didericis merged commit d9a9eef276 into main 2026-06-25 00:15:33 -04:00
didericis deleted branch strengthen-outbound-exfil-detection 2026-06-25 00:15:33 -04:00
Sign in to join this conversation.