PRD: Strengthen outbound exfiltration detection #263

2026-06-24T20:55:05-04:00

didericis-claude commented

2026-06-24 20:55:05 -04:00

Closes #259.

PRD

Summary

Inject a per-session canary token (BOT_BOTTLE_CANARY) into the agent environment and EGRESS_TOKEN_CANARY into the egress sidecar so any outbound leak is detected with zero false positives
Broaden scan_known_secrets with a sensitive_prefixes parameter; scan_outbound reads BOT_BOTTLE_SENSITIVE_PREFIXES from environ to support arbitrary additional credential prefixes
Add alnum-projection and sliding-window partial-match passes to scan_known_secrets to defeat separator-injection and chunked-substring evasion
Add opt-in scan_entropy warn-only detector for high-entropy blobs; register "entropy" in OUTBOUND_DETECTOR_NAMES
Decode binary bodies via latin-1 (bijective, byte-preserving) instead of utf-8 errors=replace

Closes #259. [PRD](https://gitea.dideric.is/didericis/bot-bottle/src/branch/strengthen-outbound-exfil-detection/docs/prds/prd-new-strengthen-outbound-exfil-detection.md) ## Summary - Inject a per-session canary token (`BOT_BOTTLE_CANARY`) into the agent environment and `EGRESS_TOKEN_CANARY` into the egress sidecar so any outbound leak is detected with zero false positives - Broaden `scan_known_secrets` with a `sensitive_prefixes` parameter; `scan_outbound` reads `BOT_BOTTLE_SENSITIVE_PREFIXES` from `environ` to support arbitrary additional credential prefixes - Add alnum-projection and sliding-window partial-match passes to `scan_known_secrets` to defeat separator-injection and chunked-substring evasion - Add opt-in `scan_entropy` warn-only detector for high-entropy blobs; register `"entropy"` in `OUTBOUND_DETECTOR_NAMES` - Decode binary bodies via `latin-1` (bijective, byte-preserving) instead of `utf-8 errors=replace`

didericis force-pushed strengthen-outbound-exfil-detection from affd628df6 to 00e50973c2

2026-06-24 22:33:51 -04:00

Compare

didericis force-pushed strengthen-outbound-exfil-detection from 00e50973c2 to 217eadf9a1

2026-06-24 22:45:56 -04:00

Compare

didericis added 6 commits 2026-06-24 23:09:19 -04:00

docs: draft PRD prd-new for strengthen-outbound-exfil-detection ea6bc5a170

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(dlp): fragmentation resistance, entropy detector, broadened known-value scan 701df6cb2f

- _alnum_projection(): strip non-alphanumeric chars for separator-injection detection
- scan_known_secrets() gains two extra passes per secret after exact-variant matching:
  alnum-projection exact match (catches hyphens/spaces between secret chars) and a
  sliding-window partial-match scan (catches chunked substrings ≥ PARTIAL_MATCH_MIN_LEN)
- scan_known_secrets() accepts sensitive_prefixes param (default ("EGRESS_TOKEN_",))
  so redact_tokens and call-sites can extend the scanned env-var prefix set
- scan_entropy() warn-only detector flagging windows with Shannon entropy ≥ 5.5 bits/char
- "entropy" added to OUTBOUND_DETECTOR_NAMES; scan_outbound opts it in only when
  explicitly listed in dlp.outbound_detectors (never part of the default "all" set)
- scan_outbound reads BOT_BOTTLE_SENSITIVE_PREFIXES from environ to extend
  scan_known_secrets beyond EGRESS_TOKEN_* without schema changes
- Binary bodies decoded via latin-1 fallback (bijective byte↔codepoint) instead
  of utf-8 errors=replace, preserving ASCII secret strings in binary payloads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(egress): inject per-session canary token into sidecar and agent environments 11cf12188d

EgressPlan gains a `canary: str` field (default "") populated in Egress.prepare()
using secrets.token_urlsafe(32).  Each launched bottle:

  - sidecar receives EGRESS_TOKEN_CANARY=<value> (literal env entry, scanned by
    existing known-secrets detector without any detector code changes)
  - agent receives BOT_BOTTLE_CANARY=<value> (visible fake secret that signals
    exfiltration with zero false positives if it appears in outbound traffic)

Docker compose and macos-container backends updated; smolmachines shares docker
compose and so picks this up automatically.  Unit tests cover canary uniqueness,
detection via scan_known_secrets, and EgressPlan backward-compat default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs(prd): flip prd-new-strengthen-outbound-exfil-detection Draft → Active e02fab15d0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(dlp): skip projection passes when exact variant is safe-listed a920203730

When a supervisor-approved safe-token exactly matched an env secret
(Pass 1), Passes 2 & 3 (alnum projection) still ran and re-blocked on
the same value.  Track whether any variant was found-and-approved and
skip the projection passes for that secret in that case.

fix(tests): remove unused dlp entropy import

lint / lint (push) Successful in 2m8s

Details

test / unit (pull_request) Successful in 40s

Details

test / integration (pull_request) Successful in 23s

Details

0a7e166b35

didericis force-pushed strengthen-outbound-exfil-detection from fcfe2f0849 to 0a7e166b35

2026-06-24 23:09:19 -04:00

Compare

didericis commented

2026-06-24 23:19:33 -04:00

@didericis-codex

Let's not name the canary "Canary", is a bit too obvious. Let's try creating a random name with postfix SECRET as the env var. Something like CANON_ALPHA_SECRET. Have it change on startup. If there's no stdlib way of generating random words, create a fixed set of random words and choose a random 2 for the secret name. If you think of a better way to do this or any problems with this approach let me know.

@didericis-codex Let's not name the canary "Canary", is a bit too obvious. Let's try creating a random name with postfix SECRET as the env var. Something like `CANON_ALPHA_SECRET`. Have it change on startup. If there's no stdlib way of generating random words, create a fixed set of random words and choose a random 2 for the secret name. If you think of a better way to do this or any problems with this approach let me know.

didericis added 1 commit 2026-06-24 23:26:05 -04:00

fix(egress): randomize canary secret env name

lint / lint (push) Successful in 2m15s

Details

test / unit (pull_request) Successful in 45s

Details

test / integration (pull_request) Successful in 26s

Details

4808ef557a

didericis-codex commented

2026-06-24 23:26:11 -04:00

Done in 4808ef5.

The canary now uses a per-launch random WORD_WORD_SECRET env var name generated from a fixed word list. Both Docker and macOS launch paths inject that same name/value into the agent and sidecar, and the sidecar also gets BOT_BOTTLE_SENSITIVE_PREFIXES=<that exact env name> so the existing known-secret detector scans it without exposing a fixed canary-looking name.

Validation: pyright clean; focused egress/rendering tests pass; full unit suite passed before the final rebase.

Done in `4808ef5`. The canary now uses a per-launch random `WORD_WORD_SECRET` env var name generated from a fixed word list. Both Docker and macOS launch paths inject that same name/value into the agent and sidecar, and the sidecar also gets `BOT_BOTTLE_SENSITIVE_PREFIXES=<that exact env name>` so the existing known-secret detector scans it without exposing a fixed canary-looking name. Validation: pyright clean; focused egress/rendering tests pass; full unit suite passed before the final rebase.

didericis added 1 commit 2026-06-24 23:32:03 -04:00

fix(egress): wire canary env for smolmachines

lint / lint (push) Successful in 2m16s

Details

test / unit (pull_request) Successful in 42s

Details

test / integration (pull_request) Successful in 23s

Details

14ae89580a

didericis-codex commented

2026-06-24 23:32:12 -04:00

Good catch: smolmachines was not covered by the Docker compose launch path. It shares the sidecar image and EgressPlan, but has separate docker run bundle env assembly plus separate smolvm guest env stamping.

I fixed that in 14ae895:

smolmachines sidecar bundle now receives <random>_SECRET=<canary> and BOT_BOTTLE_SENSITIVE_PREFIXES=<random>_SECRET
smolvm guest/agent env now receives the same <random>_SECRET=<canary>
added smolmachines tests for both paths

Validation: pyright clean; focused smolmachines/egress/rendering tests passed; full unit suite passed.

Good catch: smolmachines was **not** covered by the Docker compose launch path. It shares the sidecar image and `EgressPlan`, but has separate `docker run` bundle env assembly plus separate smolvm guest env stamping. I fixed that in `14ae895`: - smolmachines sidecar bundle now receives `<random>_SECRET=<canary>` and `BOT_BOTTLE_SENSITIVE_PREFIXES=<random>_SECRET` - smolvm guest/agent env now receives the same `<random>_SECRET=<canary>` - added smolmachines tests for both paths Validation: pyright clean; focused smolmachines/egress/rendering tests passed; full unit suite passed.

didericis reviewed 2026-06-24 23:33:10 -04:00

bot_bottle/backend/macos_container/launch.py Outdated

						
				@@ -353,6 +353,9 @@ def _sidecar_env_entries(plan: MacosContainerBottlePlan) -> tuple[str, ...]:

				    env: list[str] = []

				    if plan.egress_plan.routes:

				        env.extend(sorted(plan.egress_plan.token_env_map.keys()))

				    if plan.egress_plan.canary and plan.egress_plan.canary_env:

didericis commented

2026-06-24 23:33:09 -04:00

The fact that we needed to update all the backends to propagate the env vars here is a red flag, should be backend agnostic. Can we move this sidecar env provisioning to a location that gets shared between backends?

didericis marked this conversation as resolved

didericis added 1 commit 2026-06-24 23:35:35 -04:00

refactor(egress): centralize launch env entries

lint / lint (push) Successful in 2m12s

Details

test / unit (pull_request) Successful in 43s

Details

test / integration (pull_request) Successful in 25s

Details

5204b98777

didericis-codex commented

2026-06-24 23:35:45 -04:00

Addressed the backend-agnostic concern from https://gitea.dideric.is/didericis/bot-bottle/pulls/263/files#issuecomment-2326 in 5204b98.

The egress launch env rules now live in shared helpers:

egress_sidecar_env_entries(plan.egress_plan) returns route token env slots plus the canary env and BOT_BOTTLE_SENSITIVE_PREFIXES
egress_agent_env_entries(plan.egress_plan) returns the agent-visible canary bait env

Docker, macOS container, and smolmachines all call those helpers now, so the canary/sensitive-prefix propagation no longer has per-backend policy copies. Added direct unit coverage for the shared helpers.

Validation: pyright clean; focused backend/rendering tests passed; full unit suite passed.

Addressed the backend-agnostic concern from https://gitea.dideric.is/didericis/bot-bottle/pulls/263/files#issuecomment-2326 in `5204b98`. The egress launch env rules now live in shared helpers: - `egress_sidecar_env_entries(plan.egress_plan)` returns route token env slots plus the canary env and `BOT_BOTTLE_SENSITIVE_PREFIXES` - `egress_agent_env_entries(plan.egress_plan)` returns the agent-visible canary bait env Docker, macOS container, and smolmachines all call those helpers now, so the canary/sensitive-prefix propagation no longer has per-backend policy copies. Added direct unit coverage for the shared helpers. Validation: pyright clean; focused backend/rendering tests passed; full unit suite passed.

didericis reviewed 2026-06-24 23:56:18 -04:00

bot_bottle/dlp_detectors.py Outdated

						
				@@ -1,4 +1,4 @@

				"""DLP detectors for the egress proxy (PRD 0053).

				"""DLP detectors for the egress proxy (PRD 0053, prd-new).

didericis commented

2026-06-24 23:56:18 -04:00

remove prd-new citations in docstrings

didericis added 1 commit 2026-06-24 23:57:57 -04:00

docs: remove prd-new code citations

test / integration (pull_request) Successful in 46s

Details

test / unit (pull_request) Successful in 1m4s

Details

lint / lint (push) Successful in 2m36s

Details

prd-number / assign-numbers (push) Successful in 1m24s

Details

test / integration (push) Successful in 34s

Details

test / unit (push) Successful in 52s

Details

Update Quality Badges / update-badges (push) Successful in 2m11s

Details

d9a9eef276

didericis-codex commented

2026-06-24 23:58:02 -04:00

Addressed in d9a9eef.

Removed the prd-new placeholder citations from Python docstrings/section comments in the detector and related tests. I also scanned bot_bottle and tests Python files to confirm no prd-new references remain.

Validation: pyright clean; focused detector/egress tests passed.

Addressed in `d9a9eef`. Removed the `prd-new` placeholder citations from Python docstrings/section comments in the detector and related tests. I also scanned `bot_bottle` and `tests` Python files to confirm no `prd-new` references remain. Validation: pyright clean; focused detector/egress tests passed.

didericis approved these changes 2026-06-24 23:59:22 -04:00

didericis merged commit d9a9eef276 into main

2026-06-25 00:15:33 -04:00

didericis deleted branch strengthen-outbound-exfil-detection

2026-06-25 00:15:33 -04:00

Sign in to join this conversation.

3 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: didericis/bot-bottle#263