PRD: Egress token-block policy (supervise / redact / block) #262

Merged
didericis merged 5 commits from prd-0062-egress-supervisor-token-allow into main 2026-06-24 21:17:00 -04:00
Collaborator

Closes #261.

PRD: https://gitea.dideric.is/didericis/bot-bottle/src/branch/prd-0062-egress-supervisor-token-allow/docs/prds/0062-egress-supervisor-token-override.md

Summary

Gives each egress route a policy for what happens when an outbound DLP detector matches a token — dlp.outbound_on_match: block | redact | supervise. The goal (issue #261) is cutting false-positive friction without weakening default-deny.

  • supervise (default for manifest routes) — hold the request and surface an egress-token-allow proposal in ./cli.py supervise. On approval the matched value joins an in-memory safelist for the life of the proxy, so the request and later ones carrying it flow through. Fails closed on rejection / timeout / missing supervise wiring.
  • redact — scrub the matched value(s) from the body, non-host headers, and path/query, then forward. Fails closed (403) if a match lands on a surface redaction can't rewrite (the hostname, or a unicode-evasion token).
  • block — the original hard 403, never overridable.

Structural blocks (CRLF, no safelist-able value) stay hard 403s under block/supervise. The supervise flow mirrors git-gate's gitleaks-allow (PRD 0061): the addon writes the proposal directly to the shared queue and polls for the response on an async hook so the proxy event loop isn't stalled.

Provider routes default to redact

Agent-provider routes (the agent talking to its own LLM API — api.anthropic.com, the Codex backend, etc.) carry the whole conversation payload, which is the worst source of token-shaped false positives. They default to redact (filled centrally in egress_routes_for_bottle), so a match there is scrubbed and forwarded rather than blocked or queued. A provider that sets outbound_on_match explicitly keeps its choice; user-declared manifest routes still default to supervise. (Tradeoff: token-shaped values in a prompt are scrubbed before reaching the LLM — a leak-prevention plus, but the model won't see a real token you ask it to debug.)

CRLF false-positive fix

The URL-encoded CRLF (%0d%0a) detector was firing on legitimate requests (the Claude Code login flow) and bypassing the policy as a hard 403. Root cause: CRLF injection is only an attack in the request line + headers — an HTTP body is delimited by Content-Length, so CRLF bytes there can't split anything — but the scan flattened the body into the same blob. CRLF is now scanned only over the body-excluded request line + headers (websocket data frames excluded too), and redact also scrubs CRLF sequences.

Schema

New route field, threaded manifest → rendered routes.yaml → addon and round-tripped via list-egress-routes:

egress:
  routes:
    - host: logs.example.com
      dlp:
        outbound_on_match: redact   # block | redact | supervise

Scope / non-goals

  • Only outbound token blocks are policy-driven. Not-in-allowlist host blocks and git blocks keep their existing agent-driven allow / egress-block MCP path; inbound prompt-injection blocks are unchanged.
  • The safelist is in-memory only; a restart re-prompts (persistence deferred per the issue).
  • EGRESS_TOKEN_ALLOW_TIMEOUT_SECONDS (default 300) bounds the supervise wait.

Tests

1117 unit tests pass. Coverage: matched/safe_tokens plumbing and build_token_allow_payload (asserts the raw token never appears), the egress-token-allow tool round-trip and TUI approve/reason/suffix paths, outbound_on_match parse+validation at the manifest and core layers, a full manifest→render→addon round-trip for redact, the provider-default (+ explicit-override preservation), CRLF body-exclusion / strip_crlf. The pre-existing integration failures (test_smolmachines_launch, test_sandbox_escape) need real backends and fail identically on a clean tree.

Closes #261. PRD: https://gitea.dideric.is/didericis/bot-bottle/src/branch/prd-0062-egress-supervisor-token-allow/docs/prds/0062-egress-supervisor-token-override.md ## Summary Gives each egress route a policy for what happens when an outbound DLP detector matches a token — `dlp.outbound_on_match: block | redact | supervise`. The goal (issue #261) is cutting false-positive friction without weakening default-deny. - **`supervise`** (default for manifest routes) — hold the request and surface an `egress-token-allow` proposal in `./cli.py supervise`. On approval the matched value joins an in-memory safelist for the life of the proxy, so the request and later ones carrying it flow through. Fails closed on rejection / timeout / missing supervise wiring. - **`redact`** — scrub the matched value(s) from the body, non-`host` headers, and path/query, then forward. Fails closed (403) if a match lands on a surface redaction can't rewrite (the hostname, or a unicode-evasion token). - **`block`** — the original hard `403`, never overridable. Structural blocks (CRLF, no safelist-able value) stay hard `403`s under `block`/`supervise`. The supervise flow mirrors git-gate's `gitleaks-allow` (PRD 0061): the addon writes the proposal directly to the shared queue and polls for the response on an async hook so the proxy event loop isn't stalled. ## Provider routes default to `redact` Agent-provider routes (the agent talking to its own LLM API — `api.anthropic.com`, the Codex backend, etc.) carry the whole conversation payload, which is the worst source of token-shaped false positives. They default to `redact` (filled centrally in `egress_routes_for_bottle`), so a match there is scrubbed and forwarded rather than blocked or queued. A provider that sets `outbound_on_match` explicitly keeps its choice; user-declared manifest routes still default to `supervise`. (Tradeoff: token-shaped values in a prompt are scrubbed before reaching the LLM — a leak-prevention plus, but the model won't see a real token you ask it to debug.) ## CRLF false-positive fix The `URL-encoded CRLF (%0d%0a)` detector was firing on legitimate requests (the Claude Code login flow) and bypassing the policy as a hard 403. Root cause: CRLF injection is only an attack in the request line + headers — an HTTP body is delimited by Content-Length, so CRLF bytes there can't split anything — but the scan flattened the body into the same blob. CRLF is now scanned only over the body-excluded request line + headers (websocket data frames excluded too), and `redact` also scrubs CRLF sequences. ## Schema New route field, threaded manifest → rendered `routes.yaml` → addon and round-tripped via `list-egress-routes`: ```yaml egress: routes: - host: logs.example.com dlp: outbound_on_match: redact # block | redact | supervise ``` ## Scope / non-goals - Only outbound **token** blocks are policy-driven. Not-in-allowlist host blocks and git blocks keep their existing agent-driven `allow` / `egress-block` MCP path; inbound prompt-injection blocks are unchanged. - The safelist is in-memory only; a restart re-prompts (persistence deferred per the issue). - `EGRESS_TOKEN_ALLOW_TIMEOUT_SECONDS` (default 300) bounds the supervise wait. ## Tests 1117 unit tests pass. Coverage: `matched`/`safe_tokens` plumbing and `build_token_allow_payload` (asserts the raw token never appears), the `egress-token-allow` tool round-trip and TUI approve/reason/suffix paths, `outbound_on_match` parse+validation at the manifest and core layers, a full manifest→render→addon round-trip for `redact`, the provider-default (+ explicit-override preservation), CRLF body-exclusion / `strip_crlf`. The pre-existing integration failures (`test_smolmachines_launch`, `test_sandbox_escape`) need real backends and fail identically on a clean tree.
didericis-claude added 1 commit 2026-06-24 16:13:20 -04:00
PRD 0062: supervisor override for egress token blocks
lint / lint (push) Successful in 1m42s
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 16s
7f2352287e
When the outbound DLP catches a token, route the block through the
existing supervisor approval queue instead of returning 403 outright.
The egress proxy holds the request open until the operator answers, then
remembers an approved value for the life of the proxy so the request --
and later ones carrying it -- flow through. Fails closed on rejection,
timeout, malformed response, or when supervise is disabled.

- ScanResult.matched carries the raw matched substring (sidecar-only;
  never logged or written to the proposal). scan_outbound and the token
  detectors take a safe_tokens set and skip approved values, continuing
  past a safelisted match so a second secret in the same request is
  still caught.
- New egress-token-allow proposal tool, written directly to the queue by
  the addon (the gitleaks-allow pattern from PRD 0061). build_token_allow
  _payload renders host/method/path/detector reason + redacted context.
- Async request hook polls the queue without stalling the proxy event
  loop; EGRESS_TOKEN_ALLOW_TIMEOUT_SECONDS (default 300) bounds the wait.
- Supervisor TUI renders egress-token-allow like gitleaks-allow: report
  only, modify unavailable, approval requires a recorded reason.
- Unit tests for the matched/safe-tokens plumbing, payload builder, tool
  constant round-trip, and TUI paths; README + PRD 0062.

Closes #261.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
didericis added 1 commit 2026-06-24 16:50:18 -04:00
Add dlp.outbound_on_match policy (block | redact | supervise)
lint / lint (push) Successful in 1m41s
test / unit (pull_request) Successful in 30s
test / integration (pull_request) Successful in 18s
cdfaaa3de8
Give each egress route a policy for what the proxy does when an outbound
DLP detector matches a token, defaulting to the supervise flow added in
the previous commit. The goal is cutting false-positive friction without
weakening default-deny.

- redact: scrub the matched value(s) from the body, non-host headers, and
  path/query via redact_tokens, then re-scan. Forward if clean; fail
  closed with a 403 if a match remains on a surface redaction can't
  rewrite (the hostname, or a unicode-evasion token). For routes where a
  token-shaped value is noise the upstream doesn't need.
- block: the original hard 403, never overridable.
- supervise (default, unset): hold the request for operator approval.

Structural blocks (CRLF, no safelist-able value) stay hard 403s under
every policy.

Threads outbound_on_match from the bottle manifest (manifest_egress)
through the resolved EgressRoute and rendered routes.yaml (egress.py) to
the addon's Route (egress_addon_core), and round-trips it via the
list-egress-routes introspection endpoint. The allow/egress-block tool
descriptions document the new key.

Tests: manifest parse/validation, core parse/validation, full
manifest->render->addon round-trip for redact. README + PRD 0062 updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
didericis-claude changed title from PRD 0062: Supervisor override for egress token blocks to PRD 0062: Egress token-block policy (supervise / redact / block) 2026-06-24 16:50:36 -04:00
didericis added 1 commit 2026-06-24 20:37:30 -04:00
Stop scanning the request body for CRLF injection
lint / lint (push) Successful in 1m41s
test / unit (pull_request) Successful in 31s
test / integration (pull_request) Successful in 18s
b411577e76
A 403 "egress DLP: URL-encoded CRLF (%0d%0a)" was firing on legitimate
requests (e.g. the Claude Code login flow) and bypassing the on-match
policy entirely, because CRLF blocks carry no matched value and were
routed straight to a hard 403.

Root cause: CRLF injection is only an attack in the request line and
headers. An HTTP body is delimited by Content-Length, so CRLF bytes in
the body cannot split the request — but the scan flattened the body into
the same blob it checked, so form-encoded / multi-line body content
(which legitimately contains %0d%0a) tripped it.

Fix:
- scan_outbound takes a crlf_text param; the addon scans CRLF only over
  the body-excluded request line + headers. crlf_text=None keeps the
  old full-blob behavior for host-side callers/tests; the websocket path
  passes "" since a data frame is not a request line.
- The redact policy now also scrubs CRLF (new strip_crlf helper) from the
  path and headers, so redact is a complete escape hatch and structural
  CRLF in the URL/headers can be forwarded when a route opts into it.

Tests: strip_crlf unit tests; scan_outbound crlf_text body-exclusion and
backward-compat tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
didericis added 1 commit 2026-06-24 20:40:39 -04:00
Default agent-provider routes to the redact on-match policy
lint / lint (push) Successful in 1m42s
test / unit (pull_request) Successful in 34s
test / integration (pull_request) Successful in 16s
1ad710a041
Provider routes (the agent talking to its own LLM API — api.anthropic.com,
the Codex backend, etc.) carry the whole conversation payload, which is the
worst source of token-shaped false positives. egress_routes_for_bottle now
fills outbound_on_match=redact on any provider route that doesn't set it
explicitly, so a match there is scrubbed and forwarded rather than blocked
or queued for the operator. A provider that sets the policy keeps its
choice; manifest routes still default to supervise.

Tests: provider route gets redact default, explicit provider policy
preserved, manifest route unaffected. README + PRD 0062 updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01HnvBjPZC5V7qeQpFbQdDmS
didericis added 1 commit 2026-06-24 21:10:37 -04:00
Restructure PRD 0062 to the init-prd template
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 25s
lint / lint (push) Successful in 2m8s
test / unit (push) Successful in 41s
test / integration (push) Successful in 25s
Update Quality Badges / update-badges (push) Successful in 1m32s
d2d50be65a
Conform the PRD to the standard PRD-new skeleton: add a Scope section
(In scope / Out of scope), rename Design -> Proposed Design and split
its prose into New services / Existing code touched / Data model
changes / External dependencies, fold the old Implementation chunks
into In scope, and add a References section. No change in substance.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
didericis-claude changed title from PRD 0062: Egress token-block policy (supervise / redact / block) to PRD: Egress token-block policy (supervise / redact / block) 2026-06-24 21:15:41 -04:00
didericis merged commit d2d50be65a into main 2026-06-24 21:17:00 -04:00
didericis deleted branch prd-0062-egress-supervisor-token-allow 2026-06-24 21:17:01 -04:00
Sign in to join this conversation.