# PRD 0062: Supervisor override for egress token blocks

- **Status:** Active
- **Author:** claude
- **Created:** 2026-06-24
- **Issue:** #261

## Summary

When the egress proxy blocks an outbound request because a DLP detector
matched a token/secret, route that block through the existing supervisor
approval queue instead of returning `403` immediately. The proxy holds the
request open until the operator approves or rejects it. On approval, the
matched token is added to an in-memory "safe tokens" set so the request — and
any later request carrying the same token — flows through without re-prompting.

## Problem

The outbound DLP detectors (`token_patterns`, `known_secrets`) are
deliberately aggressive: any string that looks like a credential is blocked
before it leaves the bottle. That is the right default, but it produces false
positives — a token-shaped value that is not actually a secret, or a credential
the agent legitimately needs to send to a declared host. Today the only
recovery is for the operator to notice the `egress DLP` 403 in the logs and
hand-edit the route's `dlp.outbound_detectors`, which disables the detector for
the whole route rather than allowing the one value.

The operator has no in-the-loop signal that a token block happened and no
fine-grained way to say "this specific value is fine."

## Goals / Success Criteria

1. An outbound DLP **token** block (a `ScanResult` carrying a matched secret
   value) creates a supervisor proposal instead of an immediate `403`.
2. The egress proxy holds the blocked request open, polling for the operator's
   response up to a bounded timeout.
3. The proposal shows the operator the host, method, path, the detector reason,
   and a **redacted** context snippet — never the raw token value.
4. On `approved`/`modified`, the matched token value is added to an in-memory
   safe-tokens set and the request proceeds normally; later requests carrying
   the same value skip the block.
5. On `rejected`, timeout, malformed response, or missing supervisor wiring,
   the request fails closed with the same `403` as today.
6. Structural blocks that carry no token value (CRLF injection) and the
   route-not-allowlisted / git blocks are unchanged — they stay hard `403`s and
   keep their existing agent-driven `allow` / `egress-block` MCP path.
7. The proxy event loop is not stalled while waiting: the wait is asynchronous,
   so other flows keep being served.

## Non-goals

- Persisting the safe-tokens set across egress restarts. It lives in process
  memory only; a restart re-prompts. (The issue explicitly defers persistence.)
- Supervising inbound (prompt-injection) blocks or WebSocket frame blocks.
  WebSocket frames still honour the safe-tokens set for already-approved values
  but cannot wait for approval (there is no response surface after upgrade).
- Generalising an approved secret across encodings. The safe-tokens set matches
  the exact value the detector found.
- Replacing the per-route `dlp.outbound_detectors` override. That remains the
  way to turn a detector off wholesale.

## Design

### Detected-value plumbing

`ScanResult` gains a `matched: str = ""` field carrying the raw substring the
detector matched. The token detectors (`scan_token_patterns`,
`scan_known_secrets`) populate it; the structural CRLF detector leaves it
empty. The value stays inside the egress sidecar process — it is never written
to a log line (logs already use the redacted `context`) nor to the proposal
file.

`scan_outbound` (and the token detectors it calls) accept a `safe_tokens`
set. A match whose value is in `safe_tokens` is skipped, so an approved token
no longer blocks. The scanners keep searching past a safelisted match so a
second, un-approved secret in the same request is still caught.

### Supervisor proposal

A new proposal tool constant `egress-token-allow` is added to
`supervise.TOOLS`. The egress addon writes the proposal directly to
`SUPERVISE_QUEUE_DIR` (the queue is bind-mounted into the sidecar bundle and
shared by every daemon, exactly as git-gate's `gitleaks-allow` proposal in PRD
0061 does). The proposal's `proposed_file` is a human-readable text payload:

```
egress blocked an outbound request carrying a detected token
host: api.example.com
method: POST
path: /v1/ingest
detector: OpenAI API key found in body
context: ...before ******** after...
```

The justification tells the operator to approve only if the value is a false
positive or a credential the request legitimately needs.

The addon then polls `<proposal-id>.response.json` for
`EGRESS_TOKEN_ALLOW_TIMEOUT_SECONDS` (default 300). `approved`/`modified`
allow the request and add the value to the safe-tokens set; `rejected`,
malformed responses, and timeout fail the request closed. The proposal +
response are archived to `processed/` after a decision.

Because the wait happens inside mitmproxy's asyncio loop, the addon's
`request` hook is async and polls with `asyncio.sleep`, so concurrent flows
are unaffected.

### Supervisor UI

`cli/supervise.py` renders `egress-token-allow` like `gitleaks-allow`: the
text payload is shown, modify is unavailable (there is no file patch to edit),
and approval prompts for a non-empty reason that is recorded in the response
notes. There is no on-disk config diff, so — like `gitleaks-allow` and
`capability-block` — it writes no egress audit-log entry.

### Failure handling

If `SUPERVISE_QUEUE_DIR` / `SUPERVISE_BOTTLE_SLUG` are unset (supervise
disabled for the bottle), the addon skips the queue and returns the existing
`403`. Any error writing the proposal or reading the response also fails
closed.

## Implementation chunks

1. **Core** — `ScanResult.matched`; thread `safe_tokens` through
   `scan_outbound` / token detectors; `build_token_allow_payload`.
2. **Supervise + TUI** — `TOOL_EGRESS_TOKEN_ALLOW`; TUI suffix, modify guard,
   required approval reason.
3. **Addon glue** — async `request`, safe-tokens set, proposal write + async
   poll, allow/block decision; pass `safe_tokens` into the WebSocket path.
4. **Tests + docs** — core/supervise/TUI unit tests; README egress + supervisor
   notes.

## Open questions

- Should `known_secrets` (provisioned `EGRESS_TOKEN_*` exfiltration) be
  override-able at all, or only `token_patterns`? This PRD allows both —
  approval is an explicit operator decision and the safe-tokens set matches the
  exact found value — but a future revision could restrict `known_secrets` to
  reject-only.