bot-bottle/docs/prds/0055-egress-traffic-logging.md

# PRD 0055: Egress traffic logging

- **Status:** Active
- **Author:** claude
- **Created:** 2026-06-06
- **PR:** #207

## Summary

Adds structured log levels to the egress proxy so operators can observe
traffic and security decisions without modifying any application code.
Three integer levels control verbosity: `0` (off), `1` (security events
only), and `2` (full request/response capture). All output is JSON lines
written to stderr.

## Problem

The egress proxy makes per-request allow/block decisions and DLP scans, but
until now those decisions are invisible unless something is actively blocked
and the caller inspects the 403 body. Debugging unexpected blocks, auditing
what an agent is sending upstream, and verifying DLP detector behaviour all
require adding ad-hoc instrumentation or tailing the sidecar container logs
with no structure to grep against.

## Goals / Success Criteria

1. **Level 0 (off, default):** no egress output to stderr beyond the boot
   line. Existing behaviour for production deployments.
2. **Level 1 (blocks):** every block or DLP warn event is emitted to stderr
   as a JSON line with the event type, human-readable reason (including the
   secret type detected for DLP hits), and the request context (host, method,
   path; plus upstream status code for response-phase events). No traffic
   bodies are logged.
3. **Level 2 (full):** all level-1 events, plus a `egress_request` JSON line
   for every forwarded request (method, path, headers, body after auth
   injection) and an `egress_response` JSON line for every response that
   passes DLP (status, headers, body).
4. The log level is a single integer field `log` at the top of the egress
   config (routes.yaml in the sidecar; `egress.log` in the bottle manifest).
   Values other than 0, 1, 2 are rejected at parse time on both sides.
5. The boot message includes the active log level label (`off`, `blocks`,
   `full`).

## Non-goals

- Log rotation or file sinks — stderr output is captured by the container
  runtime (Docker, smolmachines) and goes wherever the operator routes it.
- Per-route log levels — all routes share the global level.
- Redacting secrets from the level-2 body dump — at level 2 the operator
  has explicitly requested full visibility; redaction belongs in the
  log consumer, not the proxy.

## Design

### Wire format

`routes.yaml` gains an optional top-level `log` key:

```yaml
log: 1          # 0 = off (default), 1 = blocks, 2 = full
routes:
  - host: "api.anthropic.com"
    ...
```

The field is omitted entirely when the level is 0 (default).

### Manifest format

```yaml
egress:
  log: 1
  routes:
    - host: "api.anthropic.com"
      ...
```

`egress.log` accepts integers 0, 1, or 2. Booleans and strings are rejected.

### Log events

**Block / DLP block (level ≥ 1):**
```json
{
  "event": "egress_block",
  "reason": "egress DLP: GitHub token (classic) found in request",
  "host": "api.github.com",
  "method": "POST",
  "path": "/gists"
}
```

Response-phase block also includes `"response_status"`.

**DLP warn (level ≥ 1):**
```json
{
  "event": "egress_warn",
  "reason": "egress DLP: possible prompt injection detected",
  "host": "api.anthropic.com",
  "method": "POST",
  "path": "/v1/messages",
  "response_status": 200
}
```

**Forwarded request (level 2):**
```json
{
  "event": "egress_request",
  "host": "api.anthropic.com",
  "method": "POST",
  "path": "/v1/messages",
  "headers": { "authorization": "Bearer sk-ant-...", "content-type": "application/json" },
  "body": "{\"model\": \"claude-opus-4-8\", ...}"
}
```

The request is logged after auth injection, so the outgoing `Authorization`
header is present. The agent's original `Authorization` header is stripped
before logging.

**Response (level 2):**
```json
{
  "event": "egress_response",
  "host": "api.anthropic.com",
  "status": 200,
  "headers": { "content-type": "application/json" },
  "body": "{\"id\": \"msg_...\", ...}"
}
```

Responses are logged before DLP scanning, so the body is always the raw
upstream response.

### Implementation

- **`egress_addon_core.py`**: `Config.log: int = LOG_OFF` (`LOG_OFF=0`,
  `LOG_BLOCKS=1`, `LOG_FULL=2`). `parse_config()` validates the integer and
  rejects booleans.
- **`egress_addon.py`**: `_block()` emits JSON when `log >= LOG_BLOCKS`. The
  `_req_ctx()` helper builds `{host, method, path}` for every call site.
  `_log_request()` / `_log_response()` fire when `log >= LOG_FULL`.
- **`manifest_egress.py`**: `EgressConfig.Log: int = 0`, parsed from
  `egress.log`, validated against `{0, 1, 2}`.
- **`egress.py`**: `egress_render_routes(routes, *, log: int = 0)` emits
  `log: N` at the top of routes.yaml when N > 0. `EgressPlan.log: int = 0`.