feat(egress): replace log bool with integer log levels (0/1/2)
Level 0 (off, default): no stderr output beyond boot line. Level 1 (blocks): each block/warn emitted as JSON with reason and request context (host, method, path, response_status for inbound). Level 2 (full): level-1 events + egress_request and egress_response JSON lines for every forwarded connection. Block logging at level 1+ replaces the previous plain-text stderr write. DLP warn logging is also gated on level 1+. All block call sites now pass _req_ctx(flow) so the blocked request is visible in the log entry. Boot message shows log level label (off/blocks/full). Adds PRD 0053 documenting wire format, manifest format, and all log event shapes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,148 @@
|
||||
# PRD 0053: Egress traffic logging
|
||||
|
||||
- **Status:** Active
|
||||
- **Author:** claude
|
||||
- **Created:** 2026-06-06
|
||||
- **PR:** #207
|
||||
|
||||
## Summary
|
||||
|
||||
Adds structured log levels to the egress proxy so operators can observe
|
||||
traffic and security decisions without modifying any application code.
|
||||
Three integer levels control verbosity: `0` (off), `1` (security events
|
||||
only), and `2` (full request/response capture). All output is JSON lines
|
||||
written to stderr.
|
||||
|
||||
## Problem
|
||||
|
||||
The egress proxy makes per-request allow/block decisions and DLP scans, but
|
||||
until now those decisions are invisible unless something is actively blocked
|
||||
and the caller inspects the 403 body. Debugging unexpected blocks, auditing
|
||||
what an agent is sending upstream, and verifying DLP detector behaviour all
|
||||
require adding ad-hoc instrumentation or tailing the sidecar container logs
|
||||
with no structure to grep against.
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
1. **Level 0 (off, default):** no egress output to stderr beyond the boot
|
||||
line. Existing behaviour for production deployments.
|
||||
2. **Level 1 (blocks):** every block or DLP warn event is emitted to stderr
|
||||
as a JSON line with the event type, human-readable reason (including the
|
||||
secret type detected for DLP hits), and the request context (host, method,
|
||||
path; plus upstream status code for response-phase events). No traffic
|
||||
bodies are logged.
|
||||
3. **Level 2 (full):** all level-1 events, plus a `egress_request` JSON line
|
||||
for every forwarded request (method, path, headers, body after auth
|
||||
injection) and an `egress_response` JSON line for every response that
|
||||
passes DLP (status, headers, body).
|
||||
4. The log level is a single integer field `log` at the top of the egress
|
||||
config (routes.yaml in the sidecar; `egress.log` in the bottle manifest).
|
||||
Values other than 0, 1, 2 are rejected at parse time on both sides.
|
||||
5. The boot message includes the active log level label (`off`, `blocks`,
|
||||
`full`).
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Log rotation or file sinks — stderr output is captured by the container
|
||||
runtime (Docker, smolmachines) and goes wherever the operator routes it.
|
||||
- Per-route log levels — all routes share the global level.
|
||||
- Redacting secrets from the level-2 body dump — at level 2 the operator
|
||||
has explicitly requested full visibility; redaction belongs in the
|
||||
log consumer, not the proxy.
|
||||
|
||||
## Design
|
||||
|
||||
### Wire format
|
||||
|
||||
`routes.yaml` gains an optional top-level `log` key:
|
||||
|
||||
```yaml
|
||||
log: 1 # 0 = off (default), 1 = blocks, 2 = full
|
||||
routes:
|
||||
- host: "api.anthropic.com"
|
||||
...
|
||||
```
|
||||
|
||||
The field is omitted entirely when the level is 0 (default).
|
||||
|
||||
### Manifest format
|
||||
|
||||
```yaml
|
||||
egress:
|
||||
log: 1
|
||||
routes:
|
||||
- host: "api.anthropic.com"
|
||||
...
|
||||
```
|
||||
|
||||
`egress.log` accepts integers 0, 1, or 2. Booleans and strings are rejected.
|
||||
|
||||
### Log events
|
||||
|
||||
**Block / DLP block (level ≥ 1):**
|
||||
```json
|
||||
{
|
||||
"event": "egress_block",
|
||||
"reason": "egress DLP: GitHub token (classic) found in request",
|
||||
"host": "api.github.com",
|
||||
"method": "POST",
|
||||
"path": "/gists"
|
||||
}
|
||||
```
|
||||
|
||||
Response-phase block also includes `"response_status"`.
|
||||
|
||||
**DLP warn (level ≥ 1):**
|
||||
```json
|
||||
{
|
||||
"event": "egress_warn",
|
||||
"reason": "egress DLP: possible prompt injection detected",
|
||||
"host": "api.anthropic.com",
|
||||
"method": "POST",
|
||||
"path": "/v1/messages",
|
||||
"response_status": 200
|
||||
}
|
||||
```
|
||||
|
||||
**Forwarded request (level 2):**
|
||||
```json
|
||||
{
|
||||
"event": "egress_request",
|
||||
"host": "api.anthropic.com",
|
||||
"method": "POST",
|
||||
"path": "/v1/messages",
|
||||
"headers": { "authorization": "Bearer sk-ant-...", "content-type": "application/json" },
|
||||
"body": "{\"model\": \"claude-opus-4-8\", ...}"
|
||||
}
|
||||
```
|
||||
|
||||
The request is logged after auth injection, so the outgoing `Authorization`
|
||||
header is present. The agent's original `Authorization` header is stripped
|
||||
before logging.
|
||||
|
||||
**Response (level 2):**
|
||||
```json
|
||||
{
|
||||
"event": "egress_response",
|
||||
"host": "api.anthropic.com",
|
||||
"status": 200,
|
||||
"headers": { "content-type": "application/json" },
|
||||
"body": "{\"id\": \"msg_...\", ...}"
|
||||
}
|
||||
```
|
||||
|
||||
Responses are logged before DLP scanning, so the body is always the raw
|
||||
upstream response.
|
||||
|
||||
### Implementation
|
||||
|
||||
- **`egress_addon_core.py`**: `Config.log: int = LOG_OFF` (`LOG_OFF=0`,
|
||||
`LOG_BLOCKS=1`, `LOG_FULL=2`). `parse_config()` validates the integer and
|
||||
rejects booleans.
|
||||
- **`egress_addon.py`**: `_block()` emits JSON when `log >= LOG_BLOCKS`. The
|
||||
`_req_ctx()` helper builds `{host, method, path}` for every call site.
|
||||
`_log_request()` / `_log_response()` fire when `log >= LOG_FULL`.
|
||||
- **`manifest_egress.py`**: `EgressConfig.Log: int = 0`, parsed from
|
||||
`egress.log`, validated against `{0, 1, 2}`.
|
||||
- **`egress.py`**: `egress_render_routes(routes, *, log: int = 0)` emits
|
||||
`log: N` at the top of routes.yaml when N > 0. `EgressPlan.log: int = 0`.
|
||||
Reference in New Issue
Block a user