149 lines
4.6 KiB
Markdown
149 lines
4.6 KiB
Markdown
# PRD 0055: Egress traffic logging
|
|
|
|
- **Status:** Active
|
|
- **Author:** claude
|
|
- **Created:** 2026-06-06
|
|
- **PR:** #207
|
|
|
|
## Summary
|
|
|
|
Adds structured log levels to the egress proxy so operators can observe
|
|
traffic and security decisions without modifying any application code.
|
|
Three integer levels control verbosity: `0` (off), `1` (security events
|
|
only), and `2` (full request/response capture). All output is JSON lines
|
|
written to stderr.
|
|
|
|
## Problem
|
|
|
|
The egress proxy makes per-request allow/block decisions and DLP scans, but
|
|
until now those decisions are invisible unless something is actively blocked
|
|
and the caller inspects the 403 body. Debugging unexpected blocks, auditing
|
|
what an agent is sending upstream, and verifying DLP detector behaviour all
|
|
require adding ad-hoc instrumentation or tailing the sidecar container logs
|
|
with no structure to grep against.
|
|
|
|
## Goals / Success Criteria
|
|
|
|
1. **Level 0 (off, default):** no egress output to stderr beyond the boot
|
|
line. Existing behaviour for production deployments.
|
|
2. **Level 1 (blocks):** every block or DLP warn event is emitted to stderr
|
|
as a JSON line with the event type, human-readable reason (including the
|
|
secret type detected for DLP hits), and the request context (host, method,
|
|
path; plus upstream status code for response-phase events). No traffic
|
|
bodies are logged.
|
|
3. **Level 2 (full):** all level-1 events, plus a `egress_request` JSON line
|
|
for every forwarded request (method, path, headers, body after auth
|
|
injection) and an `egress_response` JSON line for every response that
|
|
passes DLP (status, headers, body).
|
|
4. The log level is a single integer field `log` at the top of the egress
|
|
config (routes.yaml in the sidecar; `egress.log` in the bottle manifest).
|
|
Values other than 0, 1, 2 are rejected at parse time on both sides.
|
|
5. The boot message includes the active log level label (`off`, `blocks`,
|
|
`full`).
|
|
|
|
## Non-goals
|
|
|
|
- Log rotation or file sinks — stderr output is captured by the container
|
|
runtime (Docker, smolmachines) and goes wherever the operator routes it.
|
|
- Per-route log levels — all routes share the global level.
|
|
- Redacting secrets from the level-2 body dump — at level 2 the operator
|
|
has explicitly requested full visibility; redaction belongs in the
|
|
log consumer, not the proxy.
|
|
|
|
## Design
|
|
|
|
### Wire format
|
|
|
|
`routes.yaml` gains an optional top-level `log` key:
|
|
|
|
```yaml
|
|
log: 1 # 0 = off (default), 1 = blocks, 2 = full
|
|
routes:
|
|
- host: "api.anthropic.com"
|
|
...
|
|
```
|
|
|
|
The field is omitted entirely when the level is 0 (default).
|
|
|
|
### Manifest format
|
|
|
|
```yaml
|
|
egress:
|
|
log: 1
|
|
routes:
|
|
- host: "api.anthropic.com"
|
|
...
|
|
```
|
|
|
|
`egress.log` accepts integers 0, 1, or 2. Booleans and strings are rejected.
|
|
|
|
### Log events
|
|
|
|
**Block / DLP block (level ≥ 1):**
|
|
```json
|
|
{
|
|
"event": "egress_block",
|
|
"reason": "egress DLP: GitHub token (classic) found in request",
|
|
"host": "api.github.com",
|
|
"method": "POST",
|
|
"path": "/gists"
|
|
}
|
|
```
|
|
|
|
Response-phase block also includes `"response_status"`.
|
|
|
|
**DLP warn (level ≥ 1):**
|
|
```json
|
|
{
|
|
"event": "egress_warn",
|
|
"reason": "egress DLP: possible prompt injection detected",
|
|
"host": "api.anthropic.com",
|
|
"method": "POST",
|
|
"path": "/v1/messages",
|
|
"response_status": 200
|
|
}
|
|
```
|
|
|
|
**Forwarded request (level 2):**
|
|
```json
|
|
{
|
|
"event": "egress_request",
|
|
"host": "api.anthropic.com",
|
|
"method": "POST",
|
|
"path": "/v1/messages",
|
|
"headers": { "authorization": "Bearer sk-ant-...", "content-type": "application/json" },
|
|
"body": "{\"model\": \"claude-opus-4-8\", ...}"
|
|
}
|
|
```
|
|
|
|
The request is logged after auth injection, so the outgoing `Authorization`
|
|
header is present. The agent's original `Authorization` header is stripped
|
|
before logging.
|
|
|
|
**Response (level 2):**
|
|
```json
|
|
{
|
|
"event": "egress_response",
|
|
"host": "api.anthropic.com",
|
|
"status": 200,
|
|
"headers": { "content-type": "application/json" },
|
|
"body": "{\"id\": \"msg_...\", ...}"
|
|
}
|
|
```
|
|
|
|
Responses are logged before DLP scanning, so the body is always the raw
|
|
upstream response.
|
|
|
|
### Implementation
|
|
|
|
- **`egress_addon_core.py`**: `Config.log: int = LOG_OFF` (`LOG_OFF=0`,
|
|
`LOG_BLOCKS=1`, `LOG_FULL=2`). `parse_config()` validates the integer and
|
|
rejects booleans.
|
|
- **`egress_addon.py`**: `_block()` emits JSON when `log >= LOG_BLOCKS`. The
|
|
`_req_ctx()` helper builds `{host, method, path}` for every call site.
|
|
`_log_request()` / `_log_response()` fire when `log >= LOG_FULL`.
|
|
- **`manifest_egress.py`**: `EgressConfig.Log: int = 0`, parsed from
|
|
`egress.log`, validated against `{0, 1, 2}`.
|
|
- **`egress.py`**: `egress_render_routes(routes, *, log: int = 0)` emits
|
|
`log: N` at the top of routes.yaml when N > 0. `EgressPlan.log: int = 0`.
|