Files
bot-bottle/docs/prds/prd-new-log-full-credential-redaction.md
T
didericis-claude 1f96619c6a
lint / lint (push) Failing after 2m15s
test / unit (pull_request) Successful in 43s
test / integration (pull_request) Successful in 25s
fix(egress): strip injected Authorization and redact bodies in LOG_FULL path
_log_request and _log_response wrote headers and bodies to stderr verbatim.
_log_request also included the sidecar-injected upstream Authorization value,
exposing live bearer tokens on every allowed request under LOG_FULL.

Apply redact_tokens to all header values and bodies in both log functions;
exclude the authorization header from _log_request entirely since its value
is always a live sidecar-injected credential by the time _log_request runs.

Closes #257
2026-06-24 23:04:22 -04:00

4.2 KiB

PRD prd-new: LOG_FULL egress logging credential redaction

  • Status: Draft
  • Author: claude
  • Created: 2026-06-25
  • Issue: #257

Summary

The LOG_FULL egress logging path (_log_request and _log_response in egress_addon.py) writes request/response headers and bodies to stderr without redaction and includes the sidecar-injected upstream Authorization header verbatim. This PR applies redact_tokens to header values and bodies in both log functions and strips the injected Authorization header from request logs entirely.

Problem

LOG_FULL (log level 2) is intended for debugging egress traffic. When active it calls _log_request and _log_response. Both functions have two related bugs:

  1. Injected Authorization header exposure. _log_request is called after the sidecar injects upstream credentials (flow.request.headers["authorization"] = decision.inject_authorization). The full header dict — including the live credential — is serialized to stderr. Any log collector that ingests the egress container's stderr will receive the upstream bearer token in plaintext.

  2. Unredacted bodies and header values. Neither _log_request nor _log_response passes body or header values through redact_tokens. By contrast, _req_ctx (used for block/warn events) already calls redact_tokens on path and host. Any provisioned secret or recognized token pattern that appears in a request body, response body, or non-Authorization header value will be logged verbatim under LOG_FULL.

These two bugs compose: an agent that enables LOG_FULL and simultaneously triggers a request that carries a known token gains a write path from credentials → egress logs.

Goals / Success Criteria

  • _log_request never logs the authorization header in any form.
  • _log_request applies redact_tokens(value, env=os.environ) to every other header value before serializing.
  • _log_request applies redact_tokens(body, env=os.environ) to the request body before logging.
  • _log_response applies redact_tokens(value, env=os.environ) to every response header value before logging.
  • _log_response applies redact_tokens(body, env=os.environ) to the response body before logging.
  • Unit tests cover each of the five cases above.

Non-goals

  • Redacting host or path in the full-log path (already covered by _req_ctx for block/warn events; _log_request already calls redact_tokens on host and path).
  • Suppressing LOG_FULL or adding a new log level.
  • Changing the outbound DLP scan logic.

Design

_log_request

def _log_request(self, flow: http.HTTPFlow) -> None:
    headers = {
        k: redact_tokens(v, env=os.environ)
        for k, v in flow.request.headers.items()
        if k.lower() != "authorization"
    }
    body = redact_tokens(flow.request.get_text(strict=False) or "", env=os.environ)
    sys.stderr.write(
        json.dumps({
            "event": "egress_request",
            "host": redact_tokens(flow.request.pretty_host, env=os.environ),
            "method": flow.request.method,
            "path": redact_tokens(flow.request.path, env=os.environ),
            "headers": headers,
            "body": body,
        })
        + "\n"
    )

The authorization key is excluded because by the time _log_request is called the sidecar has already injected the upstream credential (decision.inject_authorization). Logging it would write a live bearer token to stderr on every allowed request. There is no safe subset to log — the value is always a live credential or empty.

_log_response

def _log_response(self, flow: http.HTTPFlow) -> None:
    headers = {
        k: redact_tokens(v, env=os.environ)
        for k, v in flow.response.headers.items()
    }
    body = redact_tokens(flow.response.get_text(strict=False) or "", env=os.environ)
    sys.stderr.write(
        json.dumps({
            "event": "egress_response",
            "host": flow.request.pretty_host,
            "status": flow.response.status_code,
            "headers": headers,
            "body": body,
        })
        + "\n"
    )

Response headers don't carry injected credentials, so no header name is suppressed — only the values are scrubbed by redact_tokens.