diff --git a/docs/prds/0036-codex-auth-redaction-policy.md b/docs/prds/0036-codex-auth-redaction-policy.md new file mode 100644 index 0000000..cf88a5a --- /dev/null +++ b/docs/prds/0036-codex-auth-redaction-policy.md @@ -0,0 +1,107 @@ +# PRD 0036: Codex Auth Redaction Policy + +- **Status:** Draft +- **Author:** didericis-codex +- **Created:** 2026-06-02 +- **Issue:** #129 + +## Summary + +Make Codex host-auth redaction explicit and fixture-driven so dummy +`auth.json` generation cannot accidentally preserve future sensitive fields. +Keep forwarding only the short-lived host access token through egress, while the +guest receives a non-secret auth file whose schema remains useful to Codex. + +## Problem + +`bot_bottle/codex_auth.py` reads the host Codex auth file, extracts the access +token for egress, and writes a dummy guest `auth.json`. The code redacts JWT +claims and auth JSON fields with a mix of schema-specific handling and generic +placeholder behavior. + +That is safer than copying raw auth, but it is still coverage-sensitive. If +Codex adds a new field that carries a token, session identifier, refresh secret, +or account metadata and the field name does not match current heuristics, the +dummy auth file could preserve more information than intended. Because this is +credential-adjacent code, the desired behavior should be allowlist-oriented and +backed by explicit fixtures. + +## Goals / Success Criteria + +- Define a durable redaction policy for Codex `auth.json`: + - host access token is read for egress only. + - guest dummy auth contains no bearer, refresh, session, or secret values. + - selected non-secret fields may be preserved only when needed by Codex. +- Prefer explicit per-field preservation over broad heuristic pass-through. +- Add representative fixture tests for current Codex auth shapes. +- Add regression tests for unknown nested fields, sensitive-looking field names, + lists, dictionaries, and JWT custom claims. +- Preserve dummy token expiration alignment with the host access token. +- Keep existing errors for missing, invalid, non-device, or expired auth. + +## Non-goals + +- No change to the egress credential-forwarding contract. +- No attempt to refresh Codex tokens inside the bottle. +- No copying of refresh tokens or raw host auth into the guest. +- No dependency on a Codex SDK or external schema package. +- No user-facing CLI changes. + +## Scope + +In scope: + +- `bot_bottle/codex_auth.py` redaction helpers. +- Unit tests in `tests/unit/test_codex_auth.py`. +- Small documentation comments that distinguish preserved non-secret fields from + redacted credential material. + +Out of scope: + +- Provider provisioning outside Codex auth file generation. +- Egress route construction for Codex. +- Runtime calls to Codex/OpenAI services. + +## Design + +Treat the dummy guest `auth.json` as a deliberately synthesized compatibility +file, not as a redacted copy of the host file. The implementation may continue +to start from the host object for convenience, but preserved fields should be +controlled by explicit allowlists at known schema locations. + +At the top level, preserve only fields required to keep Codex in the same auth +branch. In token blocks, replace access, ID, and refresh-like token values with +dummy values. In JWT payloads, preserve only claims that are known to be +non-secret and required for Codex behavior; unknown scalar claims should become +placeholders, unknown lists should become empty lists, and unknown objects +should recurse or become empty objects according to the local policy. + +For the OpenAI auth claim, preserve only currently necessary non-secret values +such as plan type and selected account id. Everything else should be +placeholder, empty object, empty list, or omitted according to the policy. The +policy should be easy to audit from constants or named helper functions. + +Tests should use fixture auth objects that include both current expected fields +and intentionally hostile future-looking fields such as `session_context`, +`bearer`, `refreshSecret`, nested `token_value`, and opaque arrays. The dummy +output must not contain the original secret strings. + +## Testing Strategy + +- Existing `tests/unit/test_codex_auth.py` should continue to pass. +- Add tests that assert original access/refresh/session strings do not appear in + `codex_dummy_auth_json`. +- Add tests for nested JWT and auth-claim redaction behavior. +- Add tests that the dummy access/id token `exp` still matches the host access + token expiry. + +Run: + +- `python3 -m unittest tests.unit.test_codex_auth` +- `python3 -m unittest discover -s tests/unit` + +## Open Questions + +- Which Codex auth fields are strictly required for the guest CLI to stay in + the device-auth branch? If a field is not demonstrably required, the default + should be to redact or omit it.