docs(prd): update PRD 0030 design to reflect provisioned_env approach

Revises the Design section to describe the implemented solution: provisioned_env on AgentProvisionPlan rather than an intermediate egress_resolve_token_values_with_provider function. Drops the old sentinel/lazy-import design narrative.
2026-06-02 04:54:09 +00:00
parent 0e29bcc829
commit e528d5c5af
1 changed files with 57 additions and 138 deletions
@@ -7,14 +7,15 @@

 ## Summary

-Extract the duplicated egress token resolution block — which calls
-`egress_resolve_token_values` and then slots in the Codex host
-credential — into a single shared function in `bot_bottle/egress.py`,
-and have both backends call it.
+Eliminate the duplicated egress token resolution block — which resolved
+manifest-declared tokens and the Codex host credential — by moving
+provider-specific token reading into `AgentProvisionPlan.provisioned_env`
+at prepare time, and having both backends merge that map into `os.environ`
+before calling the now-generic `egress_resolve_token_values`.

 ## Problem

-The same logic block appears verbatim in two places:
+The same logic block appeared in two places:

 - `bot_bottle/backend/docker/launch.py` (~line 183): inline inside
  `launch()`, building `token_values` before `compose_up`.
@@ -29,21 +30,25 @@ Both blocks:
   call `codex_host_access_token` and slot the result into any
   `token_env` whose `token_ref` is `CODEX_HOST_CREDENTIAL_TOKEN_REF`.

-The duplication means any change to step 3 — new provider credential
-type, token refresh logic, a new sentinel ref — must be applied twice.
+The duplication means any change to step 3 must be applied twice.
 PRD 0029, which introduced `forward_host_credentials`, already had to
-wire both backends; the next change will too. This is a near-certain
+wire both backends; the next change would too. This is a near-certain
 future sync bug.

+`egress_resolve_token_values` also carried a sentinel `continue` skip
+for `CODEX_HOST_CREDENTIAL_TOKEN_REF`, which tied an otherwise generic
+egress helper to a Codex-specific contract.
+
 ## Goals / Success Criteria

 - The `forward_host_credentials` resolution logic exists in exactly one
  place in the codebase.
- Both `docker/launch.py` and `smolmachines/launch.py` call that
-  shared function.
+- Both `docker/launch.py` and `smolmachines/launch.py` call
+  `egress_resolve_token_values` with no provider-specific arguments.
+- `egress_resolve_token_values` is fully generic — it neither knows nor
+  cares about provider identity or the `CODEX_HOST_CREDENTIAL_TOKEN_REF`
+  sentinel.
 - No behaviour change for either backend.
- The shared function is unit-testable without a plan or backend
-  dependency.

 ## Non-goals

@@ -53,145 +58,59 @@ future sync bug.

 ## Design

-### New function: `egress_resolve_token_values_with_provider`
+### `AgentProvisionPlan.provisioned_env`

-Add to `bot_bottle/egress.py`:
+Add `provisioned_env: dict[str, str]` (default empty) to
+`AgentProvisionPlan`. This map holds host-side secrets that the
+provisioning stage resolved and that egress needs injected into the
+sidecar environ.

-```python
-def egress_resolve_token_values_with_provider(
-    token_env_map: dict[str, str],
-    forward_host_credentials: bool,
-    host_env: dict[str, str],
-) -> dict[str, str]:
-    """Resolve all egress token env-var values from the host environ.
-
-    Combines the manifest-declared token refs (via
-    `egress_resolve_token_values`) with the optional Codex host
-    credential slot (when `forward_host_credentials` is True). Returns
-    an empty dict when `token_env_map` is empty.
-
-    Pure function: `host_env` is passed in so callers can pass a sealed
-    mapping in tests without touching `os.environ`. The Codex access
-    token is read lazily only when the flag is set and a matching slot
-    exists, so callers that don't use `forward_host_credentials` pay no
-    cost."""
-    if not token_env_map:
-        return {}
-    token_values = egress_resolve_token_values(token_env_map, host_env)
-    if forward_host_credentials:
-        from .codex_auth import codex_host_access_token
-        access_token = codex_host_access_token(host_env)
-        for token_env, token_ref in token_env_map.items():
-            if token_ref == CODEX_HOST_CREDENTIAL_TOKEN_REF:
-                token_values[token_env] = access_token
-    return token_values
-```
-
-The import of `codex_auth` is deferred inside the branch to avoid a
-circular import — `egress.py` is imported by `pipelock.py` which
-`codex_auth.py` does not import, but `codex_auth.py` imports from
-`egress.py` transitively via `agent_provider`. The lazy import sidesteps
-the cycle without restructuring the module graph.
+When `forward_host_credentials=True` for Codex, `agent_provision_plan`
+calls `codex_host_access_token(host_env)` and stores the result under
+`CODEX_HOST_CREDENTIAL_TOKEN_REF`. This is already the prepare-time
+stage where `write_codex_dummy_auth_file` runs, so the access-token
+read is colocated with all other Codex-specific provisioning.

 ### Backend call sites

-**`docker/launch.py`** — replace the inline block:
+Both backends merge `provisioned_env` over `os.environ` before calling
+`egress_resolve_token_values`:

 ```python
-# before
-token_values: dict[str, str] = {}
-if plan.egress_plan.routes:
-    token_values = egress_resolve_token_values(
-        plan.egress_plan.token_env_map, dict(os.environ),
-    )
-    if plan.spec.manifest.bottle_for(
-        plan.spec.agent_name,
-    ).agent_provider.forward_host_credentials:
-        access_token = codex_host_access_token(dict(os.environ))
-        for token_env, token_ref in plan.egress_plan.token_env_map.items():
-            if token_ref == CODEX_HOST_CREDENTIAL_TOKEN_REF:
-                token_values[token_env] = access_token
-
-# after
-bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
-token_values = egress_resolve_token_values_with_provider(
-    plan.egress_plan.token_env_map,
-    bottle.agent_provider.forward_host_credentials,
-    dict(os.environ),
-)
+effective_env = {**dict(os.environ), **plan.agent_provision.provisioned_env}
+token_values = egress_resolve_token_values(plan.egress_plan.token_env_map, effective_env)
 ```

-**`smolmachines/launch.py`** — replace `_resolve_token_env`:
+`provisioned_env` values take precedence over same-named host env vars,
+so the Codex token slot resolves from the map written at prepare time
+rather than from a raw `os.environ` lookup.

-```python
-# before
-def _resolve_token_env(
-    plan: SmolmachinesBottlePlan, host_env: object
-) -> dict[str, str]:
-    ep = plan.egress_plan
-    if not ep.routes:
-        return {}
-    env = dict(host_env)
-    token_values = egress_resolve_token_values(ep.token_env_map, env)
-    if plan.spec.manifest.bottle_for(
-        plan.spec.agent_name,
-    ).agent_provider.forward_host_credentials:
-        access_token = codex_host_access_token(env)
-        for token_env, token_ref in ep.token_env_map.items():
-            if token_ref == CODEX_HOST_CREDENTIAL_TOKEN_REF:
-                token_values[token_env] = access_token
-    return token_values
+### `egress_resolve_token_values`

-# after
-def _resolve_token_env(
-    plan: SmolmachinesBottlePlan, host_env: dict[str, str],
-) -> dict[str, str]:
-    bottle = plan.spec.manifest.bottle_for(plan.spec.agent_name)
-    return egress_resolve_token_values_with_provider(
-        plan.egress_plan.token_env_map,
-        bottle.agent_provider.forward_host_credentials,
-        host_env,
-    )
-```
+Remove the `CODEX_HOST_CREDENTIAL_TOKEN_REF` sentinel `continue` skip.
+The function now resolves every slot in `token_env_map` from `host_env`
+without special-casing any key name. The `CODEX_HOST_CREDENTIAL_TOKEN_REF`
+key is present in `effective_env` (injected by `provisioned_env`) exactly
+when it is needed.

-Note: the `host_env: object` annotation on the smolmachines version is
-also corrected to `dict[str, str]` — the annotation was wrong; the
-function body immediately called `dict(host_env)` treating it as a
-mapping.
+## Implementation

-### Imports
-
- `docker/launch.py`: drop `codex_host_access_token` and
-  `CODEX_HOST_CREDENTIAL_TOKEN_REF` imports; add
-  `egress_resolve_token_values_with_provider`.
- `smolmachines/launch.py`: same drops; add
-  `egress_resolve_token_values_with_provider`.
- `egress.py`: add `egress_resolve_token_values_with_provider` to
-  `__all__`.
-
-## Implementation chunks
-
-1. **PRD (this commit).** Sets the design.
-2. **Shared function.** Add `egress_resolve_token_values_with_provider`
-   to `bot_bottle/egress.py` with unit tests covering: no routes →
-   empty dict, routes with no `forward_host_credentials` → manifest
-   tokens only, routes with `forward_host_credentials` + matching ref →
-   Codex token slotted in, routes with `forward_host_credentials` + no
-   matching ref → manifest tokens only (Codex call still made but no
-   slot to fill).
-3. **Docker backend.** Replace the inline block in `docker/launch.py`.
-4. **Smolmachines backend.** Replace `_resolve_token_env` in
-   `smolmachines/launch.py`; fix the `host_env: object` annotation.
-5. **Cleanup.** Remove now-unused imports from both backends.
-
-## Testing strategy
-
- **Unit (must):** the four cases above in a new test in
-  `tests/unit/test_egress.py`, using a sealed `host_env` dict so tests
-  don't touch `os.environ` or the filesystem.
- **No integration changes needed:** both backends' existing integration
-  tests exercise the full launch path; a behaviour regression would
-  surface as a token-injection failure in those tests.
+- `bot_bottle/agent_provider.py`: add `provisioned_env` field; populate
+  it for Codex when `forward_host_credentials=True`.
+- `bot_bottle/egress.py`: remove sentinel skip; remove
+  `egress_resolve_token_values_with_provider` (the intermediate function
+  that was introduced and then superseded by this design); drop the
+  `codex_auth` import.
+- `bot_bottle/backend/docker/launch.py`: replace the provider-aware
+  resolution block with the `effective_env` merge.
+- `bot_bottle/backend/smolmachines/launch.py`: same replacement in
+  `_resolve_token_env`.
+- `tests/unit/test_agent_provider.py`: add tests verifying
+  `provisioned_env` is populated (or empty) for Codex with and without
+  `forward_host_credentials`.
+- `tests/unit/test_egress.py`: remove `TestResolveTokenValuesWithProvider`;
+  replace the sentinel-skip test with a test verifying the Codex token
+  ref resolves normally when present in `host_env`.

 ## References