PRD 0029: provision egress routes via AgentProvisionPlan #115
Reference in New Issue
Block a user
Delete Branch "prd-0029-egress-routes-via-agent-provision-plan"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Addresses review comments on #110.
PRD: docs/prds/0029-codex-host-credentials.md
Summary
template == "codex"branching fromegress.pyandpipelock.py— the same provider-specific pattern the rest of #110 moved out of the backendsEgressRoute.tls_passthrough: bool;egress_manifest_routeslifts the manifestpipelock.tls_passthroughflag so all route origins share the same fieldAgentProvisionPlan.egress_routes;agent_provision_planpopulates it for Codex +forward_host_credentialswithtls_passthrough=Trueegress_routes_for_bottlelogic with a generic_merge_provider_routehelper (conflict detection preserved)pipelock_effective_tls_passthroughto readroute.tls_passthroughfrom the merged route set — no provider checkagent_provision_planfirst, then passplan.egress_routestoEgress.prepareandPipelockProxy.prepareChanges (1 commit)
b79b490refactor: provision egress routes via AgentProvisionPlan@@ -93,0 +117,4 @@self.assertEqual(CODEX_HOST_CREDENTIAL_TOKEN_REF, r.token_ref)self.assertTrue(r.tls_passthrough)def test_codex_without_forward_host_credentials_has_no_egress_routes(self):when we don't forward host credentials there should still be egress routes, just not egress routes with an auto-injected token (and we should have passthrough set to true so the tokens the user would set after logging in don't get stripped out)
@@ -170,0 +174,4 @@# route marker that enables a non-secret placeholder env. Codex is# intentionally absent here: it should use its device/ChatGPT login# state, and an OPENAI_API_KEY placeholder would force API-key auth.has_provider_auth = any(lines 177-183 should also be moved into the agent provision step: it's the responsibility of the agent provisioner to determine whether or not there should be something like "egress-placeholder" in the env and which env var it should go into. This should also remove the need for
placeholder_envin the provider runtime.Agreed — the
has_provider_authdetection and placeholder injection is provider logic that should live inagent_provision_plan, not in each backend. Moving it there will also let us dropplaceholder_envfromAgentProviderRuntimeentirely since the backends will no longer need to know which env var to fill. Will fix.@@ -88,7 +99,6 @@ _RUNTIMES = {image="bot-bottle-claude:latest",dockerfile=str(_REPO_ROOT / "Dockerfile.claude"),auth_role="claude_code_oauth",I'm not sure if auth_role needs to be declared in the agent runtime anymore either: in the original plan, we were going to have the user manually specify egress routes. But given all of the provider specific logic we already need to do in the agent provider and the fact that we've already started added to egress routes in there means we should probably be doing that for claude as well, NOT looking for this specific auth role on user defined egress routes
@@ -113,6 +122,13 @@ def runtime_for(template: str) -> AgentProviderRuntime:return _RUNTIMES[template]def placeholder_env_for(template: str) -> str:This doesn't need to be a separate function, and the organization is wrong/we want the code for a specific provider to be making the decision about placeholder envs (a function which switches on different provider templates is the wrong abstraction)
Re:
placeholder_env_for(#issuecomment-941) — agree, wrong abstraction. Will remove it. One option to fixprint_utilwithout template-switching: add ahidden_env_names: frozenset[str]field toAgentProvisionPlanand have the Claude branch populate it with{"CLAUDE_CODE_OAUTH_TOKEN"}when injecting the placeholder —print_utilthen reads from the plan rather than dispatching on template. Does that direction work, or would you rather handle it another way?Re:
auth_role(#issuecomment-939) — understood on the direction (Claude injects its own routes rather than searching user routes forclaude_code_oauth). Question: does "doing that for claude as well" mean injecting a provider route for a specific known host (like Codex does forapi.openai.com), or should Claude just always inject the placeholder + disable nonessential-traffic flags unconditionally? Theclaude_code_oauthdocs say "host doesn't matter to the placeholder logic" so I'm not sure whether a route is even the right gate.RE #115 (comment)
Yes, this is what the user defined egress routes are currently:
We also need to allow agent provisioners to declare required tokens in the agent provider template for this to work so we can pipe
BOT_BOTTLE_CLAUDE_OAUTH_TOKENwhere it needs to go. EX:Got it — so the end state is:
agent_provider.auth_token: BOT_BOTTLE_CLAUDE_OAUTH_TOKENin the manifest drives the provisioner to inject a provider-ownedapi.anthropic.comroute (Bearer, tls_passthrough, token_ref from that field), the placeholder + nonessential-traffic flags follow from the provisioner having a route to inject, andauth_role+claude_code_oauthgo away.One flag though: PRD 0029 has "Supporting host credential forwarding for non-Codex providers" as an explicit non-goal. The
auth_tokenfield + Claude route injection is new scope beyond what the PRD covers. Do you want me to expand this branch to include it, or land what's here and open a follow-up PRD for the Claude provider auth work?RE #115 (comment)
Yeah, we should include it and rescope the other PR to be about injecting provider auth credentials and other settings more generally. Need to do both/makes sense to do them together to make sure we get the agent provider abstraction right.
RE
hidden_env_nameson the other issue, sounds good/do that.Will expand the branch to include it. I'll update the PRD to reflect the broader scope, implement
agent_provider.auth_token, have the Claude provisioner inject theapi.anthropic.comroute from it, dropauth_rolefrom the runtime, removeplaceholder_env_for, and addhidden_env_namestoAgentProvisionPlanforprint_util. Will also clean upclaude_code_oauthrole handling once the provider route replaces it.Operators can now declare: agent_provider: template: claude auth_token: BOT_BOTTLE_CLAUDE_OAUTH_TOKEN and the provisioner injects a provider-owned api.anthropic.com egress route (Bearer, tls_passthrough) rather than requiring a manually declared route with the former claude_code_oauth role. Changes: - Add auth_token field to AgentProvider; validate claude-only. - Remove claude_code_oauth from EGRESS_ROLES / PROVIDER_EGRESS_ROLES. Manifests that declare the role now fail at parse time with "unknown role" — the provisioner owns the route. - agent_provision_plan: replace manifest_egress_routes/has_provider_auth with auth_token; Claude branch injects the api.anthropic.com route, placeholder env, and nonessential-traffic flags when auth_token is set. - Add hidden_env_names: frozenset[str] to AgentProvisionPlan; Claude branch populates it with CLAUDE_CODE_OAUTH_TOKEN. - Remove auth_role from AgentProviderRuntime and placeholder_env_for(). - print_util.visible_agent_env_names: accept hidden_env_names from the plan instead of dispatching on agent_provider_template. - Both backends: drop manifest_egress_routes call, pass auth_token. - PRD 0029 rescoped to cover both Codex and Claude provider auth. Assisted-by: Claude Code@@ -174,2 +195,3 @@)))if template == PROVIDER_CLAUDE and has_provider_auth:if template == PROVIDER_CLAUDE and auth_token:egress_routes.append(EgressRoute(Similarly to codex, we should always include these in egress routes (whether or not auth_token is present), but we only have the egress add the auth token when it's present
Done —
api.anthropic.comis now always emitted as atls_passthroughroute for Claude (same pattern as Codex + the OpenAI hosts). Auth scheme and token_ref only populated whenauth_tokenis set; placeholder env and hidden_env_names likewise.@@ -194,3 +182,1 @@# have a stable role name, but it no longer triggers an# OPENAI_API_KEY placeholder. Codex bottles should prefer# device/ChatGPT login state today.# codex_auth: placeholder marker for Codex egress-held auth flows.should remove this/is no longer necessary