Files
bot-bottle/docs/prds/0050-agent-provider-contrib.md
didericis-claude bcdffc8400 refactor(contrib): inline provision steps per-provider, drop shared apply module
Each AgentProvider now owns its skills / prompt / provision /
supervise_mcp end-to-end. The base ABC declares all four as
abstract; ClaudeAgentProvider and CodexAgentProvider each carry
their own copy loop.

Per PR review feedback (review #128): the shared
_provision_apply.py abstraction was weak — Claude and Codex
harnesses already diverge (codex's dummy-auth + login-status
verify has no claude analogue) and forcing both onto one helper
just postpones the split. Duplication is intentional.

Deletes bot_bottle/_provision_apply.py and consolidates testing
under tests/unit/test_contrib_{claude,codex}_provider.py (one
file per provider, covering all four methods).
2026-06-03 21:38:13 -04:00

19 KiB
Raw Permalink Blame History

PRD 0050: Move provider-specific agent logic into contrib

  • Status: Active
  • Author: claude
  • Created: 2026-06-03
  • Issue: #177

Summary

The agent provider module (bot_bottle/agent_provider.py) hard-codes the Claude- and Codex-specific provisioning rules — auth file shapes, trust-dialog markers, egress routes, dummy-auth dance, env vars — in a single if template == "codex": ... if template == "claude": ... chain (lines 154230 today). Other pieces of provider behavior live in each backend's provision/ directory (provision_skills, provision_prompt, provision_provider_auth, provision_supervise), duplicated once per backend, even though almost none of what they do is actually backend-specific.

This PRD reshapes the agent provider into a proper plugin boundary. The two existing providers (Claude, Codex) move out of agent_provider into bot_bottle/contrib/claude/ and bot_bottle/contrib/codex/ — the same contrib/ layout PRD 0048 established for the Gitea deploy-key provisioner. The four provisioner methods backends currently duplicate move into the provider plugin itself; the backend keeps only the bottle-side primitives (cp_in, exec) the plugin calls through. MCP server registration becomes a first-class part of the provider contract so Codex finally gets the supervise sidecar wired in alongside Claude.

The shipping artifact is two new provider plugins under contrib/, a narrower AgentProvider ABC in bot_bottle/agent_provider.py, four fewer provisioner hooks on BottleBackend, and a supervise-MCP entry visible from the Codex agent at launch.

Problem

Three concrete pains, all downstream of the provider abstraction not being where the work happens:

  1. Adding a third provider is a five-file edit. A hypothetical Gemini or Aider provider has to: (a) add a branch in agent_provision_plan, (b) add a runtime entry in _RUNTIMES, (c) thread a prompt_mode enum value, (d) potentially extend provision_provider_auth per backend, (e) wire MCP registration into both backend/docker/provision/supervise.py and backend/smolmachines/provision/supervise.py. Nothing about that spread is load-bearing; it's leftover from when there was one provider.

  2. MCP server registration is Claude-only. Both backend/docker/provision/supervise.py and backend/smolmachines/provision/supervise.py run claude mcp add verbatim. Codex bottles silently get no MCP entry — the sidecar is running, the routes are open, but the agent can't see the tools because nothing wrote them into Codex's TOML config. Today this is a latent gap. The provider plugin is the only layer that knows how a given agent discovers MCP servers, so that's where the registration belongs.

  3. provision_skills / provision_prompt / provision_provider_auth are duplicated between backends. Each backend has its own ~50-line copy. The differences are entirely about which path the backend uses for cp_in and what user it chowns to. Same business logic, two implementations, two test surfaces, two places to update when the rules change.

The agent_provider module is the right home for all of this. It already owns the AgentProvisionPlan (the declarative description of what needs to land in the guest); extending it to own the imperative "actually land it" step is the natural next move. Putting provider-specific code under contrib/ mirrors the convention PRD 0048 established and keeps the core package provider-agnostic.

Goals / Success Criteria

  1. bot_bottle/agent_provider.py contains no Claude- or Codex-specific branches. The Claude and Codex template strings themselves still live in the core module (they're the public manifest values), but everything keyed off them moves out.
  2. bot_bottle/contrib/claude/agent_provider.py and bot_bottle/contrib/codex/agent_provider.py exist and contain the provider-specific behavior previously in lines 154230 of agent_provider.py. Each is reachable from the core registry via a lazy import (the same pattern PRD 0048 used for GiteaDeployKeyProvisioner).
  3. AgentProvider is an ABC (or protocol) with at minimum:
    • provision_plan(...) -> AgentProvisionPlan — what the existing agent_provision_plan produces today, scoped to one provider.
    • provision_skills(bottle, plan) — copy host skills into the guest.
    • provision_prompt(bottle, plan) — copy the prompt file, return the in-guest path (or None).
    • provision_supervise_mcp(bottle, plan, supervise_url) — register the supervise sidecar in the provider's MCP config. No-op when the bottle has no supervise sidecar.
    • The Claude implementation runs claude mcp add. The Codex implementation writes the corresponding entry into ~/.codex/config.toml's [mcp_servers.supervise] table.
  4. BottleBackend loses the four abstract methods being moved (provision_skills, provision_prompt, provision_provider_auth, provision_supervise). BottleBackend.provision_in_bottle calls the provider plugin directly via the bottle and plan it already has. provision_ca, provision_workspace, and provision_git stay on the backend — they're backend infrastructure, not provider behavior.
  5. bot_bottle/backend/docker/provision/{skills,prompt,provider_auth, supervise}.py and bot_bottle/backend/smolmachines/provision/{skills, prompt,provider_auth,supervise}.py are deleted. The backend-specific provisioners that remain (ca, git, workspace) stay.
  6. A Codex bottle launched with --supervise shows the supervise MCP server entry in its Codex config and can call supervise tools from inside the bottle (egress-block, pipelock-block, capability-block).
  7. Existing tests for the moved logic move with the code: provider-specific tests under tests/unit/test_contrib_claude_*.py and tests/unit/test_contrib_codex_*.py, mirroring tests/unit/test_contrib_gitea_deploy_key.py.
  8. PRD 0050's Status flips Draft → Active in the same commit that removes the last if template == "claude" branch from agent_provider.py.

Non-goals

  • A third agent provider. This PRD reshapes the boundary so a third provider is cheap to add. It does not add one.
  • Changing the manifest surface. The agent.provider manifest field still takes "claude" or "codex". The set of valid strings is unchanged.
  • Changing AgentProvisionPlan's shape. The dataclasses (AgentProvisionDir, AgentProvisionFile, AgentProvisionCommand, AgentProvisionPlan itself) stay in the core module and keep their current fields. Provider plugins produce the same plan shape; only the producer moves.
  • Changing the supervise sidecar protocol or the supervise tool surface. PRDs 00130016 stay Active. What changes is how the agent discovers the sidecar's MCP endpoint, not what it does once connected.
  • Per-skill provider differences. A Codex agent and a Claude agent see the same ~/.claude/skills/<name>/ tree today (Codex reads it via its own skills mechanism). This PRD does not change that — provision_skills lands the same content for both.
  • Removing the prompt_args helper from agent_provider.py. It stays at module scope; it's already a pure dispatch on prompt_mode and has no Claude/Codex if chain to extract.
  • provision_provider_auth migration. The issue notes this method is "probably not needed anymore" once each provider owns its own provisioning. After the move, the work that provision_provider_auth did (apply dirs / files / pre_copy / verify from the plan) becomes a shared helper the per-provider provision_skills / provision_prompt calls dispatch through — or, more likely, a single provision(bottle) entry point on the provider. The hook is removed from BottleBackend; whether the underlying loop lives on AgentProvider as a default implementation or as a free function in contrib/_apply.py is decided at implementation time, not in this PRD.

Scope

In scope

  • New AgentProvider ABC in bot_bottle/agent_provider.py with the five methods listed under Goal 3. Existing agent_provision_plan becomes AgentProvider.provision_plan.
  • New bot_bottle/contrib/claude/__init__.py, bot_bottle/contrib/claude/agent_provider.py, bot_bottle/contrib/codex/__init__.py, bot_bottle/contrib/codex/agent_provider.py. Each defines a ClaudeAgentProvider / CodexAgentProvider class.
  • A get_provider(template) -> AgentProvider registry in bot_bottle/agent_provider.py, lazy-imported from contrib/, mirroring get_provisioner(provider, ...) in bot_bottle/deploy_key_provisioner.py.
  • Backend changes:
    • BottleBackend.provision_in_bottle resolves the provider once and calls provider.provision_skills(bottle, plan), provider.provision_prompt(bottle, plan), and provider.provision_supervise_mcp(bottle, plan, url) in place of the current four abstract hooks.
    • BottleBackend.provision_skills, provision_prompt, provision_provider_auth, provision_supervise are removed.
    • Docker and smolmachines backends remove their corresponding provision_* implementations and the backend/<name>/provision/{skills,prompt,provider_auth, supervise}.py modules.
  • Codex MCP wiring: CodexAgentProvider.provision_supervise_mcp writes a [mcp_servers.supervise] block into ~/.codex/config.toml pointing at the same agent-side supervise URL the Claude provider uses. The file already exists from the trust-dialog step; the MCP entry is appended (or the file is rewritten in a single shot, whichever's simpler).
  • Tests migrate. Backend tests that targeted the four moved provisioners are rewritten against the provider plugin, with one test file per provider mirroring tests/unit/test_contrib_gitea_*.py.

Out of scope

  • Adding a manifest field for "extra MCP servers the agent should see". The supervise sidecar is the only MCP server provisioned today, and the issue's "Add mcp server configuring into agent provision" line is about the supervise sidecar specifically. A general-purpose user-declared MCP list is a follow-up if and when the need surfaces.
  • Refactoring AgentProvisionPlan's dataclasses. They stay byte- for-byte the same so the diff is purely "who owns the producer".
  • A BottleBackend.provision_provider_auth shim during transition. The hook is removed in one cut; the only caller is the backend itself, no manifest consumers reference it.
  • Renaming agent_provider.pyagent_providers/. The module still has core dataclasses + the ABC + the registry; it's a single file's worth of code.

Proposed design

Module shape after the cut

bot_bottle/agent_provider.py
  PROVIDER_CLAUDE, PROVIDER_CODEX, PROVIDER_TEMPLATES
  PromptMode  (Literal)
  AgentProvisionDir, AgentProvisionFile, AgentProvisionCommand,
    AgentProvisionPlan  (dataclasses, unchanged)
  AgentProviderRuntime  (dataclass — template/command/image/etc.)
  AgentProvider  (ABC)
    .runtime() -> AgentProviderRuntime
    .provision_plan(state_dir, ..., trusted_project_path, ...) -> AgentProvisionPlan
    .provision_skills(bottle, plan) -> None
    .provision_prompt(bottle, plan) -> str | None
    .provision_supervise_mcp(bottle, plan, supervise_url) -> None
  get_provider(template: str) -> AgentProvider     # lazy-imports contrib
  prompt_args(prompt_mode, prompt_path, *, argv)   # unchanged

bot_bottle/contrib/claude/agent_provider.py
  ClaudeAgentProvider(AgentProvider)
    _RUNTIME = AgentProviderRuntime(template="claude", ...)
    .provision_plan(...)         # owns the lines-204230 chunk
    .provision_skills(...)       # was backend/<name>/provision/skills.py
    .provision_prompt(...)       # was backend/<name>/provision/prompt.py
    .provision_supervise_mcp(...)# was backend/<name>/provision/supervise.py

bot_bottle/contrib/codex/agent_provider.py
  CodexAgentProvider(AgentProvider)
    _RUNTIME = AgentProviderRuntime(template="codex", ...)
    .provision_plan(...)         # owns the lines-154204 chunk
    .provision_skills(...)       # same as Claude impl, factored to shared helper
    .provision_prompt(...)       # same as Claude impl, factored to shared helper
    .provision_supervise_mcp(...)# writes [mcp_servers.supervise] to config.toml

The skills / prompt / provider-auth-apply implementations are 99% identical across providers — cp_in then chown / chmod. They are extracted to small free functions in bot_bottle/contrib/_provision_apply.py (or kept as default implementations on AgentProvider if every concrete subclass would just call them). Picked at implementation time; both options match PRD 0048's contrib convention. The visible contract is that provisioning lives on the provider plugin.

MCP registration for Codex

Codex reads MCP servers from ~/.codex/config.toml (or whatever CODEX_HOME/config.toml resolves to). The provider already writes this file once during provision_plan to set the project trust level. CodexAgentProvider.provision_supervise_mcp extends the existing write: same path, append a [mcp_servers.supervise] table pointing at the agent-side supervise URL.

Two implementation routes worth flagging:

  • Option A: Pre-bake the MCP entry in the same config-write that happens during provision_plan, before bottle launch. Simpler; the supervise URL has to be known at plan time, which means provision_plan needs the supervise URL (or a sentinel that means "fill this in"). The smolmachines backend already plumbs agent_supervise_url through to its provision_supervise step, so the value is available.
  • Option B: Append at bottle-launch time via a bottle.exec that writes to the file inside the guest, matching the claude mcp add flow. Slower but uniform with how ClaudeAgentProvider.provision_supervise_mcp works.

Option B is the symmetric choice and the one this PRD assumes. The implementer can switch to A if Option B turns out to need a TOML-merge primitive the codebase doesn't already have.

Backend after the cut

class BottleBackend:
    def provision_in_bottle(self, plan, bottle, supervise_url):
        provider = get_provider(plan.spec.manifest.agents[
            plan.spec.agent_name].provider)
        self.provision_ca(plan, bottle)
        prompt_path = provider.provision_prompt(bottle, plan)
        provider.provision_skills(bottle, plan)
        self.provision_workspace(plan, bottle)
        self.provision_git(plan, bottle)
        provider.provision_supervise_mcp(bottle, plan, supervise_url)
        return prompt_path

supervise_url is the existing per-backend "where does the agent reach the sidecar from inside the guest" value. The Docker backend passes http://supervise:<port>/; smolmachines passes the http://127.0.0.1:<port>/ it already computed. The backend's only remaining provider-touching duty is "tell the provider what the sidecar URL is".

Registry

# bot_bottle/agent_provider.py
def get_provider(template: str) -> AgentProvider:
    if template == PROVIDER_CLAUDE:
        from bot_bottle.contrib.claude.agent_provider import (
            ClaudeAgentProvider,
        )
        return ClaudeAgentProvider()
    if template == PROVIDER_CODEX:
        from bot_bottle.contrib.codex.agent_provider import (
            CodexAgentProvider,
        )
        return CodexAgentProvider()
    raise ValueError(f"unknown agent provider template: {template!r}")

Lazy imports keep core import-time graph small and match PRD 0048.

Implementation chunks

Each chunk is one commit on the PR; the PR ships as one cut.

  1. Lift AgentProvider ABC + registry. Add the ABC and get_provider next to the existing agent_provision_plan function. Have agent_provision_plan delegate to get_provider(template).provision_plan(...) so callers keep working through the transition.
  2. Move provider-specific provision_plan content into contrib. Create contrib/claude/ and contrib/codex/. The Claude and Codex branches of agent_provision_plan move into the respective provider classes. The shared scaffolding (initial dict setup, final AgentProvisionPlan(...) return) stays in the ABC as a template method or moves into each subclass — whichever needs less indirection.
  3. Move backend provisioners onto the provider. Add provision_skills, provision_prompt, provision_supervise_mcp to AgentProvider (with a shared apply helper for skills / prompt). Update BottleBackend.provision_in_bottle to call them. Delete the four backend hook methods and the eight backend/<name>/provision/{skills,prompt,provider_auth,supervise}.py modules.
  4. Add Codex MCP support. Implement CodexAgentProvider.provision_supervise_mcp against ~/.codex/config.toml. Add a unit test that runs the method against an in-memory FakeBottle and asserts the [mcp_servers.supervise] block is present.
  5. Migrate tests. Per-backend tests for the moved provisioners turn into per-provider tests under tests/unit/test_contrib_claude_*.py and tests/unit/test_contrib_codex_*.py. Keep one integration-style test per backend that confirms provision_in_bottle still reaches every step.
  6. Activate. Flip Status: Draft → Active in this PRD; close #177 on merge.

Open questions (resolved)

  1. codex mcp add exists. Implementation calls codex mcp add --transport http supervise <url> as node — symmetric with claude mcp add (no --scope user; Codex writes ~/.codex/config.toml by default). Failure logs a warning; the bottle still works without the entry.
  2. Each provider owns its apply steps end-to-end. The base ABC declares provision_skills / provision_prompt / provision as abstract; each concrete provider implements its own copy loop. No shared _provision_apply.py. The apply sequences look similar today, but Claude and Codex harnesses diverge over time (codex already grew a dummy-auth dance + a codex login status verify with no Claude analogue) and the "shared because both happen to call cp_in then chown" coupling would just rot. Duplication is intentional.
  3. Env knobs removed. BOT_BOTTLE_CONTAINER_HOME, BOT_BOTTLE_GUEST_HOME, BOT_BOTTLE_CONTAINER_SKILLS_DIR, and BOT_BOTTLE_GUEST_SKILLS_DIR are gone; /home/node is hardcoded everywhere it was read. The values were effectively constants; the knobs added surface area for no real flexibility.

References

  • Issue #177 — the request: move provider logic into contrib, add MCP configuration to agent provision, rename provision_supervise → provision_supervise_mcp, ensure Codex gets MCP provisioned.
  • PRD 0013 — supervise plane foundation (defines the MCP-discoverable block-remediation tools this PRD makes available to Codex).
  • PRD 0048 — SSH deploy key provisioning (the contrib/ convention this PRD follows).
  • Current source: agent_provider.py L154-L230 — the provider-specific block this PRD relocates to contrib.