From 2c6f248cda6d4e8f222932bd9bc192b5168ef620 Mon Sep 17 00:00:00 2001 From: claude Date: Wed, 3 Jun 2026 21:05:31 +0000 Subject: [PATCH] =?UTF-8?q?docs(prd):=20draft=20PRD=200050=20=E2=80=94=20m?= =?UTF-8?q?ove=20provider=20logic=20into=20contrib?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/prds/0050-agent-provider-contrib.md | 400 +++++++++++++++++++++++ 1 file changed, 400 insertions(+) create mode 100644 docs/prds/0050-agent-provider-contrib.md diff --git a/docs/prds/0050-agent-provider-contrib.md b/docs/prds/0050-agent-provider-contrib.md new file mode 100644 index 0000000..2c0d5c5 --- /dev/null +++ b/docs/prds/0050-agent-provider-contrib.md @@ -0,0 +1,400 @@ +# PRD 0050: Move provider-specific agent logic into contrib + +- **Status:** Draft +- **Author:** claude +- **Created:** 2026-06-03 +- **Issue:** #177 + +## Summary + +The agent provider module (`bot_bottle/agent_provider.py`) hard-codes +the Claude- and Codex-specific provisioning rules — auth file shapes, +trust-dialog markers, egress routes, dummy-auth dance, env vars — in a +single `if template == "codex": ... if template == "claude": ...` +chain (lines 154–230 today). Other pieces of provider behavior live in +each backend's `provision/` directory (`provision_skills`, +`provision_prompt`, `provision_provider_auth`, `provision_supervise`), +duplicated once per backend, even though almost none of what they do +is actually backend-specific. + +This PRD reshapes the agent provider into a proper plugin boundary. +The two existing providers (Claude, Codex) move out of `agent_provider` +into `bot_bottle/contrib/claude/` and `bot_bottle/contrib/codex/` — +the same `contrib/` layout PRD 0048 established for the Gitea +deploy-key provisioner. The four provisioner methods backends +currently duplicate move into the provider plugin itself; the backend +keeps only the bottle-side primitives (`cp_in`, `exec`) the plugin +calls through. MCP server registration becomes a first-class part of +the provider contract so Codex finally gets the supervise sidecar +wired in alongside Claude. + +The shipping artifact is two new provider plugins under `contrib/`, a +narrower `AgentProvider` ABC in `bot_bottle/agent_provider.py`, four +fewer provisioner hooks on `BottleBackend`, and a supervise-MCP entry +visible from the Codex agent at launch. + +## Problem + +Three concrete pains, all downstream of the provider abstraction not +being where the work happens: + +1. **Adding a third provider is a five-file edit.** A hypothetical + Gemini or Aider provider has to: (a) add a branch in + `agent_provision_plan`, (b) add a runtime entry in `_RUNTIMES`, + (c) thread a `prompt_mode` enum value, (d) potentially extend + `provision_provider_auth` per backend, (e) wire MCP registration + into both `backend/docker/provision/supervise.py` and + `backend/smolmachines/provision/supervise.py`. Nothing about that + spread is load-bearing; it's leftover from when there was one + provider. + +2. **MCP server registration is Claude-only.** Both + `backend/docker/provision/supervise.py` and + `backend/smolmachines/provision/supervise.py` run `claude mcp add` + verbatim. Codex bottles silently get no MCP entry — the sidecar + is running, the routes are open, but the agent can't see the + tools because nothing wrote them into Codex's TOML config. Today + this is a latent gap. The provider plugin is the only layer that + knows how a given agent discovers MCP servers, so that's where + the registration belongs. + +3. **`provision_skills` / `provision_prompt` / `provision_provider_auth` + are duplicated between backends.** Each backend has its own + ~50-line copy. The differences are entirely about which path the + backend uses for `cp_in` and what user it `chown`s to. Same + business logic, two implementations, two test surfaces, two + places to update when the rules change. + +The agent_provider module is the right home for all of this. It already +owns the `AgentProvisionPlan` (the declarative description of what +needs to land in the guest); extending it to own the imperative +"actually land it" step is the natural next move. Putting +provider-specific code under `contrib/` mirrors the convention PRD 0048 +established and keeps the core package provider-agnostic. + +## Goals / Success Criteria + +1. `bot_bottle/agent_provider.py` contains no Claude- or + Codex-specific branches. The Claude and Codex template strings + themselves still live in the core module (they're the public + manifest values), but everything keyed off them moves out. +2. `bot_bottle/contrib/claude/agent_provider.py` and + `bot_bottle/contrib/codex/agent_provider.py` exist and contain + the provider-specific behavior previously in lines 154–230 of + `agent_provider.py`. Each is reachable from the core registry via + a lazy import (the same pattern PRD 0048 used for + `GiteaDeployKeyProvisioner`). +3. `AgentProvider` is an ABC (or protocol) with at minimum: + - `provision_plan(...) -> AgentProvisionPlan` — what the existing + `agent_provision_plan` produces today, scoped to one provider. + - `provision_skills(bottle, plan)` — copy host skills into the guest. + - `provision_prompt(bottle, plan)` — copy the prompt file, return + the in-guest path (or None). + - `provision_supervise_mcp(bottle, plan, supervise_url)` — register + the supervise sidecar in the provider's MCP config. No-op when + the bottle has no supervise sidecar. + - The Claude implementation runs `claude mcp add`. The Codex + implementation writes the corresponding entry into + `~/.codex/config.toml`'s `[mcp_servers.supervise]` table. +4. `BottleBackend` loses the four abstract methods being moved + (`provision_skills`, `provision_prompt`, `provision_provider_auth`, + `provision_supervise`). `BottleBackend.provision_in_bottle` calls + the provider plugin directly via the bottle and plan it already + has. `provision_ca`, `provision_workspace`, and `provision_git` + stay on the backend — they're backend infrastructure, not + provider behavior. +5. `bot_bottle/backend/docker/provision/{skills,prompt,provider_auth, + supervise}.py` and `bot_bottle/backend/smolmachines/provision/{skills, + prompt,provider_auth,supervise}.py` are deleted. The + backend-specific provisioners that remain (`ca`, `git`, + `workspace`) stay. +6. A Codex bottle launched with `--supervise` shows the + supervise MCP server entry in its Codex config and can call + supervise tools from inside the bottle (egress-block, + pipelock-block, capability-block). +7. Existing tests for the moved logic move with the code: + provider-specific tests under `tests/unit/test_contrib_claude_*.py` + and `tests/unit/test_contrib_codex_*.py`, mirroring + `tests/unit/test_contrib_gitea_deploy_key.py`. +8. PRD 0050's Status flips Draft → Active in the same commit that + removes the last `if template == "claude"` branch from + `agent_provider.py`. + +## Non-goals + +- **A third agent provider.** This PRD reshapes the boundary so a + third provider is cheap to add. It does not add one. +- **Changing the manifest surface.** The `agent.provider` + manifest field still takes `"claude"` or `"codex"`. The set of + valid strings is unchanged. +- **Changing `AgentProvisionPlan`'s shape.** The dataclasses + (`AgentProvisionDir`, `AgentProvisionFile`, `AgentProvisionCommand`, + `AgentProvisionPlan` itself) stay in the core module and keep their + current fields. Provider plugins produce the same plan shape; only + the producer moves. +- **Changing the supervise sidecar protocol or the supervise tool + surface.** PRDs 0013–0016 stay Active. What changes is how the + agent discovers the sidecar's MCP endpoint, not what it does once + connected. +- **Per-skill provider differences.** A Codex agent and a Claude + agent see the same `~/.claude/skills//` tree today (Codex + reads it via its own skills mechanism). This PRD does not change + that — `provision_skills` lands the same content for both. +- **Removing the `prompt_args` helper from `agent_provider.py`.** It + stays at module scope; it's already a pure dispatch on `prompt_mode` + and has no Claude/Codex `if` chain to extract. +- **`provision_provider_auth` migration.** The issue notes this method + is "probably not needed anymore" once each provider owns its own + provisioning. After the move, the work that + `provision_provider_auth` did (apply `dirs` / `files` / `pre_copy` / + `verify` from the plan) becomes a shared helper the per-provider + `provision_skills` / `provision_prompt` calls dispatch through — + or, more likely, a single `provision(bottle)` entry point on the + provider. The hook is removed from `BottleBackend`; whether the + underlying loop lives on `AgentProvider` as a default + implementation or as a free function in `contrib/_apply.py` is + decided at implementation time, not in this PRD. + +## Scope + +### In scope + +- New `AgentProvider` ABC in `bot_bottle/agent_provider.py` with the + five methods listed under Goal 3. Existing `agent_provision_plan` + becomes `AgentProvider.provision_plan`. +- New `bot_bottle/contrib/claude/__init__.py`, + `bot_bottle/contrib/claude/agent_provider.py`, + `bot_bottle/contrib/codex/__init__.py`, + `bot_bottle/contrib/codex/agent_provider.py`. Each defines a + `ClaudeAgentProvider` / `CodexAgentProvider` class. +- A `get_provider(template) -> AgentProvider` registry in + `bot_bottle/agent_provider.py`, lazy-imported from `contrib/`, + mirroring `get_provisioner(provider, ...)` in + `bot_bottle/deploy_key_provisioner.py`. +- Backend changes: + - `BottleBackend.provision_in_bottle` resolves the provider once + and calls `provider.provision_skills(bottle, plan)`, + `provider.provision_prompt(bottle, plan)`, and + `provider.provision_supervise_mcp(bottle, plan, url)` in place + of the current four abstract hooks. + - `BottleBackend.provision_skills`, `provision_prompt`, + `provision_provider_auth`, `provision_supervise` are removed. + - Docker and smolmachines backends remove their corresponding + `provision_*` implementations and the + `backend//provision/{skills,prompt,provider_auth, + supervise}.py` modules. +- Codex MCP wiring: `CodexAgentProvider.provision_supervise_mcp` + writes a `[mcp_servers.supervise]` block into + `~/.codex/config.toml` pointing at the same agent-side supervise + URL the Claude provider uses. The file already exists from the + trust-dialog step; the MCP entry is appended (or the file is + rewritten in a single shot, whichever's simpler). +- Tests migrate. Backend tests that targeted the four moved + provisioners are rewritten against the provider plugin, with one + test file per provider mirroring `tests/unit/test_contrib_gitea_*.py`. + +### Out of scope + +- Adding a manifest field for "extra MCP servers the agent should + see". The supervise sidecar is the only MCP server provisioned + today, and the issue's "Add mcp server configuring into agent + provision" line is about the supervise sidecar specifically. A + general-purpose user-declared MCP list is a follow-up if and when + the need surfaces. +- Refactoring `AgentProvisionPlan`'s dataclasses. They stay byte- + for-byte the same so the diff is purely "who owns the producer". +- A `BottleBackend.provision_provider_auth` shim during transition. + The hook is removed in one cut; the only caller is the backend + itself, no manifest consumers reference it. +- Renaming `agent_provider.py` → `agent_providers/`. The module + still has core dataclasses + the ABC + the registry; it's a single + file's worth of code. + +## Proposed design + +### Module shape after the cut + +``` +bot_bottle/agent_provider.py + PROVIDER_CLAUDE, PROVIDER_CODEX, PROVIDER_TEMPLATES + PromptMode (Literal) + AgentProvisionDir, AgentProvisionFile, AgentProvisionCommand, + AgentProvisionPlan (dataclasses, unchanged) + AgentProviderRuntime (dataclass — template/command/image/etc.) + AgentProvider (ABC) + .runtime() -> AgentProviderRuntime + .provision_plan(state_dir, ..., trusted_project_path, ...) -> AgentProvisionPlan + .provision_skills(bottle, plan) -> None + .provision_prompt(bottle, plan) -> str | None + .provision_supervise_mcp(bottle, plan, supervise_url) -> None + get_provider(template: str) -> AgentProvider # lazy-imports contrib + prompt_args(prompt_mode, prompt_path, *, argv) # unchanged + +bot_bottle/contrib/claude/agent_provider.py + ClaudeAgentProvider(AgentProvider) + _RUNTIME = AgentProviderRuntime(template="claude", ...) + .provision_plan(...) # owns the lines-204–230 chunk + .provision_skills(...) # was backend//provision/skills.py + .provision_prompt(...) # was backend//provision/prompt.py + .provision_supervise_mcp(...)# was backend//provision/supervise.py + +bot_bottle/contrib/codex/agent_provider.py + CodexAgentProvider(AgentProvider) + _RUNTIME = AgentProviderRuntime(template="codex", ...) + .provision_plan(...) # owns the lines-154–204 chunk + .provision_skills(...) # same as Claude impl, factored to shared helper + .provision_prompt(...) # same as Claude impl, factored to shared helper + .provision_supervise_mcp(...)# writes [mcp_servers.supervise] to config.toml +``` + +The skills / prompt / provider-auth-apply implementations are 99% +identical across providers — `cp_in` then `chown` / `chmod`. They are +extracted to small free functions in +`bot_bottle/contrib/_provision_apply.py` (or kept as default +implementations on `AgentProvider` if every concrete subclass would +just call them). Picked at implementation time; both options match +PRD 0048's contrib convention. The visible contract is that +provisioning lives on the provider plugin. + +### MCP registration for Codex + +Codex reads MCP servers from `~/.codex/config.toml` (or whatever +`CODEX_HOME/config.toml` resolves to). The provider already writes +this file once during `provision_plan` to set the project trust +level. `CodexAgentProvider.provision_supervise_mcp` extends the +existing write: same path, append a `[mcp_servers.supervise]` table +pointing at the agent-side supervise URL. + +Two implementation routes worth flagging: + +- **Option A:** Pre-bake the MCP entry in the same config-write that + happens during `provision_plan`, before bottle launch. Simpler; + the supervise URL has to be known at plan time, which means + `provision_plan` needs the supervise URL (or a sentinel that means + "fill this in"). The smolmachines backend already plumbs + `agent_supervise_url` through to its provision_supervise step, so + the value is available. +- **Option B:** Append at bottle-launch time via a `bottle.exec` + that writes to the file inside the guest, matching the + `claude mcp add` flow. Slower but uniform with how + `ClaudeAgentProvider.provision_supervise_mcp` works. + +Option B is the symmetric choice and the one this PRD assumes. +The implementer can switch to A if Option B turns out to need a +TOML-merge primitive the codebase doesn't already have. + +### Backend after the cut + +```python +class BottleBackend: + def provision_in_bottle(self, plan, bottle, supervise_url): + provider = get_provider(plan.spec.manifest.agents[ + plan.spec.agent_name].provider) + self.provision_ca(plan, bottle) + prompt_path = provider.provision_prompt(bottle, plan) + provider.provision_skills(bottle, plan) + self.provision_workspace(plan, bottle) + self.provision_git(plan, bottle) + provider.provision_supervise_mcp(bottle, plan, supervise_url) + return prompt_path +``` + +`supervise_url` is the existing per-backend "where does the agent +reach the sidecar from inside the guest" value. The Docker backend +passes `http://supervise:/`; smolmachines passes the +`http://127.0.0.1:/` it already computed. The backend's only +remaining provider-touching duty is "tell the provider what the +sidecar URL is". + +### Registry + +```python +# bot_bottle/agent_provider.py +def get_provider(template: str) -> AgentProvider: + if template == PROVIDER_CLAUDE: + from bot_bottle.contrib.claude.agent_provider import ( + ClaudeAgentProvider, + ) + return ClaudeAgentProvider() + if template == PROVIDER_CODEX: + from bot_bottle.contrib.codex.agent_provider import ( + CodexAgentProvider, + ) + return CodexAgentProvider() + raise ValueError(f"unknown agent provider template: {template!r}") +``` + +Lazy imports keep core import-time graph small and match PRD 0048. + +## Implementation chunks + +Each chunk is one commit on the PR; the PR ships as one cut. + +1. **Lift `AgentProvider` ABC + registry.** Add the ABC and + `get_provider` next to the existing `agent_provision_plan` + function. Have `agent_provision_plan` delegate to + `get_provider(template).provision_plan(...)` so callers keep + working through the transition. +2. **Move provider-specific `provision_plan` content into + contrib.** Create `contrib/claude/` and `contrib/codex/`. The + Claude and Codex branches of `agent_provision_plan` move into + the respective provider classes. The shared scaffolding + (initial dict setup, final `AgentProvisionPlan(...)` return) + stays in the ABC as a template method or moves into each + subclass — whichever needs less indirection. +3. **Move backend provisioners onto the provider.** Add + `provision_skills`, `provision_prompt`, `provision_supervise_mcp` + to `AgentProvider` (with a shared apply helper for skills / + prompt). Update `BottleBackend.provision_in_bottle` to call them. + Delete the four backend hook methods and the eight + `backend//provision/{skills,prompt,provider_auth,supervise}.py` + modules. +4. **Add Codex MCP support.** Implement + `CodexAgentProvider.provision_supervise_mcp` against + `~/.codex/config.toml`. Add a unit test that runs the method + against an in-memory FakeBottle and asserts the + `[mcp_servers.supervise]` block is present. +5. **Migrate tests.** Per-backend tests for the moved + provisioners turn into per-provider tests under + `tests/unit/test_contrib_claude_*.py` and + `tests/unit/test_contrib_codex_*.py`. Keep one integration-style + test per backend that confirms `provision_in_bottle` still + reaches every step. +6. **Activate.** Flip Status: Draft → Active in this PRD; close + #177 on merge. + +## Open questions + +1. **Does Codex's `claude mcp add` equivalent exist as a CLI verb, + or is direct TOML editing the only option?** If a `codex mcp + add` (or similar) is in the binary, Option A in the MCP section + becomes more attractive. The implementation chunk for Codex MCP + should check `codex --help` against the version pinned in + `Dockerfile.codex` before committing to TOML editing. +2. **Shared apply helper vs. default methods on `AgentProvider`?** + Both work. Default methods read cleaner at call sites; a free + function in `contrib/_provision_apply.py` is easier to test in + isolation. Defer to the implementer; not load-bearing for the + PRD. +3. **Should `BOT_BOTTLE_CONTAINER_HOME` / `BOT_BOTTLE_GUEST_HOME` + /`BOT_BOTTLE_CONTAINER_SKILLS_DIR` env knobs survive the move?** + They're only read by the backend-side provision modules being + deleted. The provider plugin can take them too, but the values + are effectively constants today. Leaving the env-read fall-back + in place for compat is one line per knob; this PRD says yes, + carry them over. + +## References + +- Issue + [#177](https://gitea.dideric.is/didericis/bot-bottle/issues/177) + — the request: move provider logic into contrib, add MCP + configuration to agent provision, rename provision_supervise → + provision_supervise_mcp, ensure Codex gets MCP provisioned. +- PRD 0013 — supervise plane foundation (defines the MCP-discoverable + block-remediation tools this PRD makes available to Codex). +- PRD 0048 — SSH deploy key provisioning (the `contrib/` convention + this PRD follows). +- Current source: + [agent_provider.py L154-L230](https://gitea.dideric.is/didericis/bot-bottle/src/branch/main/bot_bottle/agent_provider.py#L154-L230) + — the provider-specific block this PRD relocates to contrib.