docs(prd): draft PRD 0050 — move provider logic into contrib

This commit is contained in:
2026-06-03 21:05:31 +00:00
committed by didericis
parent 98e4e2b7dc
commit 2c6f248cda
+400
View File
@@ -0,0 +1,400 @@
# PRD 0050: Move provider-specific agent logic into contrib
- **Status:** Draft
- **Author:** claude
- **Created:** 2026-06-03
- **Issue:** #177
## Summary
The agent provider module (`bot_bottle/agent_provider.py`) hard-codes
the Claude- and Codex-specific provisioning rules — auth file shapes,
trust-dialog markers, egress routes, dummy-auth dance, env vars — in a
single `if template == "codex": ... if template == "claude": ...`
chain (lines 154230 today). Other pieces of provider behavior live in
each backend's `provision/` directory (`provision_skills`,
`provision_prompt`, `provision_provider_auth`, `provision_supervise`),
duplicated once per backend, even though almost none of what they do
is actually backend-specific.
This PRD reshapes the agent provider into a proper plugin boundary.
The two existing providers (Claude, Codex) move out of `agent_provider`
into `bot_bottle/contrib/claude/` and `bot_bottle/contrib/codex/`
the same `contrib/` layout PRD 0048 established for the Gitea
deploy-key provisioner. The four provisioner methods backends
currently duplicate move into the provider plugin itself; the backend
keeps only the bottle-side primitives (`cp_in`, `exec`) the plugin
calls through. MCP server registration becomes a first-class part of
the provider contract so Codex finally gets the supervise sidecar
wired in alongside Claude.
The shipping artifact is two new provider plugins under `contrib/`, a
narrower `AgentProvider` ABC in `bot_bottle/agent_provider.py`, four
fewer provisioner hooks on `BottleBackend`, and a supervise-MCP entry
visible from the Codex agent at launch.
## Problem
Three concrete pains, all downstream of the provider abstraction not
being where the work happens:
1. **Adding a third provider is a five-file edit.** A hypothetical
Gemini or Aider provider has to: (a) add a branch in
`agent_provision_plan`, (b) add a runtime entry in `_RUNTIMES`,
(c) thread a `prompt_mode` enum value, (d) potentially extend
`provision_provider_auth` per backend, (e) wire MCP registration
into both `backend/docker/provision/supervise.py` and
`backend/smolmachines/provision/supervise.py`. Nothing about that
spread is load-bearing; it's leftover from when there was one
provider.
2. **MCP server registration is Claude-only.** Both
`backend/docker/provision/supervise.py` and
`backend/smolmachines/provision/supervise.py` run `claude mcp add`
verbatim. Codex bottles silently get no MCP entry — the sidecar
is running, the routes are open, but the agent can't see the
tools because nothing wrote them into Codex's TOML config. Today
this is a latent gap. The provider plugin is the only layer that
knows how a given agent discovers MCP servers, so that's where
the registration belongs.
3. **`provision_skills` / `provision_prompt` / `provision_provider_auth`
are duplicated between backends.** Each backend has its own
~50-line copy. The differences are entirely about which path the
backend uses for `cp_in` and what user it `chown`s to. Same
business logic, two implementations, two test surfaces, two
places to update when the rules change.
The agent_provider module is the right home for all of this. It already
owns the `AgentProvisionPlan` (the declarative description of what
needs to land in the guest); extending it to own the imperative
"actually land it" step is the natural next move. Putting
provider-specific code under `contrib/` mirrors the convention PRD 0048
established and keeps the core package provider-agnostic.
## Goals / Success Criteria
1. `bot_bottle/agent_provider.py` contains no Claude- or
Codex-specific branches. The Claude and Codex template strings
themselves still live in the core module (they're the public
manifest values), but everything keyed off them moves out.
2. `bot_bottle/contrib/claude/agent_provider.py` and
`bot_bottle/contrib/codex/agent_provider.py` exist and contain
the provider-specific behavior previously in lines 154230 of
`agent_provider.py`. Each is reachable from the core registry via
a lazy import (the same pattern PRD 0048 used for
`GiteaDeployKeyProvisioner`).
3. `AgentProvider` is an ABC (or protocol) with at minimum:
- `provision_plan(...) -> AgentProvisionPlan` — what the existing
`agent_provision_plan` produces today, scoped to one provider.
- `provision_skills(bottle, plan)` — copy host skills into the guest.
- `provision_prompt(bottle, plan)` — copy the prompt file, return
the in-guest path (or None).
- `provision_supervise_mcp(bottle, plan, supervise_url)` — register
the supervise sidecar in the provider's MCP config. No-op when
the bottle has no supervise sidecar.
- The Claude implementation runs `claude mcp add`. The Codex
implementation writes the corresponding entry into
`~/.codex/config.toml`'s `[mcp_servers.supervise]` table.
4. `BottleBackend` loses the four abstract methods being moved
(`provision_skills`, `provision_prompt`, `provision_provider_auth`,
`provision_supervise`). `BottleBackend.provision_in_bottle` calls
the provider plugin directly via the bottle and plan it already
has. `provision_ca`, `provision_workspace`, and `provision_git`
stay on the backend — they're backend infrastructure, not
provider behavior.
5. `bot_bottle/backend/docker/provision/{skills,prompt,provider_auth,
supervise}.py` and `bot_bottle/backend/smolmachines/provision/{skills,
prompt,provider_auth,supervise}.py` are deleted. The
backend-specific provisioners that remain (`ca`, `git`,
`workspace`) stay.
6. A Codex bottle launched with `--supervise` shows the
supervise MCP server entry in its Codex config and can call
supervise tools from inside the bottle (egress-block,
pipelock-block, capability-block).
7. Existing tests for the moved logic move with the code:
provider-specific tests under `tests/unit/test_contrib_claude_*.py`
and `tests/unit/test_contrib_codex_*.py`, mirroring
`tests/unit/test_contrib_gitea_deploy_key.py`.
8. PRD 0050's Status flips Draft → Active in the same commit that
removes the last `if template == "claude"` branch from
`agent_provider.py`.
## Non-goals
- **A third agent provider.** This PRD reshapes the boundary so a
third provider is cheap to add. It does not add one.
- **Changing the manifest surface.** The `agent.provider`
manifest field still takes `"claude"` or `"codex"`. The set of
valid strings is unchanged.
- **Changing `AgentProvisionPlan`'s shape.** The dataclasses
(`AgentProvisionDir`, `AgentProvisionFile`, `AgentProvisionCommand`,
`AgentProvisionPlan` itself) stay in the core module and keep their
current fields. Provider plugins produce the same plan shape; only
the producer moves.
- **Changing the supervise sidecar protocol or the supervise tool
surface.** PRDs 00130016 stay Active. What changes is how the
agent discovers the sidecar's MCP endpoint, not what it does once
connected.
- **Per-skill provider differences.** A Codex agent and a Claude
agent see the same `~/.claude/skills/<name>/` tree today (Codex
reads it via its own skills mechanism). This PRD does not change
that — `provision_skills` lands the same content for both.
- **Removing the `prompt_args` helper from `agent_provider.py`.** It
stays at module scope; it's already a pure dispatch on `prompt_mode`
and has no Claude/Codex `if` chain to extract.
- **`provision_provider_auth` migration.** The issue notes this method
is "probably not needed anymore" once each provider owns its own
provisioning. After the move, the work that
`provision_provider_auth` did (apply `dirs` / `files` / `pre_copy` /
`verify` from the plan) becomes a shared helper the per-provider
`provision_skills` / `provision_prompt` calls dispatch through —
or, more likely, a single `provision(bottle)` entry point on the
provider. The hook is removed from `BottleBackend`; whether the
underlying loop lives on `AgentProvider` as a default
implementation or as a free function in `contrib/_apply.py` is
decided at implementation time, not in this PRD.
## Scope
### In scope
- New `AgentProvider` ABC in `bot_bottle/agent_provider.py` with the
five methods listed under Goal 3. Existing `agent_provision_plan`
becomes `AgentProvider.provision_plan`.
- New `bot_bottle/contrib/claude/__init__.py`,
`bot_bottle/contrib/claude/agent_provider.py`,
`bot_bottle/contrib/codex/__init__.py`,
`bot_bottle/contrib/codex/agent_provider.py`. Each defines a
`ClaudeAgentProvider` / `CodexAgentProvider` class.
- A `get_provider(template) -> AgentProvider` registry in
`bot_bottle/agent_provider.py`, lazy-imported from `contrib/`,
mirroring `get_provisioner(provider, ...)` in
`bot_bottle/deploy_key_provisioner.py`.
- Backend changes:
- `BottleBackend.provision_in_bottle` resolves the provider once
and calls `provider.provision_skills(bottle, plan)`,
`provider.provision_prompt(bottle, plan)`, and
`provider.provision_supervise_mcp(bottle, plan, url)` in place
of the current four abstract hooks.
- `BottleBackend.provision_skills`, `provision_prompt`,
`provision_provider_auth`, `provision_supervise` are removed.
- Docker and smolmachines backends remove their corresponding
`provision_*` implementations and the
`backend/<name>/provision/{skills,prompt,provider_auth,
supervise}.py` modules.
- Codex MCP wiring: `CodexAgentProvider.provision_supervise_mcp`
writes a `[mcp_servers.supervise]` block into
`~/.codex/config.toml` pointing at the same agent-side supervise
URL the Claude provider uses. The file already exists from the
trust-dialog step; the MCP entry is appended (or the file is
rewritten in a single shot, whichever's simpler).
- Tests migrate. Backend tests that targeted the four moved
provisioners are rewritten against the provider plugin, with one
test file per provider mirroring `tests/unit/test_contrib_gitea_*.py`.
### Out of scope
- Adding a manifest field for "extra MCP servers the agent should
see". The supervise sidecar is the only MCP server provisioned
today, and the issue's "Add mcp server configuring into agent
provision" line is about the supervise sidecar specifically. A
general-purpose user-declared MCP list is a follow-up if and when
the need surfaces.
- Refactoring `AgentProvisionPlan`'s dataclasses. They stay byte-
for-byte the same so the diff is purely "who owns the producer".
- A `BottleBackend.provision_provider_auth` shim during transition.
The hook is removed in one cut; the only caller is the backend
itself, no manifest consumers reference it.
- Renaming `agent_provider.py` → `agent_providers/`. The module
still has core dataclasses + the ABC + the registry; it's a single
file's worth of code.
## Proposed design
### Module shape after the cut
```
bot_bottle/agent_provider.py
PROVIDER_CLAUDE, PROVIDER_CODEX, PROVIDER_TEMPLATES
PromptMode (Literal)
AgentProvisionDir, AgentProvisionFile, AgentProvisionCommand,
AgentProvisionPlan (dataclasses, unchanged)
AgentProviderRuntime (dataclass — template/command/image/etc.)
AgentProvider (ABC)
.runtime() -> AgentProviderRuntime
.provision_plan(state_dir, ..., trusted_project_path, ...) -> AgentProvisionPlan
.provision_skills(bottle, plan) -> None
.provision_prompt(bottle, plan) -> str | None
.provision_supervise_mcp(bottle, plan, supervise_url) -> None
get_provider(template: str) -> AgentProvider # lazy-imports contrib
prompt_args(prompt_mode, prompt_path, *, argv) # unchanged
bot_bottle/contrib/claude/agent_provider.py
ClaudeAgentProvider(AgentProvider)
_RUNTIME = AgentProviderRuntime(template="claude", ...)
.provision_plan(...) # owns the lines-204230 chunk
.provision_skills(...) # was backend/<name>/provision/skills.py
.provision_prompt(...) # was backend/<name>/provision/prompt.py
.provision_supervise_mcp(...)# was backend/<name>/provision/supervise.py
bot_bottle/contrib/codex/agent_provider.py
CodexAgentProvider(AgentProvider)
_RUNTIME = AgentProviderRuntime(template="codex", ...)
.provision_plan(...) # owns the lines-154204 chunk
.provision_skills(...) # same as Claude impl, factored to shared helper
.provision_prompt(...) # same as Claude impl, factored to shared helper
.provision_supervise_mcp(...)# writes [mcp_servers.supervise] to config.toml
```
The skills / prompt / provider-auth-apply implementations are 99%
identical across providers — `cp_in` then `chown` / `chmod`. They are
extracted to small free functions in
`bot_bottle/contrib/_provision_apply.py` (or kept as default
implementations on `AgentProvider` if every concrete subclass would
just call them). Picked at implementation time; both options match
PRD 0048's contrib convention. The visible contract is that
provisioning lives on the provider plugin.
### MCP registration for Codex
Codex reads MCP servers from `~/.codex/config.toml` (or whatever
`CODEX_HOME/config.toml` resolves to). The provider already writes
this file once during `provision_plan` to set the project trust
level. `CodexAgentProvider.provision_supervise_mcp` extends the
existing write: same path, append a `[mcp_servers.supervise]` table
pointing at the agent-side supervise URL.
Two implementation routes worth flagging:
- **Option A:** Pre-bake the MCP entry in the same config-write that
happens during `provision_plan`, before bottle launch. Simpler;
the supervise URL has to be known at plan time, which means
`provision_plan` needs the supervise URL (or a sentinel that means
"fill this in"). The smolmachines backend already plumbs
`agent_supervise_url` through to its provision_supervise step, so
the value is available.
- **Option B:** Append at bottle-launch time via a `bottle.exec`
that writes to the file inside the guest, matching the
`claude mcp add` flow. Slower but uniform with how
`ClaudeAgentProvider.provision_supervise_mcp` works.
Option B is the symmetric choice and the one this PRD assumes.
The implementer can switch to A if Option B turns out to need a
TOML-merge primitive the codebase doesn't already have.
### Backend after the cut
```python
class BottleBackend:
def provision_in_bottle(self, plan, bottle, supervise_url):
provider = get_provider(plan.spec.manifest.agents[
plan.spec.agent_name].provider)
self.provision_ca(plan, bottle)
prompt_path = provider.provision_prompt(bottle, plan)
provider.provision_skills(bottle, plan)
self.provision_workspace(plan, bottle)
self.provision_git(plan, bottle)
provider.provision_supervise_mcp(bottle, plan, supervise_url)
return prompt_path
```
`supervise_url` is the existing per-backend "where does the agent
reach the sidecar from inside the guest" value. The Docker backend
passes `http://supervise:<port>/`; smolmachines passes the
`http://127.0.0.1:<port>/` it already computed. The backend's only
remaining provider-touching duty is "tell the provider what the
sidecar URL is".
### Registry
```python
# bot_bottle/agent_provider.py
def get_provider(template: str) -> AgentProvider:
if template == PROVIDER_CLAUDE:
from bot_bottle.contrib.claude.agent_provider import (
ClaudeAgentProvider,
)
return ClaudeAgentProvider()
if template == PROVIDER_CODEX:
from bot_bottle.contrib.codex.agent_provider import (
CodexAgentProvider,
)
return CodexAgentProvider()
raise ValueError(f"unknown agent provider template: {template!r}")
```
Lazy imports keep core import-time graph small and match PRD 0048.
## Implementation chunks
Each chunk is one commit on the PR; the PR ships as one cut.
1. **Lift `AgentProvider` ABC + registry.** Add the ABC and
`get_provider` next to the existing `agent_provision_plan`
function. Have `agent_provision_plan` delegate to
`get_provider(template).provision_plan(...)` so callers keep
working through the transition.
2. **Move provider-specific `provision_plan` content into
contrib.** Create `contrib/claude/` and `contrib/codex/`. The
Claude and Codex branches of `agent_provision_plan` move into
the respective provider classes. The shared scaffolding
(initial dict setup, final `AgentProvisionPlan(...)` return)
stays in the ABC as a template method or moves into each
subclass — whichever needs less indirection.
3. **Move backend provisioners onto the provider.** Add
`provision_skills`, `provision_prompt`, `provision_supervise_mcp`
to `AgentProvider` (with a shared apply helper for skills /
prompt). Update `BottleBackend.provision_in_bottle` to call them.
Delete the four backend hook methods and the eight
`backend/<name>/provision/{skills,prompt,provider_auth,supervise}.py`
modules.
4. **Add Codex MCP support.** Implement
`CodexAgentProvider.provision_supervise_mcp` against
`~/.codex/config.toml`. Add a unit test that runs the method
against an in-memory FakeBottle and asserts the
`[mcp_servers.supervise]` block is present.
5. **Migrate tests.** Per-backend tests for the moved
provisioners turn into per-provider tests under
`tests/unit/test_contrib_claude_*.py` and
`tests/unit/test_contrib_codex_*.py`. Keep one integration-style
test per backend that confirms `provision_in_bottle` still
reaches every step.
6. **Activate.** Flip Status: Draft → Active in this PRD; close
#177 on merge.
## Open questions
1. **Does Codex's `claude mcp add` equivalent exist as a CLI verb,
or is direct TOML editing the only option?** If a `codex mcp
add` (or similar) is in the binary, Option A in the MCP section
becomes more attractive. The implementation chunk for Codex MCP
should check `codex --help` against the version pinned in
`Dockerfile.codex` before committing to TOML editing.
2. **Shared apply helper vs. default methods on `AgentProvider`?**
Both work. Default methods read cleaner at call sites; a free
function in `contrib/_provision_apply.py` is easier to test in
isolation. Defer to the implementer; not load-bearing for the
PRD.
3. **Should `BOT_BOTTLE_CONTAINER_HOME` / `BOT_BOTTLE_GUEST_HOME`
/`BOT_BOTTLE_CONTAINER_SKILLS_DIR` env knobs survive the move?**
They're only read by the backend-side provision modules being
deleted. The provider plugin can take them too, but the values
are effectively constants today. Leaving the env-read fall-back
in place for compat is one line per knob; this PRD says yes,
carry them over.
## References
- Issue
[#177](https://gitea.dideric.is/didericis/bot-bottle/issues/177)
— the request: move provider logic into contrib, add MCP
configuration to agent provision, rename provision_supervise →
provision_supervise_mcp, ensure Codex gets MCP provisioned.
- PRD 0013 — supervise plane foundation (defines the MCP-discoverable
block-remediation tools this PRD makes available to Codex).
- PRD 0048 — SSH deploy key provisioning (the `contrib/` convention
this PRD follows).
- Current source:
[agent_provider.py L154-L230](https://gitea.dideric.is/didericis/bot-bottle/src/branch/main/bot_bottle/agent_provider.py#L154-L230)
— the provider-specific block this PRD relocates to contrib.