docs(prd): draft PRD 0050 — move provider logic into contrib
This commit is contained in:
@@ -0,0 +1,400 @@
|
||||
# PRD 0050: Move provider-specific agent logic into contrib
|
||||
|
||||
- **Status:** Draft
|
||||
- **Author:** claude
|
||||
- **Created:** 2026-06-03
|
||||
- **Issue:** #177
|
||||
|
||||
## Summary
|
||||
|
||||
The agent provider module (`bot_bottle/agent_provider.py`) hard-codes
|
||||
the Claude- and Codex-specific provisioning rules — auth file shapes,
|
||||
trust-dialog markers, egress routes, dummy-auth dance, env vars — in a
|
||||
single `if template == "codex": ... if template == "claude": ...`
|
||||
chain (lines 154–230 today). Other pieces of provider behavior live in
|
||||
each backend's `provision/` directory (`provision_skills`,
|
||||
`provision_prompt`, `provision_provider_auth`, `provision_supervise`),
|
||||
duplicated once per backend, even though almost none of what they do
|
||||
is actually backend-specific.
|
||||
|
||||
This PRD reshapes the agent provider into a proper plugin boundary.
|
||||
The two existing providers (Claude, Codex) move out of `agent_provider`
|
||||
into `bot_bottle/contrib/claude/` and `bot_bottle/contrib/codex/` —
|
||||
the same `contrib/` layout PRD 0048 established for the Gitea
|
||||
deploy-key provisioner. The four provisioner methods backends
|
||||
currently duplicate move into the provider plugin itself; the backend
|
||||
keeps only the bottle-side primitives (`cp_in`, `exec`) the plugin
|
||||
calls through. MCP server registration becomes a first-class part of
|
||||
the provider contract so Codex finally gets the supervise sidecar
|
||||
wired in alongside Claude.
|
||||
|
||||
The shipping artifact is two new provider plugins under `contrib/`, a
|
||||
narrower `AgentProvider` ABC in `bot_bottle/agent_provider.py`, four
|
||||
fewer provisioner hooks on `BottleBackend`, and a supervise-MCP entry
|
||||
visible from the Codex agent at launch.
|
||||
|
||||
## Problem
|
||||
|
||||
Three concrete pains, all downstream of the provider abstraction not
|
||||
being where the work happens:
|
||||
|
||||
1. **Adding a third provider is a five-file edit.** A hypothetical
|
||||
Gemini or Aider provider has to: (a) add a branch in
|
||||
`agent_provision_plan`, (b) add a runtime entry in `_RUNTIMES`,
|
||||
(c) thread a `prompt_mode` enum value, (d) potentially extend
|
||||
`provision_provider_auth` per backend, (e) wire MCP registration
|
||||
into both `backend/docker/provision/supervise.py` and
|
||||
`backend/smolmachines/provision/supervise.py`. Nothing about that
|
||||
spread is load-bearing; it's leftover from when there was one
|
||||
provider.
|
||||
|
||||
2. **MCP server registration is Claude-only.** Both
|
||||
`backend/docker/provision/supervise.py` and
|
||||
`backend/smolmachines/provision/supervise.py` run `claude mcp add`
|
||||
verbatim. Codex bottles silently get no MCP entry — the sidecar
|
||||
is running, the routes are open, but the agent can't see the
|
||||
tools because nothing wrote them into Codex's TOML config. Today
|
||||
this is a latent gap. The provider plugin is the only layer that
|
||||
knows how a given agent discovers MCP servers, so that's where
|
||||
the registration belongs.
|
||||
|
||||
3. **`provision_skills` / `provision_prompt` / `provision_provider_auth`
|
||||
are duplicated between backends.** Each backend has its own
|
||||
~50-line copy. The differences are entirely about which path the
|
||||
backend uses for `cp_in` and what user it `chown`s to. Same
|
||||
business logic, two implementations, two test surfaces, two
|
||||
places to update when the rules change.
|
||||
|
||||
The agent_provider module is the right home for all of this. It already
|
||||
owns the `AgentProvisionPlan` (the declarative description of what
|
||||
needs to land in the guest); extending it to own the imperative
|
||||
"actually land it" step is the natural next move. Putting
|
||||
provider-specific code under `contrib/` mirrors the convention PRD 0048
|
||||
established and keeps the core package provider-agnostic.
|
||||
|
||||
## Goals / Success Criteria
|
||||
|
||||
1. `bot_bottle/agent_provider.py` contains no Claude- or
|
||||
Codex-specific branches. The Claude and Codex template strings
|
||||
themselves still live in the core module (they're the public
|
||||
manifest values), but everything keyed off them moves out.
|
||||
2. `bot_bottle/contrib/claude/agent_provider.py` and
|
||||
`bot_bottle/contrib/codex/agent_provider.py` exist and contain
|
||||
the provider-specific behavior previously in lines 154–230 of
|
||||
`agent_provider.py`. Each is reachable from the core registry via
|
||||
a lazy import (the same pattern PRD 0048 used for
|
||||
`GiteaDeployKeyProvisioner`).
|
||||
3. `AgentProvider` is an ABC (or protocol) with at minimum:
|
||||
- `provision_plan(...) -> AgentProvisionPlan` — what the existing
|
||||
`agent_provision_plan` produces today, scoped to one provider.
|
||||
- `provision_skills(bottle, plan)` — copy host skills into the guest.
|
||||
- `provision_prompt(bottle, plan)` — copy the prompt file, return
|
||||
the in-guest path (or None).
|
||||
- `provision_supervise_mcp(bottle, plan, supervise_url)` — register
|
||||
the supervise sidecar in the provider's MCP config. No-op when
|
||||
the bottle has no supervise sidecar.
|
||||
- The Claude implementation runs `claude mcp add`. The Codex
|
||||
implementation writes the corresponding entry into
|
||||
`~/.codex/config.toml`'s `[mcp_servers.supervise]` table.
|
||||
4. `BottleBackend` loses the four abstract methods being moved
|
||||
(`provision_skills`, `provision_prompt`, `provision_provider_auth`,
|
||||
`provision_supervise`). `BottleBackend.provision_in_bottle` calls
|
||||
the provider plugin directly via the bottle and plan it already
|
||||
has. `provision_ca`, `provision_workspace`, and `provision_git`
|
||||
stay on the backend — they're backend infrastructure, not
|
||||
provider behavior.
|
||||
5. `bot_bottle/backend/docker/provision/{skills,prompt,provider_auth,
|
||||
supervise}.py` and `bot_bottle/backend/smolmachines/provision/{skills,
|
||||
prompt,provider_auth,supervise}.py` are deleted. The
|
||||
backend-specific provisioners that remain (`ca`, `git`,
|
||||
`workspace`) stay.
|
||||
6. A Codex bottle launched with `--supervise` shows the
|
||||
supervise MCP server entry in its Codex config and can call
|
||||
supervise tools from inside the bottle (egress-block,
|
||||
pipelock-block, capability-block).
|
||||
7. Existing tests for the moved logic move with the code:
|
||||
provider-specific tests under `tests/unit/test_contrib_claude_*.py`
|
||||
and `tests/unit/test_contrib_codex_*.py`, mirroring
|
||||
`tests/unit/test_contrib_gitea_deploy_key.py`.
|
||||
8. PRD 0050's Status flips Draft → Active in the same commit that
|
||||
removes the last `if template == "claude"` branch from
|
||||
`agent_provider.py`.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **A third agent provider.** This PRD reshapes the boundary so a
|
||||
third provider is cheap to add. It does not add one.
|
||||
- **Changing the manifest surface.** The `agent.provider`
|
||||
manifest field still takes `"claude"` or `"codex"`. The set of
|
||||
valid strings is unchanged.
|
||||
- **Changing `AgentProvisionPlan`'s shape.** The dataclasses
|
||||
(`AgentProvisionDir`, `AgentProvisionFile`, `AgentProvisionCommand`,
|
||||
`AgentProvisionPlan` itself) stay in the core module and keep their
|
||||
current fields. Provider plugins produce the same plan shape; only
|
||||
the producer moves.
|
||||
- **Changing the supervise sidecar protocol or the supervise tool
|
||||
surface.** PRDs 0013–0016 stay Active. What changes is how the
|
||||
agent discovers the sidecar's MCP endpoint, not what it does once
|
||||
connected.
|
||||
- **Per-skill provider differences.** A Codex agent and a Claude
|
||||
agent see the same `~/.claude/skills/<name>/` tree today (Codex
|
||||
reads it via its own skills mechanism). This PRD does not change
|
||||
that — `provision_skills` lands the same content for both.
|
||||
- **Removing the `prompt_args` helper from `agent_provider.py`.** It
|
||||
stays at module scope; it's already a pure dispatch on `prompt_mode`
|
||||
and has no Claude/Codex `if` chain to extract.
|
||||
- **`provision_provider_auth` migration.** The issue notes this method
|
||||
is "probably not needed anymore" once each provider owns its own
|
||||
provisioning. After the move, the work that
|
||||
`provision_provider_auth` did (apply `dirs` / `files` / `pre_copy` /
|
||||
`verify` from the plan) becomes a shared helper the per-provider
|
||||
`provision_skills` / `provision_prompt` calls dispatch through —
|
||||
or, more likely, a single `provision(bottle)` entry point on the
|
||||
provider. The hook is removed from `BottleBackend`; whether the
|
||||
underlying loop lives on `AgentProvider` as a default
|
||||
implementation or as a free function in `contrib/_apply.py` is
|
||||
decided at implementation time, not in this PRD.
|
||||
|
||||
## Scope
|
||||
|
||||
### In scope
|
||||
|
||||
- New `AgentProvider` ABC in `bot_bottle/agent_provider.py` with the
|
||||
five methods listed under Goal 3. Existing `agent_provision_plan`
|
||||
becomes `AgentProvider.provision_plan`.
|
||||
- New `bot_bottle/contrib/claude/__init__.py`,
|
||||
`bot_bottle/contrib/claude/agent_provider.py`,
|
||||
`bot_bottle/contrib/codex/__init__.py`,
|
||||
`bot_bottle/contrib/codex/agent_provider.py`. Each defines a
|
||||
`ClaudeAgentProvider` / `CodexAgentProvider` class.
|
||||
- A `get_provider(template) -> AgentProvider` registry in
|
||||
`bot_bottle/agent_provider.py`, lazy-imported from `contrib/`,
|
||||
mirroring `get_provisioner(provider, ...)` in
|
||||
`bot_bottle/deploy_key_provisioner.py`.
|
||||
- Backend changes:
|
||||
- `BottleBackend.provision_in_bottle` resolves the provider once
|
||||
and calls `provider.provision_skills(bottle, plan)`,
|
||||
`provider.provision_prompt(bottle, plan)`, and
|
||||
`provider.provision_supervise_mcp(bottle, plan, url)` in place
|
||||
of the current four abstract hooks.
|
||||
- `BottleBackend.provision_skills`, `provision_prompt`,
|
||||
`provision_provider_auth`, `provision_supervise` are removed.
|
||||
- Docker and smolmachines backends remove their corresponding
|
||||
`provision_*` implementations and the
|
||||
`backend/<name>/provision/{skills,prompt,provider_auth,
|
||||
supervise}.py` modules.
|
||||
- Codex MCP wiring: `CodexAgentProvider.provision_supervise_mcp`
|
||||
writes a `[mcp_servers.supervise]` block into
|
||||
`~/.codex/config.toml` pointing at the same agent-side supervise
|
||||
URL the Claude provider uses. The file already exists from the
|
||||
trust-dialog step; the MCP entry is appended (or the file is
|
||||
rewritten in a single shot, whichever's simpler).
|
||||
- Tests migrate. Backend tests that targeted the four moved
|
||||
provisioners are rewritten against the provider plugin, with one
|
||||
test file per provider mirroring `tests/unit/test_contrib_gitea_*.py`.
|
||||
|
||||
### Out of scope
|
||||
|
||||
- Adding a manifest field for "extra MCP servers the agent should
|
||||
see". The supervise sidecar is the only MCP server provisioned
|
||||
today, and the issue's "Add mcp server configuring into agent
|
||||
provision" line is about the supervise sidecar specifically. A
|
||||
general-purpose user-declared MCP list is a follow-up if and when
|
||||
the need surfaces.
|
||||
- Refactoring `AgentProvisionPlan`'s dataclasses. They stay byte-
|
||||
for-byte the same so the diff is purely "who owns the producer".
|
||||
- A `BottleBackend.provision_provider_auth` shim during transition.
|
||||
The hook is removed in one cut; the only caller is the backend
|
||||
itself, no manifest consumers reference it.
|
||||
- Renaming `agent_provider.py` → `agent_providers/`. The module
|
||||
still has core dataclasses + the ABC + the registry; it's a single
|
||||
file's worth of code.
|
||||
|
||||
## Proposed design
|
||||
|
||||
### Module shape after the cut
|
||||
|
||||
```
|
||||
bot_bottle/agent_provider.py
|
||||
PROVIDER_CLAUDE, PROVIDER_CODEX, PROVIDER_TEMPLATES
|
||||
PromptMode (Literal)
|
||||
AgentProvisionDir, AgentProvisionFile, AgentProvisionCommand,
|
||||
AgentProvisionPlan (dataclasses, unchanged)
|
||||
AgentProviderRuntime (dataclass — template/command/image/etc.)
|
||||
AgentProvider (ABC)
|
||||
.runtime() -> AgentProviderRuntime
|
||||
.provision_plan(state_dir, ..., trusted_project_path, ...) -> AgentProvisionPlan
|
||||
.provision_skills(bottle, plan) -> None
|
||||
.provision_prompt(bottle, plan) -> str | None
|
||||
.provision_supervise_mcp(bottle, plan, supervise_url) -> None
|
||||
get_provider(template: str) -> AgentProvider # lazy-imports contrib
|
||||
prompt_args(prompt_mode, prompt_path, *, argv) # unchanged
|
||||
|
||||
bot_bottle/contrib/claude/agent_provider.py
|
||||
ClaudeAgentProvider(AgentProvider)
|
||||
_RUNTIME = AgentProviderRuntime(template="claude", ...)
|
||||
.provision_plan(...) # owns the lines-204–230 chunk
|
||||
.provision_skills(...) # was backend/<name>/provision/skills.py
|
||||
.provision_prompt(...) # was backend/<name>/provision/prompt.py
|
||||
.provision_supervise_mcp(...)# was backend/<name>/provision/supervise.py
|
||||
|
||||
bot_bottle/contrib/codex/agent_provider.py
|
||||
CodexAgentProvider(AgentProvider)
|
||||
_RUNTIME = AgentProviderRuntime(template="codex", ...)
|
||||
.provision_plan(...) # owns the lines-154–204 chunk
|
||||
.provision_skills(...) # same as Claude impl, factored to shared helper
|
||||
.provision_prompt(...) # same as Claude impl, factored to shared helper
|
||||
.provision_supervise_mcp(...)# writes [mcp_servers.supervise] to config.toml
|
||||
```
|
||||
|
||||
The skills / prompt / provider-auth-apply implementations are 99%
|
||||
identical across providers — `cp_in` then `chown` / `chmod`. They are
|
||||
extracted to small free functions in
|
||||
`bot_bottle/contrib/_provision_apply.py` (or kept as default
|
||||
implementations on `AgentProvider` if every concrete subclass would
|
||||
just call them). Picked at implementation time; both options match
|
||||
PRD 0048's contrib convention. The visible contract is that
|
||||
provisioning lives on the provider plugin.
|
||||
|
||||
### MCP registration for Codex
|
||||
|
||||
Codex reads MCP servers from `~/.codex/config.toml` (or whatever
|
||||
`CODEX_HOME/config.toml` resolves to). The provider already writes
|
||||
this file once during `provision_plan` to set the project trust
|
||||
level. `CodexAgentProvider.provision_supervise_mcp` extends the
|
||||
existing write: same path, append a `[mcp_servers.supervise]` table
|
||||
pointing at the agent-side supervise URL.
|
||||
|
||||
Two implementation routes worth flagging:
|
||||
|
||||
- **Option A:** Pre-bake the MCP entry in the same config-write that
|
||||
happens during `provision_plan`, before bottle launch. Simpler;
|
||||
the supervise URL has to be known at plan time, which means
|
||||
`provision_plan` needs the supervise URL (or a sentinel that means
|
||||
"fill this in"). The smolmachines backend already plumbs
|
||||
`agent_supervise_url` through to its provision_supervise step, so
|
||||
the value is available.
|
||||
- **Option B:** Append at bottle-launch time via a `bottle.exec`
|
||||
that writes to the file inside the guest, matching the
|
||||
`claude mcp add` flow. Slower but uniform with how
|
||||
`ClaudeAgentProvider.provision_supervise_mcp` works.
|
||||
|
||||
Option B is the symmetric choice and the one this PRD assumes.
|
||||
The implementer can switch to A if Option B turns out to need a
|
||||
TOML-merge primitive the codebase doesn't already have.
|
||||
|
||||
### Backend after the cut
|
||||
|
||||
```python
|
||||
class BottleBackend:
|
||||
def provision_in_bottle(self, plan, bottle, supervise_url):
|
||||
provider = get_provider(plan.spec.manifest.agents[
|
||||
plan.spec.agent_name].provider)
|
||||
self.provision_ca(plan, bottle)
|
||||
prompt_path = provider.provision_prompt(bottle, plan)
|
||||
provider.provision_skills(bottle, plan)
|
||||
self.provision_workspace(plan, bottle)
|
||||
self.provision_git(plan, bottle)
|
||||
provider.provision_supervise_mcp(bottle, plan, supervise_url)
|
||||
return prompt_path
|
||||
```
|
||||
|
||||
`supervise_url` is the existing per-backend "where does the agent
|
||||
reach the sidecar from inside the guest" value. The Docker backend
|
||||
passes `http://supervise:<port>/`; smolmachines passes the
|
||||
`http://127.0.0.1:<port>/` it already computed. The backend's only
|
||||
remaining provider-touching duty is "tell the provider what the
|
||||
sidecar URL is".
|
||||
|
||||
### Registry
|
||||
|
||||
```python
|
||||
# bot_bottle/agent_provider.py
|
||||
def get_provider(template: str) -> AgentProvider:
|
||||
if template == PROVIDER_CLAUDE:
|
||||
from bot_bottle.contrib.claude.agent_provider import (
|
||||
ClaudeAgentProvider,
|
||||
)
|
||||
return ClaudeAgentProvider()
|
||||
if template == PROVIDER_CODEX:
|
||||
from bot_bottle.contrib.codex.agent_provider import (
|
||||
CodexAgentProvider,
|
||||
)
|
||||
return CodexAgentProvider()
|
||||
raise ValueError(f"unknown agent provider template: {template!r}")
|
||||
```
|
||||
|
||||
Lazy imports keep core import-time graph small and match PRD 0048.
|
||||
|
||||
## Implementation chunks
|
||||
|
||||
Each chunk is one commit on the PR; the PR ships as one cut.
|
||||
|
||||
1. **Lift `AgentProvider` ABC + registry.** Add the ABC and
|
||||
`get_provider` next to the existing `agent_provision_plan`
|
||||
function. Have `agent_provision_plan` delegate to
|
||||
`get_provider(template).provision_plan(...)` so callers keep
|
||||
working through the transition.
|
||||
2. **Move provider-specific `provision_plan` content into
|
||||
contrib.** Create `contrib/claude/` and `contrib/codex/`. The
|
||||
Claude and Codex branches of `agent_provision_plan` move into
|
||||
the respective provider classes. The shared scaffolding
|
||||
(initial dict setup, final `AgentProvisionPlan(...)` return)
|
||||
stays in the ABC as a template method or moves into each
|
||||
subclass — whichever needs less indirection.
|
||||
3. **Move backend provisioners onto the provider.** Add
|
||||
`provision_skills`, `provision_prompt`, `provision_supervise_mcp`
|
||||
to `AgentProvider` (with a shared apply helper for skills /
|
||||
prompt). Update `BottleBackend.provision_in_bottle` to call them.
|
||||
Delete the four backend hook methods and the eight
|
||||
`backend/<name>/provision/{skills,prompt,provider_auth,supervise}.py`
|
||||
modules.
|
||||
4. **Add Codex MCP support.** Implement
|
||||
`CodexAgentProvider.provision_supervise_mcp` against
|
||||
`~/.codex/config.toml`. Add a unit test that runs the method
|
||||
against an in-memory FakeBottle and asserts the
|
||||
`[mcp_servers.supervise]` block is present.
|
||||
5. **Migrate tests.** Per-backend tests for the moved
|
||||
provisioners turn into per-provider tests under
|
||||
`tests/unit/test_contrib_claude_*.py` and
|
||||
`tests/unit/test_contrib_codex_*.py`. Keep one integration-style
|
||||
test per backend that confirms `provision_in_bottle` still
|
||||
reaches every step.
|
||||
6. **Activate.** Flip Status: Draft → Active in this PRD; close
|
||||
#177 on merge.
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **Does Codex's `claude mcp add` equivalent exist as a CLI verb,
|
||||
or is direct TOML editing the only option?** If a `codex mcp
|
||||
add` (or similar) is in the binary, Option A in the MCP section
|
||||
becomes more attractive. The implementation chunk for Codex MCP
|
||||
should check `codex --help` against the version pinned in
|
||||
`Dockerfile.codex` before committing to TOML editing.
|
||||
2. **Shared apply helper vs. default methods on `AgentProvider`?**
|
||||
Both work. Default methods read cleaner at call sites; a free
|
||||
function in `contrib/_provision_apply.py` is easier to test in
|
||||
isolation. Defer to the implementer; not load-bearing for the
|
||||
PRD.
|
||||
3. **Should `BOT_BOTTLE_CONTAINER_HOME` / `BOT_BOTTLE_GUEST_HOME`
|
||||
/`BOT_BOTTLE_CONTAINER_SKILLS_DIR` env knobs survive the move?**
|
||||
They're only read by the backend-side provision modules being
|
||||
deleted. The provider plugin can take them too, but the values
|
||||
are effectively constants today. Leaving the env-read fall-back
|
||||
in place for compat is one line per knob; this PRD says yes,
|
||||
carry them over.
|
||||
|
||||
## References
|
||||
|
||||
- Issue
|
||||
[#177](https://gitea.dideric.is/didericis/bot-bottle/issues/177)
|
||||
— the request: move provider logic into contrib, add MCP
|
||||
configuration to agent provision, rename provision_supervise →
|
||||
provision_supervise_mcp, ensure Codex gets MCP provisioned.
|
||||
- PRD 0013 — supervise plane foundation (defines the MCP-discoverable
|
||||
block-remediation tools this PRD makes available to Codex).
|
||||
- PRD 0048 — SSH deploy key provisioning (the `contrib/` convention
|
||||
this PRD follows).
|
||||
- Current source:
|
||||
[agent_provider.py L154-L230](https://gitea.dideric.is/didericis/bot-bottle/src/branch/main/bot_bottle/agent_provider.py#L154-L230)
|
||||
— the provider-specific block this PRD relocates to contrib.
|
||||
Reference in New Issue
Block a user