PRD 0026: Agent Provider Templates #91

Merged
didericis merged 19 commits from prd-0026-agent-provider-templates into main 2026-05-28 20:04:41 -04:00
Showing only changes of commit e03d90962d - Show all commits
@@ -0,0 +1,90 @@
# PRD 0026: Agent Provider Templates
- **Status:** Draft
- **Author:** codex
- **Created:** 2026-05-28
## Summary
Support multiple agent harnesses, starting with Codex and Claude, while keeping agent files provider-agnostic and bottle files responsible for boundaries.
## Problem
Today claude-bottle is hard-wired around Claude Code assumptions. When Claude runs out or is otherwise unavailable, the operator cannot spin up an equivalent Codex-backed bottle from the dashboard or `start` path. Agent files should remain purpose/guidance documents, while bottle files define security boundaries and provider/runtime choices.
## Goals / Success Criteria
- A Codex agent can be started from the dashboard and via `./cli.py start` alongside a Claude agent.
- The manifest can express the agent provider/template and, where needed, a custom agent Dockerfile.
- Claude-specific default egress/auth behavior is no longer implicit; provider-specific auth is expressed through explicit bottle egress routes and roles.
- The launcher preserves required infrastructure behavior for sidecars, egress, pipelock, supervisor MCP, CA handling, git, and shell basics.
- Unit tests cover manifest parsing, provider validation, provider-specific auth role behavior, and launch/prepare plan differences.
## Non-goals
- Do not implement support for additional harnesses such as `pi`, `aider`, or other future providers.
- Do not move security boundaries into agent files.
- Do not allow custom Dockerfiles to remove or bypass required claude-bottle infrastructure.
- Do not add new runtime dependencies unless the existing Docker/Codex tooling cannot satisfy the minimum cut.
## Scope
### In scope
- Add a bottle-level provider/template configuration for Claude and Codex.
- Add a Codex template that can launch a Codex agent from the dashboard and `start`.
- Support a custom agent Dockerfile path for the agent environment.
- Make Claude-specific egress/auth defaults explicit in bottle manifests instead of auto-provided.
- Add a Codex-specific auth role and provider-aware role validation.
- Keep existing Claude behavior available through a Claude provider/template.
- Gate Claude-specific crash-state/transcript handling behind a Claude-only flag or provider branch.
### Out of scope
- Implementing providers beyond Claude and Codex.
- Redesigning the agent file format beyond keeping it provider/bottle agnostic.
- Reworking the whole state/transcript subsystem in this PRD; provider-specific state handling should be isolated now and refactored in a follow-up.
## Proposed Design
### New services / components
- New `AgentProvider` model for provider/template behavior.
- Bottle manifests use a nested `agent_provider` shape:
```yaml
agent_provider:
template: codex # or claude
didericis marked this conversation as resolved
Review

debating whether or not we should call this "type" instead of template...

debating whether or not we should call this "type" instead of template...
dockerfile: ./Dockerfile.codex # optional
```
- Provider-specific launch configuration for Claude and Codex, including command argv, auth placeholder behavior, and default image/Dockerfile selection.
- Provider-aware egress role validation, including a new Codex auth role.
### Existing code touched
- `claude_bottle/manifest.py` for provider schema and role validation.
- Docker and smolmachines prepare/launch/provision paths for provider-specific image, command, auth, and state behavior.
- Dashboard/start display paths so the selected provider is visible and usable.
- README and PRD docs for provider/template configuration.
- Unit tests around manifest parsing, backend plans, launch argv, egress roles, and dashboard/start behavior.
### Data model changes
- Manifest schema gains bottle-level provider/template configuration.
- No persistent state migration is expected.
- Existing Claude-specific crash-state/transcript dumping in the state folder should be guarded so it only runs for Claude agents. A broader state/transcript abstraction is a follow-up.
### External dependencies
- Avoid new runtime dependencies where possible.
- Use existing Docker image build flows and whatever Codex install is already available in the chosen agent image/template.
## Open questions
- What is the exact Codex auth role name and environment-variable contract?
- Which state-folder artifacts are Claude-specific today, and which are provider-neutral?
## References
- Issue #90: Support for different agents