Files
bot-bottle/docs/prds/0013-supervise-plane-foundation.md
T
didericis 578363bea3 docs(prd-0013): supervise plane foundation
Adds PRD 0013, the shared foundation for the stuck-agent recovery flow
(overview in PRD 0012). Defines the MCP sidecar, the three tool
definitions, the proposal queue, the read-only current-config mount,
the minimal TUI, and the audit log format. Approval handlers are
deliberately no-ops; the actual remediations land in PRDs 0014, 0015,
and 0016.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00

7.0 KiB
Raw Blame History

PRD 0013: Supervise plane foundation

  • Status: Draft
  • Author: didericis
  • Created: 2026-05-25
  • Parent: PRD 0012

Summary

The shared infrastructure that PRDs 00140016 build on. Adds a per-bottle MCP sidecar that exposes three tools (cred-proxy-block, pipelock-block, capability-block) to the agent; a read-only /etc/claude-bottle/current-config/ mount in the agent container that exposes the current routes.json, pipelock allowlist, and Dockerfile; a host-mounted proposal queue; a minimal TUI dashboard that lists pending proposals and supports approve / modify / reject; and the audit log format. After this PRD, an operator can see proposals and approve/reject them — but the approval handlers are no-ops. The remediation engines that actually act on approvals land in 0014, 0015, and 0016.

Problem

See PRD 0012 for the broader stuck-agent problem. This PRD specifically addresses: there is no protocol for the agent to ask the operator for help, no place for the operator to see what the agent is asking, and no audit trail tying agent asks to operator decisions.

Goals / Success Criteria

  • The agent in a bottle can call any of the three MCP tools and receive a structured response from a real operator action.
  • The operator can list pending proposals across all running bottles in a TUI and approve / modify / reject each one with a single command.
  • Each approve / modify / reject decision writes an entry to the bottle's audit log, capturing the agent's justification and the operator's action.
  • The approval handlers in 0013 are deliberately no-ops: an "approved" response is delivered to the tool, but no host-side config change happens. 00140016 wire in the actual remediations.

Non-goals

  • Any actual remediation: SIGHUP reload, pipelock restart, bottle rebuild are all out of scope for 0013 (covered by 0014, 0015, 0016 respectively).
  • TUI polish beyond minimum viable. v1 list + approve/reject is enough.
  • Proactive operator-initiated routes edit <bottle> / pipelock edit <bottle> verbs — they live with the remediation PRDs that own those components.

Scope

In scope

  • A per-bottle MCP sidecar container on the bottle's internal network.
  • MCP tool definitions for cred-proxy-block, pipelock-block, capability-block (input schemas as defined in PRD 0012 Stuck categories).
  • Tool output: {status: "approved"|"modified"|"rejected", notes: "..."}.
  • A read-only mount at /etc/claude-bottle/current-config/ in the agent container exposing the current routes.json, pipelock allowlist, and Dockerfile.
  • A host-mounted per-bottle proposal queue at ~/.claude-bottle/queue/<slug>/ (file-per-proposal, with metadata and proposed file content).
  • A claude-bottle dashboard (or similarly named) TUI that lists running bottles and pending proposals across all of them; supports approve, modify-then-approve, and reject-with-reason for each pending proposal.
  • Audit log files at ~/.claude-bottle/audit/cred-proxy-<slug>.log and ~/.claude-bottle/audit/pipelock-<slug>.log with the agreed-upon format (timestamp, diff before/after, justification text, operator action with notes). Entries are written by the supervisor on each approve/modify/reject decision. (capability-block has no separate audit log — capability changes are captured by the bottle's rebuild record / git history.)
  • Bottle lifecycle script changes to launch the MCP sidecar alongside the other sidecars and mount the read-only current-config directory.

Out of scope

  • The remediation engines themselves (0014, 0015, 0016).
  • Proactive operator-initiated routes edit <bottle> / pipelock edit <bottle> verbs.

Proposed Design

New services / components

  • MCP sidecar. New per-bottle container on the bottle's internal network. Exposes the three tools to the agent. On a tool call: validates the proposed file syntactically (valid JSON for routes.json, parseable Dockerfile, etc.), writes the proposal to the queue, and holds the tool-call connection open until the supervisor responds. Returns {status, notes} to the agent on response.
  • Read-only current-config mount. /etc/claude-bottle/current-config/ in the agent container exposes routes.json, the pipelock allowlist, and the agent Dockerfile from the host. Read-only — the agent proposes changes via the tool call, never by writing the file directly.
  • Proposal queue. Per-bottle directory under ~/.claude-bottle/queue/<slug>/ on the host. One file per pending proposal with {id, tool, proposed_file, justification, arrival_timestamp, current_file_hash, bottle_slug}.
  • Minimal TUI dashboard. Lists running bottles and pending proposals. For each proposal: shows current vs. proposed diff and justification. Operator actions: approve / modify-then-approve / reject-with-reason. Stdlib only (curses) unless that proves painful.
  • Audit log format. Append-only files at ~/.claude-bottle/audit/<component>-<slug>.log. Each entry: timestamp, diff before/after, agent justification (if from a tool call), operator action + notes. Defines the format; the per-component PRDs (0014, 0015) fill in real entries.
  • No-op approval handlers. Each tool's approve path in 0013 writes an audit entry and returns {status: "approved"} to the agent but doesn't actually change any config. 0014 / 0015 / 0016 replace these with real handlers.

Existing code touched

  • Bottle lifecycle scripts — launch the MCP sidecar alongside other sidecars; mount /etc/claude-bottle/current-config/ read-only into the agent container.
  • cli.py — adds the dashboard subcommand.

Data model changes

  • A per-bottle pending-proposal queue (see above).
  • Per-bottle audit log files (see above).

External dependencies

  • An MCP server library / framework. Pick the lightest option that lets the sidecar advertise three tools with structured input/output schemas; do not adopt a heavier MCP framework than the three tools justify.
  • A TUI library is a maybe — only if stdlib can't carry the dashboard experience. Default to no new dependency.

Open questions

  • MCP sidecar placement: own container vs. fold into cred-proxy. v1 plan is its own container. Folding saves one sidecar per bottle but mixes the credential plane and the supervise plane. Worth deciding once the sidecar's actual line count is known.
  • Multiple pending proposals from the same bottle. If the agent calls a second tool before the first is answered: replace, append, or refuse? Append feels safest; replace is wrong (loses context); refuse forces the agent to handle a new error mode. Also: can different tools from the same bottle be pending simultaneously?
  • Proposal validation strictness. The sidecar validates syntactically. Should it also do a deeper check — e.g. does the proposed routes.json introduce a route the operator already rejected this session? Probably no for v1; the operator is the gate.

References

  • PRD 0010 — cred-proxy.
  • PRD 0012 — stuck-agent recovery flow overview.
  • PRD 0014 / 0015 / 0016 — remediation engines that plug into the foundation laid here.