Draft a PRD that turns the dashboard into the operator's single surface — collapses today's two-terminal workflow (one for `./cli.py start`, one for `./cli.py dashboard`) into a single dashboard invocation that can spin up new agents, re-attach to ones it already spun up, and explicitly stop them. Picks the "handoff" mechanism from `docs/research/claude-code- pane-in-dashboard.md` (curses.endwin → docker exec -it claude → stdscr.refresh) and crucially decouples the bottle's lifetime from any single claude session: exit claude → back to dashboard with the bottle still running; quit dashboard → tear down every bottle the dashboard owns. Sized into 5 chunks (refactor → picker + new-agent → re-attach → explicit stop → quit-cleanup). Seven open questions called out, the biggest being modal-vs-drop-and-resume for the preflight Y/N inside curses.
14 KiB
PRD 0020: Start and attach to agents from inside the dashboard
- Status: Draft
- Author: didericis
- Created: 2026-05-26
Summary
Today the dashboard is read-only: it surfaces pending proposals
and active agents (PRD 0019) but can't start an agent or
re-enter one. The operator's path is split — they launch
agents from one terminal (./cli.py start <name>), and watch
them from another (./cli.py dashboard).
This PRD collapses that split. The dashboard becomes the
operator's single surface: pressing a key opens an agent picker,
selecting one runs the existing prepare → preflight → launch
flow inside a curses-friendly variant, and on yield drops to a
full-screen docker exec -it … claude session (the "handoff"
shape from docs/research/claude-code-pane-in-dashboard.md).
When the operator exits claude, the dashboard re-renders with
the now-running bottle visible in the agents pane.
Crucially, the bottle's lifetime is owned by the dashboard process, not by the individual claude session. Exit claude → back to dashboard, bottle still running. Start another agent → two bottles up at once. Quit the dashboard → all dashboard- launched bottles tear down.
Problem
Two real frictions today:
-
Two terminals for one workflow. The dashboard is the right shape to watch agents — proposals queue, status updates, operator-edit verbs — but it's the wrong shape to start them. Today you open a second terminal for that. In parallel use (3–5 bottles), the operator has 5+ terminals open and the dashboard's "active agents" pane is hopelessly behind reality because they just spawned three in a row.
-
./cli.py startties the bottle to a single claude session. The start command'sExitStackbrings the bottle up, runs claude, and tears down on Ctrl-D — fine for a one- shot session, wrong for "let me bounce in and out of this bottle a few times while triaging proposals." Today the only way to re-enter a bottle after exiting claude is to start a fresh one and lose all in-bottle state.
The dashboard already discovers active bottles, scopes operator-edit verbs to a selected agent (PRD 0019), and captures full-merged logs per bottle (PRD 0018). It already wants to be the primary surface. This PRD finishes that.
Goals / Success Criteria
- From inside
./cli.py dashboard, pressingn(new) opens an agent picker listing every agent defined in the manifest. Selecting one runsprepare → preflight → launch. - The preflight Y/N summary renders cleanly — either as a
curses modal or via
curses.endwin() → text-mode prompt → restore, matching the existing editor-flow pattern. - On launch success, the dashboard performs a handoff (option
1 from the research doc):
curses.endwin()→docker exec -it claude-bottle-<slug> claude --dangerously-skip-permissions→ on exit,stdscr.refresh()and re-render with the new bottle in the agents pane. - The bottle's lifetime is owned by the dashboard process, NOT
by any single claude session. Exiting claude (Ctrl-D,
/exit) returns to the dashboard with the bottle still running. The operator can start more agents and re-enter previous ones. - Pressing Enter on a selected row in the agents pane re- attaches to that agent's bottle via the same handoff — drops to full-screen claude, returns on exit.
- Pressing
x(or similar — keybinding decided in design) on a selected agent stops just that bottle (compose down + state cleanup) without quitting the dashboard. - Quitting the dashboard (
q) tears down every bottle the dashboard started, unless something has explicitly preserved the state (capability-block, crash). Matches today's start.py teardown semantics.
Non-goals
- A pane that hosts the claude TUI alongside proposals. The embedded-emulator option from the research doc is out of scope. The handoff (option 1) is the v1; option 2 is a separate PRD if and when handoff is observably insufficient.
- Adopting bottles started by an out-of-dashboard
./cli.py startinvocation. Those have their own ExitStack-owner and the dashboard treats them as read-only-watch (already does today). Re-attach only applies to bottles the current dashboard process started. - Persisting a "bottle pool" across dashboard runs. When
the dashboard quits, its bottles go. Resume across dashboard
invocations is
./cli.py resume <identity>, which is unchanged. - Multi-window UI. Single curses window, two existing panes (proposals + agents); the agent picker is a modal, not a third pane.
- Removing
./cli.py start. Stays as the script-friendly / legacy entry point. The dashboard is the new default.
Scope
In scope
- Manifest-driven agent picker (curses modal): list view with j/k navigation + Enter to confirm, Esc to abort.
- Preflight rendering inside the dashboard's curses surface (modal or drop-and-resume — picked in design).
- A new
_dashboard_start_flowthat wraps prepare + preflight- launch and returns a
DockerBottlehandle the dashboard retains alongside itspendingandagentslists.
- launch and returns a
- A
bottles: dict[slug, DockerBottle]map on the main loop that owns every dashboard-launched handle. ExitStack tears them all down on dashboard exit. Enteron an agents-pane row → re-attach handoff (docker exec -it claude into the existing container).x(or similar) on an agents-pane row → explicit per-bottle stop without quitting.q(existing quit key) → tear down all dashboard-launched bottles before returning.
Out of scope
- Changes to
./cli.py startitself. It keeps its current shape; the dashboard reuses its internal pieces (backend. prepare / backend.launch) without reaching through the CLI layer. - Changes to
backend.launch's context-manager contract; the dashboard's bottle map just holds the context-manager-yielded Bottle and calls__exit__on quit / explicit stop. - New manifest fields. The picker reads what's already there.
- Adopting non-dashboard bottles into the dashboard's owned set.
Proposed design
Bottle ownership
Today's flow:
./cli.py start agent
└─ with backend.launch(plan) as bottle: ← bottle alive while inside `with`
bottle.exec_claude([...], tty=True) ← blocks until claude exits
# context exits → compose down → state cleanup
The proposed dashboard-owned flow:
./cli.py dashboard
└─ stack = ExitStack()
bottles: dict[str, DockerBottle] = {}
# operator presses `n`, picks agent
ctx = backend.launch(plan)
bottle = stack.enter_context(ctx) ← bottle stays alive
bottles[plan.slug] = bottle
# operator interacts via:
curses.endwin()
bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D
stdscr.refresh()
# bottle is STILL ALIVE here — only the claude process exited
# ... operator does other things, eventually `q`:
stack.close() ← tears down every bottle
The shift is one line of code semantically but the change in operator experience is real: bottles outlive any single claude session.
Agent picker
Pressing n opens a centered modal listing every agent name
from spec.manifest.agents. j/k navigates; Enter selects; Esc
aborts. Width is the longest name + bottle name + a column for
"already running?" so the operator can see at a glance whether
picking an agent starts a fresh one (different slug suffix) or
not.
┌─ start agent ───────────────────────────┐
│ implementer dev (running) │
│ > researcher dev │
│ triage-bot sandbox │
└─ Enter: start Esc: cancel ─────────────┘
Starting an agent that already has a running bottle is allowed
— each start mints a fresh slug — but the picker surfaces the
already-running state so the operator doesn't accidentally
double-launch.
Preflight Y/N
Two viable shapes:
Modal — render the preflight summary lines (agent / env / skills / bottle / git gate / egress) in a centered curses
modal with [y/N] at the bottom. Capture the next keypress.
Drop-and-resume — curses.endwin(), print the preflight to
stderr, read y/N from stdin, restore curses. Matches the
editor-flow + handoff pattern; lower implementation cost.
Lean toward modal for the y/N because it doesn't flash the terminal between dashboard frames. Drop-and-resume is acceptable if modal proves fiddly.
Re-attach (Enter on agent)
Same handoff pattern the new-agent flow uses. The dashboard
already holds the DockerBottle for any slug it started —
bottle.exec_claude([...], tty=True) does the right docker exec -it claude … and returns on session exit. Re-attach is
"already-running" + the same exec call; the agent picker isn't
involved.
For agents the dashboard didn't start (read-only watch), Enter
is a no-op with a status hint ("dashboard didn't start this
bottle; resume with ./cli.py resume <identity> outside the
dashboard"). PRD-0019's selection model already differentiates
focus; this layer just gates the action.
Explicit per-bottle stop
x on a selected dashboard-owned agent invokes
stack.pop_callback-style targeted teardown: take that bottle
out of the map, call its close() to tear down compose + state,
update the agents pane on the next refresh. Bottles the
dashboard didn't start (x on a read-only-watch row) → no-op
with a status hint.
Dashboard quit
q (existing) calls stack.close() before exit; every
dashboard-launched bottle goes through its normal teardown
(compose down + state settle). Preserve markers (capability-
block, crash) still keep state across teardown. The dashboard
process itself returns 0.
If the operator wants to keep bottles alive past dashboard
exit, the existing path is unchanged: launch them via
./cli.py start in a separate terminal. That ownership stays
out-of-band.
Implementation chunks
Sized for one PR each.
- Refactor
_launch_bottleso the launch + exec_claude pieces are separable. Today'scli/start.pyruns both inside one function. Extractprepare_with_preflight(spec, *, render_preflight, prompt_yes)andattach_claude(bottle, *, remote_control). The CLI's existing one-shot use binds them as before; the dashboard binds them with curses-aware render + prompt callables. No behavior change. - Agent picker modal + new-agent flow. New key
nopens the picker;prepare_with_preflightruns against the selected agent; on Y,backend.launch(plan)enters the dashboard's ExitStack; handoff invokesattach_claude. - Re-attach via Enter on owned agents-pane row. Looks up
the slug in the dashboard's
bottlesmap; if present → handoff; else → status-line hint pointing at./cli.py resume. - Explicit per-bottle stop (
xkeybinding). Pop the bottle'sclosecallback off the stack, call it, refresh. - Quit-cleanup (
q). Hookstack.close()into the normal return path. Document the "exiting dashboard tears down every bottle it started" contract indashboard.py's module docstring.
Open questions
-
Modal vs. drop-and-resume for preflight Y/N. Both work; modal is nicer if the curses geometry handling is straightforward. Pick during chunk 2 by prototyping the modal in ~30 lines and seeing if it looks right.
-
Agent picker: text-filter typing? v1 is j/k navigation only. If the manifest has 20+ agents the picker gets noisy; add fzf-style filter input later if needed.
-
What happens if
attach_claudeexits because the container died (not a clean claude exit — e.g., OOM, panic)? Today's_settle_statemarks the bottle preserved for non-zero exit codes. The dashboard's re-render needs to notice the bottle is gone (compose down or container-not- running state) and surface a status line. Probably: transcript snapshot + mark preserved + remove frombottlesmap + status line "claude session for [slug] ended with exit N; preserved for resume". -
Double-start of the same agent. Allowed by design — slugs are unique per launch — but the picker should make it clear this is a "start a SECOND bottle" decision, not a "re-enter the first." Probably handled by showing the running-count in the picker row.
-
Should
qconfirm before tearing down N running bottles? A 5-bottle dashboard with 5 in-flight sessions loses non-trivial state on accidentalq. Probably yes: curses modal "quit and tear down N bottles? [y/N]". Skip confirmation when there are zero owned bottles. -
Race between handoff and 1s refresh tick. While the dashboard's
stdscr.timeoutis set, a key press fires the handoff and the dashboard sits indocker execfor minutes.discover_active_agents/discover_pendingdon't poll during that window, which is fine — the moment westdscr.refresh()after exec returns, the next loop iter runs discovery and the panes reflect reality. Worth calling out in the design but no special handling needed. -
Multi-bottle resource use. Five bottles up means five compose projects: 5×(agent + pipelock + egress optional + git-gate optional + supervise optional) containers, plus 5×2 networks. On a 16-GiB host this is fine; on something smaller the operator might want a soft cap or a warning. Out of v1; flag for follow-up if it bites.
References
- PRD 0018 — compose-per-instance lifecycle (the
backend. launchcontext-manager contract this PRD layers against) - PRD 0019 — active-agents pane + selection model (the agents-pane row the re-attach + stop verbs hook into)
docs/research/claude-code-pane-in-dashboard.md— option 1 (handoff) is whatattach_claudeimplements here; options 2 / 3 are out of scope for this PRDclaude_bottle/cli/start.py:_launch_bottle— the function chunk 1 extracts the prepare + attach pieces out of