# PRD 0020: Start and attach to agents from inside the dashboard - **Status:** Draft - **Author:** didericis - **Created:** 2026-05-26 ## Summary Today the dashboard is read-only: it surfaces pending proposals and active agents (PRD 0019) but can't *start* an agent or *re-enter* one. The operator's path is split — they launch agents from one terminal (`./cli.py start `), and watch them from another (`./cli.py dashboard`). This PRD collapses that split. The dashboard becomes the operator's single surface: pressing a key opens an agent picker, selecting one runs the existing prepare → preflight → launch flow inside a curses-friendly variant, and on yield drops to a full-screen `docker exec -it … claude` session (the "handoff" shape from `docs/research/claude-code-pane-in-dashboard.md`). When the operator exits claude, the dashboard re-renders with the now-running bottle visible in the agents pane. Crucially, the bottle's lifetime is owned by the *dashboard process*, not by the individual claude session. Exit claude → back to dashboard, bottle still running. Start another agent → two bottles up at once. Quit the dashboard → all dashboard- launched bottles tear down. ## Problem Two real frictions today: 1. **Two terminals for one workflow.** The dashboard is the right shape to *watch* agents — proposals queue, status updates, operator-edit verbs — but it's the wrong shape to *start* them. Today you open a second terminal for that. In parallel use (3–5 bottles), the operator has 5+ terminals open and the dashboard's "active agents" pane is hopelessly behind reality because they just spawned three in a row. 2. **`./cli.py start` ties the bottle to a single claude session.** The start command's `ExitStack` brings the bottle up, runs claude, and tears down on Ctrl-D — fine for a one- shot session, wrong for "let me bounce in and out of this bottle a few times while triaging proposals." Today the only way to re-enter a bottle after exiting claude is to start a fresh one and lose all in-bottle state. The dashboard already discovers active bottles, scopes operator-edit verbs to a selected agent (PRD 0019), and captures full-merged logs per bottle (PRD 0018). It already *wants* to be the primary surface. This PRD finishes that. ## Goals / Success Criteria 1. From inside `./cli.py dashboard`, pressing `n` (new) opens an agent picker listing every agent defined in the manifest. Selecting one runs `prepare → preflight → launch`. 2. The preflight Y/N summary renders cleanly — either as a curses modal or via `curses.endwin() → text-mode prompt → restore`, matching the existing editor-flow pattern. 3. On launch success, the dashboard performs a handoff (option 1 from the research doc): `curses.endwin()` → `docker exec -it claude-bottle- claude --dangerously-skip-permissions` → on exit, `stdscr.refresh()` and re-render with the new bottle in the agents pane. 4. The bottle's lifetime is owned by the dashboard process, NOT by any single claude session. Exiting claude (Ctrl-D, `/exit`) returns to the dashboard with the bottle still running. The operator can start more agents and re-enter previous ones. 5. Pressing Enter on a selected row in the agents pane re- attaches to that agent's bottle via the same handoff — drops to full-screen claude, returns on exit. 6. Pressing `x` (or similar — keybinding decided in design) on a selected agent stops just that bottle (compose down + state cleanup) without quitting the dashboard. 7. Quitting the dashboard (`q`) tears down every bottle the dashboard started, unless something has explicitly preserved the state (capability-block, crash). Matches today's start.py teardown semantics. ## Non-goals - **A pane that hosts the claude TUI alongside proposals.** The embedded-emulator option from the research doc is out of scope. The handoff (option 1) is the v1; option 2 is a separate PRD if and when handoff is observably insufficient. - **Adopting bottles started by an out-of-dashboard `./cli.py start` invocation.** Those have their own ExitStack-owner and the dashboard treats them as read-only-watch (already does today). Re-attach only applies to bottles the *current dashboard process* started. - **Persisting a "bottle pool" across dashboard runs.** When the dashboard quits, its bottles go. Resume across dashboard invocations is `./cli.py resume `, which is unchanged. - **Multi-window UI.** Single curses window, two existing panes (proposals + agents); the agent picker is a modal, not a third pane. - **Removing `./cli.py start`.** Stays as the script-friendly / legacy entry point. The dashboard is the new default. ## Scope ### In scope - Manifest-driven agent picker (curses modal): list view with j/k navigation + Enter to confirm, Esc to abort. - Preflight rendering inside the dashboard's curses surface (modal or drop-and-resume — picked in design). - A new `_dashboard_start_flow` that wraps prepare + preflight + launch and returns a `DockerBottle` handle the dashboard retains alongside its `pending` and `agents` lists. - A `bottles: dict[slug, DockerBottle]` map on the main loop that owns every dashboard-launched handle. ExitStack tears them all down on dashboard exit. - `Enter` on an agents-pane row → re-attach handoff (docker exec -it claude into the existing container). - `x` (or similar) on an agents-pane row → explicit per-bottle stop without quitting. - `q` (existing quit key) → tear down all dashboard-launched bottles before returning. ### Out of scope - Changes to `./cli.py start` itself. It keeps its current shape; the dashboard reuses its internal pieces (backend. prepare / backend.launch) without reaching through the CLI layer. - Changes to `backend.launch`'s context-manager contract; the dashboard's bottle map just holds the context-manager-yielded Bottle and calls `__exit__` on quit / explicit stop. - New manifest fields. The picker reads what's already there. - Adopting non-dashboard bottles into the dashboard's owned set. ## Proposed design ### Bottle ownership Today's flow: ``` ./cli.py start agent └─ with backend.launch(plan) as bottle: ← bottle alive while inside `with` bottle.exec_claude([...], tty=True) ← blocks until claude exits # context exits → compose down → state cleanup ``` The proposed dashboard-owned flow: ``` ./cli.py dashboard └─ stack = ExitStack() bottles: dict[str, DockerBottle] = {} # operator presses `n`, picks agent ctx = backend.launch(plan) bottle = stack.enter_context(ctx) ← bottle stays alive bottles[plan.slug] = bottle # operator interacts via: curses.endwin() bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D stdscr.refresh() # bottle is STILL ALIVE here — only the claude process exited # ... operator does other things, eventually `q`: stack.close() ← tears down every bottle ``` The shift is one line of code semantically but the change in operator experience is real: bottles outlive any single claude session. ### Agent picker Pressing `n` opens a centered modal listing every agent name from `spec.manifest.agents`. j/k navigates; Enter selects; Esc aborts. Width is the longest name + bottle name + a column for "already running?" so the operator can see at a glance whether picking an agent starts a fresh one (different slug suffix) or not. ``` ┌─ start agent ───────────────────────────┐ │ implementer dev (running) │ │ > researcher dev │ │ triage-bot sandbox │ └─ Enter: start Esc: cancel ─────────────┘ ``` Starting an agent that already has a running bottle is allowed — each `start` mints a fresh slug — but the picker surfaces the already-running state so the operator doesn't accidentally double-launch. ### Preflight Y/N Two viable shapes: **Modal** — render the preflight summary lines (`agent / env / skills / bottle / git gate / egress`) in a centered curses modal with `[y/N]` at the bottom. Capture the next keypress. **Drop-and-resume** — `curses.endwin()`, print the preflight to stderr, read y/N from stdin, restore curses. Matches the editor-flow + handoff pattern; lower implementation cost. Lean toward **modal** for the y/N because it doesn't flash the terminal between dashboard frames. Drop-and-resume is acceptable if modal proves fiddly. ### Re-attach (Enter on agent) Same handoff pattern the new-agent flow uses. The dashboard already holds the `DockerBottle` for any slug it started — `bottle.exec_claude([...], tty=True)` does the right `docker exec -it claude …` and returns on session exit. Re-attach is "already-running" + the same exec call; the agent picker isn't involved. For agents the dashboard didn't start (read-only watch), Enter is a no-op with a status hint ("dashboard didn't start this bottle; resume with `./cli.py resume ` outside the dashboard"). PRD-0019's selection model already differentiates focus; this layer just gates the action. ### Explicit per-bottle stop `x` on a selected dashboard-owned agent invokes `stack.pop_callback`-style targeted teardown: take that bottle out of the map, call its `close()` to tear down compose + state, update the agents pane on the next refresh. Bottles the dashboard didn't start (`x` on a read-only-watch row) → no-op with a status hint. ### Dashboard quit `q` (existing) calls `stack.close()` before exit; every dashboard-launched bottle goes through its normal teardown (`compose down` + state settle). Preserve markers (capability- block, crash) still keep state across teardown. The dashboard process itself returns 0. If the operator wants to keep bottles alive past dashboard exit, the existing path is unchanged: launch them via `./cli.py start` in a separate terminal. That ownership stays out-of-band. ## Implementation chunks Sized for one PR each. 1. **Refactor `_launch_bottle` so the launch + exec_claude pieces are separable.** Today's `cli/start.py` runs both inside one function. Extract `prepare_with_preflight(spec, *, render_preflight, prompt_yes)` and `attach_claude(bottle, *, remote_control)`. The CLI's existing one-shot use binds them as before; the dashboard binds them with curses-aware render + prompt callables. No behavior change. 2. **Agent picker modal + new-agent flow.** New key `n` opens the picker; `prepare_with_preflight` runs against the selected agent; on Y, `backend.launch(plan)` enters the dashboard's ExitStack; handoff invokes `attach_claude`. 3. **Re-attach via Enter on owned agents-pane row.** Looks up the slug in the dashboard's `bottles` map; if present → handoff; else → status-line hint pointing at `./cli.py resume`. 4. **Explicit per-bottle stop (`x` keybinding).** Pop the bottle's `close` callback off the stack, call it, refresh. 5. **Quit-cleanup (`q`).** Hook `stack.close()` into the normal return path. Document the "exiting dashboard tears down every bottle it started" contract in `dashboard.py`'s module docstring. ## Open questions 1. **Modal vs. drop-and-resume for preflight Y/N.** Both work; modal is nicer if the curses geometry handling is straightforward. Pick during chunk 2 by prototyping the modal in ~30 lines and seeing if it looks right. 2. **Agent picker: text-filter typing?** v1 is j/k navigation only. If the manifest has 20+ agents the picker gets noisy; add fzf-style filter input later if needed. 3. **What happens if `attach_claude` exits because the container died** (not a clean claude exit — e.g., OOM, panic)? Today's `_settle_state` marks the bottle preserved for non-zero exit codes. The dashboard's re-render needs to notice the bottle is gone (compose down or container-not- running state) and surface a status line. Probably: transcript snapshot + mark preserved + remove from `bottles` map + status line "claude session for [slug] ended with exit N; preserved for resume". 4. **Double-start of the same agent.** Allowed by design — slugs are unique per launch — but the picker should make it clear this is a "start a SECOND bottle" decision, not a "re-enter the first." Probably handled by showing the running-count in the picker row. 5. **Should `q` confirm before tearing down N running bottles?** A 5-bottle dashboard with 5 in-flight sessions loses non-trivial state on accidental `q`. Probably yes: curses modal "quit and tear down N bottles? [y/N]". Skip confirmation when there are zero owned bottles. 6. **Race between handoff and 1s refresh tick.** While the dashboard's `stdscr.timeout` is set, a key press fires the handoff and the dashboard sits in `docker exec` for minutes. `discover_active_agents` / `discover_pending` don't poll during that window, which is fine — the moment we `stdscr.refresh()` after exec returns, the next loop iter runs discovery and the panes reflect reality. Worth calling out in the design but no special handling needed. 7. **Multi-bottle resource use.** Five bottles up means five compose projects: 5×(agent + pipelock + egress optional + git-gate optional + supervise optional) containers, plus 5×2 networks. On a 16-GiB host this is fine; on something smaller the operator might want a soft cap or a warning. Out of v1; flag for follow-up if it bites. ## References - PRD 0018 — compose-per-instance lifecycle (the `backend. launch` context-manager contract this PRD layers against) - PRD 0019 — active-agents pane + selection model (the agents-pane row the re-attach + stop verbs hook into) - `docs/research/claude-code-pane-in-dashboard.md` — option 1 (handoff) is what `attach_claude` implements here; options 2 / 3 are out of scope for this PRD - `claude_bottle/cli/start.py:_launch_bottle` — the function chunk 1 extracts the prepare + attach pieces out of