docs(prd-0020): start + attach to agents from the dashboard
Draft a PRD that turns the dashboard into the operator's single surface — collapses today's two-terminal workflow (one for `./cli.py start`, one for `./cli.py dashboard`) into a single dashboard invocation that can spin up new agents, re-attach to ones it already spun up, and explicitly stop them. Picks the "handoff" mechanism from `docs/research/claude-code- pane-in-dashboard.md` (curses.endwin → docker exec -it claude → stdscr.refresh) and crucially decouples the bottle's lifetime from any single claude session: exit claude → back to dashboard with the bottle still running; quit dashboard → tear down every bottle the dashboard owns. Sized into 5 chunks (refactor → picker + new-agent → re-attach → explicit stop → quit-cleanup). Seven open questions called out, the biggest being modal-vs-drop-and-resume for the preflight Y/N inside curses.
This commit is contained in:
@@ -0,0 +1,336 @@
|
|||||||
|
# PRD 0020: Start and attach to agents from inside the dashboard
|
||||||
|
|
||||||
|
- **Status:** Draft
|
||||||
|
- **Author:** didericis
|
||||||
|
- **Created:** 2026-05-26
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Today the dashboard is read-only: it surfaces pending proposals
|
||||||
|
and active agents (PRD 0019) but can't *start* an agent or
|
||||||
|
*re-enter* one. The operator's path is split — they launch
|
||||||
|
agents from one terminal (`./cli.py start <name>`), and watch
|
||||||
|
them from another (`./cli.py dashboard`).
|
||||||
|
|
||||||
|
This PRD collapses that split. The dashboard becomes the
|
||||||
|
operator's single surface: pressing a key opens an agent picker,
|
||||||
|
selecting one runs the existing prepare → preflight → launch
|
||||||
|
flow inside a curses-friendly variant, and on yield drops to a
|
||||||
|
full-screen `docker exec -it … claude` session (the "handoff"
|
||||||
|
shape from `docs/research/claude-code-pane-in-dashboard.md`).
|
||||||
|
When the operator exits claude, the dashboard re-renders with
|
||||||
|
the now-running bottle visible in the agents pane.
|
||||||
|
|
||||||
|
Crucially, the bottle's lifetime is owned by the *dashboard
|
||||||
|
process*, not by the individual claude session. Exit claude →
|
||||||
|
back to dashboard, bottle still running. Start another agent →
|
||||||
|
two bottles up at once. Quit the dashboard → all dashboard-
|
||||||
|
launched bottles tear down.
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
Two real frictions today:
|
||||||
|
|
||||||
|
1. **Two terminals for one workflow.** The dashboard is the
|
||||||
|
right shape to *watch* agents — proposals queue, status
|
||||||
|
updates, operator-edit verbs — but it's the wrong shape to
|
||||||
|
*start* them. Today you open a second terminal for that. In
|
||||||
|
parallel use (3–5 bottles), the operator has 5+ terminals
|
||||||
|
open and the dashboard's "active agents" pane is hopelessly
|
||||||
|
behind reality because they just spawned three in a row.
|
||||||
|
|
||||||
|
2. **`./cli.py start` ties the bottle to a single claude
|
||||||
|
session.** The start command's `ExitStack` brings the bottle
|
||||||
|
up, runs claude, and tears down on Ctrl-D — fine for a one-
|
||||||
|
shot session, wrong for "let me bounce in and out of this
|
||||||
|
bottle a few times while triaging proposals." Today the only
|
||||||
|
way to re-enter a bottle after exiting claude is to start a
|
||||||
|
fresh one and lose all in-bottle state.
|
||||||
|
|
||||||
|
The dashboard already discovers active bottles, scopes
|
||||||
|
operator-edit verbs to a selected agent (PRD 0019), and
|
||||||
|
captures full-merged logs per bottle (PRD 0018). It already
|
||||||
|
*wants* to be the primary surface. This PRD finishes that.
|
||||||
|
|
||||||
|
## Goals / Success Criteria
|
||||||
|
|
||||||
|
1. From inside `./cli.py dashboard`, pressing `n` (new) opens
|
||||||
|
an agent picker listing every agent defined in the manifest.
|
||||||
|
Selecting one runs `prepare → preflight → launch`.
|
||||||
|
2. The preflight Y/N summary renders cleanly — either as a
|
||||||
|
curses modal or via `curses.endwin() → text-mode prompt
|
||||||
|
→ restore`, matching the existing editor-flow pattern.
|
||||||
|
3. On launch success, the dashboard performs a handoff (option
|
||||||
|
1 from the research doc): `curses.endwin()` → `docker exec
|
||||||
|
-it claude-bottle-<slug> claude --dangerously-skip-permissions`
|
||||||
|
→ on exit, `stdscr.refresh()` and re-render with the new
|
||||||
|
bottle in the agents pane.
|
||||||
|
4. The bottle's lifetime is owned by the dashboard process, NOT
|
||||||
|
by any single claude session. Exiting claude (Ctrl-D, `/exit`)
|
||||||
|
returns to the dashboard with the bottle still running. The
|
||||||
|
operator can start more agents and re-enter previous ones.
|
||||||
|
5. Pressing Enter on a selected row in the agents pane re-
|
||||||
|
attaches to that agent's bottle via the same handoff — drops
|
||||||
|
to full-screen claude, returns on exit.
|
||||||
|
6. Pressing `x` (or similar — keybinding decided in design)
|
||||||
|
on a selected agent stops just that bottle (compose down +
|
||||||
|
state cleanup) without quitting the dashboard.
|
||||||
|
7. Quitting the dashboard (`q`) tears down every bottle the
|
||||||
|
dashboard started, unless something has explicitly preserved
|
||||||
|
the state (capability-block, crash). Matches today's
|
||||||
|
start.py teardown semantics.
|
||||||
|
|
||||||
|
## Non-goals
|
||||||
|
|
||||||
|
- **A pane that hosts the claude TUI alongside proposals.** The
|
||||||
|
embedded-emulator option from the research doc is out of
|
||||||
|
scope. The handoff (option 1) is the v1; option 2 is a
|
||||||
|
separate PRD if and when handoff is observably insufficient.
|
||||||
|
- **Adopting bottles started by an out-of-dashboard `./cli.py
|
||||||
|
start` invocation.** Those have their own ExitStack-owner and
|
||||||
|
the dashboard treats them as read-only-watch (already does
|
||||||
|
today). Re-attach only applies to bottles the *current
|
||||||
|
dashboard process* started.
|
||||||
|
- **Persisting a "bottle pool" across dashboard runs.** When
|
||||||
|
the dashboard quits, its bottles go. Resume across dashboard
|
||||||
|
invocations is `./cli.py resume <identity>`, which is
|
||||||
|
unchanged.
|
||||||
|
- **Multi-window UI.** Single curses window, two existing
|
||||||
|
panes (proposals + agents); the agent picker is a modal, not
|
||||||
|
a third pane.
|
||||||
|
- **Removing `./cli.py start`.** Stays as the script-friendly /
|
||||||
|
legacy entry point. The dashboard is the new default.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
### In scope
|
||||||
|
|
||||||
|
- Manifest-driven agent picker (curses modal): list view with
|
||||||
|
j/k navigation + Enter to confirm, Esc to abort.
|
||||||
|
- Preflight rendering inside the dashboard's curses surface
|
||||||
|
(modal or drop-and-resume — picked in design).
|
||||||
|
- A new `_dashboard_start_flow` that wraps prepare + preflight
|
||||||
|
+ launch and returns a `DockerBottle` handle the dashboard
|
||||||
|
retains alongside its `pending` and `agents` lists.
|
||||||
|
- A `bottles: dict[slug, DockerBottle]` map on the main loop
|
||||||
|
that owns every dashboard-launched handle. ExitStack tears
|
||||||
|
them all down on dashboard exit.
|
||||||
|
- `Enter` on an agents-pane row → re-attach handoff (docker
|
||||||
|
exec -it claude into the existing container).
|
||||||
|
- `x` (or similar) on an agents-pane row → explicit per-bottle
|
||||||
|
stop without quitting.
|
||||||
|
- `q` (existing quit key) → tear down all dashboard-launched
|
||||||
|
bottles before returning.
|
||||||
|
|
||||||
|
### Out of scope
|
||||||
|
|
||||||
|
- Changes to `./cli.py start` itself. It keeps its current
|
||||||
|
shape; the dashboard reuses its internal pieces (backend.
|
||||||
|
prepare / backend.launch) without reaching through the CLI
|
||||||
|
layer.
|
||||||
|
- Changes to `backend.launch`'s context-manager contract; the
|
||||||
|
dashboard's bottle map just holds the context-manager-yielded
|
||||||
|
Bottle and calls `__exit__` on quit / explicit stop.
|
||||||
|
- New manifest fields. The picker reads what's already there.
|
||||||
|
- Adopting non-dashboard bottles into the dashboard's owned set.
|
||||||
|
|
||||||
|
## Proposed design
|
||||||
|
|
||||||
|
### Bottle ownership
|
||||||
|
|
||||||
|
Today's flow:
|
||||||
|
|
||||||
|
```
|
||||||
|
./cli.py start agent
|
||||||
|
└─ with backend.launch(plan) as bottle: ← bottle alive while inside `with`
|
||||||
|
bottle.exec_claude([...], tty=True) ← blocks until claude exits
|
||||||
|
# context exits → compose down → state cleanup
|
||||||
|
```
|
||||||
|
|
||||||
|
The proposed dashboard-owned flow:
|
||||||
|
|
||||||
|
```
|
||||||
|
./cli.py dashboard
|
||||||
|
└─ stack = ExitStack()
|
||||||
|
bottles: dict[str, DockerBottle] = {}
|
||||||
|
|
||||||
|
# operator presses `n`, picks agent
|
||||||
|
ctx = backend.launch(plan)
|
||||||
|
bottle = stack.enter_context(ctx) ← bottle stays alive
|
||||||
|
bottles[plan.slug] = bottle
|
||||||
|
|
||||||
|
# operator interacts via:
|
||||||
|
curses.endwin()
|
||||||
|
bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D
|
||||||
|
stdscr.refresh()
|
||||||
|
# bottle is STILL ALIVE here — only the claude process exited
|
||||||
|
|
||||||
|
# ... operator does other things, eventually `q`:
|
||||||
|
stack.close() ← tears down every bottle
|
||||||
|
```
|
||||||
|
|
||||||
|
The shift is one line of code semantically but the change in
|
||||||
|
operator experience is real: bottles outlive any single claude
|
||||||
|
session.
|
||||||
|
|
||||||
|
### Agent picker
|
||||||
|
|
||||||
|
Pressing `n` opens a centered modal listing every agent name
|
||||||
|
from `spec.manifest.agents`. j/k navigates; Enter selects; Esc
|
||||||
|
aborts. Width is the longest name + bottle name + a column for
|
||||||
|
"already running?" so the operator can see at a glance whether
|
||||||
|
picking an agent starts a fresh one (different slug suffix) or
|
||||||
|
not.
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─ start agent ───────────────────────────┐
|
||||||
|
│ implementer dev (running) │
|
||||||
|
│ > researcher dev │
|
||||||
|
│ triage-bot sandbox │
|
||||||
|
└─ Enter: start Esc: cancel ─────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
Starting an agent that already has a running bottle is allowed
|
||||||
|
— each `start` mints a fresh slug — but the picker surfaces the
|
||||||
|
already-running state so the operator doesn't accidentally
|
||||||
|
double-launch.
|
||||||
|
|
||||||
|
### Preflight Y/N
|
||||||
|
|
||||||
|
Two viable shapes:
|
||||||
|
|
||||||
|
**Modal** — render the preflight summary lines (`agent / env /
|
||||||
|
skills / bottle / git gate / egress`) in a centered curses
|
||||||
|
modal with `[y/N]` at the bottom. Capture the next keypress.
|
||||||
|
|
||||||
|
**Drop-and-resume** — `curses.endwin()`, print the preflight to
|
||||||
|
stderr, read y/N from stdin, restore curses. Matches the
|
||||||
|
editor-flow + handoff pattern; lower implementation cost.
|
||||||
|
|
||||||
|
Lean toward **modal** for the y/N because it doesn't flash the
|
||||||
|
terminal between dashboard frames. Drop-and-resume is acceptable
|
||||||
|
if modal proves fiddly.
|
||||||
|
|
||||||
|
### Re-attach (Enter on agent)
|
||||||
|
|
||||||
|
Same handoff pattern the new-agent flow uses. The dashboard
|
||||||
|
already holds the `DockerBottle` for any slug it started —
|
||||||
|
`bottle.exec_claude([...], tty=True)` does the right `docker
|
||||||
|
exec -it claude …` and returns on session exit. Re-attach is
|
||||||
|
"already-running" + the same exec call; the agent picker isn't
|
||||||
|
involved.
|
||||||
|
|
||||||
|
For agents the dashboard didn't start (read-only watch), Enter
|
||||||
|
is a no-op with a status hint ("dashboard didn't start this
|
||||||
|
bottle; resume with `./cli.py resume <identity>` outside the
|
||||||
|
dashboard"). PRD-0019's selection model already differentiates
|
||||||
|
focus; this layer just gates the action.
|
||||||
|
|
||||||
|
### Explicit per-bottle stop
|
||||||
|
|
||||||
|
`x` on a selected dashboard-owned agent invokes
|
||||||
|
`stack.pop_callback`-style targeted teardown: take that bottle
|
||||||
|
out of the map, call its `close()` to tear down compose + state,
|
||||||
|
update the agents pane on the next refresh. Bottles the
|
||||||
|
dashboard didn't start (`x` on a read-only-watch row) → no-op
|
||||||
|
with a status hint.
|
||||||
|
|
||||||
|
### Dashboard quit
|
||||||
|
|
||||||
|
`q` (existing) calls `stack.close()` before exit; every
|
||||||
|
dashboard-launched bottle goes through its normal teardown
|
||||||
|
(`compose down` + state settle). Preserve markers (capability-
|
||||||
|
block, crash) still keep state across teardown. The dashboard
|
||||||
|
process itself returns 0.
|
||||||
|
|
||||||
|
If the operator wants to keep bottles alive past dashboard
|
||||||
|
exit, the existing path is unchanged: launch them via
|
||||||
|
`./cli.py start` in a separate terminal. That ownership stays
|
||||||
|
out-of-band.
|
||||||
|
|
||||||
|
## Implementation chunks
|
||||||
|
|
||||||
|
Sized for one PR each.
|
||||||
|
|
||||||
|
1. **Refactor `_launch_bottle` so the launch + exec_claude
|
||||||
|
pieces are separable.** Today's `cli/start.py` runs both
|
||||||
|
inside one function. Extract `prepare_with_preflight(spec,
|
||||||
|
*, render_preflight, prompt_yes)` and `attach_claude(bottle,
|
||||||
|
*, remote_control)`. The CLI's existing one-shot use binds
|
||||||
|
them as before; the dashboard binds them with curses-aware
|
||||||
|
render + prompt callables. No behavior change.
|
||||||
|
2. **Agent picker modal + new-agent flow.** New key `n` opens
|
||||||
|
the picker; `prepare_with_preflight` runs against the
|
||||||
|
selected agent; on Y, `backend.launch(plan)` enters the
|
||||||
|
dashboard's ExitStack; handoff invokes `attach_claude`.
|
||||||
|
3. **Re-attach via Enter on owned agents-pane row.** Looks up
|
||||||
|
the slug in the dashboard's `bottles` map; if present →
|
||||||
|
handoff; else → status-line hint pointing at `./cli.py
|
||||||
|
resume`.
|
||||||
|
4. **Explicit per-bottle stop (`x` keybinding).** Pop the
|
||||||
|
bottle's `close` callback off the stack, call it, refresh.
|
||||||
|
5. **Quit-cleanup (`q`).** Hook `stack.close()` into the
|
||||||
|
normal return path. Document the "exiting dashboard tears
|
||||||
|
down every bottle it started" contract in `dashboard.py`'s
|
||||||
|
module docstring.
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
1. **Modal vs. drop-and-resume for preflight Y/N.** Both work;
|
||||||
|
modal is nicer if the curses geometry handling is
|
||||||
|
straightforward. Pick during chunk 2 by prototyping the
|
||||||
|
modal in ~30 lines and seeing if it looks right.
|
||||||
|
|
||||||
|
2. **Agent picker: text-filter typing?** v1 is j/k navigation
|
||||||
|
only. If the manifest has 20+ agents the picker gets noisy;
|
||||||
|
add fzf-style filter input later if needed.
|
||||||
|
|
||||||
|
3. **What happens if `attach_claude` exits because the
|
||||||
|
container died** (not a clean claude exit — e.g., OOM,
|
||||||
|
panic)? Today's `_settle_state` marks the bottle preserved
|
||||||
|
for non-zero exit codes. The dashboard's re-render needs to
|
||||||
|
notice the bottle is gone (compose down or container-not-
|
||||||
|
running state) and surface a status line. Probably:
|
||||||
|
transcript snapshot + mark preserved + remove from
|
||||||
|
`bottles` map + status line "claude session for [slug]
|
||||||
|
ended with exit N; preserved for resume".
|
||||||
|
|
||||||
|
4. **Double-start of the same agent.** Allowed by design — slugs
|
||||||
|
are unique per launch — but the picker should make it clear
|
||||||
|
this is a "start a SECOND bottle" decision, not a "re-enter
|
||||||
|
the first." Probably handled by showing the running-count in
|
||||||
|
the picker row.
|
||||||
|
|
||||||
|
5. **Should `q` confirm before tearing down N running
|
||||||
|
bottles?** A 5-bottle dashboard with 5 in-flight sessions
|
||||||
|
loses non-trivial state on accidental `q`. Probably yes:
|
||||||
|
curses modal "quit and tear down N bottles? [y/N]". Skip
|
||||||
|
confirmation when there are zero owned bottles.
|
||||||
|
|
||||||
|
6. **Race between handoff and 1s refresh tick.** While the
|
||||||
|
dashboard's `stdscr.timeout` is set, a key press fires the
|
||||||
|
handoff and the dashboard sits in `docker exec` for minutes.
|
||||||
|
`discover_active_agents` / `discover_pending` don't poll
|
||||||
|
during that window, which is fine — the moment we
|
||||||
|
`stdscr.refresh()` after exec returns, the next loop iter
|
||||||
|
runs discovery and the panes reflect reality. Worth calling
|
||||||
|
out in the design but no special handling needed.
|
||||||
|
|
||||||
|
7. **Multi-bottle resource use.** Five bottles up means five
|
||||||
|
compose projects: 5×(agent + pipelock + egress optional +
|
||||||
|
git-gate optional + supervise optional) containers, plus 5×2
|
||||||
|
networks. On a 16-GiB host this is fine; on something
|
||||||
|
smaller the operator might want a soft cap or a warning.
|
||||||
|
Out of v1; flag for follow-up if it bites.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- PRD 0018 — compose-per-instance lifecycle (the `backend.
|
||||||
|
launch` context-manager contract this PRD layers against)
|
||||||
|
- PRD 0019 — active-agents pane + selection model (the
|
||||||
|
agents-pane row the re-attach + stop verbs hook into)
|
||||||
|
- `docs/research/claude-code-pane-in-dashboard.md` — option 1
|
||||||
|
(handoff) is what `attach_claude` implements here; options 2
|
||||||
|
/ 3 are out of scope for this PRD
|
||||||
|
- `claude_bottle/cli/start.py:_launch_bottle` — the function
|
||||||
|
chunk 1 extracts the prepare + attach pieces out of
|
||||||
Reference in New Issue
Block a user