397 lines
17 KiB
Markdown
397 lines
17 KiB
Markdown
# PRD 0020: Start and attach to agents from inside the dashboard
|
||
|
||
- **Status:** Draft
|
||
- **Author:** didericis
|
||
- **Created:** 2026-05-26
|
||
|
||
## Summary
|
||
|
||
Today the dashboard is read-only: it surfaces pending proposals
|
||
and active agents (PRD 0019) but can't *start* an agent or
|
||
*re-enter* one. The operator's path is split — they launch
|
||
agents from one terminal (`./cli.py start <name>`), and watch
|
||
them from another (`./cli.py dashboard`).
|
||
|
||
This PRD collapses that split. The dashboard becomes the
|
||
operator's single surface: pressing a key opens an agent picker,
|
||
selecting one runs the existing prepare → preflight → launch
|
||
flow inside a curses-friendly variant, and on yield drops to a
|
||
full-screen `docker exec -it … claude` session (the "handoff"
|
||
shape from `docs/research/claude-code-pane-in-dashboard.md`).
|
||
When the operator exits claude, the dashboard re-renders with
|
||
the now-running bottle visible in the agents pane.
|
||
|
||
Crucially, the bottle's lifetime is decoupled from both the
|
||
claude session AND the dashboard process. Exit claude → back to
|
||
dashboard, bottle still running. Start another agent → two
|
||
bottles up at once. Quit the dashboard → bottles continue
|
||
running. Teardown is **always explicit**: the operator presses
|
||
`x` on an agent, or runs `./cli.py cleanup` later.
|
||
|
||
## Problem
|
||
|
||
Two real frictions today:
|
||
|
||
1. **Two terminals for one workflow.** The dashboard is the
|
||
right shape to *watch* agents — proposals queue, status
|
||
updates, operator-edit verbs — but it's the wrong shape to
|
||
*start* them. Today you open a second terminal for that. In
|
||
parallel use (3–5 bottles), the operator has 5+ terminals
|
||
open and the dashboard's "active agents" pane is hopelessly
|
||
behind reality because they just spawned three in a row.
|
||
|
||
2. **`./cli.py start` ties the bottle to a single claude
|
||
session.** The start command's `ExitStack` brings the bottle
|
||
up, runs claude, and tears down on Ctrl-D — fine for a one-
|
||
shot session, wrong for "let me bounce in and out of this
|
||
bottle a few times while triaging proposals." Today the only
|
||
way to re-enter a bottle after exiting claude is to start a
|
||
fresh one and lose all in-bottle state.
|
||
|
||
The dashboard already discovers active bottles, scopes
|
||
operator-edit verbs to a selected agent (PRD 0019), and
|
||
captures full-merged logs per bottle (PRD 0018). It already
|
||
*wants* to be the primary surface. This PRD finishes that.
|
||
|
||
## Goals / Success Criteria
|
||
|
||
1. From inside `./cli.py dashboard`, pressing `n` (new) opens
|
||
an agent picker listing every agent defined in the manifest.
|
||
Selecting one runs `prepare → preflight → launch`.
|
||
2. The preflight Y/N summary renders cleanly — either as a
|
||
curses modal or via `curses.endwin() → text-mode prompt
|
||
→ restore`, matching the existing editor-flow pattern.
|
||
3. On launch success, the dashboard performs a handoff (option
|
||
1 from the research doc): `curses.endwin()` → `docker exec
|
||
-it claude-bottle-<slug> claude --dangerously-skip-permissions`
|
||
→ on exit, `stdscr.refresh()` and re-render with the new
|
||
bottle in the agents pane.
|
||
4. The bottle's lifetime is owned by the dashboard process, NOT
|
||
by any single claude session. Exiting claude (Ctrl-D, `/exit`)
|
||
returns to the dashboard with the bottle still running. The
|
||
operator can start more agents and re-enter previous ones.
|
||
5. Pressing Enter on a selected row in the agents pane re-
|
||
attaches to that agent's bottle via the same handoff — drops
|
||
to full-screen claude, returns on exit.
|
||
6. Pressing `x` (or similar — keybinding decided in design)
|
||
on a selected agent stops just that bottle (compose down +
|
||
state cleanup) without quitting the dashboard.
|
||
7. Quitting the dashboard (`q`) leaves every running bottle
|
||
running. Bottle teardown is always explicit (per-bottle `x`
|
||
or `./cli.py cleanup`). The next `./cli.py dashboard`
|
||
invocation re-discovers them via `list_active_slugs()` and
|
||
surfaces re-attach for any it can reconstruct context for
|
||
(see "Cross-dashboard re-attach" below).
|
||
|
||
## Non-goals
|
||
|
||
- **A pane that hosts the claude TUI alongside proposals.** The
|
||
embedded-emulator option from the research doc is out of
|
||
scope. The handoff (option 1) is the v1; option 2 is a
|
||
separate PRD if and when handoff is observably insufficient.
|
||
- **Adopting bottles started by an out-of-dashboard `./cli.py
|
||
start` invocation.** Those have their own ExitStack-owner and
|
||
the dashboard treats them as read-only-watch (already does
|
||
today). Re-attach only applies to bottles the *current
|
||
dashboard process* started.
|
||
- **Resurrecting an out-of-process bottle into a new dashboard
|
||
with full re-attach.** A bottle started by `./cli.py start`
|
||
in another terminal — or by a previous dashboard run, now
|
||
exited — appears in the agents pane (already does, PRD 0019)
|
||
and can be re-attached via `docker exec -it claude` because
|
||
the agent container is still running `sleep infinity`. That's
|
||
in scope. What's *out* is anything that requires the launch-
|
||
context object to drive teardown — e.g., the
|
||
ExitStack-tracked CA + state cleanup `_settle_state` performs
|
||
today. Cross-dashboard re-attach uses the existing
|
||
`./cli.py cleanup` for teardown, not an `x` keypress (see
|
||
open questions).
|
||
- **Multi-window UI.** Single curses window, two existing
|
||
panes (proposals + agents); the agent picker is a modal, not
|
||
a third pane.
|
||
- **Removing `./cli.py start`.** Stays as the script-friendly /
|
||
legacy entry point. The dashboard is the new default.
|
||
|
||
## Scope
|
||
|
||
### In scope
|
||
|
||
- Manifest-driven agent picker (curses modal): list view with
|
||
j/k navigation + Enter to confirm, Esc to abort.
|
||
- Preflight rendering inside the dashboard's curses surface
|
||
(modal or drop-and-resume — picked in design).
|
||
- A new `_dashboard_start_flow` that wraps prepare + preflight
|
||
+ launch and returns a `DockerBottle` handle the dashboard
|
||
retains alongside its `pending` and `agents` lists.
|
||
- A `bottles: dict[slug, DockerBottle]` map on the main loop
|
||
that owns every dashboard-launched handle. ExitStack tears
|
||
them all down on dashboard exit.
|
||
- `Enter` on an agents-pane row → re-attach handoff (docker
|
||
exec -it claude into the existing container).
|
||
- `x` (or similar) on an agents-pane row → explicit per-bottle
|
||
stop without quitting.
|
||
- `q` (existing quit key) → tear down all dashboard-launched
|
||
bottles before returning.
|
||
|
||
### Out of scope
|
||
|
||
- Changes to `./cli.py start` itself. It keeps its current
|
||
shape; the dashboard reuses its internal pieces (backend.
|
||
prepare / backend.launch) without reaching through the CLI
|
||
layer.
|
||
- Changes to `backend.launch`'s context-manager contract; the
|
||
dashboard's bottle map just holds the context-manager-yielded
|
||
Bottle and calls `__exit__` on quit / explicit stop.
|
||
- New manifest fields. The picker reads what's already there.
|
||
- Adopting non-dashboard bottles into the dashboard's owned set.
|
||
|
||
## Proposed design
|
||
|
||
### Bottle ownership
|
||
|
||
Today's flow:
|
||
|
||
```
|
||
./cli.py start agent
|
||
└─ with backend.launch(plan) as bottle: ← bottle alive while inside `with`
|
||
bottle.exec_claude([...], tty=True) ← blocks until claude exits
|
||
# context exits → compose down → state cleanup
|
||
```
|
||
|
||
The proposed dashboard-driven flow:
|
||
|
||
```
|
||
./cli.py dashboard
|
||
└─ bottles: dict[str, tuple[ContextManager, DockerBottle]] = {}
|
||
|
||
# operator presses `n`, picks agent
|
||
cm = backend.launch(plan)
|
||
bottle = cm.__enter__() ← enter but don't bind to a `with`
|
||
bottles[plan.slug] = (cm, bottle)
|
||
|
||
# operator interacts via:
|
||
curses.endwin()
|
||
bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D
|
||
stdscr.refresh()
|
||
# bottle is STILL ALIVE — only the claude process exited
|
||
|
||
# ... operator presses `x` on selected agent:
|
||
cm, _ = bottles.pop(slug)
|
||
cm.__exit__(None, None, None) ← tears down just that one
|
||
|
||
# ... operator presses `q`:
|
||
return # bottles dict still populated; no teardown
|
||
```
|
||
|
||
Two shifts:
|
||
|
||
1. Bottles outlive any single claude session — the dashboard
|
||
manages enter/exit per bottle, not per attach. Exit claude
|
||
→ still in the dashboard with the bottle running.
|
||
2. Bottles outlive the dashboard process itself. Quitting the
|
||
dashboard does NOT close the context managers; the docker
|
||
compose project keeps running with the agent container in
|
||
`sleep infinity`. A subsequent dashboard invocation
|
||
re-discovers it via `docker compose ls` (PRD 0019's
|
||
`list_active_slugs`) and surfaces re-attach.
|
||
|
||
The trade-off: state cleanup that today runs in
|
||
`_settle_state` (transcript snapshot, preserve-marker
|
||
evaluation, state-dir reap) doesn't fire on a quit-while-
|
||
running bottle. It DOES fire when the operator explicitly
|
||
stops via `x`, because that calls `cm.__exit__`. For
|
||
bottles a previous dashboard quit on, `./cli.py cleanup`
|
||
is the path — its compose-down + state-reap logic
|
||
already covers the case.
|
||
|
||
### Cross-dashboard re-attach
|
||
|
||
When the dashboard discovers a bottle in `discover_active_agents`
|
||
that it didn't itself start (a previous-dashboard or external
|
||
`./cli.py start` bottle), Enter still attaches via `docker exec
|
||
-it … claude` — the agent container is running `sleep infinity`
|
||
exactly the same way regardless of who started it. The only
|
||
thing the current dashboard lacks for those bottles is the
|
||
launch-context object needed to drive a clean teardown via
|
||
`x`.
|
||
|
||
For v1 we surface this honestly: pressing `x` on a non-owned
|
||
agent shows a status hint pointing at `./cli.py cleanup` (or
|
||
`./cli.py cleanup` targeted at the slug if we add that flag
|
||
later). The agent stays alive; the operator handles teardown
|
||
out-of-band. Enter (re-attach) works for both owned and
|
||
non-owned bottles.
|
||
|
||
### Agent picker
|
||
|
||
Pressing `n` opens a centered modal listing every agent name
|
||
from `spec.manifest.agents`. j/k navigates; Enter selects; Esc
|
||
aborts. Width is the longest name + bottle name + a column for
|
||
"already running?" so the operator can see at a glance whether
|
||
picking an agent starts a fresh one (different slug suffix) or
|
||
not.
|
||
|
||
```
|
||
┌─ start agent ───────────────────────────┐
|
||
│ implementer dev (running) │
|
||
│ > researcher dev │
|
||
│ triage-bot sandbox │
|
||
└─ Enter: start Esc: cancel ─────────────┘
|
||
```
|
||
|
||
Starting an agent that already has a running bottle is allowed
|
||
— each `start` mints a fresh slug — but the picker surfaces the
|
||
already-running state so the operator doesn't accidentally
|
||
double-launch.
|
||
|
||
### Preflight Y/N
|
||
|
||
Two viable shapes:
|
||
|
||
**Modal** — render the preflight summary lines (`agent / env /
|
||
skills / bottle / git gate / egress`) in a centered curses
|
||
modal with `[y/N]` at the bottom. Capture the next keypress.
|
||
|
||
**Drop-and-resume** — `curses.endwin()`, print the preflight to
|
||
stderr, read y/N from stdin, restore curses. Matches the
|
||
editor-flow + handoff pattern; lower implementation cost.
|
||
|
||
Lean toward **modal** for the y/N because it doesn't flash the
|
||
terminal between dashboard frames. Drop-and-resume is acceptable
|
||
if modal proves fiddly.
|
||
|
||
### Re-attach (Enter on agent)
|
||
|
||
Same handoff pattern the new-agent flow uses. For an agent the
|
||
dashboard started this session, the dashboard holds the
|
||
`DockerBottle` handle in its `bottles` dict and calls
|
||
`bottle.exec_claude(...)`. For an agent it discovered via
|
||
`list_active_slugs` (previous-dashboard or external start),
|
||
the dashboard synthesizes a one-shot `DockerBottle` from the
|
||
slug — container name is `claude-bottle-<slug>`, no prompt
|
||
path because the agent's claude config already has `--append-
|
||
system-prompt-file` baked in from the original launch —
|
||
and runs the same exec. Either way, Enter drops to
|
||
full-screen claude; on exit the dashboard re-renders.
|
||
|
||
### Explicit per-bottle stop
|
||
|
||
`x` on a dashboard-owned agent: pop the `(cm, bottle)` from
|
||
the dict, call `cm.__exit__(None, None, None)` which drives
|
||
the existing compose-down + state-settle logic. Refresh the
|
||
agents pane.
|
||
|
||
`x` on a non-owned agent (discovered via `list_active_slugs`
|
||
but not in `bottles` dict): no-op with status hint pointing
|
||
at `./cli.py cleanup` (the existing path that tears down
|
||
ANY claude-bottle compose project plus reaps state dirs).
|
||
|
||
### Dashboard quit
|
||
|
||
`q` returns the dashboard process to 0 without touching any
|
||
running bottles. The `bottles` dict goes out of scope but
|
||
because the context managers' `__exit__` is never invoked,
|
||
the `docker compose` project keeps running. The next dashboard
|
||
invocation discovers the bottles via `list_active_slugs` and
|
||
surfaces re-attach.
|
||
|
||
This is a real departure from today's `./cli.py start`
|
||
semantics (which couples bottle lifetime to the process via
|
||
ExitStack). It's intentional: the dashboard is a watching +
|
||
acting surface, not a lifetime owner.
|
||
|
||
## Implementation chunks
|
||
|
||
Sized for one PR each.
|
||
|
||
1. **Refactor `_launch_bottle` so the launch + exec_claude
|
||
pieces are separable.** Today's `cli/start.py` runs both
|
||
inside one function. Extract `prepare_with_preflight(spec,
|
||
*, render_preflight, prompt_yes)` and `attach_claude(bottle,
|
||
*, remote_control)`. The CLI's existing one-shot use binds
|
||
them as before; the dashboard binds them with curses-aware
|
||
render + prompt callables. No behavior change.
|
||
2. **Agent picker modal + new-agent flow.** New key `n` opens
|
||
the picker; `prepare_with_preflight` runs against the
|
||
selected agent; on Y, `backend.launch(plan)` enters the
|
||
dashboard's ExitStack; handoff invokes `attach_claude`.
|
||
3. **Re-attach via Enter on owned agents-pane row.** Looks up
|
||
the slug in the dashboard's `bottles` map; if present →
|
||
handoff; else → status-line hint pointing at `./cli.py
|
||
resume`.
|
||
4. **Explicit per-bottle stop (`x` keybinding).** Pop the
|
||
bottle's `close` callback off the stack, call it, refresh.
|
||
5. **Quit-cleanup (`q`).** Hook `stack.close()` into the
|
||
normal return path. Document the "exiting dashboard tears
|
||
down every bottle it started" contract in `dashboard.py`'s
|
||
module docstring.
|
||
|
||
## Resolved questions
|
||
|
||
1. **Modal vs. drop-and-resume for preflight Y/N.** Resolved:
|
||
**modal.** Render the preflight lines centered in a curses
|
||
sub-window with `[y/N]` at the bottom; capture the next
|
||
keypress. If geometry proves fiddly during implementation
|
||
we'll fall back to drop-and-resume, but modal is the target.
|
||
|
||
2. **Agent picker: text-filter typing.** Resolved: **yes,
|
||
include filter typing.** As the operator types, the list
|
||
filters to agents whose name matches (substring,
|
||
case-insensitive). j/k still navigates within the filtered
|
||
set; Esc clears the filter on first press, exits the picker
|
||
on the second.
|
||
|
||
3. **Container-died-during-claude handling.** Keep the design
|
||
as drafted: transcript snapshot (`snapshot_transcript`) +
|
||
`mark_preserved` if exit code is non-zero + remove from
|
||
the `bottles` dict + status line `"claude session for
|
||
[slug] ended with exit N; preserved for resume"`. The
|
||
bottle's `cm.__exit__` would normally run on stop; here it
|
||
runs as part of the death-handling (the container is
|
||
already gone, but compose-down + state-settle still
|
||
sequence the network removal + state cleanup correctly).
|
||
|
||
4. **Double-start of the same agent.** Allowed. The picker
|
||
surfaces a `(N running)` annotation next to any agent name
|
||
that already has live bottles in this dashboard's `bottles`
|
||
dict OR in `list_active_slugs()`, so the operator sees the
|
||
running-count before picking. Selecting an already-running
|
||
agent name mints a fresh slug for the new bottle as
|
||
normal.
|
||
|
||
5. **Quit behavior.** Resolved: **`q` does NOT tear down any
|
||
bottles.** Dashboard exit is purely a UI exit; the
|
||
bottles dict goes out of scope without invoking `__exit__`,
|
||
so the `docker compose` projects keep running. Bottle
|
||
teardown is always explicit: per-bottle `x` (for
|
||
dashboard-owned), or `./cli.py cleanup` (for everything).
|
||
|
||
## Open questions
|
||
|
||
6. **Race between handoff and 1s refresh tick.** While the
|
||
dashboard's `stdscr.timeout` is set, a key press fires the
|
||
handoff and the dashboard sits in `docker exec` for minutes.
|
||
`discover_active_agents` / `discover_pending` don't poll
|
||
during that window — that's harmless on its own (the moment
|
||
we `stdscr.refresh()` after exec returns, the next loop
|
||
iter runs discovery and the panes reflect reality), but
|
||
it does mean: (a) proposals queued during the claude
|
||
session won't fire any operator notification until the
|
||
handoff ends, and (b) a bottle that died mid-claude won't
|
||
be detectable until the operator exits back to the
|
||
dashboard. Not blocking v1 — flagging as a known limitation
|
||
to revisit alongside the option-2 embedded-emulator path
|
||
from the research doc.
|
||
|
||
## References
|
||
|
||
- PRD 0018 — compose-per-instance lifecycle (the `backend.
|
||
launch` context-manager contract this PRD layers against)
|
||
- PRD 0019 — active-agents pane + selection model (the
|
||
agents-pane row the re-attach + stop verbs hook into)
|
||
- `docs/research/claude-code-pane-in-dashboard.md` — option 1
|
||
(handoff) is what `attach_claude` implements here; options 2
|
||
/ 3 are out of scope for this PRD
|
||
- `claude_bottle/cli/start.py:_launch_bottle` — the function
|
||
chunk 1 extracts the prepare + attach pieces out of
|