Files
bot-bottle/docs/prds/0020-start-and-attach-from-dashboard.md
T

397 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PRD 0020: Start and attach to agents from inside the dashboard
- **Status:** Draft
- **Author:** didericis
- **Created:** 2026-05-26
## Summary
Today the dashboard is read-only: it surfaces pending proposals
and active agents (PRD 0019) but can't *start* an agent or
*re-enter* one. The operator's path is split — they launch
agents from one terminal (`./cli.py start <name>`), and watch
them from another (`./cli.py dashboard`).
This PRD collapses that split. The dashboard becomes the
operator's single surface: pressing a key opens an agent picker,
selecting one runs the existing prepare → preflight → launch
flow inside a curses-friendly variant, and on yield drops to a
full-screen `docker exec -it … claude` session (the "handoff"
shape from `docs/research/claude-code-pane-in-dashboard.md`).
When the operator exits claude, the dashboard re-renders with
the now-running bottle visible in the agents pane.
Crucially, the bottle's lifetime is decoupled from both the
claude session AND the dashboard process. Exit claude → back to
dashboard, bottle still running. Start another agent → two
bottles up at once. Quit the dashboard → bottles continue
running. Teardown is **always explicit**: the operator presses
`x` on an agent, or runs `./cli.py cleanup` later.
## Problem
Two real frictions today:
1. **Two terminals for one workflow.** The dashboard is the
right shape to *watch* agents — proposals queue, status
updates, operator-edit verbs — but it's the wrong shape to
*start* them. Today you open a second terminal for that. In
parallel use (35 bottles), the operator has 5+ terminals
open and the dashboard's "active agents" pane is hopelessly
behind reality because they just spawned three in a row.
2. **`./cli.py start` ties the bottle to a single claude
session.** The start command's `ExitStack` brings the bottle
up, runs claude, and tears down on Ctrl-D — fine for a one-
shot session, wrong for "let me bounce in and out of this
bottle a few times while triaging proposals." Today the only
way to re-enter a bottle after exiting claude is to start a
fresh one and lose all in-bottle state.
The dashboard already discovers active bottles, scopes
operator-edit verbs to a selected agent (PRD 0019), and
captures full-merged logs per bottle (PRD 0018). It already
*wants* to be the primary surface. This PRD finishes that.
## Goals / Success Criteria
1. From inside `./cli.py dashboard`, pressing `n` (new) opens
an agent picker listing every agent defined in the manifest.
Selecting one runs `prepare → preflight → launch`.
2. The preflight Y/N summary renders cleanly — either as a
curses modal or via `curses.endwin() → text-mode prompt
→ restore`, matching the existing editor-flow pattern.
3. On launch success, the dashboard performs a handoff (option
1 from the research doc): `curses.endwin()``docker exec
-it claude-bottle-<slug> claude --dangerously-skip-permissions`
→ on exit, `stdscr.refresh()` and re-render with the new
bottle in the agents pane.
4. The bottle's lifetime is owned by the dashboard process, NOT
by any single claude session. Exiting claude (Ctrl-D, `/exit`)
returns to the dashboard with the bottle still running. The
operator can start more agents and re-enter previous ones.
5. Pressing Enter on a selected row in the agents pane re-
attaches to that agent's bottle via the same handoff — drops
to full-screen claude, returns on exit.
6. Pressing `x` (or similar — keybinding decided in design)
on a selected agent stops just that bottle (compose down +
state cleanup) without quitting the dashboard.
7. Quitting the dashboard (`q`) leaves every running bottle
running. Bottle teardown is always explicit (per-bottle `x`
or `./cli.py cleanup`). The next `./cli.py dashboard`
invocation re-discovers them via `list_active_slugs()` and
surfaces re-attach for any it can reconstruct context for
(see "Cross-dashboard re-attach" below).
## Non-goals
- **A pane that hosts the claude TUI alongside proposals.** The
embedded-emulator option from the research doc is out of
scope. The handoff (option 1) is the v1; option 2 is a
separate PRD if and when handoff is observably insufficient.
- **Adopting bottles started by an out-of-dashboard `./cli.py
start` invocation.** Those have their own ExitStack-owner and
the dashboard treats them as read-only-watch (already does
today). Re-attach only applies to bottles the *current
dashboard process* started.
- **Resurrecting an out-of-process bottle into a new dashboard
with full re-attach.** A bottle started by `./cli.py start`
in another terminal — or by a previous dashboard run, now
exited — appears in the agents pane (already does, PRD 0019)
and can be re-attached via `docker exec -it claude` because
the agent container is still running `sleep infinity`. That's
in scope. What's *out* is anything that requires the launch-
context object to drive teardown — e.g., the
ExitStack-tracked CA + state cleanup `_settle_state` performs
today. Cross-dashboard re-attach uses the existing
`./cli.py cleanup` for teardown, not an `x` keypress (see
open questions).
- **Multi-window UI.** Single curses window, two existing
panes (proposals + agents); the agent picker is a modal, not
a third pane.
- **Removing `./cli.py start`.** Stays as the script-friendly /
legacy entry point. The dashboard is the new default.
## Scope
### In scope
- Manifest-driven agent picker (curses modal): list view with
j/k navigation + Enter to confirm, Esc to abort.
- Preflight rendering inside the dashboard's curses surface
(modal or drop-and-resume — picked in design).
- A new `_dashboard_start_flow` that wraps prepare + preflight
+ launch and returns a `DockerBottle` handle the dashboard
retains alongside its `pending` and `agents` lists.
- A `bottles: dict[slug, DockerBottle]` map on the main loop
that owns every dashboard-launched handle. ExitStack tears
them all down on dashboard exit.
- `Enter` on an agents-pane row → re-attach handoff (docker
exec -it claude into the existing container).
- `x` (or similar) on an agents-pane row → explicit per-bottle
stop without quitting.
- `q` (existing quit key) → tear down all dashboard-launched
bottles before returning.
### Out of scope
- Changes to `./cli.py start` itself. It keeps its current
shape; the dashboard reuses its internal pieces (backend.
prepare / backend.launch) without reaching through the CLI
layer.
- Changes to `backend.launch`'s context-manager contract; the
dashboard's bottle map just holds the context-manager-yielded
Bottle and calls `__exit__` on quit / explicit stop.
- New manifest fields. The picker reads what's already there.
- Adopting non-dashboard bottles into the dashboard's owned set.
## Proposed design
### Bottle ownership
Today's flow:
```
./cli.py start agent
└─ with backend.launch(plan) as bottle: ← bottle alive while inside `with`
bottle.exec_claude([...], tty=True) ← blocks until claude exits
# context exits → compose down → state cleanup
```
The proposed dashboard-driven flow:
```
./cli.py dashboard
└─ bottles: dict[str, tuple[ContextManager, DockerBottle]] = {}
# operator presses `n`, picks agent
cm = backend.launch(plan)
bottle = cm.__enter__() ← enter but don't bind to a `with`
bottles[plan.slug] = (cm, bottle)
# operator interacts via:
curses.endwin()
bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D
stdscr.refresh()
# bottle is STILL ALIVE — only the claude process exited
# ... operator presses `x` on selected agent:
cm, _ = bottles.pop(slug)
cm.__exit__(None, None, None) ← tears down just that one
# ... operator presses `q`:
return # bottles dict still populated; no teardown
```
Two shifts:
1. Bottles outlive any single claude session — the dashboard
manages enter/exit per bottle, not per attach. Exit claude
→ still in the dashboard with the bottle running.
2. Bottles outlive the dashboard process itself. Quitting the
dashboard does NOT close the context managers; the docker
compose project keeps running with the agent container in
`sleep infinity`. A subsequent dashboard invocation
re-discovers it via `docker compose ls` (PRD 0019's
`list_active_slugs`) and surfaces re-attach.
The trade-off: state cleanup that today runs in
`_settle_state` (transcript snapshot, preserve-marker
evaluation, state-dir reap) doesn't fire on a quit-while-
running bottle. It DOES fire when the operator explicitly
stops via `x`, because that calls `cm.__exit__`. For
bottles a previous dashboard quit on, `./cli.py cleanup`
is the path — its compose-down + state-reap logic
already covers the case.
### Cross-dashboard re-attach
When the dashboard discovers a bottle in `discover_active_agents`
that it didn't itself start (a previous-dashboard or external
`./cli.py start` bottle), Enter still attaches via `docker exec
-it … claude` — the agent container is running `sleep infinity`
exactly the same way regardless of who started it. The only
thing the current dashboard lacks for those bottles is the
launch-context object needed to drive a clean teardown via
`x`.
For v1 we surface this honestly: pressing `x` on a non-owned
agent shows a status hint pointing at `./cli.py cleanup` (or
`./cli.py cleanup` targeted at the slug if we add that flag
later). The agent stays alive; the operator handles teardown
out-of-band. Enter (re-attach) works for both owned and
non-owned bottles.
### Agent picker
Pressing `n` opens a centered modal listing every agent name
from `spec.manifest.agents`. j/k navigates; Enter selects; Esc
aborts. Width is the longest name + bottle name + a column for
"already running?" so the operator can see at a glance whether
picking an agent starts a fresh one (different slug suffix) or
not.
```
┌─ start agent ───────────────────────────┐
│ implementer dev (running) │
│ > researcher dev │
│ triage-bot sandbox │
└─ Enter: start Esc: cancel ─────────────┘
```
Starting an agent that already has a running bottle is allowed
— each `start` mints a fresh slug — but the picker surfaces the
already-running state so the operator doesn't accidentally
double-launch.
### Preflight Y/N
Two viable shapes:
**Modal** — render the preflight summary lines (`agent / env /
skills / bottle / git gate / egress`) in a centered curses
modal with `[y/N]` at the bottom. Capture the next keypress.
**Drop-and-resume** — `curses.endwin()`, print the preflight to
stderr, read y/N from stdin, restore curses. Matches the
editor-flow + handoff pattern; lower implementation cost.
Lean toward **modal** for the y/N because it doesn't flash the
terminal between dashboard frames. Drop-and-resume is acceptable
if modal proves fiddly.
### Re-attach (Enter on agent)
Same handoff pattern the new-agent flow uses. For an agent the
dashboard started this session, the dashboard holds the
`DockerBottle` handle in its `bottles` dict and calls
`bottle.exec_claude(...)`. For an agent it discovered via
`list_active_slugs` (previous-dashboard or external start),
the dashboard synthesizes a one-shot `DockerBottle` from the
slug — container name is `claude-bottle-<slug>`, no prompt
path because the agent's claude config already has `--append-
system-prompt-file` baked in from the original launch —
and runs the same exec. Either way, Enter drops to
full-screen claude; on exit the dashboard re-renders.
### Explicit per-bottle stop
`x` on a dashboard-owned agent: pop the `(cm, bottle)` from
the dict, call `cm.__exit__(None, None, None)` which drives
the existing compose-down + state-settle logic. Refresh the
agents pane.
`x` on a non-owned agent (discovered via `list_active_slugs`
but not in `bottles` dict): no-op with status hint pointing
at `./cli.py cleanup` (the existing path that tears down
ANY claude-bottle compose project plus reaps state dirs).
### Dashboard quit
`q` returns the dashboard process to 0 without touching any
running bottles. The `bottles` dict goes out of scope but
because the context managers' `__exit__` is never invoked,
the `docker compose` project keeps running. The next dashboard
invocation discovers the bottles via `list_active_slugs` and
surfaces re-attach.
This is a real departure from today's `./cli.py start`
semantics (which couples bottle lifetime to the process via
ExitStack). It's intentional: the dashboard is a watching +
acting surface, not a lifetime owner.
## Implementation chunks
Sized for one PR each.
1. **Refactor `_launch_bottle` so the launch + exec_claude
pieces are separable.** Today's `cli/start.py` runs both
inside one function. Extract `prepare_with_preflight(spec,
*, render_preflight, prompt_yes)` and `attach_claude(bottle,
*, remote_control)`. The CLI's existing one-shot use binds
them as before; the dashboard binds them with curses-aware
render + prompt callables. No behavior change.
2. **Agent picker modal + new-agent flow.** New key `n` opens
the picker; `prepare_with_preflight` runs against the
selected agent; on Y, `backend.launch(plan)` enters the
dashboard's ExitStack; handoff invokes `attach_claude`.
3. **Re-attach via Enter on owned agents-pane row.** Looks up
the slug in the dashboard's `bottles` map; if present →
handoff; else → status-line hint pointing at `./cli.py
resume`.
4. **Explicit per-bottle stop (`x` keybinding).** Pop the
bottle's `close` callback off the stack, call it, refresh.
5. **Quit-cleanup (`q`).** Hook `stack.close()` into the
normal return path. Document the "exiting dashboard tears
down every bottle it started" contract in `dashboard.py`'s
module docstring.
## Resolved questions
1. **Modal vs. drop-and-resume for preflight Y/N.** Resolved:
**modal.** Render the preflight lines centered in a curses
sub-window with `[y/N]` at the bottom; capture the next
keypress. If geometry proves fiddly during implementation
we'll fall back to drop-and-resume, but modal is the target.
2. **Agent picker: text-filter typing.** Resolved: **yes,
include filter typing.** As the operator types, the list
filters to agents whose name matches (substring,
case-insensitive). j/k still navigates within the filtered
set; Esc clears the filter on first press, exits the picker
on the second.
3. **Container-died-during-claude handling.** Keep the design
as drafted: transcript snapshot (`snapshot_transcript`) +
`mark_preserved` if exit code is non-zero + remove from
the `bottles` dict + status line `"claude session for
[slug] ended with exit N; preserved for resume"`. The
bottle's `cm.__exit__` would normally run on stop; here it
runs as part of the death-handling (the container is
already gone, but compose-down + state-settle still
sequence the network removal + state cleanup correctly).
4. **Double-start of the same agent.** Allowed. The picker
surfaces a `(N running)` annotation next to any agent name
that already has live bottles in this dashboard's `bottles`
dict OR in `list_active_slugs()`, so the operator sees the
running-count before picking. Selecting an already-running
agent name mints a fresh slug for the new bottle as
normal.
5. **Quit behavior.** Resolved: **`q` does NOT tear down any
bottles.** Dashboard exit is purely a UI exit; the
bottles dict goes out of scope without invoking `__exit__`,
so the `docker compose` projects keep running. Bottle
teardown is always explicit: per-bottle `x` (for
dashboard-owned), or `./cli.py cleanup` (for everything).
## Open questions
6. **Race between handoff and 1s refresh tick.** While the
dashboard's `stdscr.timeout` is set, a key press fires the
handoff and the dashboard sits in `docker exec` for minutes.
`discover_active_agents` / `discover_pending` don't poll
during that window — that's harmless on its own (the moment
we `stdscr.refresh()` after exec returns, the next loop
iter runs discovery and the panes reflect reality), but
it does mean: (a) proposals queued during the claude
session won't fire any operator notification until the
handoff ends, and (b) a bottle that died mid-claude won't
be detectable until the operator exits back to the
dashboard. Not blocking v1 — flagging as a known limitation
to revisit alongside the option-2 embedded-emulator path
from the research doc.
## References
- PRD 0018 — compose-per-instance lifecycle (the `backend.
launch` context-manager contract this PRD layers against)
- PRD 0019 — active-agents pane + selection model (the
agents-pane row the re-attach + stop verbs hook into)
- `docs/research/claude-code-pane-in-dashboard.md` — option 1
(handoff) is what `attach_claude` implements here; options 2
/ 3 are out of scope for this PRD
- `claude_bottle/cli/start.py:_launch_bottle` — the function
chunk 1 extracts the prepare + attach pieces out of