Assisted-by: Codex
17 KiB
PRD 0020: Start and attach to agents from inside the dashboard
- Status: Draft
- Author: didericis
- Created: 2026-05-26
Summary
Today the dashboard is read-only: it surfaces pending proposals
and active agents (PRD 0019) but can't start an agent or
re-enter one. The operator's path is split — they launch
agents from one terminal (./cli.py start <name>), and watch
them from another (./cli.py dashboard).
This PRD collapses that split. The dashboard becomes the
operator's single surface: pressing a key opens an agent picker,
selecting one runs the existing prepare → preflight → launch
flow inside a curses-friendly variant, and on yield drops to a
full-screen docker exec -it … claude session (the "handoff"
shape from docs/research/claude-code-pane-in-dashboard.md).
When the operator exits claude, the dashboard re-renders with
the now-running bottle visible in the agents pane.
Crucially, the bottle's lifetime is decoupled from both the
claude session AND the dashboard process. Exit claude → back to
dashboard, bottle still running. Start another agent → two
bottles up at once. Quit the dashboard → bottles continue
running. Teardown is always explicit: the operator presses
x on an agent, or runs ./cli.py cleanup later.
Problem
Two real frictions today:
-
Two terminals for one workflow. The dashboard is the right shape to watch agents — proposals queue, status updates, operator-edit verbs — but it's the wrong shape to start them. Today you open a second terminal for that. In parallel use (3–5 bottles), the operator has 5+ terminals open and the dashboard's "active agents" pane is hopelessly behind reality because they just spawned three in a row.
-
./cli.py startties the bottle to a single claude session. The start command'sExitStackbrings the bottle up, runs claude, and tears down on Ctrl-D — fine for a one- shot session, wrong for "let me bounce in and out of this bottle a few times while triaging proposals." Today the only way to re-enter a bottle after exiting claude is to start a fresh one and lose all in-bottle state.
The dashboard already discovers active bottles, scopes operator-edit verbs to a selected agent (PRD 0019), and captures full-merged logs per bottle (PRD 0018). It already wants to be the primary surface. This PRD finishes that.
Goals / Success Criteria
- From inside
./cli.py dashboard, pressingn(new) opens an agent picker listing every agent defined in the manifest. Selecting one runsprepare → preflight → launch. - The preflight Y/N summary renders cleanly — either as a
curses modal or via
curses.endwin() → text-mode prompt → restore, matching the existing editor-flow pattern. - On launch success, the dashboard performs a handoff (option
1 from the research doc):
curses.endwin()→docker exec -it bot-bottle-<slug> claude --dangerously-skip-permissions→ on exit,stdscr.refresh()and re-render with the new bottle in the agents pane. - The bottle's lifetime is owned by the dashboard process, NOT
by any single claude session. Exiting claude (Ctrl-D,
/exit) returns to the dashboard with the bottle still running. The operator can start more agents and re-enter previous ones. - Pressing Enter on a selected row in the agents pane re- attaches to that agent's bottle via the same handoff — drops to full-screen claude, returns on exit.
- Pressing
x(or similar — keybinding decided in design) on a selected agent stops just that bottle (compose down + state cleanup) without quitting the dashboard. - Quitting the dashboard (
q) leaves every running bottle running. Bottle teardown is always explicit (per-bottlexor./cli.py cleanup). The next./cli.py dashboardinvocation re-discovers them vialist_active_slugs()and surfaces re-attach for any it can reconstruct context for (see "Cross-dashboard re-attach" below).
Non-goals
- A pane that hosts the claude TUI alongside proposals. The embedded-emulator option from the research doc is out of scope. The handoff (option 1) is the v1; option 2 is a separate PRD if and when handoff is observably insufficient.
- Adopting bottles started by an out-of-dashboard
./cli.py startinvocation. Those have their own ExitStack-owner and the dashboard treats them as read-only-watch (already does today). Re-attach only applies to bottles the current dashboard process started. - Resurrecting an out-of-process bottle into a new dashboard
with full re-attach. A bottle started by
./cli.py startin another terminal — or by a previous dashboard run, now exited — appears in the agents pane (already does, PRD 0019) and can be re-attached viadocker exec -it claudebecause the agent container is still runningsleep infinity. That's in scope. What's out is anything that requires the launch- context object to drive teardown — e.g., the ExitStack-tracked CA + state cleanup_settle_stateperforms today. Cross-dashboard re-attach uses the existing./cli.py cleanupfor teardown, not anxkeypress (see open questions). - Multi-window UI. Single curses window, two existing panes (proposals + agents); the agent picker is a modal, not a third pane.
- Removing
./cli.py start. Stays as the script-friendly / legacy entry point. The dashboard is the new default.
Scope
In scope
- Manifest-driven agent picker (curses modal): list view with j/k navigation + Enter to confirm, Esc to abort.
- Preflight rendering inside the dashboard's curses surface (modal or drop-and-resume — picked in design).
- A new
_dashboard_start_flowthat wraps prepare + preflight- launch and returns a
DockerBottlehandle the dashboard retains alongside itspendingandagentslists.
- launch and returns a
- A
bottles: dict[slug, DockerBottle]map on the main loop that owns every dashboard-launched handle. ExitStack tears them all down on dashboard exit. Enteron an agents-pane row → re-attach handoff (docker exec -it claude into the existing container).x(or similar) on an agents-pane row → explicit per-bottle stop without quitting.q(existing quit key) → tear down all dashboard-launched bottles before returning.
Out of scope
- Changes to
./cli.py startitself. It keeps its current shape; the dashboard reuses its internal pieces (backend. prepare / backend.launch) without reaching through the CLI layer. - Changes to
backend.launch's context-manager contract; the dashboard's bottle map just holds the context-manager-yielded Bottle and calls__exit__on quit / explicit stop. - New manifest fields. The picker reads what's already there.
- Adopting non-dashboard bottles into the dashboard's owned set.
Proposed design
Bottle ownership
Today's flow:
./cli.py start agent
└─ with backend.launch(plan) as bottle: ← bottle alive while inside `with`
bottle.exec_agent([...], tty=True) ← blocks until claude exits
# context exits → compose down → state cleanup
The proposed dashboard-driven flow:
./cli.py dashboard
└─ bottles: dict[str, tuple[ContextManager, DockerBottle]] = {}
# operator presses `n`, picks agent
cm = backend.launch(plan)
bottle = cm.__enter__() ← enter but don't bind to a `with`
bottles[plan.slug] = (cm, bottle)
# operator interacts via:
curses.endwin()
bottle.exec_agent([...], tty=True) ← blocks; returns on Ctrl-D
stdscr.refresh()
# bottle is STILL ALIVE — only the claude process exited
# ... operator presses `x` on selected agent:
cm, _ = bottles.pop(slug)
cm.__exit__(None, None, None) ← tears down just that one
# ... operator presses `q`:
return # bottles dict still populated; no teardown
Two shifts:
-
Bottles outlive any single claude session — the dashboard manages enter/exit per bottle, not per attach. Exit claude → still in the dashboard with the bottle running.
-
Bottles outlive the dashboard process itself. Quitting the dashboard does NOT close the context managers; the docker compose project keeps running with the agent container in
sleep infinity. A subsequent dashboard invocation re-discovers it viadocker compose ls(PRD 0019'slist_active_slugs) and surfaces re-attach.The trade-off: state cleanup that today runs in
_settle_state(transcript snapshot, preserve-marker evaluation, state-dir reap) doesn't fire on a quit-while- running bottle. It DOES fire when the operator explicitly stops viax, because that callscm.__exit__. For bottles a previous dashboard quit on,./cli.py cleanupis the path — its compose-down + state-reap logic already covers the case.
Cross-dashboard re-attach
When the dashboard discovers a bottle in discover_active_agents
that it didn't itself start (a previous-dashboard or external
./cli.py start bottle), Enter still attaches via docker exec -it … claude — the agent container is running sleep infinity
exactly the same way regardless of who started it. The only
thing the current dashboard lacks for those bottles is the
launch-context object needed to drive a clean teardown via
x.
For v1 we surface this honestly: pressing x on a non-owned
agent shows a status hint pointing at ./cli.py cleanup (or
./cli.py cleanup targeted at the slug if we add that flag
later). The agent stays alive; the operator handles teardown
out-of-band. Enter (re-attach) works for both owned and
non-owned bottles.
Agent picker
Pressing n opens a centered modal listing every agent name
from spec.manifest.agents. j/k navigates; Enter selects; Esc
aborts. Width is the longest name + bottle name + a column for
"already running?" so the operator can see at a glance whether
picking an agent starts a fresh one (different slug suffix) or
not.
┌─ start agent ───────────────────────────┐
│ implementer dev (running) │
│ > researcher dev │
│ triage-bot sandbox │
└─ Enter: start Esc: cancel ─────────────┘
Starting an agent that already has a running bottle is allowed
— each start mints a fresh slug — but the picker surfaces the
already-running state so the operator doesn't accidentally
double-launch.
Preflight Y/N
Two viable shapes:
Modal — render the preflight summary lines (agent / env / skills / bottle / git gate / egress) in a centered curses
modal with [y/N] at the bottom. Capture the next keypress.
Drop-and-resume — curses.endwin(), print the preflight to
stderr, read y/N from stdin, restore curses. Matches the
editor-flow + handoff pattern; lower implementation cost.
Lean toward modal for the y/N because it doesn't flash the terminal between dashboard frames. Drop-and-resume is acceptable if modal proves fiddly.
Re-attach (Enter on agent)
Same handoff pattern the new-agent flow uses. For an agent the
dashboard started this session, the dashboard holds the
DockerBottle handle in its bottles dict and calls
bottle.exec_agent(...). For an agent it discovered via
list_active_slugs (previous-dashboard or external start),
the dashboard synthesizes a one-shot DockerBottle from the
slug — container name is bot-bottle-<slug>, no prompt
path because the agent's claude config already has --append- system-prompt-file baked in from the original launch —
and runs the same exec. Either way, Enter drops to
full-screen claude; on exit the dashboard re-renders.
Explicit per-bottle stop
x on a dashboard-owned agent: pop the (cm, bottle) from
the dict, call cm.__exit__(None, None, None) which drives
the existing compose-down + state-settle logic. Refresh the
agents pane.
x on a non-owned agent (discovered via list_active_slugs
but not in bottles dict): no-op with status hint pointing
at ./cli.py cleanup (the existing path that tears down
ANY bot-bottle compose project plus reaps state dirs).
Dashboard quit
q returns the dashboard process to 0 without touching any
running bottles. The bottles dict goes out of scope but
because the context managers' __exit__ is never invoked,
the docker compose project keeps running. The next dashboard
invocation discovers the bottles via list_active_slugs and
surfaces re-attach.
This is a real departure from today's ./cli.py start
semantics (which couples bottle lifetime to the process via
ExitStack). It's intentional: the dashboard is a watching +
acting surface, not a lifetime owner.
Implementation chunks
Sized for one PR each.
- Refactor
_launch_bottleso the launch + exec_agent pieces are separable. Today'scli/start.pyruns both inside one function. Extractprepare_with_preflight(spec, *, render_preflight, prompt_yes)andattach_agent(bottle, *, remote_control). The CLI's existing one-shot use binds them as before; the dashboard binds them with curses-aware render + prompt callables. No behavior change. - Agent picker modal + new-agent flow. New key
nopens the picker;prepare_with_preflightruns against the selected agent; on Y,backend.launch(plan)enters the dashboard's ExitStack; handoff invokesattach_agent. - Re-attach via Enter on owned agents-pane row. Looks up
the slug in the dashboard's
bottlesmap; if present → handoff; else → status-line hint pointing at./cli.py resume. - Explicit per-bottle stop (
xkeybinding). Pop the bottle'sclosecallback off the stack, call it, refresh. - Quit-cleanup (
q). Hookstack.close()into the normal return path. Document the "exiting dashboard tears down every bottle it started" contract indashboard.py's module docstring.
Resolved questions
-
Modal vs. drop-and-resume for preflight Y/N. Resolved: modal. Render the preflight lines centered in a curses sub-window with
[y/N]at the bottom; capture the next keypress. If geometry proves fiddly during implementation we'll fall back to drop-and-resume, but modal is the target. -
Agent picker: text-filter typing. Resolved: yes, include filter typing. As the operator types, the list filters to agents whose name matches (substring, case-insensitive). j/k still navigates within the filtered set; Esc clears the filter on first press, exits the picker on the second.
-
Container-died-during-claude handling. Keep the design as drafted: transcript snapshot (
snapshot_transcript) +mark_preservedif exit code is non-zero + remove from thebottlesdict + status line"claude session for [slug] ended with exit N; preserved for resume". The bottle'scm.__exit__would normally run on stop; here it runs as part of the death-handling (the container is already gone, but compose-down + state-settle still sequence the network removal + state cleanup correctly). -
Double-start of the same agent. Allowed. The picker surfaces a
(N running)annotation next to any agent name that already has live bottles in this dashboard'sbottlesdict OR inlist_active_slugs(), so the operator sees the running-count before picking. Selecting an already-running agent name mints a fresh slug for the new bottle as normal. -
Quit behavior. Resolved:
qdoes NOT tear down any bottles. Dashboard exit is purely a UI exit; the bottles dict goes out of scope without invoking__exit__, so thedocker composeprojects keep running. Bottle teardown is always explicit: per-bottlex(for dashboard-owned), or./cli.py cleanup(for everything).
Open questions
- Race between handoff and 1s refresh tick. While the
dashboard's
stdscr.timeoutis set, a key press fires the handoff and the dashboard sits indocker execfor minutes.discover_active_agents/discover_pendingdon't poll during that window — that's harmless on its own (the moment westdscr.refresh()after exec returns, the next loop iter runs discovery and the panes reflect reality), but it does mean: (a) proposals queued during the claude session won't fire any operator notification until the handoff ends, and (b) a bottle that died mid-claude won't be detectable until the operator exits back to the dashboard. Not blocking v1 — flagging as a known limitation to revisit alongside the option-2 embedded-emulator path from the research doc.
References
- PRD 0018 — compose-per-instance lifecycle (the
backend. launchcontext-manager contract this PRD layers against) - PRD 0019 — active-agents pane + selection model (the agents-pane row the re-attach + stop verbs hook into)
docs/research/claude-code-pane-in-dashboard.md— option 1 (handoff) is whatattach_agentimplements here; options 2 / 3 are out of scope for this PRDbot_bottle/cli/start.py:_launch_bottle— the function chunk 1 extracts the prepare + attach pieces out of