docs(prd-0020): record answers to open questions, switch to no-teardown-on-quit
This commit is contained in:
@@ -21,11 +21,12 @@ shape from `docs/research/claude-code-pane-in-dashboard.md`).
|
|||||||
When the operator exits claude, the dashboard re-renders with
|
When the operator exits claude, the dashboard re-renders with
|
||||||
the now-running bottle visible in the agents pane.
|
the now-running bottle visible in the agents pane.
|
||||||
|
|
||||||
Crucially, the bottle's lifetime is owned by the *dashboard
|
Crucially, the bottle's lifetime is decoupled from both the
|
||||||
process*, not by the individual claude session. Exit claude →
|
claude session AND the dashboard process. Exit claude → back to
|
||||||
back to dashboard, bottle still running. Start another agent →
|
dashboard, bottle still running. Start another agent → two
|
||||||
two bottles up at once. Quit the dashboard → all dashboard-
|
bottles up at once. Quit the dashboard → bottles continue
|
||||||
launched bottles tear down.
|
running. Teardown is **always explicit**: the operator presses
|
||||||
|
`x` on an agent, or runs `./cli.py cleanup` later.
|
||||||
|
|
||||||
## Problem
|
## Problem
|
||||||
|
|
||||||
@@ -75,10 +76,12 @@ captures full-merged logs per bottle (PRD 0018). It already
|
|||||||
6. Pressing `x` (or similar — keybinding decided in design)
|
6. Pressing `x` (or similar — keybinding decided in design)
|
||||||
on a selected agent stops just that bottle (compose down +
|
on a selected agent stops just that bottle (compose down +
|
||||||
state cleanup) without quitting the dashboard.
|
state cleanup) without quitting the dashboard.
|
||||||
7. Quitting the dashboard (`q`) tears down every bottle the
|
7. Quitting the dashboard (`q`) leaves every running bottle
|
||||||
dashboard started, unless something has explicitly preserved
|
running. Bottle teardown is always explicit (per-bottle `x`
|
||||||
the state (capability-block, crash). Matches today's
|
or `./cli.py cleanup`). The next `./cli.py dashboard`
|
||||||
start.py teardown semantics.
|
invocation re-discovers them via `list_active_slugs()` and
|
||||||
|
surfaces re-attach for any it can reconstruct context for
|
||||||
|
(see "Cross-dashboard re-attach" below).
|
||||||
|
|
||||||
## Non-goals
|
## Non-goals
|
||||||
|
|
||||||
@@ -91,10 +94,18 @@ captures full-merged logs per bottle (PRD 0018). It already
|
|||||||
the dashboard treats them as read-only-watch (already does
|
the dashboard treats them as read-only-watch (already does
|
||||||
today). Re-attach only applies to bottles the *current
|
today). Re-attach only applies to bottles the *current
|
||||||
dashboard process* started.
|
dashboard process* started.
|
||||||
- **Persisting a "bottle pool" across dashboard runs.** When
|
- **Resurrecting an out-of-process bottle into a new dashboard
|
||||||
the dashboard quits, its bottles go. Resume across dashboard
|
with full re-attach.** A bottle started by `./cli.py start`
|
||||||
invocations is `./cli.py resume <identity>`, which is
|
in another terminal — or by a previous dashboard run, now
|
||||||
unchanged.
|
exited — appears in the agents pane (already does, PRD 0019)
|
||||||
|
and can be re-attached via `docker exec -it claude` because
|
||||||
|
the agent container is still running `sleep infinity`. That's
|
||||||
|
in scope. What's *out* is anything that requires the launch-
|
||||||
|
context object to drive teardown — e.g., the
|
||||||
|
ExitStack-tracked CA + state cleanup `_settle_state` performs
|
||||||
|
today. Cross-dashboard re-attach uses the existing
|
||||||
|
`./cli.py cleanup` for teardown, not an `x` keypress (see
|
||||||
|
open questions).
|
||||||
- **Multi-window UI.** Single curses window, two existing
|
- **Multi-window UI.** Single curses window, two existing
|
||||||
panes (proposals + agents); the agent picker is a modal, not
|
panes (proposals + agents); the agent picker is a modal, not
|
||||||
a third pane.
|
a third pane.
|
||||||
@@ -147,31 +158,69 @@ Today's flow:
|
|||||||
# context exits → compose down → state cleanup
|
# context exits → compose down → state cleanup
|
||||||
```
|
```
|
||||||
|
|
||||||
The proposed dashboard-owned flow:
|
The proposed dashboard-driven flow:
|
||||||
|
|
||||||
```
|
```
|
||||||
./cli.py dashboard
|
./cli.py dashboard
|
||||||
└─ stack = ExitStack()
|
└─ bottles: dict[str, tuple[ContextManager, DockerBottle]] = {}
|
||||||
bottles: dict[str, DockerBottle] = {}
|
|
||||||
|
|
||||||
# operator presses `n`, picks agent
|
# operator presses `n`, picks agent
|
||||||
ctx = backend.launch(plan)
|
cm = backend.launch(plan)
|
||||||
bottle = stack.enter_context(ctx) ← bottle stays alive
|
bottle = cm.__enter__() ← enter but don't bind to a `with`
|
||||||
bottles[plan.slug] = bottle
|
bottles[plan.slug] = (cm, bottle)
|
||||||
|
|
||||||
# operator interacts via:
|
# operator interacts via:
|
||||||
curses.endwin()
|
curses.endwin()
|
||||||
bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D
|
bottle.exec_claude([...], tty=True) ← blocks; returns on Ctrl-D
|
||||||
stdscr.refresh()
|
stdscr.refresh()
|
||||||
# bottle is STILL ALIVE here — only the claude process exited
|
# bottle is STILL ALIVE — only the claude process exited
|
||||||
|
|
||||||
# ... operator does other things, eventually `q`:
|
# ... operator presses `x` on selected agent:
|
||||||
stack.close() ← tears down every bottle
|
cm, _ = bottles.pop(slug)
|
||||||
|
cm.__exit__(None, None, None) ← tears down just that one
|
||||||
|
|
||||||
|
# ... operator presses `q`:
|
||||||
|
return # bottles dict still populated; no teardown
|
||||||
```
|
```
|
||||||
|
|
||||||
The shift is one line of code semantically but the change in
|
Two shifts:
|
||||||
operator experience is real: bottles outlive any single claude
|
|
||||||
session.
|
1. Bottles outlive any single claude session — the dashboard
|
||||||
|
manages enter/exit per bottle, not per attach. Exit claude
|
||||||
|
→ still in the dashboard with the bottle running.
|
||||||
|
2. Bottles outlive the dashboard process itself. Quitting the
|
||||||
|
dashboard does NOT close the context managers; the docker
|
||||||
|
compose project keeps running with the agent container in
|
||||||
|
`sleep infinity`. A subsequent dashboard invocation
|
||||||
|
re-discovers it via `docker compose ls` (PRD 0019's
|
||||||
|
`list_active_slugs`) and surfaces re-attach.
|
||||||
|
|
||||||
|
The trade-off: state cleanup that today runs in
|
||||||
|
`_settle_state` (transcript snapshot, preserve-marker
|
||||||
|
evaluation, state-dir reap) doesn't fire on a quit-while-
|
||||||
|
running bottle. It DOES fire when the operator explicitly
|
||||||
|
stops via `x`, because that calls `cm.__exit__`. For
|
||||||
|
bottles a previous dashboard quit on, `./cli.py cleanup`
|
||||||
|
is the path — its compose-down + state-reap logic
|
||||||
|
already covers the case.
|
||||||
|
|
||||||
|
### Cross-dashboard re-attach
|
||||||
|
|
||||||
|
When the dashboard discovers a bottle in `discover_active_agents`
|
||||||
|
that it didn't itself start (a previous-dashboard or external
|
||||||
|
`./cli.py start` bottle), Enter still attaches via `docker exec
|
||||||
|
-it … claude` — the agent container is running `sleep infinity`
|
||||||
|
exactly the same way regardless of who started it. The only
|
||||||
|
thing the current dashboard lacks for those bottles is the
|
||||||
|
launch-context object needed to drive a clean teardown via
|
||||||
|
`x`.
|
||||||
|
|
||||||
|
For v1 we surface this honestly: pressing `x` on a non-owned
|
||||||
|
agent shows a status hint pointing at `./cli.py cleanup` (or
|
||||||
|
`./cli.py cleanup` targeted at the slug if we add that flag
|
||||||
|
later). The agent stays alive; the operator handles teardown
|
||||||
|
out-of-band. Enter (re-attach) works for both owned and
|
||||||
|
non-owned bottles.
|
||||||
|
|
||||||
### Agent picker
|
### Agent picker
|
||||||
|
|
||||||
@@ -213,40 +262,43 @@ if modal proves fiddly.
|
|||||||
|
|
||||||
### Re-attach (Enter on agent)
|
### Re-attach (Enter on agent)
|
||||||
|
|
||||||
Same handoff pattern the new-agent flow uses. The dashboard
|
Same handoff pattern the new-agent flow uses. For an agent the
|
||||||
already holds the `DockerBottle` for any slug it started —
|
dashboard started this session, the dashboard holds the
|
||||||
`bottle.exec_claude([...], tty=True)` does the right `docker
|
`DockerBottle` handle in its `bottles` dict and calls
|
||||||
exec -it claude …` and returns on session exit. Re-attach is
|
`bottle.exec_claude(...)`. For an agent it discovered via
|
||||||
"already-running" + the same exec call; the agent picker isn't
|
`list_active_slugs` (previous-dashboard or external start),
|
||||||
involved.
|
the dashboard synthesizes a one-shot `DockerBottle` from the
|
||||||
|
slug — container name is `claude-bottle-<slug>`, no prompt
|
||||||
For agents the dashboard didn't start (read-only watch), Enter
|
path because the agent's claude config already has `--append-
|
||||||
is a no-op with a status hint ("dashboard didn't start this
|
system-prompt-file` baked in from the original launch —
|
||||||
bottle; resume with `./cli.py resume <identity>` outside the
|
and runs the same exec. Either way, Enter drops to
|
||||||
dashboard"). PRD-0019's selection model already differentiates
|
full-screen claude; on exit the dashboard re-renders.
|
||||||
focus; this layer just gates the action.
|
|
||||||
|
|
||||||
### Explicit per-bottle stop
|
### Explicit per-bottle stop
|
||||||
|
|
||||||
`x` on a selected dashboard-owned agent invokes
|
`x` on a dashboard-owned agent: pop the `(cm, bottle)` from
|
||||||
`stack.pop_callback`-style targeted teardown: take that bottle
|
the dict, call `cm.__exit__(None, None, None)` which drives
|
||||||
out of the map, call its `close()` to tear down compose + state,
|
the existing compose-down + state-settle logic. Refresh the
|
||||||
update the agents pane on the next refresh. Bottles the
|
agents pane.
|
||||||
dashboard didn't start (`x` on a read-only-watch row) → no-op
|
|
||||||
with a status hint.
|
`x` on a non-owned agent (discovered via `list_active_slugs`
|
||||||
|
but not in `bottles` dict): no-op with status hint pointing
|
||||||
|
at `./cli.py cleanup` (the existing path that tears down
|
||||||
|
ANY claude-bottle compose project plus reaps state dirs).
|
||||||
|
|
||||||
### Dashboard quit
|
### Dashboard quit
|
||||||
|
|
||||||
`q` (existing) calls `stack.close()` before exit; every
|
`q` returns the dashboard process to 0 without touching any
|
||||||
dashboard-launched bottle goes through its normal teardown
|
running bottles. The `bottles` dict goes out of scope but
|
||||||
(`compose down` + state settle). Preserve markers (capability-
|
because the context managers' `__exit__` is never invoked,
|
||||||
block, crash) still keep state across teardown. The dashboard
|
the `docker compose` project keeps running. The next dashboard
|
||||||
process itself returns 0.
|
invocation discovers the bottles via `list_active_slugs` and
|
||||||
|
surfaces re-attach.
|
||||||
|
|
||||||
If the operator wants to keep bottles alive past dashboard
|
This is a real departure from today's `./cli.py start`
|
||||||
exit, the existing path is unchanged: launch them via
|
semantics (which couples bottle lifetime to the process via
|
||||||
`./cli.py start` in a separate terminal. That ownership stays
|
ExitStack). It's intentional: the dashboard is a watching +
|
||||||
out-of-band.
|
acting surface, not a lifetime owner.
|
||||||
|
|
||||||
## Implementation chunks
|
## Implementation chunks
|
||||||
|
|
||||||
@@ -274,54 +326,62 @@ Sized for one PR each.
|
|||||||
down every bottle it started" contract in `dashboard.py`'s
|
down every bottle it started" contract in `dashboard.py`'s
|
||||||
module docstring.
|
module docstring.
|
||||||
|
|
||||||
|
## Resolved questions
|
||||||
|
|
||||||
|
1. **Modal vs. drop-and-resume for preflight Y/N.** Resolved:
|
||||||
|
**modal.** Render the preflight lines centered in a curses
|
||||||
|
sub-window with `[y/N]` at the bottom; capture the next
|
||||||
|
keypress. If geometry proves fiddly during implementation
|
||||||
|
we'll fall back to drop-and-resume, but modal is the target.
|
||||||
|
|
||||||
|
2. **Agent picker: text-filter typing.** Resolved: **yes,
|
||||||
|
include filter typing.** As the operator types, the list
|
||||||
|
filters to agents whose name matches (substring,
|
||||||
|
case-insensitive). j/k still navigates within the filtered
|
||||||
|
set; Esc clears the filter on first press, exits the picker
|
||||||
|
on the second.
|
||||||
|
|
||||||
|
3. **Container-died-during-claude handling.** Keep the design
|
||||||
|
as drafted: transcript snapshot (`snapshot_transcript`) +
|
||||||
|
`mark_preserved` if exit code is non-zero + remove from
|
||||||
|
the `bottles` dict + status line `"claude session for
|
||||||
|
[slug] ended with exit N; preserved for resume"`. The
|
||||||
|
bottle's `cm.__exit__` would normally run on stop; here it
|
||||||
|
runs as part of the death-handling (the container is
|
||||||
|
already gone, but compose-down + state-settle still
|
||||||
|
sequence the network removal + state cleanup correctly).
|
||||||
|
|
||||||
|
4. **Double-start of the same agent.** Allowed. The picker
|
||||||
|
surfaces a `(N running)` annotation next to any agent name
|
||||||
|
that already has live bottles in this dashboard's `bottles`
|
||||||
|
dict OR in `list_active_slugs()`, so the operator sees the
|
||||||
|
running-count before picking. Selecting an already-running
|
||||||
|
agent name mints a fresh slug for the new bottle as
|
||||||
|
normal.
|
||||||
|
|
||||||
|
5. **Quit behavior.** Resolved: **`q` does NOT tear down any
|
||||||
|
bottles.** Dashboard exit is purely a UI exit; the
|
||||||
|
bottles dict goes out of scope without invoking `__exit__`,
|
||||||
|
so the `docker compose` projects keep running. Bottle
|
||||||
|
teardown is always explicit: per-bottle `x` (for
|
||||||
|
dashboard-owned), or `./cli.py cleanup` (for everything).
|
||||||
|
|
||||||
## Open questions
|
## Open questions
|
||||||
|
|
||||||
1. **Modal vs. drop-and-resume for preflight Y/N.** Both work;
|
|
||||||
modal is nicer if the curses geometry handling is
|
|
||||||
straightforward. Pick during chunk 2 by prototyping the
|
|
||||||
modal in ~30 lines and seeing if it looks right.
|
|
||||||
|
|
||||||
2. **Agent picker: text-filter typing?** v1 is j/k navigation
|
|
||||||
only. If the manifest has 20+ agents the picker gets noisy;
|
|
||||||
add fzf-style filter input later if needed.
|
|
||||||
|
|
||||||
3. **What happens if `attach_claude` exits because the
|
|
||||||
container died** (not a clean claude exit — e.g., OOM,
|
|
||||||
panic)? Today's `_settle_state` marks the bottle preserved
|
|
||||||
for non-zero exit codes. The dashboard's re-render needs to
|
|
||||||
notice the bottle is gone (compose down or container-not-
|
|
||||||
running state) and surface a status line. Probably:
|
|
||||||
transcript snapshot + mark preserved + remove from
|
|
||||||
`bottles` map + status line "claude session for [slug]
|
|
||||||
ended with exit N; preserved for resume".
|
|
||||||
|
|
||||||
4. **Double-start of the same agent.** Allowed by design — slugs
|
|
||||||
are unique per launch — but the picker should make it clear
|
|
||||||
this is a "start a SECOND bottle" decision, not a "re-enter
|
|
||||||
the first." Probably handled by showing the running-count in
|
|
||||||
the picker row.
|
|
||||||
|
|
||||||
5. **Should `q` confirm before tearing down N running
|
|
||||||
bottles?** A 5-bottle dashboard with 5 in-flight sessions
|
|
||||||
loses non-trivial state on accidental `q`. Probably yes:
|
|
||||||
curses modal "quit and tear down N bottles? [y/N]". Skip
|
|
||||||
confirmation when there are zero owned bottles.
|
|
||||||
|
|
||||||
6. **Race between handoff and 1s refresh tick.** While the
|
6. **Race between handoff and 1s refresh tick.** While the
|
||||||
dashboard's `stdscr.timeout` is set, a key press fires the
|
dashboard's `stdscr.timeout` is set, a key press fires the
|
||||||
handoff and the dashboard sits in `docker exec` for minutes.
|
handoff and the dashboard sits in `docker exec` for minutes.
|
||||||
`discover_active_agents` / `discover_pending` don't poll
|
`discover_active_agents` / `discover_pending` don't poll
|
||||||
during that window, which is fine — the moment we
|
during that window — that's harmless on its own (the moment
|
||||||
`stdscr.refresh()` after exec returns, the next loop iter
|
we `stdscr.refresh()` after exec returns, the next loop
|
||||||
runs discovery and the panes reflect reality. Worth calling
|
iter runs discovery and the panes reflect reality), but
|
||||||
out in the design but no special handling needed.
|
it does mean: (a) proposals queued during the claude
|
||||||
|
session won't fire any operator notification until the
|
||||||
7. **Multi-bottle resource use.** Five bottles up means five
|
handoff ends, and (b) a bottle that died mid-claude won't
|
||||||
compose projects: 5×(agent + pipelock + egress optional +
|
be detectable until the operator exits back to the
|
||||||
git-gate optional + supervise optional) containers, plus 5×2
|
dashboard. Not blocking v1 — flagging as a known limitation
|
||||||
networks. On a 16-GiB host this is fine; on something
|
to revisit alongside the option-2 embedded-emulator path
|
||||||
smaller the operator might want a soft cap or a warning.
|
from the research doc.
|
||||||
Out of v1; flag for follow-up if it bites.
|
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user