didericis/bot-bottle

Fork 0

Files

T

didericis 47c3ba63f8

test / unit (pull_request) Successful in 36s

Details

test / integration (pull_request) Successful in 58s

Details

test / integration (push) Successful in 54s

Details

test / unit (push) Successful in 32s

Details

docs(prd): mark merged PRDs as Active

Flip Status: Draft -> Active for the 23 PRDs whose work has shipped to
main (including 0027, now that PR #95 has merged). Leaves the
terminal-status PRDs unchanged: 0007 and 0010 (Superseded) and 0014
(Retargeted) were replaced, not shipped as-is.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-05-28 22:12:03 -04:00

9.8 KiB

Raw Blame History

PRD 0019: Active agents in the dashboard, agent-scoped edit verbs

Status: Active
Author: didericis
Created: 2026-05-26

Summary

The dashboard today is proposal-centric: it lists every pending supervise tool call across every running bottle and lets the operator approve / modify / reject from one place. The operator- initiated routes edit (e) and pipelock edit (p) verbs are global — they discover every running sidecar of that kind and prompt for which bottle to edit if more than one is up.

This PRD adds a first-class "active agents" view to the dashboard and reshapes the edit verbs to be agent-scoped: the operator picks an agent, then e / p (and any future per-agent verbs) target that agent without a separate prompt.

After this PRD the dashboard answers two questions in one screen:

What's queued for me to act on? (existing proposals view)
What's currently running, and what would I act on if I wanted to push a config edit without an agent prompt?

Problem

Two rough edges in the current dashboard:

No visibility into what's actually running. The dashboard shows only pending proposals. If no agent has called a tool, the screen reads "no pending proposals" — even when five bottles are quietly working. The operator has to docker compose ls (or ./cli.py cleanup -n to see the y/N preview) to find out what's actually live.
e / p re-discover-and-disambiguate every invocation. Today each press of e runs discover_egress_slugs(), finds the running egress sidecars, and prompts if there's more than one. The prompt interrupts the keyboard flow — and once the operator picks a bottle, there's no carry-over to the next edit. Editing pipelock for the same bottle right after is another prompt.

The proposal-centric design is fine for the "agent triggered a remediation" case but flips the relationship the wrong way for the "operator wants to make an unprompted change" case.

Goals / Success Criteria

The dashboard's main screen shows two lists: pending proposals (above) and active agents (below) — both visible at once, no tab / mode switch.
Each active-agent row shows enough for the operator to recognize the bottle at a glance: identity (slug), agent_name (from metadata.json), started_at, and which sidecars are up.
The operator can select an agent row with j / k / arrow keys (the same nav keys already in use for proposals), with a clear keystroke that swaps the active list (e.g., Tab toggles which list j / k moves through).
Pressing e (routes edit) or p (pipelock edit) with an agent selected targets that agent. No disambiguation prompt; no global discover.
Pressing e / p with NO agent selected is a no-op (status line surfaces "no agent selected"). The global discover- and-prompt path comes out — selection in the agents pane is now the only way to scope an edit.
The active-agents list refreshes on the same ~1s tick as the proposals list so an agent starting / stopping is reflected without operator action.

Non-goals

Per-agent proposal filtering. The proposals list stays global across bottles. Filtering ("show me only this agent's proposals") might be a follow-up but isn't this PRD.
Agent lifecycle from the dashboard. Starting / stopping agents stays in ./cli.py start / ./cli.py cleanup. The dashboard reads state; it doesn't change it.
Preserved-but-not-running bottles. The active-agents list is strictly "what's running now" (cross-referenced from docker compose ls). Preserved state dirs without a live project don't appear — ./cli.py resume <identity> is the path for those.
A separate per-agent detail view. The agent rows are one-line summaries. Pressing Enter on a proposal still drops into proposal-detail; we don't add an analogous agent-detail screen in v1.
Replacing the existing --once mode. dashboard --once stays a proposal-only listing. No active-agents output there (different consumers — --once is for scripts; the agents view is for the interactive TUI).

Scope

In scope

A new "active agents" pane in the curses TUI, rendered below the proposals pane.
A discovery helper that returns (slug, agent_name, started_at, services_up) per active compose project. Reads agent_name + started_at from each project's metadata.json, cross-references docker compose ls for the live list.
Tab-toggle selection state: which pane the cursor is in. j / k / arrow keys move within that pane.
Rewire _operator_edit_routes_flow and _operator_edit_allowlist_flow to require a slug from the caller. The discover-and-prompt scaffolding (no-arg discover + single-bottle shortcut + multi-bottle prompt) comes out. The dashboard's key handlers pass the agents-pane selection in directly, or no-op if nothing is selected.
Status-line indicator showing which agent is selected (or "no agent selected" when in the proposals pane).
Tests for the new discovery helper.

Out of scope

Changes to proposal handling (a / m / r / Enter all unchanged).
Changes to the queue-dir / supervise sidecar protocol.
New CLI surface beyond what's in ./cli.py dashboard.
Touching the manifest, compose renderer, launch lifecycle.

Proposed design

Layout

bot-bottle dashboard  (3 pending, 2 active)
─────────────────────────────────────────────────────────
proposals:
  03:14:22  [implementer-cy7a6]  egress-block         abc123…
  03:13:55  [researcher-9xqs1]   pipelock-block       def456…
  03:13:10  [implementer-cy7a6]  capability-block     ghi789…

active agents:
> implementer-cy7a6  implementer   started 02:55:01  [pipelock,egress,git-gate,supervise]
  researcher-9xqs1   researcher    started 02:58:14  [pipelock,supervise]

[selected: implementer-cy7a6]  q quit  Tab switch  j/k nav  e routes  p pipelock  a/m/r/Enter

One screen, two lists. Header counts both totals.
A > cursor and reverse-video highlight mark the currently selected row in the active pane.
Status footer carries [selected: <slug>] (or [no agent selected]) so it's always clear what e / p will target.

Selection model

Tab (or Shift-Tab) toggles which pane j / k / arrow keys move through.
Each pane keeps its own selection index. Switching panes doesn't lose the position in the other.
e / p:
- An agent is selected (cursor in the agents pane on a row) → use that agent's slug.
- Otherwise → no-op with a status-line "no agent selected". The pre-PRD global discover-and-prompt code paths come out of _operator_edit_routes_flow and _operator_edit_allowlist_flow.

Active-agent discovery

A new helper discover_active_agents() in dashboard.py returns a list of ActiveAgent(slug, agent_name, started_at, services):

list_active_slugs() (already in backend/docker/compose.py) → list of slugs.
For each slug: read state/<slug>/metadata.json → agent_name, started_at.
For each slug: docker compose -p <project> ps --format json → set of running service names.

Step 3 is the part that's per-bottle and could be slow on hosts with many bottles. Open question below.

Implementation chunks

Sized small.

Discovery helper + dataclass. Pure-ish: takes list_active_slugs() as injected, reads metadata + queries compose ps. Unit-test with mocked subprocess. No UI yet.
Render the agents pane. Wire discover_active_agents into _main_loop's tick, render below proposals, no selection model yet (cursor stays in proposals).
Selection state + Tab toggle. Add the which_pane variable, route j/k/arrow based on it, status footer.
Agent-scoped e / p. Pass selected slug into the edit flows when the agents pane is focused; keep today's global behavior when the proposals pane is focused.

Open questions

compose ps per bottle: too slow? On a host with 10+ active bottles, calling docker compose -p <X> ps per project on every 1s tick is 10+ subprocess calls per second. Options: (a) cache the services list and refresh on a slower cadence (e.g., every 5s); (b) skip the per-bottle services column and just show the slug + agent name; (c) one docker ps --filter label=... call that buckets containers by com.docker.compose.project label. Probably (c) — one call, no per-bottle fanout.
What if metadata.json is missing or stale? For a bottle started by pre-chunk-3 code (no compose_project field), or a state dir written by a tool we don't know about, the metadata read can fail. Render with agent_name = ? rather than dropping the row.
Selection persistence across refresh ticks. If the currently-selected agent is no longer running (it exited between ticks), the selection should fall back to the previous row, not jump to the top. Mirrors the existing proposals-list behavior.
Color / highlight for the selected agent. The proposals pane uses green for newly-arrived. Agents could use a different attribute (e.g., reverse video for selection, no color for the row itself). Aesthetic decision; pick something readable in the standard 8-color palette.
Selecting a proposal cross-selects its agent? Possible UX: highlighting a proposal in the proposals pane could auto-move the agents-pane cursor to that proposal's bottle. Cute, but probably confusing — the explicit Tab toggle is clearer. Out of v1.

References

PRD 0013 — supervise sidecar (proposals + queue)
PRD 0014 / 0015 / 0016 — the apply flows the edit verbs drive
PRD 0018 — compose-per-instance; list_active_slugs + metadata.json source-of-truth

9.8 KiB Raw Blame History

PRD 0019: Active agents in the dashboard, agent-scoped edit verbs

Summary

Problem

Goals / Success Criteria

Non-goals

Scope

In scope

Out of scope

Proposed design

Layout

Selection model

Active-agent discovery

Implementation chunks

Open questions

References

9.8 KiB

Raw Blame History