Files

T

didericis-codex c08b09dc9f refactor!: rename project to bot-bottle

Assisted-by: Codex

2026-05-28 17:56:14 -04:00

13 KiB

Raw Blame History

Claude-code pane in the dashboard

Question

The dashboard today shows pending proposals (top pane) and active agents (bottom pane, PRD 0019). Selecting an agent and pressing e / p invokes operator-scoped edits. The next move the user wants is a way to interact with the claude-code session inside the selected bottle without leaving the dashboard — type at it, read its output, return focus to the dashboard.

What's the cheapest path to that, and where does it bottom out?

What "interact" means

Today the flow is bimodal. ./cli.py start <agent> brings the bottle up and immediately drops you into an interactive docker exec -it bot-bottle-<slug> claude ... — claude-code owns the whole terminal until you Ctrl-D out, at which point the bottle tears down. The dashboard (./cli.py dashboard) is a separate invocation that watches across bottles but never exposes the claude TUI itself.

The user wants the dashboard to also be a claude-code session host: one of the dashboard's panes (or a press-key-to-focus mode) is a live claude-code terminal connected to the agent container the operator is sitting on in the agents pane.

That changes the dashboard's job from "screen of metadata" to "terminal multiplexer that also draws metadata." The interesting question is whether that change is small or unbounded.

The core problem

claude-code is a TUI in its own right. It runs as an interactive Node process, expects a real PTY, drives its own cursor positioning, color, mouse events, and key bindings. The dashboard is also a TUI (curses), and curses owns the terminal's input + output stream while it's active.

Two TUIs sharing one terminal can't both be "running" without one of them giving up screen control to the other. The decision shape is which one yields, and where the boundary lives.

There are exactly three realistic ways to resolve this:

Handoff — the dashboard releases the terminal when the user wants to talk to claude-code, claude-code takes over full-screen, and the dashboard re-takes control when claude-code exits or is detached. Like how e (routes edit) already shells to $EDITOR today (curses.endwin() → run editor → stdscr.refresh()).
Embedded emulator — the dashboard runs claude-code in a PTY it owns, parses claude-code's ANSI escape stream ourselves, and paints the resulting cell grid into a curses pane. Keypresses inside the pane get routed to the PTY's stdin; the dashboard renders metadata in the other panes alongside.
External multiplexer — the dashboard doesn't render the claude-code session at all. It asks tmux / screen / a terminal emulator to open it in a real adjacent pane (split window, new tab), and treats the multiplexer as the coordinator instead of trying to be one.

Below are the actual costs.

Option 1: Handoff

The dashboard sees a key (say Enter on a selected agent in the agents pane). It calls curses.endwin(), then subprocess.run( ["docker", "exec", "-it", "bot-bottle-<slug>", "claude", "--dangerously-skip-permissions"]). claude-code takes the terminal full-screen. When the operator exits claude-code (Ctrl-D, /exit), the subprocess returns; the dashboard calls stdscr.refresh() to redraw and resume.

What's good:

It's ~20 lines of code. The plumbing (curses.endwin / refresh + a docker exec) already exists for the editor flow.
Zero new dependencies. claude-code runs in its real PTY exactly the way it does today.
No "embedded TUI inside another TUI" weirdness. Keybinding collisions, terminal-resize stories, scrollback are all whatever claude-code already does.
Already-running session reuse: a bottle's agent container runs sleep infinity and docker execs claude in on-demand (PRD 0018 chunk 3). Re-entering with another exec would start a second claude process; we'd want to either attach to the first one (tricky — docker exec doesn't have an "attach to existing exec" verb) or treat first-time entry as "start the session" and stash a marker so re-entry is a resume rather than a fresh process.

What's not good:

It's not really "a pane in the dashboard." It's "press Enter to leave the dashboard, talk to claude, come back." The user wanted side-by-side; this is side-by-time.
The dashboard can't auto-refresh while claude-code has the terminal. If a new proposal lands while you're in the claude session, you won't see it until you exit.
Notifications during the claude session need a separate channel (sound? OS notification?). Otherwise the operator's reason for using the dashboard — "watch everything in one place" — partially evaporates.

This is the v1 the project's existing code-shape strongly prefers. It clears the bar of "let me talk to claude-code without quitting ./cli.py dashboard."

Option 2: Embedded emulator

The dashboard opens a PTY (stdlib pty module), spawns docker exec -it … claude attached to it, and runs a terminal emulator in-process that consumes claude-code's output stream and maintains a virtual screen buffer. The buffer's current state gets painted into a curses pane every refresh tick. Keypresses received inside the focused pane get written to the PTY's input fd.

This is what tmux does. It is also what every "terminal in a TUI" demo does. The challenge is everything between "run a PTY" and "render its output correctly."

What you need to implement (or take as a dep):

ANSI/VT escape parsing. claude-code uses xterm-class escape sequences for cursor positioning, color, scroll regions, alternate screen buffer (for the prompt UI), mouse reporting, and so on. The full xterm spec is dozens of pages. Sloppy parsing produces a corrupted display the user will hate.
A screen buffer model. Cells with attributes (foreground, background, bold, underline, italic, inverse). Cursor position. Saved cursor. Alternate screen. Scrollback.
Resize protocol. claude-code asks the PTY its size via TIOCGWINSZ and re-layouts on SIGWINCH. The dashboard has to size the PTY to the pane it's rendering into and propagate SIGWINCH when curses says the terminal resized.
Input routing. When the pane has focus, keypresses written to the PTY. When the dashboard has focus, keypresses consumed by the dashboard. Define an escape sequence (like tmux's Ctrl-B) that toggles focus, and document that claude-code's own use of that key sequence is now intercepted.
Output throttling. claude-code can emit megabytes of tokens in a streaming response. The dashboard's 1s refresh tick is too slow to render character-at-a-time; you want the PTY reader to coalesce and the renderer to render on a smaller cadence than the main loop's getch timeout.

The stdlib has pty (the spawn side) and you can read/write the master fd by hand. It does not have an ANSI parser; the established Python library for this is pyte (pyte.readthedocs.io) — pure Python, MIT-licensed, no transitive deps. ~3k lines. It would be the project's first runtime dependency beyond stdlib.

Even with pyte, the integration is non-trivial: you're re-rendering a 24x80-ish (or whatever fits) screen buffer into curses cells on every tick, dealing with attribute mapping (pyte's color enum → curses color pair), and handling mouse events through the pane. Plan on ~600–1200 lines, not 200.

Open trap-doors:

Claude-code uses bracketed paste, alternate screen, and occasionally raw terminal control for its prompt input. Some of these features stress the emulator harder than vim does — alt-screen has to be supported or claude's command-prompt UI corrupts the line above it. pyte claims to handle alt-screen; verify before committing.
Scrollback in claude is /transcript-driven, not terminal scrollback. A small pane height means you only see the last 10–20 lines of output without leaving the dashboard, which is the wrong shape for a 200-line streaming response. You'd want to make the pane resizable or open a full-height "expand" mode (which is just option 1, the handoff, with extra steps).
Multiple agents = multiple PTYs running concurrently. If the user wants to monitor 3 bottles, the dashboard is now holding 3 PTYs open and parsing 3 ANSI streams in parallel. Memory + CPU costs are bounded but nonzero; design the PTY-per-agent state machine carefully.

This is the option that delivers the "pane in the dashboard" literal request. It's the right answer if the user's day-to-day involves watching multiple bottles' output simultaneously without context-switching. It's the wrong answer if they mostly want one focused session at a time with proposals visible.

Option 3: External multiplexer

The dashboard binds a key (e.g. Enter on agent) to tmux split-window -h 'docker exec -it bot-bottle-<slug> claude' when run inside a tmux session, or to osascript- driven iTerm pane spawning on macOS, or to wezterm cli spawn if the user is on wezterm.

What's good:

The "real terminal in a real pane" is solved by tools the user already trusts. tmux's terminal emulation is correct; iTerm's is correct; wezterm's is correct. We're not reimplementing any of them.
Multi-bottle parallelism is automatic — the user opens one pane per agent, the multiplexer renders them.
Implementation cost is tiny: ~50 lines of "if TMUX env is set, shell out to tmux split-window."

What's not good:

It requires the user to be running in a multiplexer. Outside one (plain Terminal.app, vscode integrated terminal, etc.) the verb either falls back to handoff or just fails.
It splits the operator's mental model. The dashboard is one window, claude-code panes are other windows; the dashboard's "agents pane" no longer matches the visible reality (some agents have an attached pane, others don't, and the dashboard doesn't know which).
We don't actually own the layout. tmux's pane sizing rules are tmux's; the dashboard can't enforce "agent pane on the right, dashboard on the left."

This is the right answer if the project decides "we shouldn't be a multiplexer; let tmux be a multiplexer." It's the wrong answer if the dashboard wants to be the operator's primary surface — which the user's question implies it does.

Recommendation

Build Option 1 first (handoff). It clears the bar of "let me talk to claude without quitting the dashboard," it's a ~30-line patch over the existing e / p infrastructure, and it carries no new dependencies. Critically it lets the user confirm whether the workflow actually wants embedded panes — if option 1 turns out to be enough in practice (Enter to drop in, Ctrl-D back out, watch proposals between sessions), the embedded-emulator complexity is unnecessary.

Treat Option 2 as a six-week project to plan only if Option 1 is observably insufficient. The pyte dep is acceptable but the integration is real engineering — terminal emulator correctness, pane sizing, input routing, multi-PTY state management. Schedule it; don't bolt it on. If the user lives in 3-bottle days and needs simultaneous output, this is the option; otherwise it's premature optimization.

Treat Option 3 as a niche augmentation, not a primary path. Detect TMUX / WEZTERM_PANE / ITERM_SESSION_ID at startup; if present, add a split keybinding that delegates pane creation to the multiplexer. The dashboard remains the primary interface; the multiplexer is convenience for power users.

Followups worth checking before committing

What does claude-code's PTY-resize behavior look like at small heights? Drive it at 24×40, 10×80, 8×40 and see if it blows up. The dashboard's bottom pane is going to be small.
Is there a way to docker exec into an already-running exec rather than spawning a new claude process per attach? If not, the dashboard needs its own session-state model: "is there an exec running for this slug? attach to it. otherwise start one."
How does claude-code handle SIGWINCH at the moment? If it re-layouts cleanly the embedded-emulator story gets easier; if it corrupts, layer 2 needs special-case handling.
Does docker exec -i (no -t) preserve enough of the TTY contract for claude-code to start at all? Some apps refuse to launch without a real TTY; the embedded emulator option needs the PTY allocated on the host side and the exec re-attached to it, which is the harder of the two options.

References

PRD 0019 — active agents pane + selection model (the selection-source for whichever option lands)
PRD 0018 chunk 3 — agent container runs sleep infinity; claude is invoked via docker exec -it (the attachment-point this doc is layering against)
bot_bottle/cli/dashboard.py:_operator_edit_flow — the existing curses.endwin → shell out → stdscr.refresh() pattern Option 1 would clone
pyte: https://pyte.readthedocs.io/ — the candidate terminal-emulator library if Option 2 is picked

13 KiB Raw Blame History Unescape Escape