8cd867f3d2
Survey the three realistic ways to surface a claude-code session
inside the dashboard TUI:
1. Handoff — drop curses, foreground claude, restore on exit
(the existing `e`/`p` pattern, extended). Minimal code,
side-by-time rather than side-by-side.
2. Embedded emulator — own a PTY, parse claude-code's ANSI
stream via `pyte`, paint it into a curses pane. Real
"pane in the dashboard" but a six-week build with one new
dep and several integration trap-doors (alt-screen, resize,
input routing, multi-PTY state).
3. External multiplexer — delegate pane creation to tmux /
iTerm / wezterm when detected. Tiny code, but splits the
operator's mental model and gives up layout control.
Recommendation: ship Option 1 first; defer Option 2 to "only if
Option 1 is observably insufficient"; treat Option 3 as a
niche augmentation for power users.
Calls out four followups worth verifying before committing
(PTY behavior at small sizes, attach-to-existing-exec, SIGWINCH
handling, `-it` vs `-i` for the embedded path).
286 lines
13 KiB
Markdown
286 lines
13 KiB
Markdown
# Claude-code pane in the dashboard
|
||
|
||
## Question
|
||
|
||
The dashboard today shows pending proposals (top pane) and active
|
||
agents (bottom pane, PRD 0019). Selecting an agent and pressing
|
||
`e` / `p` invokes operator-scoped edits. The next move the user
|
||
wants is **a way to interact with the claude-code session inside
|
||
the selected bottle without leaving the dashboard** — type at it,
|
||
read its output, return focus to the dashboard.
|
||
|
||
What's the cheapest path to that, and where does it bottom out?
|
||
|
||
## What "interact" means
|
||
|
||
Today the flow is bimodal. `./cli.py start <agent>` brings the
|
||
bottle up and immediately drops you into an interactive
|
||
`docker exec -it claude-bottle-<slug> claude ...` — claude-code
|
||
owns the whole terminal until you Ctrl-D out, at which point the
|
||
bottle tears down. The dashboard (`./cli.py dashboard`) is a
|
||
*separate* invocation that watches across bottles but never
|
||
exposes the claude TUI itself.
|
||
|
||
The user wants the dashboard to *also* be a claude-code session
|
||
host: one of the dashboard's panes (or a press-key-to-focus
|
||
mode) is a live claude-code terminal connected to the agent
|
||
container the operator is sitting on in the agents pane.
|
||
|
||
That changes the dashboard's job from "screen of metadata" to
|
||
"terminal multiplexer that also draws metadata." The interesting
|
||
question is whether that change is small or unbounded.
|
||
|
||
## The core problem
|
||
|
||
claude-code is a TUI in its own right. It runs as an
|
||
interactive Node process, expects a real PTY, drives its own
|
||
cursor positioning, color, mouse events, and key bindings. The
|
||
dashboard is *also* a TUI (curses), and curses owns the
|
||
terminal's input + output stream while it's active.
|
||
|
||
Two TUIs sharing one terminal can't both be "running" without
|
||
one of them giving up screen control to the other. The decision
|
||
shape is which one yields, and where the boundary lives.
|
||
|
||
There are exactly three realistic ways to resolve this:
|
||
|
||
1. **Handoff** — the dashboard releases the terminal when the
|
||
user wants to talk to claude-code, claude-code takes over
|
||
full-screen, and the dashboard re-takes control when
|
||
claude-code exits or is detached. Like how `e` (routes
|
||
edit) already shells to `$EDITOR` today
|
||
(`curses.endwin()` → run editor → `stdscr.refresh()`).
|
||
2. **Embedded emulator** — the dashboard runs claude-code in a
|
||
PTY it owns, parses claude-code's ANSI escape stream
|
||
ourselves, and paints the resulting cell grid into a
|
||
curses pane. Keypresses inside the pane get routed to the
|
||
PTY's stdin; the dashboard renders metadata in the other
|
||
panes alongside.
|
||
3. **External multiplexer** — the dashboard doesn't render the
|
||
claude-code session at all. It asks tmux / screen / a
|
||
terminal emulator to open it in a real adjacent pane (split
|
||
window, new tab), and treats the multiplexer as the
|
||
coordinator instead of trying to be one.
|
||
|
||
Below are the actual costs.
|
||
|
||
## Option 1: Handoff
|
||
|
||
The dashboard sees a key (say Enter on a selected agent in the
|
||
agents pane). It calls `curses.endwin()`, then `subprocess.run(
|
||
["docker", "exec", "-it", "claude-bottle-<slug>", "claude",
|
||
"--dangerously-skip-permissions"])`. claude-code takes the
|
||
terminal full-screen. When the operator exits claude-code
|
||
(Ctrl-D, `/exit`), the subprocess returns; the dashboard calls
|
||
`stdscr.refresh()` to redraw and resume.
|
||
|
||
What's good:
|
||
|
||
- It's ~20 lines of code. The plumbing (`curses.endwin` /
|
||
refresh + a `docker exec`) already exists for the editor flow.
|
||
- Zero new dependencies. claude-code runs in its real PTY exactly
|
||
the way it does today.
|
||
- No "embedded TUI inside another TUI" weirdness. Keybinding
|
||
collisions, terminal-resize stories, scrollback are all
|
||
whatever claude-code already does.
|
||
- Already-running session reuse: a bottle's agent container
|
||
runs `sleep infinity` and `docker exec`s claude in on-demand
|
||
(PRD 0018 chunk 3). Re-entering with another `exec` would
|
||
start a *second* claude process; we'd want to either attach
|
||
to the first one (tricky — `docker exec` doesn't have an
|
||
"attach to existing exec" verb) or treat first-time entry as
|
||
"start the session" and stash a marker so re-entry is a
|
||
resume rather than a fresh process.
|
||
|
||
What's not good:
|
||
|
||
- It's not really "a pane in the dashboard." It's "press Enter
|
||
to leave the dashboard, talk to claude, come back." The user
|
||
wanted side-by-side; this is side-by-time.
|
||
- The dashboard can't auto-refresh while claude-code has the
|
||
terminal. If a new proposal lands while you're in the claude
|
||
session, you won't see it until you exit.
|
||
- Notifications during the claude session need a separate
|
||
channel (sound? OS notification?). Otherwise the operator's
|
||
reason for using the dashboard — "watch everything in one
|
||
place" — partially evaporates.
|
||
|
||
This is the v1 the project's existing code-shape strongly
|
||
prefers. It clears the bar of "let me talk to claude-code
|
||
without quitting `./cli.py dashboard`."
|
||
|
||
## Option 2: Embedded emulator
|
||
|
||
The dashboard opens a PTY (stdlib `pty` module), spawns
|
||
`docker exec -it … claude` attached to it, and runs a terminal
|
||
emulator in-process that consumes claude-code's output stream
|
||
and maintains a virtual screen buffer. The buffer's current
|
||
state gets painted into a curses pane every refresh tick.
|
||
Keypresses received inside the focused pane get written to the
|
||
PTY's input fd.
|
||
|
||
This is what tmux does. It is also what every "terminal in
|
||
a TUI" demo does. The challenge is everything between "run a
|
||
PTY" and "render its output correctly."
|
||
|
||
What you need to implement (or take as a dep):
|
||
|
||
- **ANSI/VT escape parsing.** claude-code uses xterm-class
|
||
escape sequences for cursor positioning, color, scroll
|
||
regions, alternate screen buffer (for the prompt UI), mouse
|
||
reporting, and so on. The full xterm spec is dozens of pages.
|
||
Sloppy parsing produces a corrupted display the user will
|
||
hate.
|
||
- **A screen buffer model.** Cells with attributes
|
||
(foreground, background, bold, underline, italic, inverse).
|
||
Cursor position. Saved cursor. Alternate screen. Scrollback.
|
||
- **Resize protocol.** claude-code asks the PTY its size via
|
||
`TIOCGWINSZ` and re-layouts on `SIGWINCH`. The dashboard has
|
||
to size the PTY to the pane it's rendering into and propagate
|
||
SIGWINCH when curses says the terminal resized.
|
||
- **Input routing.** When the pane has focus, keypresses
|
||
written to the PTY. When the dashboard has focus, keypresses
|
||
consumed by the dashboard. Define an escape sequence (like
|
||
tmux's `Ctrl-B`) that toggles focus, and document that
|
||
claude-code's own use of that key sequence is now intercepted.
|
||
- **Output throttling.** claude-code can emit megabytes of
|
||
tokens in a streaming response. The dashboard's 1s refresh
|
||
tick is too slow to render character-at-a-time; you want the
|
||
PTY reader to coalesce and the renderer to render on a
|
||
smaller cadence than the main loop's `getch` timeout.
|
||
|
||
The stdlib has `pty` (the spawn side) and you can read/write
|
||
the master fd by hand. It does **not** have an ANSI parser; the
|
||
established Python library for this is `pyte`
|
||
([pyte.readthedocs.io](https://pyte.readthedocs.io/)) — pure
|
||
Python, MIT-licensed, no transitive deps. ~3k lines. It would
|
||
be the project's first runtime dependency beyond stdlib.
|
||
|
||
Even with `pyte`, the integration is non-trivial: you're
|
||
re-rendering a 24x80-ish (or whatever fits) screen buffer into
|
||
curses cells on every tick, dealing with attribute mapping
|
||
(pyte's color enum → curses color pair), and handling mouse
|
||
events through the pane. Plan on ~600–1200 lines, not 200.
|
||
|
||
Open trap-doors:
|
||
|
||
- **Claude-code uses bracketed paste, alternate screen, and
|
||
occasionally raw terminal control for its prompt input.**
|
||
Some of these features stress the emulator harder than `vim`
|
||
does — alt-screen has to be supported or claude's
|
||
command-prompt UI corrupts the line above it. `pyte` claims to
|
||
handle alt-screen; verify before committing.
|
||
- **Scrollback in claude is `/transcript`-driven, not terminal
|
||
scrollback.** A small pane height means you only see the last
|
||
10–20 lines of output without leaving the dashboard, which is
|
||
the wrong shape for a 200-line streaming response. You'd
|
||
want to make the pane resizable or open a full-height
|
||
"expand" mode (which is just option 1, the handoff, with
|
||
extra steps).
|
||
- **Multiple agents = multiple PTYs running concurrently.** If
|
||
the user wants to monitor 3 bottles, the dashboard is now
|
||
holding 3 PTYs open and parsing 3 ANSI streams in parallel.
|
||
Memory + CPU costs are bounded but nonzero; design the
|
||
PTY-per-agent state machine carefully.
|
||
|
||
This is the option that delivers the "pane in the dashboard"
|
||
literal request. It's the right answer if the user's day-to-day
|
||
involves watching multiple bottles' output simultaneously
|
||
without context-switching. It's the wrong answer if they mostly
|
||
want one focused session at a time with proposals visible.
|
||
|
||
## Option 3: External multiplexer
|
||
|
||
The dashboard binds a key (e.g. `Enter` on agent) to
|
||
`tmux split-window -h 'docker exec -it claude-bottle-<slug>
|
||
claude'` when run inside a tmux session, or to `osascript`-
|
||
driven iTerm pane spawning on macOS, or to `wezterm cli
|
||
spawn` if the user is on wezterm.
|
||
|
||
What's good:
|
||
|
||
- The "real terminal in a real pane" is solved by tools the
|
||
user already trusts. tmux's terminal emulation is correct;
|
||
iTerm's is correct; wezterm's is correct. We're not
|
||
reimplementing any of them.
|
||
- Multi-bottle parallelism is automatic — the user opens one
|
||
pane per agent, the multiplexer renders them.
|
||
- Implementation cost is tiny: ~50 lines of "if `TMUX` env is
|
||
set, shell out to `tmux split-window`."
|
||
|
||
What's not good:
|
||
|
||
- It requires the user to be running in a multiplexer. Outside
|
||
one (plain Terminal.app, vscode integrated terminal, etc.)
|
||
the verb either falls back to handoff or just fails.
|
||
- It splits the operator's mental model. The dashboard is one
|
||
window, claude-code panes are other windows; the dashboard's
|
||
"agents pane" no longer matches the visible reality (some
|
||
agents have an attached pane, others don't, and the dashboard
|
||
doesn't know which).
|
||
- We don't actually own the layout. tmux's pane sizing rules
|
||
are tmux's; the dashboard can't enforce "agent pane on the
|
||
right, dashboard on the left."
|
||
|
||
This is the right answer if the project decides "we shouldn't
|
||
be a multiplexer; let tmux be a multiplexer." It's the wrong
|
||
answer if the dashboard wants to *be* the operator's primary
|
||
surface — which the user's question implies it does.
|
||
|
||
## Recommendation
|
||
|
||
**Build Option 1 first** (handoff). It clears the bar of "let me
|
||
talk to claude without quitting the dashboard," it's a
|
||
~30-line patch over the existing `e` / `p` infrastructure, and
|
||
it carries no new dependencies. Critically it lets the user
|
||
*confirm whether the workflow actually wants embedded panes* —
|
||
if option 1 turns out to be enough in practice (Enter to drop
|
||
in, Ctrl-D back out, watch proposals between sessions), the
|
||
embedded-emulator complexity is unnecessary.
|
||
|
||
**Treat Option 2 as a six-week project to plan only if Option 1
|
||
is observably insufficient.** The pyte dep is acceptable but
|
||
the integration is real engineering — terminal emulator
|
||
correctness, pane sizing, input routing, multi-PTY state
|
||
management. Schedule it; don't bolt it on. If the user lives in
|
||
3-bottle days and needs simultaneous output, this is the option;
|
||
otherwise it's premature optimization.
|
||
|
||
**Treat Option 3 as a niche augmentation, not a primary path.**
|
||
Detect `TMUX` / `WEZTERM_PANE` / `ITERM_SESSION_ID` at startup;
|
||
if present, add a `split` keybinding that delegates pane
|
||
creation to the multiplexer. The dashboard remains the primary
|
||
interface; the multiplexer is convenience for power users.
|
||
|
||
## Followups worth checking before committing
|
||
|
||
- **What does claude-code's PTY-resize behavior look like at
|
||
small heights?** Drive it at 24×40, 10×80, 8×40 and see if it
|
||
blows up. The dashboard's bottom pane is going to be small.
|
||
- **Is there a way to `docker exec` into an *already-running
|
||
exec* rather than spawning a new claude process per attach?**
|
||
If not, the dashboard needs its own session-state model: "is
|
||
there an exec running for this slug? attach to it. otherwise
|
||
start one."
|
||
- **How does claude-code handle SIGWINCH at the moment?** If it
|
||
re-layouts cleanly the embedded-emulator story gets easier;
|
||
if it corrupts, layer 2 needs special-case handling.
|
||
- **Does `docker exec -i` (no `-t`) preserve enough of the TTY
|
||
contract for claude-code to start at all?** Some apps refuse
|
||
to launch without a real TTY; the embedded emulator option
|
||
needs the PTY allocated on the host side and the exec
|
||
re-attached to it, which is the harder of the two options.
|
||
|
||
## References
|
||
|
||
- PRD 0019 — active agents pane + selection model (the
|
||
selection-source for whichever option lands)
|
||
- PRD 0018 chunk 3 — agent container runs `sleep infinity`;
|
||
claude is invoked via `docker exec -it` (the
|
||
attachment-point this doc is layering against)
|
||
- `claude_bottle/cli/dashboard.py:_operator_edit_flow` — the
|
||
existing `curses.endwin` → shell out → `stdscr.refresh()`
|
||
pattern Option 1 would clone
|
||
- pyte: <https://pyte.readthedocs.io/> — the candidate
|
||
terminal-emulator library if Option 2 is picked
|