Commit Graph

3 Commits

Author SHA1 Message Date
didericis-claude b9853ae0c7 fix(smolmachines): give pty_resize side-channel DEVNULL stdin so it survives under tmux
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 40s
Inside tmux the dashboard's smolmachines launch crashed within
~100ms of the wrapper Popen-ing the main smolvm exec child —
sometimes with rc=137 (SIGKILL), sometimes with smolvm
spitting a runc-style "load `config.json`: cannot parse the
data: parse error: trailing garbage" and exiting 1. The same
wrapper ran fine outside tmux. Diagnostic logs showed the
SIGKILL landed ~100ms after the wrapper kicked off its
initial `sync()` (which fires the side-channel smolvm exec).

Root cause: the side-channel `subprocess.run([smolvm, machine,
exec, --, sh, -c, ...])` did not specify `stdin=`, so it
inherited the wrapper's stdin — the tmux pane PTY. The main
smolvm child (the agent session) also had that PTY as stdin.
Two concurrent smolvm processes sharing the PTY's
foreground-process-group / input plumbing caused smolvm to
abort one of them. iTerm's PTY plumbing apparently tolerated
this; tmux's didn't.

Fix is one line in `_push_size`: `stdin=subprocess.DEVNULL`.
The side-channel never needs stdin — it runs a fire-and-forget
`stty` and exits. Verified end-to-end: pre-fix the wrapper
crashed under `tmux respawn-pane` against a live VM; post-fix
the same invocation completes cleanly.

Also drop the diagnostic log added in 37bd11b — we have the
fix.

Regression test:
`test_side_channel_uses_devnull_stdin` locks the
`stdin=DEVNULL` invariant so a future "let's simplify the
subprocess.run kwargs" refactor surfaces this immediately.

637 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:43:59 -04:00
didericis-claude 37bd11b375 chore(smolmachines): instrument pty_resize wrapper for crash diagnosis
test / unit (pull_request) Successful in 28s
test / integration (pull_request) Successful in 41s
User reports launch crashing only inside tmux (works outside).
The wrapper itself runs fine in standalone tmux repros, so the
break is in some interaction we can't see — curses eats stderr,
default tmux remain-on-exit is off, and the pane closes before
the operator can read anything.

Add an always-on per-pid log at ~/.claude-bottle/pty_resize.log:

  - start record: argv, cwd, PATH, TMUX status
  - sync record: window size observed
  - child pid + exit rc
  - any KeyboardInterrupt forwarding
  - Popen failure traceback if it dies

Append-mode, small overhead, easy to grep + share.

Removable (along with the wrapper itself) once smolvm forwards
SIGWINCH natively.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:37:50 -04:00
didericis-claude 3fb305f654 fix(smolmachines): bridge host SIGWINCH into the VM PTY (issue #82)
test / unit (pull_request) Successful in 28s
test / integration (pull_request) Successful in 41s
`smolvm 0.8.0 machine exec -t` allocates an in-VM PTY but never
forwards the host terminal's window size — the PTY starts at
`0 0` and host resizes (tmux pane resize, terminal window
resize) go unnoticed, so the claude TUI inside a smolmachines
bottle renders for whatever tiny box it last saw and ignores
operator resizes. `docker exec -it` propagates window-size
changes automatically; smolvm doesn't.

Workaround: a small Python wrapper
(`backend/smolmachines/pty_resize.py`) that interposes between
the operator's terminal and `smolvm machine exec`. It spawns
smolvm as a child, traps host SIGWINCH, and on every resize
(plus once at startup) runs a side-channel
`smolvm machine exec --name <M> -- sh -c 'for f in /dev/pts/*;
do stty -F $f cols X rows Y; done'`. The kernel delivers
SIGWINCH to the in-VM foreground process group when the slave
PTY's size changes, so claude picks up the new dimensions
without extra signalling.

`SmolmachinesBottle.claude_argv` prepends
`[sys.executable, -m, claude_bottle.backend.smolmachines.
pty_resize, <machine>, --, ...]` to the existing smolvm argv
in TTY mode. Non-TTY mode (provisioning shell-outs) skips the
wrapper — no PTY to resize.

The wrapper survives the dashboard's
`_build_resume_argv_with_fallback` shell-wrap because the
split-at-`claude` token still finds the right position — the
wrapper's prefix wraps the entire smolvm-exec framing.

Tests:
- `test_smolmachines_pty_resize.py` (new): argv parsing, the
  side-channel command shape (cols/rows / for-loop over
  /dev/pts/*), and `_read_winsize`'s fallback across
  stdin/stdout/stderr including the smolvm-allocated-PTY-
  reports-`0 0` ironic case.
- `test_smolmachines_bottle.py`: updated TTY-mode assertions
  to unwrap the pty_resize prefix; added `TestClaudeArgvNoTTY`
  to lock the non-TTY skip.

636 unit tests pass.

Removable when smolvm grows native SIGWINCH forwarding.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:15:11 -04:00