On resume from a committed snapshot, smolvm's pack process remaps all
file uids to the host uid (501 on macOS). Files in /tmp that were
created during the session (e.g. /tmp/claude-1000 owned by node=uid
1000) get remapped to 501. Claude Code then refuses to use the temp
directory because it's owned by a different uid.
Two-part fix:
- Exclude ./tmp and ./var/tmp from the tar in _exec_tar_to_file.
Both directories are ephemeral; a resumed VM should start with clean
temp directories identical to a fresh VM.
- Add mkdir -p /tmp /var/tmp to _init_vm before chown/chmod, so the
directories are created if the committed snapshot omitted them.
Replace the Popen/stdout=PIPE approach with a write-then-copy
strategy that avoids binary-stdout piping through the smolvm exec
channel entirely:
1. Probe connectivity with `machine_exec(machine, ["true"])` first.
If this fails while an interactive session is running, the error
now says "concurrent exec not available" instead of the opaque
"<no stderr>".
2. Run `tar --create --gzip --file=/var/tmp/.bot-bottle-commit.tar.gz`
inside the VM via machine_exec (same mechanism used during
provisioning). tar writes to a file in the VM, not stdout, so
smolvm never has to transmit binary data over the exec channel.
3. Copy the compressed archive to the host with machine_cp.
4. Dockerfile switches to ADD rootfs.tar.gz / — Docker decompresses
gzip tarballs automatically.
smolvm machine exec requires stdout to be a pipe, not a regular
file descriptor. Passing stdout=file caused smolvm to return
non-zero with no stderr (the error was silently swallowed or went
to the regular-file fd instead of reaching us).
Switch _snapshot_running_vm to a new _exec_tar_to_file helper that
uses Popen with stdout=PIPE and streams the tar to disk via
shutil.copyfileobj. A background thread drains stderr concurrently
to prevent deadlock when the stderr pipe buffer fills while we are
writing stdout data.
The terminal-decoration wrapper script is invoked with sh -lc, which
sources login-shell init files (/etc/profile, ~/.profile) rather than
interactive-shell files (~/.zshrc). smolvm is typically installed via
homebrew whose PATH setup lands in ~/.zprofile or ~/.zshrc — not picked
up by sh -l — so pty_resize.py's Popen(["smolvm", ...]) raises
FileNotFoundError, pty_resize exits non-zero, and the trailing reset-
printf makes sh exit 0. The caller sees "session ended (exit 0)"
immediately with no agent output.
Use sh -c instead. The calling process (./cli.py) inherits the user's
interactive shell PATH where smolvm is present, confirmed by the
provision steps (machine_exec) succeeding before exec_agent is reached.
smolvm pack create --from-vm requires the VM to be stopped, and stopping
a smolmachines VM terminates any running interactive session.
Instead, mirror the macos-container approach: exec into the running VM as
root and stream the root filesystem via tar (smolvm machine exec -- tar),
build a Docker image from the archive, push to an ephemeral local registry,
and run smolvm pack create --image to produce the .smolmachine artifact.
The VM stays running throughout the commit.
Remove the stop-confirm prompt and machine_is_running check that were
added in the previous commit — neither is needed when we no longer stop.
smolvm pack create --from-vm requires the VM to be stopped. Add
machine_is_running() to smolvm.py (via machine ls --json state field),
and add the same confirm-stop flow to SmolmachinesFreezer that was
originally designed for macos-container: if running, prompt the user,
stop the VM, then pack. Already-stopped VMs are packed directly.
- Rename export test to reflect new exec-tar mechanism; update argv
assertions to match the new `container exec ... tar` command shape
- Change mock stderr from str to bytes (subprocess.PIPE without text=True)
- Add type annotation to capture_freeze closure to satisfy pyright
Apple Container removes containers when they stop, making the
stop-then-export flow impossible regardless of the --rm flag.
Replace `container export` (requires stopped container) with
`container exec --user root <name> tar --create ... --file=- --directory=/ .`
streamed to a temp file, then build the committed image from that archive
as before. The bottle stays running after commit, which is better UX.
Drop the stop-confirm prompt from MacosContainerFreezer since we no longer
need to stop the container at all.
container stop was removing the container immediately (due to --rm)
before container export could run. The force_remove_container teardown
callback on the ExitStack already handles cleanup on normal exit, so
--rm was redundant. Without it, the stopped container stays available
for container export to snapshot.
Freezer._freeze only ever used bottle.name, which is always
f"bot-bottle-{agent.slug}". Remove the Bottle parameter from
commit() and _freeze(), derive the container name from agent.slug
directly in each subclass, and delete the _NamedBottle stub that
existed solely to paper over this.
Adds a Freezer ABC (backend/freeze.py) that encapsulates the
stop-commit-mark-preserved flow for all backends, following the same
pattern as BottleBackend. Each backend gets its own Freezer subclass:
DockerFreezer — docker commit
MacosContainerFreezer — container export + image rebuild; prompts
to stop if the container is running
SmolmachinesFreezer — smolvm pack create --from-vm
The base class owns write_committed_image, mark_preserved, and the
resume hint. Subclasses implement _freeze() and optionally override
_export_hint() for migration instructions.
Freezer.commit(agent, bottle) is the primary entry point for use
within a live launch context. Freezer.commit_slug(slug) is a
convenience wrapper for cmd_commit, which no longer branches on
backend names itself.
get_freezer(backend_name) is the factory, analogous to
get_bottle_backend(). CommitCancelled is raised by MacosContainerFreezer
when the user declines the stop prompt; cmd_commit catches it and
returns 0.
`container export` requires the container to be stopped first. When a
running bottle is detected, prompt the user to confirm, stop the
container, then commit. Adds `container_is_running` and
`stop_container` helpers to the macos-container util.
Addresses #240 (comment)
- test_docker_launch_committed_image: replace Manifest.from_json_obj
(nonexistent) with ManifestIndex.from_json_obj; pass manifest= arg
to DockerBottlePlan constructor (required by BottlePlan base class)
- test_macos_container_launch: cast SimpleNamespace stubs to their
expected types (BottleSpec, GitGatePlan, EgressPlan) in _build_plan;
add str type annotations to fake_build parameter signatures
- test_macos_container_util: add str type annotations to fake_build_image
parameter signatures
Adds `./cli.py commit [<slug>]` which runs `docker commit` on the
active agent container and stores the resulting image tag in per-bottle
state. The next `./cli.py resume <slug>` automatically boots from the
committed snapshot instead of rebuilding from the Dockerfile, preserving
all in-container state across restarts and migrations.
- bottle_state: add write_committed_image / read_committed_image helpers
- docker/util: add commit_container wrapper around `docker commit`
- docker/launch: check for a committed image before the Dockerfile build
step; fall back to normal build if the image is absent from the daemon
- cli/commit: new command with interactive slug picker; errors clearly on
non-Docker backends
- 50 new unit tests covering all paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>