6.3 KiB
PRD 0060: Commit bottle state to an image
- Status: Active
- Author: Claude
- Created: 2026-06-20
- Issue: #194
Summary
Add a commit CLI command that freezes a running bottle's state to a
resumable local artifact. Docker bottles are stored as Docker images;
smolmachines bottles are stored as .smolmachine artifacts. Operators
can then resume the bottle from that exact filesystem snapshot, or
export the artifact to migrate work to a different host.
Problem
When a long-running agent session is interrupted — by a host reboot, a
network failure, or a planned infrastructure migration — the in-progress
container state is lost. cli.py resume rebuilds the agent image from
the Dockerfile and reprovi-sions the bottle, but that returns the guest
to its initial state, not to wherever the agent was mid-task.
There is no mechanism today to capture "what's installed / configured
inside the running container right now" and make it reproducible. The
capability-block flow writes a new Dockerfile and marks the bottle for
resume, but that only applies when the agent itself has requested a
capability change; it doesn't help the operator who wants to take a
snapshot before a planned host reboot or hardware migration.
Goals / Success Criteria
./cli.py commit [<slug>]takes a snapshot of the running agent and stores it as a local artifact.- Without a slug argument the command shows the same interactive picker
as
start(the list of active slugs). - The committed artifact reference is stored in per-bottle state so
that the next
./cli.py resume <slug>automatically uses the snapshot instead of rebuilding from the Dockerfile. mark_preservedis called so the state dir survives the normal session-end cleanup.- A backend-specific export hint is printed so operators know how to migrate the snapshot.
- The command errors clearly on unsupported backends.
Non-goals
- macOS-container backend support.
- Automatic commit on agent exit.
- Image push to a remote registry.
- Storing the image tag in the manifest or sharing it between operators.
Design
Docker image tag
bot-bottle-committed-<slug>:latest — namespaced under bot-bottle-
to match existing image naming conventions; committed distinguishes it
from the build-time image (bot-bottle-claude:latest) and the
capability-block rebuild image (bot-bottle-rebuilt-<identity>:latest).
State storage
A new plain-text file committed-image is added to the per-bottle state
directory:
~/.bot-bottle/state/<identity>/
metadata.json
Dockerfile (capability-block override; optional)
committed-image (committed artifact reference; optional)
transcript/
bottle_state.committed_image_path(identity) returns the path.
write_committed_image / read_committed_image are the read/write
helpers, matching the existing per_bottle_dockerfile pattern. Docker
stores a Docker tag in this file; smolmachines stores the absolute path
to the committed .smolmachine artifact.
commit command
./cli.py commit [<slug>]
- Resolve slug (arg or interactive picker from
enumerate_active_agents). - Check metadata and branch by backend.
- For Docker, derive container name
bot-bottle-<slug>and rundocker commit <container> bot-bottle-committed-<slug>:latest. - For smolmachines, derive machine name
bot-bottle-<slug>and runsmolvm pack create --from-vm <machine> -o ~/.bot-bottle/state/<slug>/committed-smolmachine. - Write the Docker image tag or smolmachine artifact path to
~/.bot-bottle/state/<slug>/committed-image. - Call
mark_preserved(<slug>)so the state dir survives session-end. - Print the resume hint and a backend-specific export example.
Resume from committed image
bot_bottle/backend/docker/launch.py already rebuilds the agent image
at the top of the launch context manager. The change is a check
immediately before that step:
committed = read_committed_image(plan.slug)
if committed and docker_mod.image_exists(committed):
info(f"using committed image {committed!r}")
plan = dataclasses.replace(
plan,
agent_provision=dataclasses.replace(plan.agent_provision, image=committed),
)
else:
docker_mod.build_image(plan.image, _REPO_DIR, dockerfile=plan.dockerfile_path)
Replacing agent_provision.image propagates to plan.image (a
property) and from there to the Compose spec renderer's _agent_service
→ image: field, so the container boots from the committed snapshot.
The build step is skipped entirely when a committed image is found and
exists locally.
If the committed image has been deleted from the local daemon (e.g.
after docker rmi or a docker system prune), the launch falls back
to a normal Dockerfile build, matching the pre-commit behavior.
Resume from committed smolmachine
bot_bottle/backend/smolmachines/launch.py checks the committed
reference before the normal Docker build -> pack cache path:
committed = read_committed_image(plan.slug)
if committed and Path(committed).is_file():
return Path(committed)
return _ensure_smolmachine(plan.agent_image, dockerfile=plan.agent_dockerfile_path)
The returned path is passed to smolvm machine create --from, so the
resumed VM boots from the committed snapshot. If the artifact has been
deleted, launch falls back to the normal build and pack flow.
Testing strategy
- Unit tests for
write_committed_image/read_committed_imageintests/unit/test_bottle_state.py, using the existing_FakeHomeMixinpattern. - Unit tests for
commit_containerintests/unit/test_docker_util_image.py, mockingsubprocess.runand asserting on thedocker commitargv. - Unit tests for
cmd_commitargument parsing, Docker commit, smolmachines pack, and the unsupported backend error path, mockingenumerate_active_agents,commit_container, andpack_create_from_vm. - Unit tests for the launch-step committed-image branch: patch
read_committed_imageto return a tag, patchimage_existsto return True, and assert thatbuild_imageis not called andplan.imageis overridden. - Unit tests for the smolmachines launch-step committed-artifact branch:
patch
read_committed_imageto return an existing path and assert the normal_ensure_smolmachinepath is skipped.