didericis/bot-bottle

Fork 0

Files

T

didericis-claude c6e976fa7d

lint / lint (push) Failing after 1m43s

Details

test / unit (pull_request) Successful in 33s

Details

test / integration (pull_request) Successful in 19s

Details

docs(prd): mark commit-bottle-state PRD as Active

2026-06-22 22:29:33 -04:00

5.3 KiB

Raw Blame History

PRD prd-new: Commit bottle state to an image

Status: Active
Author: Claude
Created: 2026-06-20
Issue: #194

Summary

Add a commit CLI command that freezes a running Docker bottle's container state to a named Docker image. Operators can then resume the bottle from that exact filesystem snapshot, or export the image with docker save to migrate work to a different host.

Problem

When a long-running agent session is interrupted — by a host reboot, a network failure, or a planned infrastructure migration — the in-progress container state is lost. cli.py resume rebuilds the agent image from the Dockerfile and reprovi-sions the bottle, but that returns the guest to its initial state, not to wherever the agent was mid-task.

There is no mechanism today to capture "what's installed / configured inside the running container right now" and make it reproducible. The capability-block flow writes a new Dockerfile and marks the bottle for resume, but that only applies when the agent itself has requested a capability change; it doesn't help the operator who wants to take a snapshot before a planned host reboot or hardware migration.

Goals / Success Criteria

./cli.py commit [<slug>] takes a snapshot of the running Docker agent container and stores it as a local Docker image.
Without a slug argument the command shows the same interactive picker as start (the list of active slugs).
The committed image tag is stored in per-bottle state so that the next ./cli.py resume <slug> automatically uses the committed image instead of rebuilding from the Dockerfile.
mark_preserved is called so the state dir survives the normal session-end cleanup.
A docker save hint is printed so operators know how to export the image for migration.
The command errors clearly on non-Docker backends (smolmachines does not expose a container-level commit API in its current CLI surface).

Non-goals

Smolmachines or macOS-container backend support.
Automatic commit on agent exit.
Image push to a remote registry.
Storing the image tag in the manifest or sharing it between operators.

Design

Image tag

bot-bottle-committed-<slug>:latest — namespaced under bot-bottle- to match existing image naming conventions; committed distinguishes it from the build-time image (bot-bottle-claude:latest) and the capability-block rebuild image (bot-bottle-rebuilt-<identity>:latest).

State storage

A new plain-text file committed-image is added to the per-bottle state directory:

~/.bot-bottle/state/<identity>/
    metadata.json
    Dockerfile            (capability-block override; optional)
    committed-image       (committed image tag; optional)
    transcript/

bottle_state.committed_image_path(identity) returns the path. write_committed_image / read_committed_image are the read/write helpers, matching the existing per_bottle_dockerfile pattern.

`commit` command

./cli.py commit [<slug>]

Resolve slug (arg or interactive picker from enumerate_active_agents).
Check metadata: if backend is set and is not docker, die with a clear "not supported" error.
Derive container name: bot-bottle-<slug> (matches the agent provision plan's instance_name convention).
Run docker commit <container> bot-bottle-committed-<slug>:latest.
Write the image tag to ~/.bot-bottle/state/<slug>/committed-image.
Call mark_preserved(<slug>) so the state dir survives session-end.
Print the resume hint and a docker save export example.

Resume from committed image

bot_bottle/backend/docker/launch.py already rebuilds the agent image at the top of the launch context manager. The change is a check immediately before that step:

committed = read_committed_image(plan.slug)
if committed and docker_mod.image_exists(committed):
    info(f"using committed image {committed!r}")
    plan = dataclasses.replace(
        plan,
        agent_provision=dataclasses.replace(plan.agent_provision, image=committed),
    )
else:
    docker_mod.build_image(plan.image, _REPO_DIR, dockerfile=plan.dockerfile_path)

Replacing agent_provision.image propagates to plan.image (a property) and from there to the Compose spec renderer's _agent_service → image: field, so the container boots from the committed snapshot. The build step is skipped entirely when a committed image is found and exists locally.

If the committed image has been deleted from the local daemon (e.g. after docker rmi or a docker system prune), the launch falls back to a normal Dockerfile build, matching the pre-commit behavior.

Testing strategy

Unit tests for write_committed_image / read_committed_image in tests/unit/test_bottle_state.py, using the existing _FakeHomeMixin pattern.
Unit tests for commit_container in tests/unit/test_docker_util_image.py, mocking subprocess.run and asserting on the docker commit argv.
Unit tests for cmd_commit argument parsing and the "unsupported backend" error path, mocking enumerate_active_agents and commit_container.
Unit tests for the launch-step committed-image branch: patch read_committed_image to return a tag, patch image_exists to return True, and assert that build_image is not called and plan.image is overridden.

5.3 KiB Raw Blame History

PRD prd-new: Commit bottle state to an image

Summary

Problem

Goals / Success Criteria

Non-goals

Design

Image tag

State storage

commit command

Resume from committed image

Testing strategy

5.3 KiB

Raw Blame History

`commit` command