feat(state): clean up per-bottle state on session end (except capability-block) #26

Merged
didericis merged 3 commits from state-cleanup-on-close into main 2026-05-25 07:07:54 -04:00
Owner

Summary

Every bottle launch was leaving ~/.claude-bottle/state/<identity>/ behind forever — metadata.json on every run, plus per-bottle Dockerfile + transcript snapshot on capability-block rebuilds. The metadata accumulated debris; the only state actually worth keeping was the capability-block rebuild bundle.

Make cleanup the default; preserve only on capability-block.

Mechanism: a .preserve marker file.

  • capability_apply.apply_capability_change writes the marker before teardown.
  • cli.py runs is_preserved(identity) after the launch context closes — if true, prints the resume <identity> hint; if false, rm -rf the state dir.
  • prepare.py clears any leftover marker at launch so a stale marker from a prior capability-block doesn't keep state alive past the next normal session-end.

The resume hint also moves out of the launch with-block (where it was printed unconditionally) — operator now sees it only when state was actually kept.

What's preserved when

Exit cause Marker set? State kept?
Normal exit (Ctrl-D / exit) no no — cleaned up
Agent crash no no
Operator Ctrl-Cs cli.py no no
capability-block approval yes yes — for cli.py resume
## Summary Every bottle launch was leaving `~/.claude-bottle/state/<identity>/` behind forever — metadata.json on every run, plus per-bottle Dockerfile + transcript snapshot on capability-block rebuilds. The metadata accumulated debris; the only state actually worth keeping was the capability-block rebuild bundle. Make cleanup the default; preserve only on capability-block. **Mechanism: a `.preserve` marker file.** - `capability_apply.apply_capability_change` writes the marker before teardown. - `cli.py` runs `is_preserved(identity)` after the launch context closes — if true, prints the `resume <identity>` hint; if false, `rm -rf` the state dir. - `prepare.py` clears any leftover marker at launch so a stale marker from a prior capability-block doesn't keep state alive past the next normal session-end. The resume hint also moves out of the launch with-block (where it was printed unconditionally) — operator now sees it only when state was actually kept. ## What's preserved when | Exit cause | Marker set? | State kept? | |---|---|---| | Normal exit (Ctrl-D / `exit`) | no | no — cleaned up | | Agent crash | no | no | | Operator Ctrl-Cs `cli.py` | no | no | | capability-block approval | **yes** | **yes — for `cli.py resume`** |
didericis added 1 commit 2026-05-25 06:51:29 -04:00
feat(state): clean up per-bottle state on session end (except capability-block)
test / unit (pull_request) Successful in 19s
test / integration (pull_request) Successful in 1m35s
9dbd20398e
Previously every bottle launch left ~/.claude-bottle/state/<identity>/
behind forever — metadata.json on every run, plus per-bottle
Dockerfile + transcript snapshot on capability-block rebuilds. The
metadata accumulated debris across launches; the only state worth
keeping was the capability-block rebuild bundle.

Make cleanup the default; preserve only on capability-block.

- bottle_state.py: .preserve marker helpers (mark_preserved,
  is_preserved, clear_preserve_marker, preserve_marker_path) +
  cleanup_state(identity) that rm -rf's the per-bottle dir.
- capability_apply.apply_capability_change writes mark_preserved
  before teardown so cli.py's session-end cleanup keeps the dir.
- prepare.py clears any leftover marker at launch (start or resume),
  so a marker from a prior capability-block doesn't keep state
  alive past a subsequent normal session-end.
- cli/start.py runs the cleanup decision AFTER the launch context
  closes: if is_preserved → print resume hint; else cleanup_state.
  The resume hint moves out of the launch with-block (was previously
  printed unconditionally — would have misled the operator about
  whether state was actually kept).

Future-proof: cli.py never persists state speculatively. If the
agent wants to be resumable, it has to go through capability-block.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis added 1 commit 2026-05-25 06:56:07 -04:00
feat(cleanup): prompt to remove per-bottle state, separately from containers
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m34s
fb2b5844c4
`cli.py cleanup` already enumerated orphan containers + networks
and asked for confirmation before nuking them. Per-bottle state
under ~/.claude-bottle/state/ wasn't touched — accumulated forever,
including orphans from old code paths.

Add state to the cleanup flow with its own prompt: the trade-off is
different from containers (which are pure debris) because a state
dir may carry a resumable bottle (capability-block rebuild +
transcript snapshot) the operator still wants.

Output shows the resumable / orphan / rebuilt-Dockerfile / transcript /
preserve-marker flags for each state dir so the operator sees what
they'd lose. Both sections are skippable independently — answering
"n" to containers doesn't skip the state prompt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis added 1 commit 2026-05-25 07:05:25 -04:00
feat(state): preserve on crash + always snapshot transcript
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m31s
ef5d2f9a4d
Extends the preserve-on-capability-block design to also preserve
state on agent crash, and snapshots the transcript on every
teardown so any resume (crash or capability-block) gets a warm
claude session — not a cold start.

- capability_apply: rename _snapshot_transcript → snapshot_transcript
  (public; reused below). No behavior change in the capability path.
- cli/start.py: capture bottle.exec_claude's exit code; while the
  container is still alive (inside the launch context):
    * always snapshot_transcript(identity)
    * if exit_code != 0, mark_preserved(identity)
  Then the existing _settle_state runs after teardown.

Now the preservation matrix is:

  exit 0   (clean)          → snapshot + cleanup state
  exit ≠0  (crash, Ctrl-C)  → snapshot + preserve + show resume hint
  capability-block          → (already snapshotted/preserved by apply
                               before teardown; this path is a no-op
                               because the container is already gone
                               by the time exec_claude returns)

snapshot_transcript is best-effort — capability-block's earlier
snapshot is not clobbered when the container is already torn down,
and a missing /home/node/.claude is a warn + skip.

Tested behavior: clean exit doesn't preserve, non-zero exit
(including SIGINT/130 and SIGKILL/137) preserves; empty identity
no-ops both helpers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
didericis merged commit 8e6ed278d0 into main 2026-05-25 07:07:54 -04:00
Sign in to join this conversation.