didericis/bot-bottle

PRD 0016: capability block remediation #22

Merged

didericis merged 7 commits from prd-0016-capability-block into main

2026-05-25 06:14:40 -04:00

Author	SHA1	Message	Date
didericis	4032e04a9c	feat(bottle): random-suffix identity + cli.py resume <identity> test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m30s Details Replaces the cwd-hash identity with a random 5-char base36 suffix per launch, so two simultaneous `start <agent>` invocations against the same cwd no longer collide on container names. Each launch is its own bottle. State carries metadata: every prepare step writes ~/.claude-bottle/state/<identity>/metadata.json with the (agent_name, cwd, copy_cwd, started_at) the bottle was launched with. The new `cli.py resume <identity>` reads this metadata and re-launches a bottle pinned to the same identity — picking up the per-bottle Dockerfile (from a prior capability-block apply) and the transcript snapshot under the same state dir. - bottle_state.py: bottle_identity(agent_name) drops the cwd param and gains a random suffix; BottleMetadata dataclass + read/write/metadata_path helpers. - BottleSpec gains an optional identity field — resume sets it to pin the identity; start leaves it empty so prepare mints fresh. - prepare.py: writes metadata at launch time; uses spec.identity if provided (resume) else bottle_identity(agent_name) (fresh start). - start.py: extracted _launch_bottle from cmd_start so resume can share the launch core; prints `./cli.py resume <identity>` hint at session end. - cli/resume.py (new): reads metadata, reconstructs BottleSpec with the recorded identity + cwd, delegates to _launch_bottle. Errors clearly when no state exists for the given identity. - cli/__init__.py: registers `resume` in COMMANDS + usage. - dashboard.py: capability-block approval status line now appends the `resume <identity>` hint so the operator can copy-paste the rebuild command without leaving the TUI. Closes the rebuild loop in PRD 0016: agent calls capability-block → operator approves → bottle torn down with state preserved → status line shows resume command → operator runs it → replacement bottle boots with the new Dockerfile and prior transcript. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 06:09:45 -04:00
didericis	e996f72532	fix(bottle): identity-key all per-bottle resources by (agent, cwd) test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 1m30s Details The single point that computed `slug = slugify(agent_name)` in prepare.py is now `slug = bottle_identity(agent_name, cwd)`. With --cwd the identity has a sha256(resolved-cwd)[:12] suffix, so the same agent against different projects gets distinct container names, network names, queue dir, audit log paths, and per-bottle state (Dockerfile + transcript). Without --cwd the identity is just slugify(agent_name), unchanged from before — no-cwd bottles look the same as today. The downstream `slug` field on DockerBottlePlan keeps its name — every module already threads it under "slug" and the value flowing through is now the bottle's full identity. A comment in prepare.py flags the change. Fixes the bug surfaced in PR #22 review: running the same agent against project-A's cwd then project-B's would silently share project-A's per-bottle Dockerfile + transcript snapshot, container name (forcing serialized runs), and queue/audit history. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:46:26 -04:00
didericis	ac8f14ae6f	test(capability): integration test for apply_capability_change (PRD 0016) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m30s Details Phase 4 of PRD 0016. End-to-end test against real Docker: - Stages a fake bottle: alpine:latest container named claude-bottle-<slug> with a marker file at /home/node/.claude/sessions.json, plus a fake supervise sidecar. - Calls apply_capability_change with a new Dockerfile. - Verifies: per-bottle Dockerfile written, agent + sidecars removed, networks removed, transcript snapshot dir on host contains the marker file (proving docker cp transferred bytes). - Subsequent-apply test proves the per-bottle Dockerfile state persists across rebuilds (before-diff uses the prior override, not the repo Dockerfile). - Teardown-idempotent test: apply against a never-started bottle doesn't raise. docker exec / cp / rm / network rm work fine across the docker socket boundary, so this runs in DinD too — no act_runner skip needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:30:04 -04:00
didericis	d9c47d0fbe	feat(dashboard): wire capability-block approval to real apply (PRD 0016) Phase 3 of PRD 0016. dashboard.approve() now dispatches to apply_capability_change when the proposal is a capability-block: cred-proxy-block → apply_routes_change pipelock-block → apply_allowlist_change capability-block → apply_capability_change (new in PRD 0016) CapabilityApplyError joins the ApplyError tuple, so the TUI's key handlers catch it the same way and surface failures in the status line. After a successful capability-block apply, dashboard archives the proposal+response itself — the supervise sidecar was torn down by apply_capability_change and can't archive its own queue file. Without this, dashboard.discover_pending would keep surfacing the resolved proposal forever. No audit log for capability-block per PRD 0013 — its record lives in the per-bottle Dockerfile state + transcript snapshot. Tests stub apply_capability_change at the dashboard module level, add TestCapabilityApplyWiring (call wiring, failure-keeps-pending, no-audit invariant, archive-after-apply), and update TestApproveReject to stub the capability path too so it stays docker-independent. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:28:35 -04:00
didericis	0899a898e0	feat(capability): host-side apply_capability_change orchestrator (PRD 0016) Phase 2 of PRD 0016. New module claude_bottle/backend/docker/capability_apply.py: - apply_capability_change(slug, new_dockerfile): snapshot transcript → push working tree → write per-bottle Dockerfile → teardown. Returns (before, after) for the dashboard's audit/diff render. - fetch_current_dockerfile(slug): per-bottle Dockerfile if set, else the repo's Dockerfile. - Internal helpers _snapshot_transcript, _push_working_tree are best-effort (log + return on failure); _teardown_bottle is idempotent (force-rm + network rm silently ignore missing names). Fire-and-forget from the agent's perspective: by the time the dashboard writes the response file the supervise sidecar is already gone (it was torn down), so the agent's tool call connection drops without receiving the response. The replacement agent (next manual `cli.py start <agent>`) sees the new per-bottle Dockerfile and the transcript snapshot for resume. v1 does not auto-relaunch. Tests cover sequencing (snapshot → push → teardown order), the per-bottle vs repo Dockerfile fallback chain, empty-input rejection, and the per-bottle-Dockerfile write. The docker exec / cp / rm plumbing is covered by the Phase 4 integration test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:26:38 -04:00
didericis	02811e0417	feat(bottle): per-bottle Dockerfile state + image build hook (PRD 0016) Phase 1 of PRD 0016. Lays the per-bottle state plumbing that capability-block remediation will write into: - claude_bottle/backend/docker/bottle_state.py: bottle_state_dir, per_bottle_dockerfile (read), write_per_bottle_dockerfile, per_bottle_image_tag (unique per slug), transcript_snapshot_dir. Stores under ~/.claude-bottle/state/<slug>/. - prepare.py: when a per-bottle Dockerfile exists, use per_bottle_image_tag(slug) as the base image and pass the per-bottle Dockerfile path through DockerBottlePlan.dockerfile_path. --cwd still layers a derived image on top. - launch.py: passes plan.dockerfile_path to build_image so the per-bottle Dockerfile is what docker build reads. - DockerBottlePlan gains dockerfile_path field; print() surfaces it in the preflight summary so the operator can see at-a-glance that this bottle is running on a rebuilt image. Phase 2 will write to write_per_bottle_dockerfile (capability-block approval); Phase 3 wires it into the dashboard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:23:31 -04:00
didericis	de87f21ff8	docs(prd-0016): capability block remediation test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m13s Details Adds PRD 0016, the heaviest of the three remediation engines in the stuck-agent recovery flow (overview in PRD 0012, foundation in PRD 0013). Wires the capability block path: rebuild orchestrator, state-preservation helper, capability-block end-to-end. On approval the orchestrator tears down the bottle, builds from the new Dockerfile, and starts a replacement on the same branch via state-preservation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:15:32 -04:00