Commit Graph

55 Commits

Author SHA1 Message Date
didericis 4991d5b3ee feat(dashboard): new-agent flow spawns into right tmux pane
PRD 0021 chunk 3. The `n` flow (PRD 0020 chunk 2) now routes
the first claude session of a freshly-started bottle into the
right tmux pane when `\$TMUX` is set — same `_attach_in_tmux`
state machine the Enter re-attach uses, just with
`resume=False` so claude starts fresh.

Outside tmux the existing foreground handoff is unchanged.

The compose-up phase (`backend.launch.__enter__`) still drops
curses for its stderr output; we restore curses BEFORE
spawning into the right pane so the dashboard re-renders
alongside the new claude session instead of waiting for
attach to return.
2026-05-26 14:27:37 -04:00
didericis 9944878277 feat(dashboard): tmux split-pane helpers + Enter dispatch
PRD 0021 chunk 2. New tmux integration: when `\$TMUX` is set
and the operator presses Enter on a focused agent row, the
dashboard spawns / respawns the right pane with that bottle's
claude session instead of taking over the terminal via
curses.endwin.

Mechanics:

  - `_in_tmux()` — true when `\$TMUX` is set.
  - `_tmux_split_pane_create` — first attach: `tmux split-window
    -h -P -F '#{pane_id}'` opens a right pane and prints its id
    for tracking.
  - `_tmux_respawn_pane` — subsequent attaches: `tmux
    respawn-pane -k -t <id>` swaps the content without
    re-splitting.
  - `_tmux_pane_exists` — `tmux list-panes` check before
    respawn so a manually-closed pane gracefully falls back to
    a fresh split.
  - `_attach_in_tmux` — owns the create-or-respawn state
    machine, mutates `tmux_state` ({pane_id, slug}) so the
    main loop tracks the right-pane occupant.
  - `_attach_via_handoff` — the previous curses-endwin path,
    extracted as the fallback when tmux is missing or fails.
  - `_attach_to_bottle` dispatches: in tmux + state available →
    `_attach_in_tmux`; otherwise → handoff.

Main loop gets `tmux_state: dict = {"pane_id": None, "slug":
None}`. Chunks 3 + 4 wire it through the new-agent flow and
the stop hook.

`FileNotFoundError`-safe `subprocess.run` calls around every
tmux invocation — a missing tmux binary cleanly falls back to
the handoff for that keypress. 478 unit tests pass (10 new
for the pure argv builders + `_claude_runtime_args`).
2026-05-26 14:26:40 -04:00
didericis ae6d11f09d fix(dashboard): use os._exit on quit so bottles survive the dashboard
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m9s
The `bottles` dict held `@contextmanager`-wrapped launch contexts.
On normal Python interpreter shutdown those context managers'
generators got GC'd, which raised GeneratorExit at the yield
point and ran the `finally` block — invoking each bottle's
teardown and tearing down the compose project. Net effect: `q`
WAS implicitly stopping every dashboard-launched bottle even
though the keypress handler just `return`'d.

`os._exit(0)` skips all Python-level cleanup (GC, atexit, etc.),
so the docker compose projects survive the dashboard exit
untouched. Curses gets explicit `endwin()` first because the
brutal exit skips curses.wrapper's normal terminal restoration.

Matches PRD 0020's resolved-question answer (`q` does NOT tear
down bottles; teardown is always explicit via `x` or
`./cli.py cleanup`).
2026-05-26 13:48:16 -04:00
didericis 14d5c78370 fix(attach): use --continue (no picker) instead of --resume
`--resume` alone surfaces claude's session picker even when only
one session exists. `--continue` jumps to the most recent session
non-interactively, which is the actual behavior the dashboard's
Enter re-attach wants for typical bottle-with-one-session cases.
2026-05-26 13:48:16 -04:00
didericis 832e92c7a6 feat(attach): pass --resume on dashboard re-attach
Re-entering a running bottle from the dashboard (Enter on the
agents pane) now invokes claude with `--resume` so the session
picks up the prior conversation history rather than starting a
fresh transcript. The first-attach paths (`./cli.py start` and
the dashboard's new-agent `n` flow) leave it off — the
transcript doesn't exist yet there.

`attach_claude` gains a `resume: bool = False` kwarg;
`_attach_to_bottle` in the dashboard passes `True`.
2026-05-26 13:48:16 -04:00
didericis 3ed3745982 feat(dashboard): x stops a dashboard-owned bottle (PRD 0020 chunk 4)
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m7s
Final PRD 0020 chunk. `x` on a focused agents-pane row tears
down the selected bottle if the dashboard owns it (started via
the chunk-2 `n` flow): pops `(cm, bottle, identity)` from the
main loop's bottles map, snapshots the transcript best-effort,
calls `cm.__exit__(None, None, None)` to drive the existing
compose-down + network-remove sequence, then `settle_state` to
honor any pre-existing preserve marker.

On a non-owned slug (discovered via `list_active_slugs` but not
in the dashboard's bottles dict — i.e., previous-dashboard or
external `./cli.py start` bottle), `x` is a no-op with a status
hint pointing at `./cli.py cleanup`. Matches the PRD's
cross-dashboard re-attach model: the dashboard can re-attach
either kind, but can only tear down its own.

The PRD's chunk 5 ("quit-cleanup") is satisfied by the existing
no-op behavior of `q` — per the user's resolved-question
answer, quit leaves bottles running unchanged. No code change
needed for that.

Footer surfaces `[x] stop`. 465 unit tests pass (1 new for the
non-owned no-op path; the owned path is integration territory
because it drives a real compose-down).
2026-05-26 03:46:57 -04:00
didericis 572306ddb6 feat(dashboard): Enter on agents pane re-attaches to bottle
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m11s
PRD 0020 chunk 3. Enter on a focused agents-pane row drops to a
claude session inside the selected bottle. Works for both
dashboard-owned bottles (looks up the stored Bottle handle in
the main loop's `bottles` dict) and externally-discovered ones
(synthesizes a DockerBottle from the slug → `claude-bottle-<slug>`
container name).

For the synthesized path, the `--append-system-prompt-file`
target resolves via metadata.json + the manifest's agent prompt
if both can be read; otherwise the re-attach runs without the
flag (claude defaults to no system prompt, the bottle's other
state is untouched).

Shares the curses.endwin → attach → refresh handoff with the
chunk-2 new-agent flow via a new `_attach_to_bottle` helper.
Footer reshuffled to advertise `[Enter] view/attach`. 464 unit
tests pass (3 new for `_bottle_for_slug`).
2026-05-26 03:39:58 -04:00
didericis 309ffaa4ab feat(dashboard): agent picker modal + new-agent (n) flow
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m7s
PRD 0020 chunk 2. Pressing `n` opens a modal that lists every
agent from the manifest with `(N running)` suffixes for ones
that already have bottles up. Type to filter (substring,
case-insensitive); j/k or arrows to navigate; Enter to confirm;
Esc clears the filter on first press, exits the picker on the
second.

On confirmation, the dashboard runs:

  - `prepare_with_preflight` from chunk 1 with curses-modal
    render + prompt callables (the preflight modal centers the
    plan summary + captures [y/N]).
  - `backend.launch(plan).__enter__()` — enters but doesn't bind
    the context to a `with`. The (cm, bottle, identity) tuple
    lands in the main loop's `bottles` dict keyed by slug.
  - `curses.endwin()` → `attach_claude(bottle)` → `stdscr.refresh()`
    handoff. The agent's claude session takes over the terminal;
    on exit the dashboard re-renders with the bottle now visible
    in the agents pane.

Crucially the context manager is held alive in `bottles` — never
`__exit__`'d at quit. Chunk 4 will wire `x` to that exit; for
now bottles started from the dashboard stay running until
explicit cleanup. Matches the PRD's "q does not tear down"
decision.

Footer surfaces `[n] new agent`. 461 unit tests pass (8 new for
`_filter_agents` and `_running_counts`).
2026-05-26 03:22:44 -04:00
didericis a56be6beb5 refactor(start): extract prepare_with_preflight + attach_claude
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m7s
PRD 0020 chunk 1. `cli/start.py`'s `_launch_bottle` did three
things in one function: prepare + preflight, attach claude, and
settle state on teardown. Split them so the dashboard (PRD 0020
chunk 2+) can reuse the prepare + attach pieces piecewise
without going through the CLI's one-shot orchestrator:

  - `prepare_with_preflight(spec, *, stage_dir, render_preflight,
    prompt_yes, dry_run)` — injects render + prompt callables so
    the CLI binds them to stderr/stdin while the dashboard binds
    them to a curses modal. Returns `(plan, identity)`; identity
    is set after `backend.prepare` returns so callers can reap
    the prepare-time state dir on abort via `settle_state` in
    their finally — preserving today's preflight-N cleanup.
  - `attach_claude(bottle, *, remote_control)` — runs claude
    inside the bottle and returns its exit code. The dashboard
    calls this from inside a `curses.endwin` → … →
    `stdscr.refresh()` handoff.
  - `capture_session_state` / `settle_state` lose their leading
    underscore; the dashboard will call them on
    session-end + explicit-stop respectively.

`_launch_bottle` becomes a thin orchestrator over those helpers.
No behavior change; all 453 unit tests pass and `./cli.py start
implementer --dry-run` produces identical preflight output.
2026-05-26 03:12:29 -04:00
didericis 7b29c81f27 feat(dashboard): agent-scoped e/p, drop discover-and-prompt path
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m6s
PRD 0019 chunk 4 (final). The `e` (routes edit) and `p` (pipelock
edit) keys now require an agent selection in the agents pane.
Pressing them with the proposals pane focused, with no active
agents, or with an out-of-range selection is a no-op with a
status hint ("no agent selected; Tab into the agents pane first").

The discover-and-prompt scaffolding inside
`_operator_edit_routes_flow` / `_operator_edit_allowlist_flow` /
`_operator_edit_flow` is gone. The flows now take an `ActiveAgent`
+ required-service name; they refuse with a clear message when
the bottle lacks the requested sidecar (e.g., `routes edit`
against a bottle with no `bottle.egress.routes` declared). The
`discover_egress_slugs` + `discover_pipelock_slugs` +
`_discover_active_with_service` helpers come out — they had no
remaining callers.

Footer now reads `[e/p] edit selected agent`.
2026-05-26 01:50:28 -04:00
didericis 0abffc4d90 feat(dashboard): Tab toggle + per-pane selection state
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m4s
PRD 0019 chunk 3. The TUI now has two focusable panes — proposals
and agents — and `Tab` toggles which one the `j/k`/arrow keys
move through.

Each pane keeps its own selection index. Switching panes doesn't
lose the position in the other; the cursor (`>` + reverse-video
row) appears only in the focused pane. The label line on each
pane shows "(focused)" when active.

Footer reshuffled: `[Tab] switch pane  [j/k] move  [Enter] view
[a/m/r] proposal  [e/p] edit  [q] quit`. When the agents pane is
focused and there's no status message to display, the idle
status line surfaces the currently-selected agent (or "[no
active agents]" / "[no agent selected]" fallbacks) so the
operator knows what an agent-scoped edit verb will target after
chunk 4 wires them up.

Proposal action keys (a/m/r/Enter) are gated on the proposals
pane being focused — pressing them with the agents pane focused
is a no-op. e/p still use the global discover-and-prompt flow
for one more chunk; chunk 4 swaps them to read the agents-pane
selection.
2026-05-26 01:37:23 -04:00
didericis cfd8f269ba feat(dashboard): render active agents pane below proposals
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m4s
PRD 0019 chunk 2. The TUI's main render now draws two panes:
proposals on top (existing), active agents on the bottom (new).
Header counts both totals. The agents pane refreshes on the
same 1s tick — agents starting/stopping reflect without
operator action.

Each agent row shows slug, agent name, started-time (HH:MM:SS
of the metadata.json timestamp), and the bracketed list of
sidecars currently up. The `agent` service is filtered out of
the displayed list — it's always present so it'd be noise; the
sidecars are the differentiator. A bottle whose only running
service is `agent` (sidecars still warming up) renders as
`(starting)`.

No selection model yet — that's chunk 3. The cursor stays in
the proposals pane; `j/k`/arrow nav and the proposal action
keys are unchanged.
2026-05-26 01:23:59 -04:00
didericis 6e4a9f606f feat(dashboard): discover_active_agents helper + ActiveAgent dataclass
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m6s
PRD 0019 chunk 1. New `discover_active_agents()` in dashboard.py
returns one `ActiveAgent(slug, agent_name, started_at, services)`
per currently-running compose project:

  - Slugs come from `list_active_slugs()` (chunk-5 shared helper).
  - The service set per project comes from ONE label-filtered
    `docker ps` call (PRD open question #1: avoids N per-bottle
    `compose ps` invocations on each 1s refresh tick).
  - agent_name + started_at come from each bottle's
    metadata.json; "?" / "" fallbacks when the file is missing
    so the row renders rather than vanishes.

Not wired into the TUI yet — chunk 2 renders the agents pane.
The parser (`_parse_services_by_project`) is split out as a pure
function so the conditional-input shape can be unit-tested
without docker.
2026-05-26 01:11:54 -04:00
didericis 1fa3745832 refactor(dashboard): discover via docker compose ls
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m8s
PRD 0018 chunk 5. The dashboard's operator-edit verbs
(`routes edit`, `pipelock edit`) enumerated running sidecars
via `docker ps --filter name=...` prefix scans. Switch to
`docker compose ls`-based discovery so the dashboard, cleanup
CLI, and launch step all agree on what's running.

Mechanics:

  - `claude_bottle/backend/docker/compose.py` grows three shared
    helpers: `list_compose_projects` (the JSON parse moved out
    of cleanup), `slug_from_compose_project` (inverse of
    `compose_project_name`), and `list_active_slugs` (sugar over
    the first two for the common "what's running?" question).
  - cleanup.py drops its private `_list_compose_projects` +
    `_PROJECT_PREFIX` in favor of the shared ones; `list_active`
    simplifies (one compose-ls call, not two).
  - dashboard.py's `_discover_sidecar_slugs` becomes
    `_discover_active_with_service`: cross-references the active
    slug list with a label-filtered `docker ps` so only bottles
    whose given service container is actually up surface in the
    edit menu. Bottles without an egress sidecar (no
    bottle.egress.routes) no longer appear for `routes edit`.

3 new unit tests cover the slug ↔ compose-project naming
contract; manual probe with a fake compose project confirms
both `discover_egress_slugs` and `discover_pipelock_slugs`
return the expected slug.
2026-05-26 00:14:16 -04:00
didericis aee249f119 refactor(cleanup): compose-ls driven, plus orphan state-dir reaping
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m9s
PRD 0018 chunk 4. `claude-bottle cleanup` now derives its work
from `docker compose ls --all --format json`, filtered to projects
whose name starts with `claude-bottle-`. Per project: one `compose
down --volumes` removes the containers + the compose-managed
networks atomically.

The plan also enumerates three fallback buckets:

  - Stray containers — `claude-bottle-*` containers with no
    `com.docker.compose.project` label (left over from pre-compose
    code paths). Cleared via `docker rm -f`.
  - Stray networks — `claude-bottle-*` networks with no compose
    project label. Cleared via `docker network rm`.
  - Orphan state dirs — per-bottle `~/.claude-bottle/state/<id>/`
    dirs with no live project AND no `.preserve` marker. The
    `.preserve` marker (capability-block or auto-preserve-on-crash)
    explicitly opts-out of reaping; manual `rm -rf` is the only
    path for preserved state.

cli/cleanup.py collapses to a single y/N prompt — backend.prepare_cleanup
returns everything in one plan, backend.cleanup processes everything,
no more double-prompt for state. The CLI-side state-dir enumeration
+ `_state_summary` flags from PR #25 are gone; the backend's
orphan-detection rules subsume them.
2026-05-25 23:48:02 -04:00
didericis cd82a48399 refactor(state): write prepare-time scratch files under state/<slug>/
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m5s
PRD 0018 chunk 2. Each sidecar's prepare-time output (pipelock yaml +
CAs, egress routes.yaml + CAs, git-gate entrypoint + hooks, supervise
current-config, agent env + prompt) now lands in
~/.claude-bottle/state/<slug>/<service>/ instead of an ephemeral
mktemp dir. The state subdirs become the stable bind-mount sources
that chunk 3's docker compose project will reference.

The SDK launch path is unchanged — `docker cp` still copies from the
plan-held paths into containers, just from new locations. start.py's
session-end cleanup is now in `finally`, which also reaps state dirs
left behind by dry-run / preflight-N / prepare-exception paths
(previously only the post-launch path settled state).
2026-05-25 22:53:47 -04:00
didericis 1e5b0dcfca refactor: rename egress-proxy → egress everywhere
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m10s
The manifest key is `egress:` now; finish the rename so the rest of
the codebase matches. Files (Dockerfile.egress, claude_bottle/egress.py
etc.), classes (Egress, EgressConfig, EgressRoute, EgressPlan,
DockerEgress), constants (EGRESS_HOSTNAME, EGRESS_ROUTES, ...),
container name prefix (claude-bottle-egress-*), docker network alias
(egress), the introspection host (_egress.local), the MCP tool IDs
(egress-block, list-egress-routes), and the preflight label all drop
the `-proxy` suffix.
2026-05-25 21:59:47 -04:00
didericis 572106d98f refactor(cli): drop --format=json end-to-end
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m2s
Companion to the compact preflight in #31 — the JSON format was
the structured alternative to the verbose text summary. With the
new compact text already on screen, no consumer was using the
JSON shape, and the abstract `BottlePlan.to_dict` was the
biggest piece of API surface no one is implementing against.

Removed:
  - `--format` CLI flag from `start` and `resume`.
  - `output_format` kwarg from `_launch_bottle`.
  - `BottlePlan.to_dict` abstract method.
  - `DockerBottlePlan.to_dict` (60-line dict builder).
  - The `_PlanView` dataclass — `print` was the only remaining
    caller, so the env-name computation is inlined.
  - `tests/integration/test_dry_run_plan.py` (JSON-shape
    integration test).
  - `tests/unit/test_cli_start_format.py` (flag-conflict unit).

Plan-introspection is still possible by reading the
`DockerBottlePlan` dataclass directly — fields like `image`,
`container_name`, `stage_dir`, `use_runsc` are all there. Tooling
that needs a stable wire shape can JSON-serialize the dataclass
themselves.

411 unit + integration tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 20:54:51 -04:00
didericis 1542ee0b93 feat(egress-proxy-block): single-route input + merge-on-apply
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m14s
Instead of asking the agent to compose and submit a full routes
file, the tool now takes ONE proposed route — host + optional
path_allowlist + optional auth — and the supervisor merges it
into the live routes table at approval time. The agent no longer
needs to fetch / reproduce / extend the existing allowlist; it
just describes the host it wants reachable.

Tool input (new):
  - `host` (required)
  - `path_allowlist` (optional, array of absolute path prefixes)
  - `auth` (optional, {scheme, token_ref})
  - `justification` (required)

Merge semantics (in `egress_proxy_apply._merge_single_route`):
  - Host NOT in current routes → append the proposed route as a
    new entry. If `auth` is set, assign the next EGRESS_PROXY_TOKEN_N
    slot.
  - Host already present → union the proposed `path_allowlist`
    with the existing one (proposed entries appended after
    existing, deduped). Existing `auth_scheme` / `token_env`
    preserved; proposed `auth` ignored (operator-controlled, not
    agent-controlled).
  - Hostname comparison is case-insensitive.

Dashboard wiring: `approve()` on an egress-proxy-block proposal
now calls `add_route(slug, proposed_route_json)` instead of
`apply_routes_change(slug, full_file)`. add_route fetches the
current routes from the running egress-proxy, merges, and calls
apply_routes_change with the merged content — so the
pipelock-mirror + SIGHUP plumbing from chunk 3 still runs
end-to-end. Audit diff still captures the full-file before/after.

Tool description rewritten to make the new shape obvious and to
stop pointing the agent at the routes file. The
`list-egress-proxy-routes` tool stays available for agents that
want to see what's currently allowed.

Tests: 9 new `_merge_single_route` cases (host absent/present,
path-allowlist union+dedup, auth-slot indexing, case-insensitive
match, existing-auth preservation, missing-host rejection,
malformed-current rejection). 407 unit + integration pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 18:45:17 -04:00
didericis 9cd583fbbb feat(egress-proxy): retarget remediation at egress-proxy (PRD 0017 chunk 3)
test / unit (pull_request) Successful in 19s
test / integration (pull_request) Successful in 1m6s
Finishes PRD 0017. The `cred-proxy-block` MCP tool is renamed and
its remediation apply path is repointed at egress-proxy.

  - `claude_bottle/supervise.py` — `TOOL_CRED_PROXY_BLOCK` →
    `TOOL_EGRESS_PROXY_BLOCK`; `COMPONENT_FOR_TOOL` maps the new
    tool ID to `egress-proxy` for audit-log routing.

  - `claude_bottle/supervise_server.py` — tool definition renamed
    + description rewritten: "Call when egress-proxy refused your
    HTTPS request ... Read the current routes.yaml from /etc/
    claude-bottle/current-config/routes.yaml, compose a modified
    version, pass the full new file plus a justification." The
    syntactic validator dispatches on the new tool ID.

  - `claude_bottle/backend/docker/egress_proxy_apply.py` — renamed
    from `cred_proxy_apply.py`. Reads routes.yaml from
    /etc/egress-proxy/routes.yaml via `docker exec cat`; validates
    via `egress_proxy_addon_core.load_routes` (so both sides use
    the same parser); writes via `docker cp`; SIGHUPs egress-proxy
    with `docker kill --signal HUP`. `EgressProxyApplyError`
    replaces `CredProxyApplyError`.

  - `claude_bottle/cli/dashboard.py` — wires the new apply +
    `discover_egress_proxy_slugs` helper; the operator-initiated
    `routes edit <bottle>` verb now writes to egress-proxy with
    `.yaml` suffix. Stale follow-up comment about path-aware
    filtering removed — PRD 0017 settled that question.

  - `tests/integration/test_supervise_sidecar.py` — restores the
    approval round-trip test (chunk 2 had switched it to a reject
    path because no cred-proxy existed). Approval stubs
    `apply_routes_change` so the test focuses on the supervise
    queue/response plumbing rather than docker-exec into a real
    egress-proxy sidecar (that's covered separately).

  - `tests/unit/test_egress_proxy_apply.py` — rewritten against
    the new validator; covers JSON shape, missing routes key,
    partial-auth-pair rejection (the addon-core parser catches
    these before SIGHUP).

  - PRDs 0010 + 0014 — status headers updated to
    Superseded / Retargeted with a callout block pointing at PRD
    0017's migration section. Historical text preserved.

384 unit + integration tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 15:13:44 -04:00
didericis 6066bb4d4c fix(dashboard): show the literal new allowlist line in green, no prefix
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m37s
The "→ would allow host: api.github.com" framing added narration
where none was needed. Just render the host on its own line in
green — that's literally the text that gets appended to pipelock's
allowlist on approve, and the green color carries "what's about to
change". The URL (with path) is still right above for context.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 08:28:29 -04:00
didericis 97ff506783 feat(dashboard): highlight new hostname in green on pipelock-block detail
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m32s
When the operator opens a pipelock-block proposal in the detail
view (Enter / 'v'), append a green-coloured line:

    → would allow host: api.github.com

so what's actually about to change is obvious at a glance. The
full failed URL stays above the new line (the path is operator
context — pipelock can't enforce it, just records intent).

- _detail_lines now returns (text, attr) tuples; pipelock-block
  appends the host-extract line tagged with the green color pair.
- _detail_view threaded the green_attr through from the main loop
  (matches the new-proposal highlight pattern from earlier in this
  PR).
- Best-effort URL parsing; unparseable payloads skip the highlight
  line rather than render a misleading blank host.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 08:25:24 -04:00
didericis 82d6534e6b docs(pipelock-block): flag follow-up for path-aware filtering
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Successful in 1m33s
PR #25's pipelock-block tool sends a full URL and the supervisor
extracts just the hostname for pipelock's allowlist — pipelock
2.3.0's api_allowlist is hostname-only (verified by inspecting the
binary's strict preset). The path component is operator context,
not enforced.

Document the follow-up shape inline at the apply site so a future
reader looking at why we're throwing away the path lands on the
plan: adding `auth_scheme: none` + `path_allowlist` to cred-proxy,
and rewiring pipelock-block to propose cred-proxy routes instead
of pipelock hostnames. Multi-touch change, its own PRD.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 08:15:38 -04:00
didericis f3f2e3e9ab feat(pipelock-block): tool sends failed URL, supervisor merges host
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Successful in 1m32s
Reshape the pipelock-block MCP tool around what the agent actually
knows at the moment of failure (the URL pipelock just refused), not
what the operator needs (a full allowlist file).

Before: agent had to read /etc/claude-bottle/current-config/allowlist,
copy the whole file, append their host, send back. Lots of work,
easy to get wrong, and the operator's diff was noisy because the
proposal contained every host the agent saw — most of which weren't
the change.

After: agent calls
  pipelock-block(failed_url="https://api.github.com/repos/foo/bar",
                 justification="...")
supervisor extracts api.github.com, fetches the running allowlist,
adds the host if not already present, applies the merged content.

Path is captured as operator context (the detail view labels it
"failed URL" instead of "proposed file") but isn't enforced —
pipelock's api_allowlist is hostname-only, so the path can't
become an allow rule.

- supervise_server: pipelock-block input schema gains `failed_url`
  (replaces `allowlist`); validate_proposed_file checks for
  http/https + hostname.
- PROPOSED_FILE_FIELD updated; tool description rewritten.
- dashboard._apply_pipelock_url: extract host, fetch current,
  merge, apply.
- _proposed_payload_label: detail view renders "failed URL" for
  pipelock-block, "proposed file" otherwise.
- Tests updated end-to-end; new url-host-merge + idempotent-merge +
  invalid-url cases added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 08:02:53 -04:00
didericis a9bb34cb77 feat(dashboard): highlight newly-arrived proposals in green for 5s
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m34s
When a new proposal lands in the dashboard's list, the operator
shouldn't have to compare the list to a mental snapshot to spot
what's new. Render newly-arrived proposals in green for the first
five seconds after they show up.

- _try_init_green: initialise a green color pair; returns 0 if the
  terminal lacks color so the highlight degrades to no-op.
- _main_loop tracks first_seen[proposal_id] across refresh ticks,
  pruning entries when a proposal leaves the queue.
- _render ORs green into the existing attr (composes with selection
  reverse-video — terminal handles the mix).

Applies to all tool types (cred-proxy-block, pipelock-block,
capability-block). If a tool-specific highlight is wanted later,
filter on qp.proposal.tool in _is_recent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:54:34 -04:00
didericis 4e4051f420 fix(dashboard): auto-refresh the TUI every 1s
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m34s
The main loop blocked on stdscr.getch() until the operator hit a
key — a tool call landing in the queue while the operator was just
watching wouldn't appear on the screen. The operator had to press
any key to trigger a re-render and see the new proposal.

Switch to stdscr.timeout(1000): getch returns -1 after 1s if no
key was pressed, and the loop re-renders with the latest
discover_pending() result. CPU cost is trivial; the loop body is
~one filesystem scan + curses draw per second.

Also restructure status_line lifecycle: was cleared right after
every render, which meant a timeout-driven re-render would wipe
the message ~1s after the operator's keystroke set it. Now
status_line is cleared only on actual key press, so messages
like "approved cred-proxy-block for [dev-xyz]" persist until the
operator does something else.

Detail view + prompt view are unchanged — they're modal, the
underlying proposal data doesn't move, and getstr can't tolerate
a re-render mid-input.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:48:24 -04:00
didericis ef5d2f9a4d feat(state): preserve on crash + always snapshot transcript
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m31s
Extends the preserve-on-capability-block design to also preserve
state on agent crash, and snapshots the transcript on every
teardown so any resume (crash or capability-block) gets a warm
claude session — not a cold start.

- capability_apply: rename _snapshot_transcript → snapshot_transcript
  (public; reused below). No behavior change in the capability path.
- cli/start.py: capture bottle.exec_claude's exit code; while the
  container is still alive (inside the launch context):
    * always snapshot_transcript(identity)
    * if exit_code != 0, mark_preserved(identity)
  Then the existing _settle_state runs after teardown.

Now the preservation matrix is:

  exit 0   (clean)          → snapshot + cleanup state
  exit ≠0  (crash, Ctrl-C)  → snapshot + preserve + show resume hint
  capability-block          → (already snapshotted/preserved by apply
                               before teardown; this path is a no-op
                               because the container is already gone
                               by the time exec_claude returns)

snapshot_transcript is best-effort — capability-block's earlier
snapshot is not clobbered when the container is already torn down,
and a missing /home/node/.claude is a warn + skip.

Tested behavior: clean exit doesn't preserve, non-zero exit
(including SIGINT/130 and SIGKILL/137) preserves; empty identity
no-ops both helpers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:05:23 -04:00
didericis fb2b5844c4 feat(cleanup): prompt to remove per-bottle state, separately from containers
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m34s
`cli.py cleanup` already enumerated orphan containers + networks
and asked for confirmation before nuking them. Per-bottle state
under ~/.claude-bottle/state/ wasn't touched — accumulated forever,
including orphans from old code paths.

Add state to the cleanup flow with its own prompt: the trade-off is
different from containers (which are pure debris) because a state
dir may carry a resumable bottle (capability-block rebuild +
transcript snapshot) the operator still wants.

Output shows the resumable / orphan / rebuilt-Dockerfile / transcript /
preserve-marker flags for each state dir so the operator sees what
they'd lose. Both sections are skippable independently — answering
"n" to containers doesn't skip the state prompt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:56:04 -04:00
didericis 9dbd20398e feat(state): clean up per-bottle state on session end (except capability-block)
test / unit (pull_request) Successful in 19s
test / integration (pull_request) Successful in 1m35s
Previously every bottle launch left ~/.claude-bottle/state/<identity>/
behind forever — metadata.json on every run, plus per-bottle
Dockerfile + transcript snapshot on capability-block rebuilds. The
metadata accumulated debris across launches; the only state worth
keeping was the capability-block rebuild bundle.

Make cleanup the default; preserve only on capability-block.

- bottle_state.py: .preserve marker helpers (mark_preserved,
  is_preserved, clear_preserve_marker, preserve_marker_path) +
  cleanup_state(identity) that rm -rf's the per-bottle dir.
- capability_apply.apply_capability_change writes mark_preserved
  before teardown so cli.py's session-end cleanup keeps the dir.
- prepare.py clears any leftover marker at launch (start or resume),
  so a marker from a prior capability-block doesn't keep state
  alive past a subsequent normal session-end.
- cli/start.py runs the cleanup decision AFTER the launch context
  closes: if is_preserved → print resume hint; else cleanup_state.
  The resume hint moves out of the launch with-block (was previously
  printed unconditionally — would have misled the operator about
  whether state was actually kept).

Future-proof: cli.py never persists state speculatively. If the
agent wants to be resumable, it has to go through capability-block.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:51:13 -04:00
didericis 4032e04a9c feat(bottle): random-suffix identity + cli.py resume <identity>
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m30s
Replaces the cwd-hash identity with a random 5-char base36 suffix
per launch, so two simultaneous `start <agent>` invocations against
the same cwd no longer collide on container names. Each launch is
its own bottle.

State carries metadata: every prepare step writes
~/.claude-bottle/state/<identity>/metadata.json with the
(agent_name, cwd, copy_cwd, started_at) the bottle was launched
with. The new `cli.py resume <identity>` reads this metadata and
re-launches a bottle pinned to the same identity — picking up the
per-bottle Dockerfile (from a prior capability-block apply) and
the transcript snapshot under the same state dir.

- bottle_state.py: bottle_identity(agent_name) drops the cwd param
  and gains a random suffix; BottleMetadata dataclass +
  read/write/metadata_path helpers.
- BottleSpec gains an optional identity field — resume sets it to
  pin the identity; start leaves it empty so prepare mints fresh.
- prepare.py: writes metadata at launch time; uses spec.identity if
  provided (resume) else bottle_identity(agent_name) (fresh start).
- start.py: extracted _launch_bottle from cmd_start so resume can
  share the launch core; prints `./cli.py resume <identity>` hint
  at session end.
- cli/resume.py (new): reads metadata, reconstructs BottleSpec
  with the recorded identity + cwd, delegates to _launch_bottle.
  Errors clearly when no state exists for the given identity.
- cli/__init__.py: registers `resume` in COMMANDS + usage.
- dashboard.py: capability-block approval status line now appends
  the `resume <identity>` hint so the operator can copy-paste the
  rebuild command without leaving the TUI.

Closes the rebuild loop in PRD 0016: agent calls capability-block →
operator approves → bottle torn down with state preserved → status
line shows resume command → operator runs it → replacement bottle
boots with the new Dockerfile and prior transcript.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:09:45 -04:00
didericis d9c47d0fbe feat(dashboard): wire capability-block approval to real apply (PRD 0016)
Phase 3 of PRD 0016. dashboard.approve() now dispatches to
apply_capability_change when the proposal is a capability-block:

  cred-proxy-block → apply_routes_change
  pipelock-block   → apply_allowlist_change
  capability-block → apply_capability_change   (new in PRD 0016)

CapabilityApplyError joins the ApplyError tuple, so the TUI's key
handlers catch it the same way and surface failures in the status
line.

After a successful capability-block apply, dashboard archives the
proposal+response itself — the supervise sidecar was torn down by
apply_capability_change and can't archive its own queue file.
Without this, dashboard.discover_pending would keep surfacing the
resolved proposal forever.

No audit log for capability-block per PRD 0013 — its record lives
in the per-bottle Dockerfile state + transcript snapshot.

Tests stub apply_capability_change at the dashboard module level,
add TestCapabilityApplyWiring (call wiring, failure-keeps-pending,
no-audit invariant, archive-after-apply), and update TestApproveReject
to stub the capability path too so it stays docker-independent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:28:35 -04:00
didericis 1d58d62c47 feat(dashboard): pipelock edit TUI verb (PRD 0015)
Phase 3 of PRD 0015. Adds the proactive `pipelock edit` path,
mirroring routes edit from PRD 0014:

- discover_pipelock_slugs() lists running pipelock sidecars.
- operator_edit_allowlist(slug, new) wraps apply_allowlist_change
  and writes an audit entry tagged ACTION_OPERATOR_EDIT.
- New 'p' keybinding in the main TUI: discover slugs, prompt if
  multiple, fetch current allowlist, open in $EDITOR, apply on
  save.
- Extracts shared scaffolding into _operator_edit_flow used by
  both routes-edit and pipelock-edit — DRY without sacrificing
  the per-verb status-line copy.
- Footer updated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:03:20 -04:00
didericis 5a6c4be342 feat(dashboard): wire pipelock-block approval to real apply (PRD 0015)
Phase 2 of PRD 0015. dashboard.approve() now dispatches on the
proposal's tool:

  cred-proxy-block → apply_routes_change   (from PRD 0014)
  pipelock-block   → apply_allowlist_change (new in PRD 0015)
  capability-block → no-op (lands in PRD 0016)

PipelockApplyError joins CredProxyApplyError under the ApplyError
tuple the TUI catches: failures keep the proposal pending and the
status line surfaces the message; no response is written and no
audit entry is appended.

Tests: existing TestApproveReject stubs both apply paths; new
TestPipelockApplyWiring covers the call wiring, failure-propagation,
and real-diff-in-audit invariants.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:01:18 -04:00
didericis 81277e9d81 feat(dashboard): routes edit TUI verb for operator-initiated changes (PRD 0014)
Phase 4 of PRD 0014. Adds the proactive routes-edit path that
doesn't require a pending proposal:

- discover_cred_proxy_slugs() lists running cred-proxy sidecars by
  parsing docker ps output. Returns [] when docker is unreachable
  or not installed (no exception escapes).
- operator_edit_routes(slug, new_content) wraps apply_routes_change
  and writes an audit entry tagged ACTION_OPERATOR_EDIT (so a
  future reader can distinguish operator-initiated changes from
  agent-proposal approvals in the log).
- New 'e' keybinding in the main TUI: discover slugs, prompt if
  multiple (or use the only one directly), fetch current routes,
  open in $EDITOR, apply on save. CredProxyApplyError lands in the
  status line; the operator can retry.

Tests cover audit-entry shape, failure path, and docker-missing
recovery for slug discovery.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:47:22 -04:00
didericis f3a1b4d667 feat(dashboard): wire cred-proxy-block approval to real apply (PRD 0014)
Phase 3 of PRD 0014. dashboard.approve() now does the real
remediation for cred-proxy-block proposals:

- Calls apply_routes_change(slug, file_to_apply) which fetches the
  current routes.json from the running sidecar, validates the new
  JSON, docker cp's it in, and SIGHUPs the sidecar.
- Audit entry's diff is now the real before→after from the apply
  return — not the empty-string placeholder 0013 wrote.
- On apply failure (CredProxyApplyError): no response file, no
  audit entry. Proposal stays pending so the operator can fix the
  input and retry. The TUI's key handlers catch the exception and
  surface the message in the status line.
- pipelock-block + capability-block remain no-op approvals; their
  remediation lands in PRDs 0015 + 0016 and the audit diff stays
  empty until then.
- reject path unchanged: no apply, audit entry with empty diff.

Tests stub apply_routes_change at the dashboard module level so the
unit suite doesn't need a running sidecar; integration test in
Phase 5 covers the real docker exec/cp/SIGHUP plumbing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:44:33 -04:00
didericis 0aecb41e33 feat(supervise): TUI dashboard for approve/modify/reject (PRD 0013)
Phase 4 of PRD 0013. Adds `claude-bottle dashboard` subcommand:

- discover_pending() walks ~/.claude-bottle/queue/* and gathers
  pending proposals across all bottles, sorted FIFO by arrival.
- approve / approve-with-final-file / reject helpers write the
  Response file the sidecar polls, and append an AuditEntry for
  cred-proxy and pipelock tools. capability-block proposals don't
  write to an audit log here (PRD 0016 captures via rebuild record).
- Stdlib-curses TUI: list view, detail view, $EDITOR shellout for
  modify-then-approve, inline prompt for reject reason.
- `dashboard --once` dumps pending proposals to stdout without
  bringing up curses — useful for scripted checks and tests.

For 0013 the audit entry's diff field is render_diff("", proposed)
because we don't yet have access to the live on-disk current file;
PRDs 0014 / 0015 fill in real before→after diffs once they own the
host-side config writes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis 32b62cbacc feat(cred_proxy)!: cred-proxy is the only Anthropic auth path
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 23s
Removes the legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` -> `CLAUDE_CODE_OAUTH_TOKEN`
forward in prepare.py. Bottles that need claude-code to authenticate
must declare a cred_proxy route with role: "anthropic-base-url" — there
is no fallback that hands the token to the agent directly.

Drops the now-dead BottleSpec.forward_oauth_token field, the CLI
setter that read CLAUDE_BOTTLE_OAUTH_TOKEN from the host env at
prepare time, and the forward_oauth_token=False arg in the six
pipelock integration tests.

PRD 0010 and README updated; the dev ~/claude-bottle.json gains an
anthropic-base-url route so the implementer/researcher agents keep
working.

BREAKING: bottles previously relying on the implicit OAuth forward
will now produce an agent environ without any Anthropic credential.
Verified with --dry-run: a bottle with no anthropic-base-url route
yields env_names: [] (no token at all); a bottle that declares the
route yields ANTHROPIC_BASE_URL plus a non-secret placeholder for
CLAUDE_CODE_OAUTH_TOKEN.
2026-05-24 12:56:09 -04:00
didericis 3d66ad2a86 feat(ssh-gate)!: remove ssh-gate sidecar and provisioner (PRD 0009)
Delete claude_bottle/ssh_gate.py, the DockerSSHGate sidecar,
and the provision_ssh provisioner (~/.ssh/config + ssh-agent
wiring). Unwire the gate from the abstract BottleBackend
(provision orchestration drops the ssh step,
_validate_ssh_entries goes away) and from the Docker backend
(prepare/launch lose the `gate` kwarg, bottle_plan drops the
gate_plan field, dry-run JSON drops the ssh_hosts / ssh_gate
keys, y/N preflight drops the ssh-hosts block). cli/info now
prints declared git remotes instead of ssh hosts. pipelock's
docstring picks up the git-gate framing now that there's no
PRD-0007 boundary to call out.

BREAKING (dry-run JSON): the `ssh_hosts` and `ssh_gate` keys
are gone from `start --dry-run --format=json`. Consumers should
read `git_remotes` / `git_gate` instead.
2026-05-12 23:49:58 -04:00
didericis 64a31a382b chore(types): add pyright strict config and fix resulting errors
test / unit (push) Successful in 11s
test / integration (push) Successful in 12s
Adds pyrightconfig.json (strict, Python 3.11) covering cli.py,
claude_bottle/, and tests/. Fixes the 49 strict-mode errors:

- Type DockerBottle.teardown as Callable[[], None].
- ResolvedEnv default_factory uses parameterized list[str] / dict[str, str].
- Erase BottleBackend generics at the registry boundary
  (BottleBackend[Any, Any]) since selection is runtime-driven and
  callers use the unparameterized interface.
- DockerBottleBackend.launch returns Generator[DockerBottle, None, None];
  @contextmanager now flags Iterator returns as deprecated.
- Sidestep cli.list submodule shadowing builtins.list in main()'s argv
  annotation via an aliased re-import in cli/__init__.py.
- Cast cfg[...] results in test_pipelock_yaml at the dict[str, object]
  boundary.
- Annotate write_fixture's fn parameter and _manifest_with_runtime's
  return type.
2026-05-12 10:03:48 -04:00
didericis beb0c9d58f feat(cli): add --format=json to start --dry-run for machine-readable plan
BottlePlan gains a to_dict method (abstract on the base, implemented
on DockerBottlePlan) returning a JSON-serializable view of the resolved
plan. `cli.py start --dry-run --format=json` prints it to stdout and
exits zero. --format=json without --dry-run is rejected — emitting JSON
during a real launch would race the y/N prompt.

The dry-run integration test now parses the JSON and asserts on
structured fields (agent, bottle, runtime, hosts sorted+deduped, etc.)
instead of regex-matching the human-readable preflight stdout. That
kills the magic-"8 hosts allowed" coupling — adding a new baked
default doesn't break the test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:23:24 -04:00
didericis 70a22fa210 refactor: rename platform abstraction to backend
test / run tests/run_tests.py (pull_request) Successful in 21s
Across the package:
  - claude_bottle/platform/         -> claude_bottle/backend/
  - platform/docker/platform.py     -> backend/docker/backend.py
  - class BottlePlatform            -> BottleBackend
  - class DockerBottlePlatform      -> DockerBottleBackend
  - get_bottle_platform()           -> get_bottle_backend()
  - env var CLAUDE_BOTTLE_PLATFORM  -> CLAUDE_BOTTLE_BACKEND
  - dict _PLATFORMS                 -> _BACKENDS

"Backend" is shorter and more established as the term for a
pluggable strategy-pattern implementation. "Platform" was vague
(could mean OS, hardware, cloud) and mildly redundant — Docker is
itself a platform.

The previous PRD section claiming "the Backend protocol was
rejected" referred to a low-level run/exec/cp/network_connect
protocol; the name was never the reason. The PRD is updated to
describe that rejected design by shape rather than by name.

The bottle/agent concepts and the manifest schema are unchanged.
2026-05-10 23:59:38 -04:00
didericis 1d2c18eaae refactor(platform): rename claude_bottle/bottles -> claude_bottle/platform
test / run tests/run_tests.py (pull_request) Successful in 13s
'bottles' was the package name when it held a single Bottle Protocol;
since we added BottlePlatform / BottlePlan / BottleCleanupPlan and
made it the home of platform dispatch, 'platform' describes the
package better. The 'bottle' concept (and the manifest field) stays.

CLI imports update from ..bottles to ..platform; internal relative
imports inside the package survive the rename unchanged. Git
detected all 7 file renames.
2026-05-10 23:37:28 -04:00
didericis 47b882f634 refactor(bottles): move 'list active' onto DockerBottlePlatform
test / run tests/run_tests.py (pull_request) Successful in 14s
Add list_active() abstract method on BottlePlatform; DockerBottlePlatform
implements it with the existing docker ps logic. cli/list.py no longer
calls docker directly — it just dispatches the active branch to the
platform.
2026-05-10 23:19:22 -04:00
didericis 18d29fc23f refactor(bottles): two-phase cleanup parallel to prepare/launch
test / run tests/run_tests.py (pull_request) Successful in 13s
cmd_cleanup used to only sweep running containers via `docker ps`,
missing stopped pipelock sidecars and orphaned networks entirely. On
my host the new version surfaced ~10 stranded networks left behind by
SIGKILLed sessions — the kind of thing the old command implied it was
handling.

New shape, symmetric with start:
- BottleCleanupPlan (abstract, in bottles/__init__.py) with `print` +
  `empty` abstract members.
- DockerBottleCleanupPlan (concrete, in bottles/docker.py) carrying
  the resolved tuples of containers and networks.
- BottlePlatform gains abstract prepare_cleanup() + cleanup(plan).
  DockerBottlePlatform implements both:
    - prepare_cleanup: docker ps -a + docker network ls, both
      filtered to ^claude-bottle-, sorted for stable output.
    - cleanup: docker rm -f containers first (they hold the network
      attachment), then docker network rm.
- cmd_cleanup is now ~25 lines: prepare → print → y/N → cleanup.
2026-05-10 23:14:54 -04:00
didericis 4a45c267f3 refactor(cli): remove redundant 'build' command
test / run tests/run_tests.py (pull_request) Successful in 13s
DockerBottlePlatform.launch already runs 'docker build' on every
start, and the layer cache makes no-change rebuilds nearly free.
Anything 'cli.py build' did, 'cli.py start' does for you.
2026-05-10 23:05:24 -04:00
didericis 2827d9b899 refactor(bottles): introduce BottlePlan base + move print onto plan
test / run tests/run_tests.py (pull_request) Successful in 19s
- Add BottlePlan (frozen dataclass + ABC) with spec, stage_dir, and an
  abstract `print(*, remote_control)` method.
- DockerBottlePlan now inherits from BottlePlan; spec/stage_dir come
  from the base, Docker-specific fields stay on the subclass.
- Move BottleSpec from bottles/docker.py to bottles/__init__.py so the
  cross-platform types live together. docker.py pulls them via
  `from . import ...`.
- Move show_plan from cli/start.py to `DockerBottlePlan.print`. Caller
  becomes `plan.print(remote_control=...)`. The CLI no longer reads
  any Docker-specific fields.
- BottlePlatform.prepare is now typed `Callable[..., BottlePlan]`.

cmd_start drops ~46 more lines.
2026-05-10 22:49:57 -04:00
didericis 236c4fa50c refactor(bottles): rename DockerBottleSpec to BottleSpec
test / run tests/run_tests.py (pull_request) Successful in 13s
The spec is intent-only and platform-agnostic — only the plan carries
Docker-specific fields. Drop the 'Docker' prefix and re-export from
claude_bottle.bottles so callers see it as cross-platform.
2026-05-10 22:40:19 -04:00
didericis 4f16b3a9e1 refactor(bottles): split factory into prepare + launch phases
test / run tests/run_tests.py (pull_request) Successful in 15s
The Docker factory had absorbed live container ops but left the
host-side prep (image-name resolution, container-name collision
retry, pipelock yaml generation, env_resolve writes, host
validation) in cli/start.py. That kept ~half the Docker-specific
logic outside the abstraction.

Split the factory into two phases:

  prepare_docker_bottle(spec, stage_dir=...) -> DockerBottlePlan
      Resolves names, validates skills/SSH, writes scratch files.
      No Docker resources created yet.

  launch_docker_bottle(plan) -> ContextManager[Bottle]
      Builds image, creates networks, boots pipelock, runs the
      agent container, provisions files. Teardown on exit.

DockerBottleSpec shrinks to intent-only inputs (manifest, agent
name, --cwd flag, user_cwd, forward_oauth_token). The CLI no longer
references docker_mod, pipelock, skills, ssh, or env_resolve.

get_bottle_factory becomes get_bottle_platform returning a
BottlePlatform with .prepare and .launch — one selectable thing per
platform.

The Bottle handle now remembers the in-container prompt path and
adds --append-system-prompt-file to claude's argv when present, so
the CLI no longer needs to know the path.

cmd_start: ~148 lines down from 229. Tests pass; dry-run output
byte-identical.
2026-05-10 22:36:26 -04:00
didericis a284d85296 refactor(start): show_plan now takes DockerBottleSpec
test / run tests/run_tests.py (pull_request) Successful in 15s
2026-05-10 22:23:40 -04:00
didericis 7500ba230c refactor(start): extract show_plan from cmd_start
test / run tests/run_tests.py (pull_request) Successful in 15s
2026-05-10 22:20:33 -04:00