Commit Graph

26 Commits

Author SHA1 Message Date
didericis 4032e04a9c feat(bottle): random-suffix identity + cli.py resume <identity>
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m30s
Replaces the cwd-hash identity with a random 5-char base36 suffix
per launch, so two simultaneous `start <agent>` invocations against
the same cwd no longer collide on container names. Each launch is
its own bottle.

State carries metadata: every prepare step writes
~/.claude-bottle/state/<identity>/metadata.json with the
(agent_name, cwd, copy_cwd, started_at) the bottle was launched
with. The new `cli.py resume <identity>` reads this metadata and
re-launches a bottle pinned to the same identity — picking up the
per-bottle Dockerfile (from a prior capability-block apply) and
the transcript snapshot under the same state dir.

- bottle_state.py: bottle_identity(agent_name) drops the cwd param
  and gains a random suffix; BottleMetadata dataclass +
  read/write/metadata_path helpers.
- BottleSpec gains an optional identity field — resume sets it to
  pin the identity; start leaves it empty so prepare mints fresh.
- prepare.py: writes metadata at launch time; uses spec.identity if
  provided (resume) else bottle_identity(agent_name) (fresh start).
- start.py: extracted _launch_bottle from cmd_start so resume can
  share the launch core; prints `./cli.py resume <identity>` hint
  at session end.
- cli/resume.py (new): reads metadata, reconstructs BottleSpec
  with the recorded identity + cwd, delegates to _launch_bottle.
  Errors clearly when no state exists for the given identity.
- cli/__init__.py: registers `resume` in COMMANDS + usage.
- dashboard.py: capability-block approval status line now appends
  the `resume <identity>` hint so the operator can copy-paste the
  rebuild command without leaving the TUI.

Closes the rebuild loop in PRD 0016: agent calls capability-block →
operator approves → bottle torn down with state preserved → status
line shows resume command → operator runs it → replacement bottle
boots with the new Dockerfile and prior transcript.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:09:45 -04:00
didericis d9c47d0fbe feat(dashboard): wire capability-block approval to real apply (PRD 0016)
Phase 3 of PRD 0016. dashboard.approve() now dispatches to
apply_capability_change when the proposal is a capability-block:

  cred-proxy-block → apply_routes_change
  pipelock-block   → apply_allowlist_change
  capability-block → apply_capability_change   (new in PRD 0016)

CapabilityApplyError joins the ApplyError tuple, so the TUI's key
handlers catch it the same way and surface failures in the status
line.

After a successful capability-block apply, dashboard archives the
proposal+response itself — the supervise sidecar was torn down by
apply_capability_change and can't archive its own queue file.
Without this, dashboard.discover_pending would keep surfacing the
resolved proposal forever.

No audit log for capability-block per PRD 0013 — its record lives
in the per-bottle Dockerfile state + transcript snapshot.

Tests stub apply_capability_change at the dashboard module level,
add TestCapabilityApplyWiring (call wiring, failure-keeps-pending,
no-audit invariant, archive-after-apply), and update TestApproveReject
to stub the capability path too so it stays docker-independent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:28:35 -04:00
didericis 1d58d62c47 feat(dashboard): pipelock edit TUI verb (PRD 0015)
Phase 3 of PRD 0015. Adds the proactive `pipelock edit` path,
mirroring routes edit from PRD 0014:

- discover_pipelock_slugs() lists running pipelock sidecars.
- operator_edit_allowlist(slug, new) wraps apply_allowlist_change
  and writes an audit entry tagged ACTION_OPERATOR_EDIT.
- New 'p' keybinding in the main TUI: discover slugs, prompt if
  multiple, fetch current allowlist, open in $EDITOR, apply on
  save.
- Extracts shared scaffolding into _operator_edit_flow used by
  both routes-edit and pipelock-edit — DRY without sacrificing
  the per-verb status-line copy.
- Footer updated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:03:20 -04:00
didericis 5a6c4be342 feat(dashboard): wire pipelock-block approval to real apply (PRD 0015)
Phase 2 of PRD 0015. dashboard.approve() now dispatches on the
proposal's tool:

  cred-proxy-block → apply_routes_change   (from PRD 0014)
  pipelock-block   → apply_allowlist_change (new in PRD 0015)
  capability-block → no-op (lands in PRD 0016)

PipelockApplyError joins CredProxyApplyError under the ApplyError
tuple the TUI catches: failures keep the proposal pending and the
status line surfaces the message; no response is written and no
audit entry is appended.

Tests: existing TestApproveReject stubs both apply paths; new
TestPipelockApplyWiring covers the call wiring, failure-propagation,
and real-diff-in-audit invariants.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:01:18 -04:00
didericis 81277e9d81 feat(dashboard): routes edit TUI verb for operator-initiated changes (PRD 0014)
Phase 4 of PRD 0014. Adds the proactive routes-edit path that
doesn't require a pending proposal:

- discover_cred_proxy_slugs() lists running cred-proxy sidecars by
  parsing docker ps output. Returns [] when docker is unreachable
  or not installed (no exception escapes).
- operator_edit_routes(slug, new_content) wraps apply_routes_change
  and writes an audit entry tagged ACTION_OPERATOR_EDIT (so a
  future reader can distinguish operator-initiated changes from
  agent-proposal approvals in the log).
- New 'e' keybinding in the main TUI: discover slugs, prompt if
  multiple (or use the only one directly), fetch current routes,
  open in $EDITOR, apply on save. CredProxyApplyError lands in the
  status line; the operator can retry.

Tests cover audit-entry shape, failure path, and docker-missing
recovery for slug discovery.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:47:22 -04:00
didericis f3a1b4d667 feat(dashboard): wire cred-proxy-block approval to real apply (PRD 0014)
Phase 3 of PRD 0014. dashboard.approve() now does the real
remediation for cred-proxy-block proposals:

- Calls apply_routes_change(slug, file_to_apply) which fetches the
  current routes.json from the running sidecar, validates the new
  JSON, docker cp's it in, and SIGHUPs the sidecar.
- Audit entry's diff is now the real before→after from the apply
  return — not the empty-string placeholder 0013 wrote.
- On apply failure (CredProxyApplyError): no response file, no
  audit entry. Proposal stays pending so the operator can fix the
  input and retry. The TUI's key handlers catch the exception and
  surface the message in the status line.
- pipelock-block + capability-block remain no-op approvals; their
  remediation lands in PRDs 0015 + 0016 and the audit diff stays
  empty until then.
- reject path unchanged: no apply, audit entry with empty diff.

Tests stub apply_routes_change at the dashboard module level so the
unit suite doesn't need a running sidecar; integration test in
Phase 5 covers the real docker exec/cp/SIGHUP plumbing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:44:33 -04:00
didericis 0aecb41e33 feat(supervise): TUI dashboard for approve/modify/reject (PRD 0013)
Phase 4 of PRD 0013. Adds `claude-bottle dashboard` subcommand:

- discover_pending() walks ~/.claude-bottle/queue/* and gathers
  pending proposals across all bottles, sorted FIFO by arrival.
- approve / approve-with-final-file / reject helpers write the
  Response file the sidecar polls, and append an AuditEntry for
  cred-proxy and pipelock tools. capability-block proposals don't
  write to an audit log here (PRD 0016 captures via rebuild record).
- Stdlib-curses TUI: list view, detail view, $EDITOR shellout for
  modify-then-approve, inline prompt for reject reason.
- `dashboard --once` dumps pending proposals to stdout without
  bringing up curses — useful for scripted checks and tests.

For 0013 the audit entry's diff field is render_diff("", proposed)
because we don't yet have access to the live on-disk current file;
PRDs 0014 / 0015 fill in real before→after diffs once they own the
host-side config writes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis 32b62cbacc feat(cred_proxy)!: cred-proxy is the only Anthropic auth path
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 23s
Removes the legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` -> `CLAUDE_CODE_OAUTH_TOKEN`
forward in prepare.py. Bottles that need claude-code to authenticate
must declare a cred_proxy route with role: "anthropic-base-url" — there
is no fallback that hands the token to the agent directly.

Drops the now-dead BottleSpec.forward_oauth_token field, the CLI
setter that read CLAUDE_BOTTLE_OAUTH_TOKEN from the host env at
prepare time, and the forward_oauth_token=False arg in the six
pipelock integration tests.

PRD 0010 and README updated; the dev ~/claude-bottle.json gains an
anthropic-base-url route so the implementer/researcher agents keep
working.

BREAKING: bottles previously relying on the implicit OAuth forward
will now produce an agent environ without any Anthropic credential.
Verified with --dry-run: a bottle with no anthropic-base-url route
yields env_names: [] (no token at all); a bottle that declares the
route yields ANTHROPIC_BASE_URL plus a non-secret placeholder for
CLAUDE_CODE_OAUTH_TOKEN.
2026-05-24 12:56:09 -04:00
didericis 3d66ad2a86 feat(ssh-gate)!: remove ssh-gate sidecar and provisioner (PRD 0009)
Delete claude_bottle/ssh_gate.py, the DockerSSHGate sidecar,
and the provision_ssh provisioner (~/.ssh/config + ssh-agent
wiring). Unwire the gate from the abstract BottleBackend
(provision orchestration drops the ssh step,
_validate_ssh_entries goes away) and from the Docker backend
(prepare/launch lose the `gate` kwarg, bottle_plan drops the
gate_plan field, dry-run JSON drops the ssh_hosts / ssh_gate
keys, y/N preflight drops the ssh-hosts block). cli/info now
prints declared git remotes instead of ssh hosts. pipelock's
docstring picks up the git-gate framing now that there's no
PRD-0007 boundary to call out.

BREAKING (dry-run JSON): the `ssh_hosts` and `ssh_gate` keys
are gone from `start --dry-run --format=json`. Consumers should
read `git_remotes` / `git_gate` instead.
2026-05-12 23:49:58 -04:00
didericis 64a31a382b chore(types): add pyright strict config and fix resulting errors
test / unit (push) Successful in 11s
test / integration (push) Successful in 12s
Adds pyrightconfig.json (strict, Python 3.11) covering cli.py,
claude_bottle/, and tests/. Fixes the 49 strict-mode errors:

- Type DockerBottle.teardown as Callable[[], None].
- ResolvedEnv default_factory uses parameterized list[str] / dict[str, str].
- Erase BottleBackend generics at the registry boundary
  (BottleBackend[Any, Any]) since selection is runtime-driven and
  callers use the unparameterized interface.
- DockerBottleBackend.launch returns Generator[DockerBottle, None, None];
  @contextmanager now flags Iterator returns as deprecated.
- Sidestep cli.list submodule shadowing builtins.list in main()'s argv
  annotation via an aliased re-import in cli/__init__.py.
- Cast cfg[...] results in test_pipelock_yaml at the dict[str, object]
  boundary.
- Annotate write_fixture's fn parameter and _manifest_with_runtime's
  return type.
2026-05-12 10:03:48 -04:00
didericis beb0c9d58f feat(cli): add --format=json to start --dry-run for machine-readable plan
BottlePlan gains a to_dict method (abstract on the base, implemented
on DockerBottlePlan) returning a JSON-serializable view of the resolved
plan. `cli.py start --dry-run --format=json` prints it to stdout and
exits zero. --format=json without --dry-run is rejected — emitting JSON
during a real launch would race the y/N prompt.

The dry-run integration test now parses the JSON and asserts on
structured fields (agent, bottle, runtime, hosts sorted+deduped, etc.)
instead of regex-matching the human-readable preflight stdout. That
kills the magic-"8 hosts allowed" coupling — adding a new baked
default doesn't break the test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:23:24 -04:00
didericis 70a22fa210 refactor: rename platform abstraction to backend
test / run tests/run_tests.py (pull_request) Successful in 21s
Across the package:
  - claude_bottle/platform/         -> claude_bottle/backend/
  - platform/docker/platform.py     -> backend/docker/backend.py
  - class BottlePlatform            -> BottleBackend
  - class DockerBottlePlatform      -> DockerBottleBackend
  - get_bottle_platform()           -> get_bottle_backend()
  - env var CLAUDE_BOTTLE_PLATFORM  -> CLAUDE_BOTTLE_BACKEND
  - dict _PLATFORMS                 -> _BACKENDS

"Backend" is shorter and more established as the term for a
pluggable strategy-pattern implementation. "Platform" was vague
(could mean OS, hardware, cloud) and mildly redundant — Docker is
itself a platform.

The previous PRD section claiming "the Backend protocol was
rejected" referred to a low-level run/exec/cp/network_connect
protocol; the name was never the reason. The PRD is updated to
describe that rejected design by shape rather than by name.

The bottle/agent concepts and the manifest schema are unchanged.
2026-05-10 23:59:38 -04:00
didericis 1d2c18eaae refactor(platform): rename claude_bottle/bottles -> claude_bottle/platform
test / run tests/run_tests.py (pull_request) Successful in 13s
'bottles' was the package name when it held a single Bottle Protocol;
since we added BottlePlatform / BottlePlan / BottleCleanupPlan and
made it the home of platform dispatch, 'platform' describes the
package better. The 'bottle' concept (and the manifest field) stays.

CLI imports update from ..bottles to ..platform; internal relative
imports inside the package survive the rename unchanged. Git
detected all 7 file renames.
2026-05-10 23:37:28 -04:00
didericis 47b882f634 refactor(bottles): move 'list active' onto DockerBottlePlatform
test / run tests/run_tests.py (pull_request) Successful in 14s
Add list_active() abstract method on BottlePlatform; DockerBottlePlatform
implements it with the existing docker ps logic. cli/list.py no longer
calls docker directly — it just dispatches the active branch to the
platform.
2026-05-10 23:19:22 -04:00
didericis 18d29fc23f refactor(bottles): two-phase cleanup parallel to prepare/launch
test / run tests/run_tests.py (pull_request) Successful in 13s
cmd_cleanup used to only sweep running containers via `docker ps`,
missing stopped pipelock sidecars and orphaned networks entirely. On
my host the new version surfaced ~10 stranded networks left behind by
SIGKILLed sessions — the kind of thing the old command implied it was
handling.

New shape, symmetric with start:
- BottleCleanupPlan (abstract, in bottles/__init__.py) with `print` +
  `empty` abstract members.
- DockerBottleCleanupPlan (concrete, in bottles/docker.py) carrying
  the resolved tuples of containers and networks.
- BottlePlatform gains abstract prepare_cleanup() + cleanup(plan).
  DockerBottlePlatform implements both:
    - prepare_cleanup: docker ps -a + docker network ls, both
      filtered to ^claude-bottle-, sorted for stable output.
    - cleanup: docker rm -f containers first (they hold the network
      attachment), then docker network rm.
- cmd_cleanup is now ~25 lines: prepare → print → y/N → cleanup.
2026-05-10 23:14:54 -04:00
didericis 4a45c267f3 refactor(cli): remove redundant 'build' command
test / run tests/run_tests.py (pull_request) Successful in 13s
DockerBottlePlatform.launch already runs 'docker build' on every
start, and the layer cache makes no-change rebuilds nearly free.
Anything 'cli.py build' did, 'cli.py start' does for you.
2026-05-10 23:05:24 -04:00
didericis 2827d9b899 refactor(bottles): introduce BottlePlan base + move print onto plan
test / run tests/run_tests.py (pull_request) Successful in 19s
- Add BottlePlan (frozen dataclass + ABC) with spec, stage_dir, and an
  abstract `print(*, remote_control)` method.
- DockerBottlePlan now inherits from BottlePlan; spec/stage_dir come
  from the base, Docker-specific fields stay on the subclass.
- Move BottleSpec from bottles/docker.py to bottles/__init__.py so the
  cross-platform types live together. docker.py pulls them via
  `from . import ...`.
- Move show_plan from cli/start.py to `DockerBottlePlan.print`. Caller
  becomes `plan.print(remote_control=...)`. The CLI no longer reads
  any Docker-specific fields.
- BottlePlatform.prepare is now typed `Callable[..., BottlePlan]`.

cmd_start drops ~46 more lines.
2026-05-10 22:49:57 -04:00
didericis 236c4fa50c refactor(bottles): rename DockerBottleSpec to BottleSpec
test / run tests/run_tests.py (pull_request) Successful in 13s
The spec is intent-only and platform-agnostic — only the plan carries
Docker-specific fields. Drop the 'Docker' prefix and re-export from
claude_bottle.bottles so callers see it as cross-platform.
2026-05-10 22:40:19 -04:00
didericis 4f16b3a9e1 refactor(bottles): split factory into prepare + launch phases
test / run tests/run_tests.py (pull_request) Successful in 15s
The Docker factory had absorbed live container ops but left the
host-side prep (image-name resolution, container-name collision
retry, pipelock yaml generation, env_resolve writes, host
validation) in cli/start.py. That kept ~half the Docker-specific
logic outside the abstraction.

Split the factory into two phases:

  prepare_docker_bottle(spec, stage_dir=...) -> DockerBottlePlan
      Resolves names, validates skills/SSH, writes scratch files.
      No Docker resources created yet.

  launch_docker_bottle(plan) -> ContextManager[Bottle]
      Builds image, creates networks, boots pipelock, runs the
      agent container, provisions files. Teardown on exit.

DockerBottleSpec shrinks to intent-only inputs (manifest, agent
name, --cwd flag, user_cwd, forward_oauth_token). The CLI no longer
references docker_mod, pipelock, skills, ssh, or env_resolve.

get_bottle_factory becomes get_bottle_platform returning a
BottlePlatform with .prepare and .launch — one selectable thing per
platform.

The Bottle handle now remembers the in-container prompt path and
adds --append-system-prompt-file to claude's argv when present, so
the CLI no longer needs to know the path.

cmd_start: ~148 lines down from 229. Tests pass; dry-run output
byte-identical.
2026-05-10 22:36:26 -04:00
didericis a284d85296 refactor(start): show_plan now takes DockerBottleSpec
test / run tests/run_tests.py (pull_request) Successful in 15s
2026-05-10 22:23:40 -04:00
didericis 7500ba230c refactor(start): extract show_plan from cmd_start
test / run tests/run_tests.py (pull_request) Successful in 15s
2026-05-10 22:20:33 -04:00
didericis d75cc9325f feat(bottles): implement bottle factory abstraction per PRD 0003
test / run tests/run_tests.py (pull_request) Successful in 16s
Introduce claude_bottle/bottles/ with a Bottle Protocol and a
get_bottle_factory() that dispatches on CLAUDE_BOTTLE_PLATFORM
(default "docker"). Move every Docker-specific subprocess.run call
from cli/start.py, plus the orchestration of build, networks, the
pipelock sidecar, container launch, and per-container provisioning
(prompt, skills, ssh, .git), into create_docker_bottle.

Drop bottles[].runtime from the manifest schema. Auto-detect whether
gVisor is registered with the daemon and pass --runtime=runsc when it
is; the preflight shows the resolved runtime so the choice is visible.
Manifests still carrying 'runtime' get a clear error pointing at the
auto-detect behavior, rather than silent ignore.

Out of scope: cli/cleanup.py and cli/list.py still call docker
directly. They enumerate active bottles across the host, which is a
separate concern from "create a bottle" and is left for a follow-up
that introduces a list_active/cleanup primitive on the factory.
2026-05-10 22:15:05 -04:00
didericis 1f36d53f7b refactor(manifest): convert TypedDict to frozen dataclasses
test / run tests/run_tests.py (pull_request) Successful in 14s
Replace the TypedDict + 14 manifest_* free functions with frozen
dataclasses (SshEntry, BottleEgress, Bottle, Agent, Manifest) carrying
their own validators and constructors. Call sites import Manifest and
chain attribute access; the manifest_* helpers and manifest_validate
are gone.

Behavior changes worth flagging:
- Agent.bottle is now required (was optional with a "(none)" fallback).
  Manifest.from_json_obj dies if any agent lacks a 'bottle' field or
  references an undefined bottle, where previously start.py raised the
  error lazily for the specific agent being launched.
- ssh.py now takes SshEntry instances; Host/IdentityFile shape checks
  moved upstream into Manifest construction, leaving only the IdentityFile
  filesystem-existence check in ssh_validate_entries.
- pipelock_bottle_allowlist's per-element string check is dropped — the
  Manifest validator enforces it at load.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 21:20:15 -04:00
didericis e3f5a5907a feat(bottle): opt-in gVisor runtime per bottle
test / run tests/run_tests.py (push) Successful in 19s
Bottles can now set "runtime": "runsc" to launch the agent container
under gVisor instead of runc, adding a userspace syscall barrier
between the agent and the host kernel. Default is runc (Docker
default). Pipelock stays on the default runtime per the research doc's
minimum-diff prescription.

The launcher verifies runsc is registered with the daemon before
launch, surfaces the runtime in the preflight plan, and dies with an
install pointer (and a macOS-not-supported note) when runsc is
requested but unavailable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 00:48:11 -04:00
didericis 4ebfcec2f7 fix(cli): make 'build --help' actually print help
test / run tests/run_tests.py (push) Successful in 15s
cmd_build was ignoring its argv, so 'cli.py build --help' fell through
and started a docker build instead of printing the subcommand's
argparse help. Wire up an empty parser so --help and unknown args are
handled the same way the other subcommands handle them.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 00:16:17 -04:00
didericis f817847dff refactor(cli): split claude_bottle/cli.py into a package
test / run tests/run_tests.py (push) Successful in 20s
One file per subcommand under claude_bottle/cli/, with shared constants
and the tty helper in _common.py and dispatch in __init__.py. The
public import (from claude_bottle.cli import main) is unchanged, so
the root cli.py entrypoint and the test suite see no surface change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 00:15:16 -04:00