bot-bottle

Author	SHA1	Message	Date
didericis	f807ed1149	fix(egress-proxy): force traffic through pipelock + block unallowlisted hosts test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m5s Details Two issues stopping the bottle's egress allowlist from being enforced: 1. mitmproxy was bypassing pipelock. We set HTTPS_PROXY=pipelock in the egress-proxy container's env, but mitmproxy is a proxy server — it does NOT honor HTTP(S)_PROXY env vars on its outbound side the way HTTP-client libraries do. All post-MITM traffic was going direct to the upstream, never touching pipelock's hostname allowlist or DLP scanner. Fix: use mitmproxy's `--mode upstream:URL` flag. The Dockerfile entrypoint now reads a new `EGRESS_PROXY_UPSTREAM_PROXY` env (set by `DockerEgressProxy.start` to the pipelock URL when pipelock is in the topology) and switches mitmdump to upstream-proxy mode. Standalone runs of the image without the env still get `--mode regular@9099` direct-to-upstream — useful for unit-test boots. Confirmed in the boot log: "HTTP(S) proxy (upstream mode) listening at *:9099." 2. egress-proxy was forwarding unrecognized hosts. The addon's `decide()` returned `Decision(action="forward")` whenever no route matched the request host, deferring to pipelock to gate. With #1 broken pipelock wasn't gating either; even with #1 fixed, defense-in-depth wants both layers enforcing. Fix: no-route-match → 403 with a "host not in allowlist" reason. The egress allowlist is now strictly the set of hosts declared in `bottle.egress_proxy.routes`; bare-pass routes (host with no auth, no path_allowlist) cover the passthrough case for hosts that just need reach. path_allowlist enforcement on matched routes is unchanged. Test updated: `test_no_matching_route_forwards` → `test_no_matching_route_blocks`. 364 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 16:38:18 -04:00
didericis	f04fbb68a9	feat(egress-proxy): drive claude-code OAuth placeholder off a role marker test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m3s Details The chunk 2 detection keyed on `token_ref == "CLAUDE_CODE_OAUTH_TOKEN"`, which broke any bottle whose host env var has a different name (e.g. `CLAUDE_BOTTLE_OAUTH_TOKEN`). The token_ref is the user's choice — the placeholder-env trigger shouldn't be locked to one specific string. Restoring a minimal `role` marker on `EgressProxyRoute`: - `EGRESS_PROXY_ROLES = frozenset({"claude_code_oauth"})` — one marker for now; the field is back so we can grow it. - `EGRESS_PROXY_SINGLETON_ROLES` — claude_code_oauth is a singleton (only one route per bottle can carry it). - `Role: tuple[str, ...]` field on `EgressProxyRoute` (manifest + runtime), parsed as string or list-of-strings; unknown roles are rejected so typos can't become silent no-ops. `prepare.py:has_anthropic_auth` now checks for `"claude_code_oauth" in r.roles` instead of matching a literal token_ref string. Bottles can name their host OAuth env var anything; the role marker is what flips on `CLAUDE_CODE_OAUTH_TOKEN=<placeholder>` and the telemetry-off env vars on the agent. Test coverage: 7 new manifest tests (omitted / string / list / unknown role rejected / non-string rejected / list-item non-string rejected / singleton enforced). 364 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 15:28:11 -04:00
didericis	9cd583fbbb	feat(egress-proxy): retarget remediation at egress-proxy (PRD 0017 chunk 3) test / unit (pull_request) Successful in 19s Details test / integration (pull_request) Successful in 1m6s Details Finishes PRD 0017. The `cred-proxy-block` MCP tool is renamed and its remediation apply path is repointed at egress-proxy. - `claude_bottle/supervise.py` — `TOOL_CRED_PROXY_BLOCK` → `TOOL_EGRESS_PROXY_BLOCK`; `COMPONENT_FOR_TOOL` maps the new tool ID to `egress-proxy` for audit-log routing. - `claude_bottle/supervise_server.py` — tool definition renamed + description rewritten: "Call when egress-proxy refused your HTTPS request ... Read the current routes.yaml from /etc/ claude-bottle/current-config/routes.yaml, compose a modified version, pass the full new file plus a justification." The syntactic validator dispatches on the new tool ID. - `claude_bottle/backend/docker/egress_proxy_apply.py` — renamed from `cred_proxy_apply.py`. Reads routes.yaml from /etc/egress-proxy/routes.yaml via `docker exec cat`; validates via `egress_proxy_addon_core.load_routes` (so both sides use the same parser); writes via `docker cp`; SIGHUPs egress-proxy with `docker kill --signal HUP`. `EgressProxyApplyError` replaces `CredProxyApplyError`. - `claude_bottle/cli/dashboard.py` — wires the new apply + `discover_egress_proxy_slugs` helper; the operator-initiated `routes edit <bottle>` verb now writes to egress-proxy with `.yaml` suffix. Stale follow-up comment about path-aware filtering removed — PRD 0017 settled that question. - `tests/integration/test_supervise_sidecar.py` — restores the approval round-trip test (chunk 2 had switched it to a reject path because no cred-proxy existed). Approval stubs `apply_routes_change` so the test focuses on the supervise queue/response plumbing rather than docker-exec into a real egress-proxy sidecar (that's covered separately). - `tests/unit/test_egress_proxy_apply.py` — rewritten against the new validator; covers JSON shape, missing routes key, partial-auth-pair rejection (the addon-core parser catches these before SIGHUP). - PRDs 0010 + 0014 — status headers updated to Superseded / Retargeted with a callout block pointing at PRD 0017's migration section. Historical text preserved. 384 unit + integration tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 15:13:44 -04:00
didericis	4abea282e0	revert(egress-proxy): drop Role + agent provisioner (keep git-push block) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m3s Details Partial revert of `fa06a3a`. The role + agent-side provisioner felt overengineered: anthropic-base-url + npm-registry's only realistic host values match the tool defaults, so the role tags drove no-op dotfile writes most of the time. If non-default npm registry / tea config is needed in a future bottle, we can ship it through a more direct mechanism then. What stays from `fa06a3a`: - Universal HTTPS git-push block in the egress-proxy addon (`is_git_push_request` in egress_proxy_addon_core, called from the request hook before route matching; 403s git-receive-pack regardless of route). This is the security backstop so git-gate remains the only outbound write path; PR #29 keeps it. What gets reverted: - `Role` field on EgressProxyRoute (manifest + runtime). - `EGRESS_PROXY_ROLES` + `EGRESS_PROXY_SINGLETON_ROLES` constants and singleton-role validation. - `backend/docker/provision/egress_proxy.py` (npmrc + tea config). - `provision_egress_proxy` slot in `BottleBackend.provision`. - `prepare.py`'s role-based ANTHROPIC_BASE_URL detection (back to the token_ref="CLAUDE_CODE_OAUTH_TOKEN" auto-detect). - Manifest + provisioner tests for the above. 355 unit + 24 integration tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 15:02:15 -04:00
didericis	fa06a3a0ab	feat(egress-proxy): block HTTPS git push + restore role provisioner test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m1s Details Two related fixes on top of PR #29's chunk-2 cutover: 1. Universal HTTPS git-push block in the egress-proxy addon (`is_git_push_request` in egress_proxy_addon_core, called from the mitmproxy request hook before route matching). 403s any `/git-receive-pack` or `info/refs?service=git-receive-pack` — defense in depth so git-gate (PRD 0008) remains the only outbound path for writes, gitleaks-scanned by its pre-receive. Replicates cred-proxy's `is_git_push_request` behavior. 2. Restored agent-side role provisioner. Brings back `Role` on EgressProxyRoute (manifest + runtime) with three roles — `anthropic-base-url`, `npm-registry`, `tea-login`. Singleton constraint on the first two carries over from cred-proxy. `git-insteadof` is intentionally absent (option 1 above handles the push-bypass concern, and the canonical-URL rewrite has no function when egress-proxy is on HTTPS_PROXY). The provisioner (`backend/docker/provision/egress_proxy.py`): - `~/.npmrc` registry= the canonical upstream URL. - `~/.config/tea/config.yml` logins[] entry per tea-login route. - `ANTHROPIC_BASE_URL` env set in prepare.py based on the anthropic-base-url role (was a token_ref="CLAUDE_CODE_OAUTH_TOKEN" check in this PR's earlier draft — the role marker is cleaner and matches the cred-proxy precedent the user wants kept). All three dotfile values point at canonical upstream URLs; the agent's HTTPS_PROXY=egress-proxy routes them through the proxy automatically. Tests: 11 new role-validation tests, 11 new provisioner-render tests, the chunk-1 manifest fixture exercise role=anthropic-base-url. 400 tests pass (was 376). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:48:13 -04:00
didericis	70f773ac61	feat(egress-proxy): cutover from cred-proxy (PRD 0017 chunk 2) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m3s Details Hard cutover. cred-proxy is deleted; egress-proxy is now the agent's HTTP_PROXY (when routes are declared) with pipelock on its outbound leg. Two per-bottle CAs are minted: egress-proxy's (agent trust store) and pipelock's (egress-proxy's outbound trust store). Manifest: - `bottle.cred_proxy` → hard error with a migration recipe. - `bottle.egress_proxy` is the new shape (PRD 0017 chunk 1). - CredProxy* types + role validators removed. Wiring: - launch.py: `egress_proxy_tls_init` mints the egress-proxy CA (cert+key concat for mitmproxy + cert-only for agent trust); `DockerEgressProxy.start` docker-cps both CAs in, sets `HTTPS_PROXY=pipelock` + `EGRESS_PROXY_UPSTREAM_CA` so mitmdump trusts pipelock's MITM. Agent's HTTP_PROXY points at egress-proxy when routes exist, else falls back to pipelock (no-routes bottles unchanged). - prepare.py / backend.py: `cred_proxy` arg → `egress_proxy`; sidecar-orphan probe + plan field + dashboard view all renamed. - provision_ca: selects the egress-proxy CA when present, else pipelock's (filename renamed to claude-bottle-mitm-ca.crt). - bottle.provision: cred-proxy dotfile rewrites (~/.npmrc, ~/.gitconfig insteadOf, tea config) are gone — HTTP_PROXY catches everything respecting it. Pipelock helpers: - `pipelock_token_hosts` → `pipelock_route_hosts` (now reading egress_proxy.routes). - cred-proxy hostname auto-allow → egress-proxy hostname auto-allow. - Anthropic seed-phrase workaround now triggers when an egress_proxy route targets api.anthropic.com (was based on the cred-proxy `anthropic-base-url` role). Dockerfile.egress-proxy: - Entrypoint conditionally passes `--set ssl_verify_upstream_trusted_ca=$EGRESS_PROXY_UPSTREAM_CA` (via the `${VAR:+...}` shell expansion) so standalone runs without a mounted pipelock CA still boot. - mkdirs `/home/mitmproxy/.mitmproxy` ahead of `docker cp`. Deleted: claude_bottle/{cred_proxy,cred_proxy_server}.py, backend/docker/{cred_proxy,provision/cred_proxy}.py, Dockerfile.cred-proxy, plus the corresponding unit + integration tests. backend/docker/cred_proxy_apply.py stays as a stub for chunk 3 to rewrite (its container-name + routes-path constants are inlined so it survives without the deleted module). Test changes: - test_pipelock_allowlist rewritten against egress-proxy routes + the new `pipelock_route_hosts`. - test_manifest_md_load + test_pipelock_yaml + test_yaml_subset fixtures migrated to the `egress_proxy: { routes: [...] }` shape. - test_supervise_sidecar's round-trip test switched from `dashboard.approve` to `dashboard.reject`: the approval-apply path on cred-proxy-block proposals hits a deleted sidecar in chunk 2's transitional state. Chunk 3 restores the approval test once the remediation flow is retargeted at egress-proxy. 376 tests pass (was 427; net delta is removed cred-proxy tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:30:39 -04:00
didericis	3df54573d4	feat(egress-proxy): add mitmproxy-based sidecar core (PRD 0017 chunk 1) test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m39s Details Lands the new egress-proxy artifact alongside cred-proxy. Chunk 2 wires the agent's HTTP_PROXY to it and removes cred-proxy. - `Dockerfile.egress-proxy` — mitmproxy 11.1.3 base, COPY addon files flat to /app, mkdir routes dir at /etc/egress-proxy/. Digest pin deferred to chunk 2. - `egress_proxy_addon_core.py` — pure-logic parse + decide (host-importable; 21 unit tests). - `egress_proxy_addon.py` — mitmproxy hook wrapper, container-only (boot + SIGHUP reload, strip-Authorization + decide + 403/inject). - `egress_proxy.py` — host helpers: manifest lift, routes.yaml render (JSON content), token-env-map, Plan + abstract class. - `backend/docker/egress_proxy.py` — `DockerEgressProxy` start/stop mirroring `DockerCredProxy`; not yet called from launch.py. - `manifest.py` — new `EgressProxyRoute` + `EgressProxyConfig` types with the nested `auth: { scheme, token_ref }` block per PRD; `bottle.egress_proxy` added to the bottle key set alongside `cred_proxy` (chunk 2 hard-fails on the latter). All 427 unit tests pass. Image builds; `docker run` boots mitmdump and the addon loads routes from a mounted routes.yaml. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:58:24 -04:00
didericis	6066bb4d4c	fix(dashboard): show the literal new allowlist line in green, no prefix test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m37s Details The "→ would allow host: api.github.com" framing added narration where none was needed. Just render the host on its own line in green — that's literally the text that gets appended to pipelock's allowlist on approve, and the green color carries "what's about to change". The URL (with path) is still right above for context. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 08:28:29 -04:00
didericis	97ff506783	feat(dashboard): highlight new hostname in green on pipelock-block detail test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m32s Details When the operator opens a pipelock-block proposal in the detail view (Enter / 'v'), append a green-coloured line: → would allow host: api.github.com so what's actually about to change is obvious at a glance. The full failed URL stays above the new line (the path is operator context — pipelock can't enforce it, just records intent). - _detail_lines now returns (text, attr) tuples; pipelock-block appends the host-extract line tagged with the green color pair. - _detail_view threaded the green_attr through from the main loop (matches the new-proposal highlight pattern from earlier in this PR). - Best-effort URL parsing; unparseable payloads skip the highlight line rather than render a misleading blank host. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 08:25:24 -04:00
didericis	f3f2e3e9ab	feat(pipelock-block): tool sends failed URL, supervisor merges host test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 1m32s Details Reshape the pipelock-block MCP tool around what the agent actually knows at the moment of failure (the URL pipelock just refused), not what the operator needs (a full allowlist file). Before: agent had to read /etc/claude-bottle/current-config/allowlist, copy the whole file, append their host, send back. Lots of work, easy to get wrong, and the operator's diff was noisy because the proposal contained every host the agent saw — most of which weren't the change. After: agent calls pipelock-block(failed_url="https://api.github.com/repos/foo/bar", justification="...") supervisor extracts api.github.com, fetches the running allowlist, adds the host if not already present, applies the merged content. Path is captured as operator context (the detail view labels it "failed URL" instead of "proposed file") but isn't enforced — pipelock's api_allowlist is hostname-only, so the path can't become an allow rule. - supervise_server: pipelock-block input schema gains `failed_url` (replaces `allowlist`); validate_proposed_file checks for http/https + hostname. - PROPOSED_FILE_FIELD updated; tool description rewritten. - dashboard._apply_pipelock_url: extract host, fetch current, merge, apply. - _proposed_payload_label: detail view renders "failed URL" for pipelock-block, "proposed file" otherwise. - Tests updated end-to-end; new url-host-merge + idempotent-merge + invalid-url cases added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 08:02:53 -04:00
didericis	a9bb34cb77	feat(dashboard): highlight newly-arrived proposals in green for 5s test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m34s Details When a new proposal lands in the dashboard's list, the operator shouldn't have to compare the list to a mental snapshot to spot what's new. Render newly-arrived proposals in green for the first five seconds after they show up. - _try_init_green: initialise a green color pair; returns 0 if the terminal lacks color so the highlight degrades to no-op. - _main_loop tracks first_seen[proposal_id] across refresh ticks, pruning entries when a proposal leaves the queue. - _render ORs green into the existing attr (composes with selection reverse-video — terminal handles the mix). Applies to all tool types (cred-proxy-block, pipelock-block, capability-block). If a tool-specific highlight is wanted later, filter on qp.proposal.tool in _is_recent. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 07:54:34 -04:00
didericis	307400f08a	fix(supervise): bypass pipelock for agent → supervise MCP traffic test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m36s Details `/mcp` showed the supervise server as ✔ connected (initialize is fast), but any actual tool call failed because the supervise MCP design is long-poll — the sidecar holds the HTTP request open until the operator approves in the dashboard (potentially minutes) and only then returns the response. Pipelock is a forward proxy with idle timeouts; it cut the long- polled HTTPS-style request well before the operator could act, and claude-code reported the tool as ✘ failed. Fix: add `supervise` to the agent's NO_PROXY when bottle.supervise is true. The supervise sidecar is on the bottle's internal network with the `supervise` network-alias, so the agent can dial it directly via docker DNS — no proxy, no idle timeout. Body-scanning supervise traffic isn't critical because the operator reviews every proposal in the TUI before approving. The earlier pipelock allowlist auto-add for `supervise` stays as belt-and- braces (handles any proxy-respecting client other than claude-code that might dial supervise). Existing bottles need a restart to pick up the new NO_PROXY value (env can't be changed on a running container). The dashboard's pipelock-edit workaround from PR #25 unblocks short-running tool calls in the meantime but won't survive the pipelock idle timeout on a long-polled call. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 07:36:27 -04:00
didericis	d2e047fa66	fix(pipelock): auto-allow `supervise` hostname like `cred-proxy` test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m35s Details When PR #19 added the supervise sidecar (PRD 0013), I forgot to mirror the cred-proxy auto-allow in pipelock_effective_allowlist. The agent's HTTP_PROXY points at pipelock, so a request for http://supervise:9100/ (the MCP endpoint claude-code dials) arrives at pipelock as hostname `supervise` — and pipelock 403s it because the host isn't in api_allowlist. End-user symptom: even after `claude mcp add` registers the supervise server, `/mcp` shows it as ✘ failed and the supervise sidecar's docker logs are silent (request never gets through). Mirror what cred-proxy already does: when bottle.supervise is True, add SUPERVISE_HOSTNAME to the rendered pipelock allowlist. New tests cover both the auto-add and the no-add-when-disabled invariants. Existing bottles: the dashboard `pipelock edit <bottle>` verb (or backend.docker.pipelock_apply.apply_allowlist_change) can apply this fix to a running bottle without a relaunch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 07:27:30 -04:00
didericis	0e2fc97aa8	fix(supervise): provision MCP via `claude mcp add`, not raw settings.json test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m34s Details The previous provisioner wrote ~/.claude/settings.json with an mcpServers entry — but claude-code doesn't read its mcpServers from that path. Inside a bottle, /mcp showed "No MCP servers configured" even though the sidecar was running. Switch to the official `claude mcp add` command run via docker exec: docker exec -u node <agent> \ claude mcp add --scope user --transport http supervise <url> claude-code owns its config file format (~/.claude.json shape, key names, scope semantics) and has changed it between versions. The official command writes to the right place in the right shape for whatever version is installed. Failure is logged but not fatal — the bottle still works; you just have to register the server manually with the command surfaced in the warning. Worst case is a bad agent claude-code version, not a bad bottle. To fix an already-running bottle without restarting, the user can run the same `docker exec` command directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 07:19:51 -04:00
didericis	ef5d2f9a4d	feat(state): preserve on crash + always snapshot transcript test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m31s Details Extends the preserve-on-capability-block design to also preserve state on agent crash, and snapshots the transcript on every teardown so any resume (crash or capability-block) gets a warm claude session — not a cold start. - capability_apply: rename _snapshot_transcript → snapshot_transcript (public; reused below). No behavior change in the capability path. - cli/start.py: capture bottle.exec_claude's exit code; while the container is still alive (inside the launch context): * always snapshot_transcript(identity) * if exit_code != 0, mark_preserved(identity) Then the existing _settle_state runs after teardown. Now the preservation matrix is: exit 0 (clean) → snapshot + cleanup state exit ≠0 (crash, Ctrl-C) → snapshot + preserve + show resume hint capability-block → (already snapshotted/preserved by apply before teardown; this path is a no-op because the container is already gone by the time exec_claude returns) snapshot_transcript is best-effort — capability-block's earlier snapshot is not clobbered when the container is already torn down, and a missing /home/node/.claude is a warn + skip. Tested behavior: clean exit doesn't preserve, non-zero exit (including SIGINT/130 and SIGKILL/137) preserves; empty identity no-ops both helpers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 07:05:23 -04:00
didericis	fb2b5844c4	feat(cleanup): prompt to remove per-bottle state, separately from containers test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m34s Details `cli.py cleanup` already enumerated orphan containers + networks and asked for confirmation before nuking them. Per-bottle state under ~/.claude-bottle/state/ wasn't touched — accumulated forever, including orphans from old code paths. Add state to the cleanup flow with its own prompt: the trade-off is different from containers (which are pure debris) because a state dir may carry a resumable bottle (capability-block rebuild + transcript snapshot) the operator still wants. Output shows the resumable / orphan / rebuilt-Dockerfile / transcript / preserve-marker flags for each state dir so the operator sees what they'd lose. Both sections are skippable independently — answering "n" to containers doesn't skip the state prompt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 06:56:04 -04:00
didericis	9dbd20398e	feat(state): clean up per-bottle state on session end (except capability-block) test / unit (pull_request) Successful in 19s Details test / integration (pull_request) Successful in 1m35s Details Previously every bottle launch left ~/.claude-bottle/state/<identity>/ behind forever — metadata.json on every run, plus per-bottle Dockerfile + transcript snapshot on capability-block rebuilds. The metadata accumulated debris across launches; the only state worth keeping was the capability-block rebuild bundle. Make cleanup the default; preserve only on capability-block. - bottle_state.py: .preserve marker helpers (mark_preserved, is_preserved, clear_preserve_marker, preserve_marker_path) + cleanup_state(identity) that rm -rf's the per-bottle dir. - capability_apply.apply_capability_change writes mark_preserved before teardown so cli.py's session-end cleanup keeps the dir. - prepare.py clears any leftover marker at launch (start or resume), so a marker from a prior capability-block doesn't keep state alive past a subsequent normal session-end. - cli/start.py runs the cleanup decision AFTER the launch context closes: if is_preserved → print resume hint; else cleanup_state. The resume hint moves out of the launch with-block (was previously printed unconditionally — would have misled the operator about whether state was actually kept). Future-proof: cli.py never persists state speculatively. If the agent wants to be resumable, it has to go through capability-block. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 06:51:13 -04:00
didericis	6e46ca4478	feat(supervise): provision agent-side MCP config so Claude sees the sidecar test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m30s Details The supervise sidecar (PRD 0013) has been serving MCP at http://supervise:9100/ since it landed, but the in-bottle Claude Code had no `.mcp.json` or settings pointing there — so the agent couldn't actually call cred-proxy-block / pipelock-block / capability-block as tools. To exercise the flow you had to curl the sidecar from a sibling container. This closes that last mile. - claude_bottle/backend/docker/provision/supervise.py (new): provision_supervise(plan, target) writes ~/.claude/settings.json into the running agent container with an mcpServers.supervise entry of type http pointing at the per-bottle sidecar. No-op when bottle.supervise is False. - BottleBackend.provision orchestrator gains provision_supervise as the last step (after CA, prompt, skills, git, cred-proxy). Default impl is a no-op so non-Docker backends aren't forced to implement it. - DockerBottleBackend wires it through to the new module. - Test covers the rendered settings shape so a future regression in the MCP entry format would surface in unit-level CI. To test the full flow end-to-end now: ./cli.py start <agent> --cwd # agent's claude sees supervise # agent calls cred-proxy-block via MCP ./cli.py dashboard # approve ./cli.py resume <identity> # restart with new capabilities Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 06:22:25 -04:00
didericis	4032e04a9c	feat(bottle): random-suffix identity + cli.py resume <identity> test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m30s Details Replaces the cwd-hash identity with a random 5-char base36 suffix per launch, so two simultaneous `start <agent>` invocations against the same cwd no longer collide on container names. Each launch is its own bottle. State carries metadata: every prepare step writes ~/.claude-bottle/state/<identity>/metadata.json with the (agent_name, cwd, copy_cwd, started_at) the bottle was launched with. The new `cli.py resume <identity>` reads this metadata and re-launches a bottle pinned to the same identity — picking up the per-bottle Dockerfile (from a prior capability-block apply) and the transcript snapshot under the same state dir. - bottle_state.py: bottle_identity(agent_name) drops the cwd param and gains a random suffix; BottleMetadata dataclass + read/write/metadata_path helpers. - BottleSpec gains an optional identity field — resume sets it to pin the identity; start leaves it empty so prepare mints fresh. - prepare.py: writes metadata at launch time; uses spec.identity if provided (resume) else bottle_identity(agent_name) (fresh start). - start.py: extracted _launch_bottle from cmd_start so resume can share the launch core; prints `./cli.py resume <identity>` hint at session end. - cli/resume.py (new): reads metadata, reconstructs BottleSpec with the recorded identity + cwd, delegates to _launch_bottle. Errors clearly when no state exists for the given identity. - cli/__init__.py: registers `resume` in COMMANDS + usage. - dashboard.py: capability-block approval status line now appends the `resume <identity>` hint so the operator can copy-paste the rebuild command without leaving the TUI. Closes the rebuild loop in PRD 0016: agent calls capability-block → operator approves → bottle torn down with state preserved → status line shows resume command → operator runs it → replacement bottle boots with the new Dockerfile and prior transcript. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 06:09:45 -04:00
didericis	e996f72532	fix(bottle): identity-key all per-bottle resources by (agent, cwd) test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 1m30s Details The single point that computed `slug = slugify(agent_name)` in prepare.py is now `slug = bottle_identity(agent_name, cwd)`. With --cwd the identity has a sha256(resolved-cwd)[:12] suffix, so the same agent against different projects gets distinct container names, network names, queue dir, audit log paths, and per-bottle state (Dockerfile + transcript). Without --cwd the identity is just slugify(agent_name), unchanged from before — no-cwd bottles look the same as today. The downstream `slug` field on DockerBottlePlan keeps its name — every module already threads it under "slug" and the value flowing through is now the bottle's full identity. A comment in prepare.py flags the change. Fixes the bug surfaced in PR #22 review: running the same agent against project-A's cwd then project-B's would silently share project-A's per-bottle Dockerfile + transcript snapshot, container name (forcing serialized runs), and queue/audit history. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:46:26 -04:00
didericis	ac8f14ae6f	test(capability): integration test for apply_capability_change (PRD 0016) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m30s Details Phase 4 of PRD 0016. End-to-end test against real Docker: - Stages a fake bottle: alpine:latest container named claude-bottle-<slug> with a marker file at /home/node/.claude/sessions.json, plus a fake supervise sidecar. - Calls apply_capability_change with a new Dockerfile. - Verifies: per-bottle Dockerfile written, agent + sidecars removed, networks removed, transcript snapshot dir on host contains the marker file (proving docker cp transferred bytes). - Subsequent-apply test proves the per-bottle Dockerfile state persists across rebuilds (before-diff uses the prior override, not the repo Dockerfile). - Teardown-idempotent test: apply against a never-started bottle doesn't raise. docker exec / cp / rm / network rm work fine across the docker socket boundary, so this runs in DinD too — no act_runner skip needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:30:04 -04:00
didericis	d9c47d0fbe	feat(dashboard): wire capability-block approval to real apply (PRD 0016) Phase 3 of PRD 0016. dashboard.approve() now dispatches to apply_capability_change when the proposal is a capability-block: cred-proxy-block → apply_routes_change pipelock-block → apply_allowlist_change capability-block → apply_capability_change (new in PRD 0016) CapabilityApplyError joins the ApplyError tuple, so the TUI's key handlers catch it the same way and surface failures in the status line. After a successful capability-block apply, dashboard archives the proposal+response itself — the supervise sidecar was torn down by apply_capability_change and can't archive its own queue file. Without this, dashboard.discover_pending would keep surfacing the resolved proposal forever. No audit log for capability-block per PRD 0013 — its record lives in the per-bottle Dockerfile state + transcript snapshot. Tests stub apply_capability_change at the dashboard module level, add TestCapabilityApplyWiring (call wiring, failure-keeps-pending, no-audit invariant, archive-after-apply), and update TestApproveReject to stub the capability path too so it stays docker-independent. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:28:35 -04:00
didericis	0899a898e0	feat(capability): host-side apply_capability_change orchestrator (PRD 0016) Phase 2 of PRD 0016. New module claude_bottle/backend/docker/capability_apply.py: - apply_capability_change(slug, new_dockerfile): snapshot transcript → push working tree → write per-bottle Dockerfile → teardown. Returns (before, after) for the dashboard's audit/diff render. - fetch_current_dockerfile(slug): per-bottle Dockerfile if set, else the repo's Dockerfile. - Internal helpers _snapshot_transcript, _push_working_tree are best-effort (log + return on failure); _teardown_bottle is idempotent (force-rm + network rm silently ignore missing names). Fire-and-forget from the agent's perspective: by the time the dashboard writes the response file the supervise sidecar is already gone (it was torn down), so the agent's tool call connection drops without receiving the response. The replacement agent (next manual `cli.py start <agent>`) sees the new per-bottle Dockerfile and the transcript snapshot for resume. v1 does not auto-relaunch. Tests cover sequencing (snapshot → push → teardown order), the per-bottle vs repo Dockerfile fallback chain, empty-input rejection, and the per-bottle-Dockerfile write. The docker exec / cp / rm plumbing is covered by the Phase 4 integration test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:26:38 -04:00
didericis	02811e0417	feat(bottle): per-bottle Dockerfile state + image build hook (PRD 0016) Phase 1 of PRD 0016. Lays the per-bottle state plumbing that capability-block remediation will write into: - claude_bottle/backend/docker/bottle_state.py: bottle_state_dir, per_bottle_dockerfile (read), write_per_bottle_dockerfile, per_bottle_image_tag (unique per slug), transcript_snapshot_dir. Stores under ~/.claude-bottle/state/<slug>/. - prepare.py: when a per-bottle Dockerfile exists, use per_bottle_image_tag(slug) as the base image and pass the per-bottle Dockerfile path through DockerBottlePlan.dockerfile_path. --cwd still layers a derived image on top. - launch.py: passes plan.dockerfile_path to build_image so the per-bottle Dockerfile is what docker build reads. - DockerBottlePlan gains dockerfile_path field; print() surfaces it in the preflight summary so the operator can see at-a-glance that this bottle is running on a rebuilt image. Phase 2 will write to write_per_bottle_dockerfile (capability-block approval); Phase 3 wires it into the dashboard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:23:31 -04:00
didericis	4fada1651b	test(pipelock): integration test for apply_allowlist_change (PRD 0015) test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 1m8s Details Phase 4 of PRD 0015. End-to-end test against real Docker: - Brings up a real pipelock sidecar via the production DockerPipelockProxy bring-up + pipelock_tls_init. - Calls apply_allowlist_change to add a new host. - Polls the live /etc/pipelock.yaml until the new host shows up (bridging the docker-restart window). - Verifies api_allowlist contains both old + new hosts and tls_interception block is preserved. - Smaller cases: invalid hostname raises, missing sidecar raises, fetch_current_allowlist returns one-per-line format. Skipped under GITEA_ACTIONS because pipelock_tls_init bind-mounts a host path that doesn't share fs in the runner, matching the existing pipelock smoke test's skip pattern. Drive-by fix: fetch_current_yaml now uses `docker cp` (daemon-API tarball copy) instead of `docker exec cat` because the pipelock image is distroless and has no shell utilities. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:07:26 -04:00
didericis	1d58d62c47	feat(dashboard): pipelock edit TUI verb (PRD 0015) Phase 3 of PRD 0015. Adds the proactive `pipelock edit` path, mirroring routes edit from PRD 0014: - discover_pipelock_slugs() lists running pipelock sidecars. - operator_edit_allowlist(slug, new) wraps apply_allowlist_change and writes an audit entry tagged ACTION_OPERATOR_EDIT. - New 'p' keybinding in the main TUI: discover slugs, prompt if multiple, fetch current allowlist, open in $EDITOR, apply on save. - Extracts shared scaffolding into _operator_edit_flow used by both routes-edit and pipelock-edit — DRY without sacrificing the per-verb status-line copy. - Footer updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:03:20 -04:00
didericis	5a6c4be342	feat(dashboard): wire pipelock-block approval to real apply (PRD 0015) Phase 2 of PRD 0015. dashboard.approve() now dispatches on the proposal's tool: cred-proxy-block → apply_routes_change (from PRD 0014) pipelock-block → apply_allowlist_change (new in PRD 0015) capability-block → no-op (lands in PRD 0016) PipelockApplyError joins CredProxyApplyError under the ApplyError tuple the TUI catches: failures keep the proposal pending and the status line surfaces the message; no response is written and no audit entry is appended. Tests: existing TestApproveReject stubs both apply paths; new TestPipelockApplyWiring covers the call wiring, failure-propagation, and real-diff-in-audit invariants. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 05:01:18 -04:00
didericis	c05457fbef	feat(pipelock): host-side apply_allowlist_change helper (PRD 0015) Phase 1 of PRD 0015. New module claude_bottle/backend/docker/pipelock_apply.py: - fetch_current_yaml(slug): docker exec cat of the live /etc/pipelock.yaml. - fetch_current_allowlist(slug): parses the yaml, extracts api_allowlist, renders as one-per-line for the operator/agent. - parse_allowlist_content / render_allowlist_content: one-per-line with `#` comments + blank-line tolerance, conservative hostname validation. - apply_allowlist_change(slug, new): parses new hosts, fetches + parses current yaml, swaps api_allowlist, re-renders via pipelock_render_yaml, docker cp into sidecar, docker restart. Returns (before, after) as one-per-line strings for the audit diff. - PipelockApplyError: caller surfaces to operator without crashing the dashboard. v1 uses restart, not SIGHUP — pipelock has no in-process reload hook; adding one is the PRD's open question. Restart drops in-flight outbound calls and the agent retries pick up the restarted proxy. Yaml roundtrip is covered by tests: parse(render(cfg)) preserves all fields pipelock_render_yaml emits, including tls_interception + passthrough_domains. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:59:13 -04:00
didericis	70f43d8c4f	test(cred-proxy): integration test for SIGHUP + apply round-trip (PRD 0014) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m12s Details Phase 5 of PRD 0014. End-to-end test against real Docker: - Brings up a cred-proxy sidecar with route /a/ → unreachable upstream (so 502 = route matched, 404 = no route). - Calls apply_routes_change to swap to /b/ only. - Polls until the route table flips: /a/ now 404s, /b/ now 502s. - Separately verifies fetch_current_routes returns the live file, apply with invalid JSON raises, and apply against a non-existent sidecar raises. No fake-upstream container needed: unreachable hostnames give the 502 signal directly. apply_routes_change uses docker exec / cp / kill (not bind mounts), so this should work in docker-in-docker too — no DinD skip needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:50:29 -04:00
didericis	81277e9d81	feat(dashboard): routes edit TUI verb for operator-initiated changes (PRD 0014) Phase 4 of PRD 0014. Adds the proactive routes-edit path that doesn't require a pending proposal: - discover_cred_proxy_slugs() lists running cred-proxy sidecars by parsing docker ps output. Returns [] when docker is unreachable or not installed (no exception escapes). - operator_edit_routes(slug, new_content) wraps apply_routes_change and writes an audit entry tagged ACTION_OPERATOR_EDIT (so a future reader can distinguish operator-initiated changes from agent-proposal approvals in the log). - New 'e' keybinding in the main TUI: discover slugs, prompt if multiple (or use the only one directly), fetch current routes, open in $EDITOR, apply on save. CredProxyApplyError lands in the status line; the operator can retry. Tests cover audit-entry shape, failure path, and docker-missing recovery for slug discovery. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:47:22 -04:00
didericis	f3a1b4d667	feat(dashboard): wire cred-proxy-block approval to real apply (PRD 0014) Phase 3 of PRD 0014. dashboard.approve() now does the real remediation for cred-proxy-block proposals: - Calls apply_routes_change(slug, file_to_apply) which fetches the current routes.json from the running sidecar, validates the new JSON, docker cp's it in, and SIGHUPs the sidecar. - Audit entry's diff is now the real before→after from the apply return — not the empty-string placeholder 0013 wrote. - On apply failure (CredProxyApplyError): no response file, no audit entry. Proposal stays pending so the operator can fix the input and retry. The TUI's key handlers catch the exception and surface the message in the status line. - pipelock-block + capability-block remain no-op approvals; their remediation lands in PRDs 0015 + 0016 and the audit diff stays empty until then. - reject path unchanged: no apply, audit entry with empty diff. Tests stub apply_routes_change at the dashboard module level so the unit suite doesn't need a running sidecar; integration test in Phase 5 covers the real docker exec/cp/SIGHUP plumbing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:44:33 -04:00
didericis	f7f1a7d5da	feat(cred-proxy): host-side apply_routes_change helper (PRD 0014) Phase 2 of PRD 0014. New module claude_bottle/backend/docker/cred_proxy_apply.py: - fetch_current_routes(slug): docker exec cat of the live routes.json from the running cred-proxy sidecar. - validate_routes_json(content): syntactic check before SIGHUP so failures keep the old routes live and surface a clearer error than 'reload failed' in the sidecar logs. - apply_routes_change(slug, new): fetch current → validate new → write to temp → docker cp into sidecar → docker kill --signal HUP. Returns (before, after) so the caller can render a real audit diff. - CredProxyApplyError: caller surfaces to operator without crashing the dashboard. docker exec / cp / kill paths are covered by the integration test in Phase 5; unit tests here cover the validator. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:41:18 -04:00
didericis	ee60b09816	feat(cred-proxy): SIGHUP reload of routes.json (PRD 0014) Phase 1 of PRD 0014. Adds the in-sidecar SIGHUP signal handler that re-reads routes.json + re-resolves tokens from env without dropping in-flight connections: - reload_routes(server, path, environ=...) does the atomic swap. Returns (ok, message) so the caller can log/surface failures. On failure (bad JSON, missing file) the server keeps serving the old routes rather than dying — typos shouldn't crash the sidecar. - install_sighup_handler wires SIGHUP → reload_routes. No-op on platforms without SIGHUP (Windows). - serve() now installs the handler at startup. Atomicity: Python attribute reassignment is atomic, and the request handler reads server.routes/tokens once at the top of _proxy() so an in-flight request keeps the version it captured. Tests cover successful reload, JSON-parse failure, and missing-file failure (both verify the old routes survive). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:39:54 -04:00
didericis	92fee89e20	test(supervise): skip queue round-trip test in docker-in-docker (PRD 0013) test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 41s Details The integration test test_tools_call_round_trips_through_queue relies on a host bind-mount to share the queue dir between the sidecar (writing proposals) and the test process (approving via dashboard helpers). In the Gitea Actions runner the docker socket forwards to the outer host's daemon, so bind-mount paths are resolved against the outer host's fs — not the runner container's. The sidecar writes its proposal where the test can't see it; the test times out. Add a one-shot probe that does docker run -v <tmp>:<container> and checks both directions of fs visibility. Skip the round-trip test when the probe fails. tools_list and the orphan-name test are unaffected — they don't touch the queue. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:26:06 -04:00
didericis	9f445d61be	test(supervise): docker integration test for the sidecar (PRD 0013) test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Failing after 1m25s Details Phase 5 of PRD 0013. End-to-end integration test against real Docker: - Brings up the supervise sidecar on a per-bottle internal network. - A curl-image "agent" on the same network does tools/list and gets back the three PRD 0013 tool names over real MCP wire format. - A tools/call round-trips through the queue: agent blocks on the call, host watches the queue, dashboard.approve writes a Response, agent receives the approval payload (status, notes) in MCP content. - Documents the orphan-sidecar name-collision behavior so a future auto-cleanup change can flip the assertion. Skips if docker is unreachable, matching the existing integration pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:20:57 -04:00
didericis	0aecb41e33	feat(supervise): TUI dashboard for approve/modify/reject (PRD 0013) Phase 4 of PRD 0013. Adds `claude-bottle dashboard` subcommand: - discover_pending() walks ~/.claude-bottle/queue/* and gathers pending proposals across all bottles, sorted FIFO by arrival. - approve / approve-with-final-file / reject helpers write the Response file the sidecar polls, and append an AuditEntry for cred-proxy and pipelock tools. capability-block proposals don't write to an audit log here (PRD 0016 captures via rebuild record). - Stdlib-curses TUI: list view, detail view, $EDITOR shellout for modify-then-approve, inline prompt for reject reason. - `dashboard --once` dumps pending proposals to stdout without bringing up curses — useful for scripted checks and tests. For 0013 the audit entry's diff field is render_diff("", proposed) because we don't yet have access to the live on-disk current file; PRDs 0014 / 0015 fill in real before→after diffs once they own the host-side config writes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:20:57 -04:00
didericis	4b2dbcdefd	feat(supervise): Docker lifecycle + bottle integration (PRD 0013) Phase 3 of PRD 0013. Wires the supervise sidecar into bottle launch: - Manifest: bottle.supervise (bool, default False). Opt-in for v1 so existing bottles are unchanged. - supervise.py: adds SupervisePlan + abstract Supervise(ABC) with a prepare template that stages the per-bottle queue dir on the host and the current-config dir under stage_dir (routes.json + allowlist + Dockerfile). Stdlib-only so it still runs as the in-container shared helper. - backend/docker/supervise.py: DockerSupervise concrete start/stop. No egress network (the sidecar doesn't make outbound calls); just the bottle's internal network with network-alias "supervise" and a bind-mount of the host queue dir at /run/supervise/queue. - Prepare wires supervise.prepare into the DockerBottlePlan, derives routes_content from cred_proxy_plan, allowlist_content from pipelock_effective_allowlist, and dockerfile_content from the repo's Dockerfile. supervise sidecar added to the orphan probe. - Launch starts the supervise sidecar after pipelock + cred-proxy but before the agent (so DNS resolution for `supervise` is up on the agent's first tool call). - Agent container gets a read-only bind-mount of the current-config dir at /etc/claude-bottle/current-config when supervise is enabled. - bottle_plan print + to_dict surface the supervise state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:20:57 -04:00
didericis	d5ba253878	feat(supervise): MCP sidecar HTTP server + Dockerfile (PRD 0013) Phase 2 of PRD 0013. Adds the in-container MCP server: - claude_bottle/supervise_server.py: minimal JSON-RPC over HTTP MCP server. Handles initialize / notifications/initialized / tools/list / tools/call. Each tools/call validates the proposed file syntactically, writes a Proposal to the host-mounted queue, blocks waiting for a Response, archives both files, returns the operator's {status, notes} wrapped in MCP content. - Three tool definitions with JSON Schema inputs: cred-proxy-block (routes.json), pipelock-block (allowlist), capability-block (Dockerfile). - Dockerfile.supervise mirroring the cred-proxy pattern: same pinned python:3.13-alpine, copies supervise.py + supervise_server.py into /app, exposes port 9100. Stdlib-only. Tests cover JSON-RPC parsing, per-tool validation, all three handlers, the queue round-trip via a background responder thread, and an end-to-end HTTP sanity check on a random port. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:20:57 -04:00
didericis	2e06090464	feat(supervise): host-side queue + audit log primitives (PRD 0013) Phase 1 of PRD 0013. Adds claude_bottle/supervise.py with: - Proposal / Response / AuditEntry dataclasses - Per-bottle queue dir under ~/.claude-bottle/queue/<slug>/ - write/read/list/archive proposal helpers + wait_for_response - Audit log writer (JSON-Lines under ~/.claude-bottle/audit/) - Unified-diff rendering + sha256 helper for stale-proposal detection Stdlib-only; in-container code (Phase 2) and Docker lifecycle (Phase 3) follow. Tests cover queue, audit, and diff/hash helpers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:20:57 -04:00
didericis	6ba5f9a9d3	feat(manifest): per-file MD directory loader (PRD 0011) test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 22s Details Manifest.resolve walks $HOME/.claude-bottle/{bottles,agents}/ and $CWD/.claude-bottle/agents/ instead of reading claude-bottle.json. A bottles/ subdir under $CWD is logged as a warn and ignored — the filesystem layout IS the trust boundary, no resolver check needed. If claude-bottle.json exists alongside no .claude-bottle/ dir at either location, dies with a clear pointer at the README — the manifest format changed and we don't silently fall back. Manifest.from_md_dirs(home, cwd) is the programmatic entry point tests use to build a Manifest from fixture directories without touching os.environ. Manifest.from_json_obj is preserved for tests that still want to build manifests in-memory. Bottle / agent frontmatter goes through Bottle.from_dict / Agent.from_dict — same validators as today's JSON path. Unknown top-level frontmatter keys die with a "did you mean" pointer listing accepted keys. Filenames that don't match [a-z][a-z0-9-]* are skipped with a warn. Agent files accept the Claude Code subagent passthrough fields (name, description, model, color, memory) so the same file can drop into ~/.claude/agents/ — claude-bottle ignores them at launch but doesn't reject. The dry-run integration test ships a real MD fixture tree now; all 200 unit + 17 integration tests stay green.	2026-05-24 22:15:02 -04:00
didericis	8c1e4d0220	feat(yaml_subset): hand-rolled YAML-subset + frontmatter parser test / unit (pull_request) Successful in 12s Details test / integration (pull_request) Successful in 25s Details claude_bottle/yaml_subset.py — stdlib-only, ~450 lines. Parses the bounded shape claude-bottle's manifest files use: - Block mappings (top-level + nested via indentation) - Block lists (under a key, items can be scalars or block-style mappings whose keys align with the rest after the dash) - Inline lists `[a, b]` and inline dicts `{a: 1}` for one-level leaves - Quoted (single + double) and bare strings - Scalars: string, int, true/false, null/~ Rejects, each with a clear pointer at the line number: - `yes`/`no`/`on`/`off`/`Y`/`N`/`TRUE`/`FALSE` — only literal `true` / `false` are bools (the Norway problem stays solved by "quote your strings if they look like bools") - Bare strings that look like dates / octals / hex / floats - Anchors (`&`/`*`), aliases, YAML tags (`!!str`) - Multi-line block scalars (`\|`, `>`) - Tabs in indentation - Nested flow style (only one level allowed) Public API: parse_yaml_subset(text) -> dict[str, object] Top level must be a mapping. parse_frontmatter(text) -> (dict, body_text) Strips `---` delimiters, parses content as YAML subset, returns the verbatim body text after the closing fence. 46 unit tests covering every construct the real manifest files use (the cred_proxy.routes structure, role-as-inline-list, nested ExtraHosts dicts) plus every rejection case listed in PRD 0011.	2026-05-24 21:59:34 -04:00
didericis	77a51702fc	fix(cred_proxy): force identity encoding on upstream requests test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 25s Details claude-code sends Accept-Encoding: gzip, deflate, br on every request. api.anthropic.com honors it and returns gzip-compressed SSE responses. Pipelock 2.3.0 has no decompression path; its response scanner fails closed with "blocked: compressed sse_stream response cannot be scanned" — and that gate fires even with response_scanning.enabled=false and sse_streaming disabled. Verified empirically against the real pipelock image. Cleanest fix that preserves DLP coverage end-to-end: have cred-proxy ask upstream for uncompressed bytes. Strip the agent's Accept-Encoding when building the upstream headers and inject `Accept-Encoding: identity`. Upstream returns plaintext; pipelock can scan; no 403. Bandwidth cost is the gzip ratio one-way (cred-proxy ↔ upstream through pipelock). For LLM SSE streams that's a few KB extra per turn — trivial compared to the alternative of leaving pipelock's response scanner blind.	2026-05-24 14:08:35 -04:00
didericis	4662087b32	fix(pipelock): disable seed_phrase_detection for anthropic bottles test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 22s Details The previous attempt added a `suppress: [{rule, path}]` entry. The yaml validated and the entry showed up in the live pipelock's config, but the BIP-39 detector kept firing — `suppress` only silences alerts, not enforcement. Reproduced the failure in isolation, probed three knobs against a real pipelock with a canonical BIP-39 body (`abandon abandon ... about`): suppress: [{rule: "BIP-39 Seed Phrase", path: "/anthropic/*"}] -> still 403 rules.disabled: ["dlp:BIP-39 Seed Phrase"] -> still 403 seed_phrase_detection: { enabled: false } -> 200 (forwarded) Only the global toggle actually stops the block. Pipelock 2.3.0 has no per-path / per-host knob for this detector, so the trade-off is: when the bottle declares an `anthropic-base-url` route, BIP-39 detection comes off globally for that bottle. Every other DLP pattern (gh_, sk-ant-, AKIA, etc.) keeps firing — the ones that actually map to claude-bottle's threat model. Drops the `suppress:` emitter from pipelock_build_config / pipelock_render_yaml; replaces with a `seed_phrase_detection: { enabled: false }` block driven by `pipelock_seed_phrase_detection_enabled(bottle)`. Tests flip from suppress-shape to seed_phrase shape. End-to-end probe through the real pipelock image confirms BIP-39 bodies forward.	2026-05-24 13:59:05 -04:00
didericis	c5d729e25d	fix(pipelock): suppress BIP-39 detector on cred-proxy anthropic path test / unit (pull_request) Successful in 14s Details test / integration (pull_request) Successful in 22s Details claude-code's chat bodies legitimately trip pipelock's BIP-39 seed- phrase detector — any 12+ English words that pass the BIP-39 checksum match. The direct path to api.anthropic.com already sits on tls_interception.passthrough_domains so no body scan runs there, but the cred-proxy hop is plain HTTP through pipelock and the body scanner fires. Add an anthropic-route-specific suppress entry: suppress: - rule: "BIP-39 Seed Phrase" path: "/anthropic/*" Just this one detector, only on this one path. Every other DLP pattern (AKIA, gh_, sk-ant-, etc.) keeps firing — those are unambiguous credential shapes with no legitimate reason to appear in a chat completion. Other detectors that fire on natural language can be added to the suppress list when/if they surface. Wiring: pipelock_effective_suppress(bottle) computes the entries from bottle.cred_proxy.routes; pipelock_build_config accepts them and emits a `suppress:` block; pipelock_render_yaml renders it. Probed schema with `pipelock check --config` to confirm the {rule, path} shape; full yaml validates clean.	2026-05-24 13:49:31 -04:00
didericis	51b20340a9	fix(pipelock): allow agent->sidecar traffic via SSRF exception test / unit (pull_request) Successful in 12s Details test / integration (pull_request) Successful in 21s Details The agent's HTTP_PROXY points at pipelock, so a request to http://cred-proxy:9099/... arrives at pipelock; pipelock resolves the host, sees an RFC1918 address (the bottle's internal Docker network sits in 172.x), and 403's "SSRF blocked: cred-proxy resolves to internal IP 172.20.0.4". Bypassing pipelock entirely would also remove its body scanner from the agent->cred-proxy leg — we want to keep that DLP coverage. Pipelock has `ssrf.ip_allowlist` for exactly this: CIDRs that override the built-in internal-IP block while api_allowlist + body scanning + tls_interception keep firing. Wiring: - `pipelock_build_config` accepts `ssrf_ip_allowlist`; when non-empty, emits an `ssrf: { ip_allowlist: [...] }` block. - `pipelock_render_yaml` renders that block. - `PipelockProxyPlan` gains `internal_network_cidr`. - New `network_inspect_cidr(name)` helper reads the Docker-assigned subnet via `docker network inspect`. - launch.py: after `network_create_internal`, inspect the CIDR, re-render the yaml with `ssrf_ip_allowlist=(cidr,)`, overwrite the file in place; `DockerPipelockProxy.start` then docker-cp's the updated content. Prepare's initial render stays unchanged (CIDR isn't known yet at prepare time). The exception scope is the bottle's own internal network only — agent ↔ pipelock / git-gate / cred-proxy. Body scanning still applies to the bytes flowing through pipelock; pipelock just no longer treats those internal IPs as exfil targets.	2026-05-24 13:39:27 -04:00
didericis	f4452b391d	fix(pipelock): auto-allow cred-proxy hostname when routes are declared test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 22s Details The agent's HTTP_PROXY env points at pipelock, so an ANTHROPIC_BASE_URL like http://cred-proxy:9099/anthropic doesn't short-circuit through Docker's embedded DNS — it gets forwarded through pipelock, which then checks its api_allowlist for the hostname `cred-proxy` and 403's because the name isn't there. The agent surfaces the failure as "API Error: 403 blocked: domain not in allowlist: cred-proxy" on Claude's first call. Fix: pipelock_effective_allowlist auto-adds CRED_PROXY_HOSTNAME when bottle.cred_proxy.routes is non-empty (i.e., when the sidecar will actually be running and reachable). Move CRED_PROXY_HOSTNAME from backend/docker/cred_proxy.py to the backend-agnostic claude_bottle/cred_proxy.py so pipelock can reference it without a layering violation; the docker concrete imports it from the same place.	2026-05-24 13:25:21 -04:00
didericis	32b62cbacc	feat(cred_proxy)!: cred-proxy is the only Anthropic auth path test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 23s Details Removes the legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` -> `CLAUDE_CODE_OAUTH_TOKEN` forward in prepare.py. Bottles that need claude-code to authenticate must declare a cred_proxy route with role: "anthropic-base-url" — there is no fallback that hands the token to the agent directly. Drops the now-dead BottleSpec.forward_oauth_token field, the CLI setter that read CLAUDE_BOTTLE_OAUTH_TOKEN from the host env at prepare time, and the forward_oauth_token=False arg in the six pipelock integration tests. PRD 0010 and README updated; the dev ~/claude-bottle.json gains an anthropic-base-url route so the implementer/researcher agents keep working. BREAKING: bottles previously relying on the implicit OAuth forward will now produce an agent environ without any Anthropic credential. Verified with --dry-run: a bottle with no anthropic-base-url route yields env_names: [] (no token at all); a bottle that declares the route yields ANTHROPIC_BASE_URL plus a non-secret placeholder for CLAUDE_CODE_OAUTH_TOKEN.	2026-05-24 12:56:09 -04:00
didericis	2990c3c903	refactor(cred_proxy): rename Upstream -> Route, fix tea-login AttributeError test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 25s Details Three leftovers from the manifest refactor: 1. provision/cred_proxy.py:223 referenced u.kind == 'gitea' for the tea login count — kind was removed from the runtime class, so any bottle with a tea-login route raised AttributeError at provision time. Switch to `'tea-login' in r.roles`. 2. The runtime class CredProxyUpstream is renamed to CredProxyRoute (its data is a route on the proxy, not an "upstream"; the field route.upstream is the upstream URL). Module's own naming now aligns with manifest.CredProxyRoute and routes.json. 3. cred_proxy_upstreams_for_bottle -> cred_proxy_routes_for_bottle; CredProxyPlan.upstreams -> CredProxyPlan.routes; local `upstreams` collections become `routes`. Callers in backend.py, launch.py, prepare.py, bottle_plan.py, provision/cred_proxy.py, and tests updated. Also strips lingering `bottle.tokens` references from docstrings (pipelock.py, cred_proxy.py prepare(), manifest._parse_https_host, test_pipelock_allowlist.py module doc) and removes dead helpers from the integration test (the _bottle helper used a tokens field that no longer parses).	2026-05-15 02:39:10 -04:00
didericis	fcbbc4484d	refactor(cred_proxy): flat routes, role-driven provisioning (PRD 0010) test / unit (pull_request) Successful in 14s Details test / integration (pull_request) Successful in 22s Details Replace bottle.tokens (with Kind enum and hardcoded per-kind route/auth tables) with bottle.cred_proxy.routes — each route declares its own path, upstream, auth_scheme, token_ref, and optional role[]. The manifest is now the source of truth for the proxy's runtime route table; adding an upstream is a manifest edit, not a code change. Agent-side rewrites move from per-kind dispatch to per-role tags on routes: anthropic-base-url -> set ANTHROPIC_BASE_URL=<proxy><path> npm-registry -> write ~/.npmrc registry= git-insteadof -> write ~/.gitconfig [url] insteadOf, keyed off route.upstream (suppressed when bottle.git brokers the same host) tea-login -> add a ~/.config/tea/config.yml login Roles are a list (string accepted as sugar). A gitea route typically carries ["git-insteadof", "tea-login"]. Singleton roles (anthropic-base-url, npm-registry) appear on at most one route. token_env slots are assigned per distinct TokenRef in declaration order — two routes sharing a token_ref (e.g. github API + git endpoints) share a slot. Drops: TOKEN_KINDS, _KIND_ROUTES, _KIND_AUTH_SCHEME, _TOKEN_DEFAULT_HOST, cred_proxy_route_path_for_gitea, the kind field on CredProxyUpstream, and the kind-based hardcoding in pipelock_token_hosts (now derives from route.UpstreamHost). Legacy bottle.tokens manifests now die with a hint pointing at bottle.cred_proxy.routes + this PRD. Tests rewritten end-to-end. Docs + example.json + the dev ~/claude-bottle.json updated to match.	2026-05-13 21:49:55 -04:00
didericis	27b2d78b11	fix(cred_proxy): close git-push bypass + route through pipelock (PRD 0010) test / unit (pull_request) Successful in 15s Details test / integration (pull_request) Successful in 29s Details Three coupled fixes that close a documented bypass of git-gate's gitleaks pre-receive hook: 1. cred-proxy refuses git smart-HTTP push at runtime. Any path ending in /git-receive-pack or /info/refs?service=git-receive-pack returns 403 with a pointer at the bottle.git SSH path. Fetch (upload-pack) is still allowed — the bypass we're closing is push, where gitleaks is the load-bearing scanner. Hard guarantee. 2. The provisioner suppresses the cred-proxy `~/.gitconfig` insteadOf rewrite for any host already declared in bottle.git. git-gate is the canonical git path there; we don't write a competing rule that would let `git clone https://<host>/...` succeed in ways that confuse on push. Defense in depth — (1) is the hard guarantee. 3. cred-proxy routes its outbound HTTPS through pipelock. The sidecar's environ now sets HTTPS_PROXY=<pipelock-url>, and the image's entrypoint runs `update-ca-certificates` over the per-bottle pipelock CA (docker cp'd into /usr/local/share/ca-certificates/pipelock.crt before start) so the proxy's HTTPS client trusts pipelock's bumped certs. Consequence: pipelock's allowlist + body scanner now sit in the cred-proxy egress path the same way they sit in front of direct agent traffic. The cred-proxy upstream hosts (api.github.com, github.com, gitea hosts, registry.npmjs.org) come OFF pipelock's passthrough_domains. Only api.anthropic.com remains on passthrough (LLM body content legitimately trips DLP). PRD 0010 updated to reflect all three. Tests adjusted: the "cred-proxy hosts go on passthrough" assertion in test_pipelock_allowlist flips to "they don't", a new TestIsGitPushRequest exercises the smart-HTTP refusal predicate, and the gitconfig renderer tests cover the per-host suppression matrix.	2026-05-13 21:09:33 -04:00

... 2 3 4 5 6

261 Commits