Commit Graph

102 Commits

Author SHA1 Message Date
didericis f3f2e3e9ab feat(pipelock-block): tool sends failed URL, supervisor merges host
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Successful in 1m32s
Reshape the pipelock-block MCP tool around what the agent actually
knows at the moment of failure (the URL pipelock just refused), not
what the operator needs (a full allowlist file).

Before: agent had to read /etc/claude-bottle/current-config/allowlist,
copy the whole file, append their host, send back. Lots of work,
easy to get wrong, and the operator's diff was noisy because the
proposal contained every host the agent saw — most of which weren't
the change.

After: agent calls
  pipelock-block(failed_url="https://api.github.com/repos/foo/bar",
                 justification="...")
supervisor extracts api.github.com, fetches the running allowlist,
adds the host if not already present, applies the merged content.

Path is captured as operator context (the detail view labels it
"failed URL" instead of "proposed file") but isn't enforced —
pipelock's api_allowlist is hostname-only, so the path can't
become an allow rule.

- supervise_server: pipelock-block input schema gains `failed_url`
  (replaces `allowlist`); validate_proposed_file checks for
  http/https + hostname.
- PROPOSED_FILE_FIELD updated; tool description rewritten.
- dashboard._apply_pipelock_url: extract host, fetch current,
  merge, apply.
- _proposed_payload_label: detail view renders "failed URL" for
  pipelock-block, "proposed file" otherwise.
- Tests updated end-to-end; new url-host-merge + idempotent-merge +
  invalid-url cases added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 08:02:53 -04:00
didericis a9bb34cb77 feat(dashboard): highlight newly-arrived proposals in green for 5s
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m34s
When a new proposal lands in the dashboard's list, the operator
shouldn't have to compare the list to a mental snapshot to spot
what's new. Render newly-arrived proposals in green for the first
five seconds after they show up.

- _try_init_green: initialise a green color pair; returns 0 if the
  terminal lacks color so the highlight degrades to no-op.
- _main_loop tracks first_seen[proposal_id] across refresh ticks,
  pruning entries when a proposal leaves the queue.
- _render ORs green into the existing attr (composes with selection
  reverse-video — terminal handles the mix).

Applies to all tool types (cred-proxy-block, pipelock-block,
capability-block). If a tool-specific highlight is wanted later,
filter on qp.proposal.tool in _is_recent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:54:34 -04:00
didericis 307400f08a fix(supervise): bypass pipelock for agent → supervise MCP traffic
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m36s
`/mcp` showed the supervise server as ✔ connected (initialize is
fast), but any actual tool call failed because the supervise
MCP design is long-poll — the sidecar holds the HTTP request open
until the operator approves in the dashboard (potentially minutes)
and only then returns the response.

Pipelock is a forward proxy with idle timeouts; it cut the long-
polled HTTPS-style request well before the operator could act, and
claude-code reported the tool as ✘ failed.

Fix: add `supervise` to the agent's NO_PROXY when bottle.supervise
is true. The supervise sidecar is on the bottle's internal network
with the `supervise` network-alias, so the agent can dial it
directly via docker DNS — no proxy, no idle timeout.

Body-scanning supervise traffic isn't critical because the operator
reviews every proposal in the TUI before approving. The earlier
pipelock allowlist auto-add for `supervise` stays as belt-and-
braces (handles any proxy-respecting client other than claude-code
that might dial supervise).

Existing bottles need a restart to pick up the new NO_PROXY value
(env can't be changed on a running container). The dashboard's
pipelock-edit workaround from PR #25 unblocks short-running tool
calls in the meantime but won't survive the pipelock idle timeout
on a long-polled call.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:36:27 -04:00
didericis d2e047fa66 fix(pipelock): auto-allow supervise hostname like cred-proxy
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m35s
When PR #19 added the supervise sidecar (PRD 0013), I forgot to
mirror the cred-proxy auto-allow in pipelock_effective_allowlist.
The agent's HTTP_PROXY points at pipelock, so a request for
http://supervise:9100/ (the MCP endpoint claude-code dials) arrives
at pipelock as hostname `supervise` — and pipelock 403s it because
the host isn't in api_allowlist.

End-user symptom: even after `claude mcp add` registers the
supervise server, `/mcp` shows it as ✘ failed and the supervise
sidecar's docker logs are silent (request never gets through).

Mirror what cred-proxy already does: when bottle.supervise is True,
add SUPERVISE_HOSTNAME to the rendered pipelock allowlist. New tests
cover both the auto-add and the no-add-when-disabled invariants.

Existing bottles: the dashboard `pipelock edit <bottle>` verb (or
backend.docker.pipelock_apply.apply_allowlist_change) can apply
this fix to a running bottle without a relaunch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:27:30 -04:00
didericis 0e2fc97aa8 fix(supervise): provision MCP via claude mcp add, not raw settings.json
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m34s
The previous provisioner wrote ~/.claude/settings.json with an
mcpServers entry — but claude-code doesn't read its mcpServers from
that path. Inside a bottle, /mcp showed "No MCP servers configured"
even though the sidecar was running.

Switch to the official `claude mcp add` command run via docker exec:

  docker exec -u node <agent> \
    claude mcp add --scope user --transport http supervise <url>

claude-code owns its config file format (~/.claude.json shape, key
names, scope semantics) and has changed it between versions. The
official command writes to the right place in the right shape for
whatever version is installed.

Failure is logged but not fatal — the bottle still works; you just
have to register the server manually with the command surfaced in
the warning. Worst case is a bad agent claude-code version, not a
bad bottle.

To fix an already-running bottle without restarting, the user can
run the same `docker exec` command directly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:19:51 -04:00
didericis ef5d2f9a4d feat(state): preserve on crash + always snapshot transcript
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m31s
Extends the preserve-on-capability-block design to also preserve
state on agent crash, and snapshots the transcript on every
teardown so any resume (crash or capability-block) gets a warm
claude session — not a cold start.

- capability_apply: rename _snapshot_transcript → snapshot_transcript
  (public; reused below). No behavior change in the capability path.
- cli/start.py: capture bottle.exec_claude's exit code; while the
  container is still alive (inside the launch context):
    * always snapshot_transcript(identity)
    * if exit_code != 0, mark_preserved(identity)
  Then the existing _settle_state runs after teardown.

Now the preservation matrix is:

  exit 0   (clean)          → snapshot + cleanup state
  exit ≠0  (crash, Ctrl-C)  → snapshot + preserve + show resume hint
  capability-block          → (already snapshotted/preserved by apply
                               before teardown; this path is a no-op
                               because the container is already gone
                               by the time exec_claude returns)

snapshot_transcript is best-effort — capability-block's earlier
snapshot is not clobbered when the container is already torn down,
and a missing /home/node/.claude is a warn + skip.

Tested behavior: clean exit doesn't preserve, non-zero exit
(including SIGINT/130 and SIGKILL/137) preserves; empty identity
no-ops both helpers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 07:05:23 -04:00
didericis fb2b5844c4 feat(cleanup): prompt to remove per-bottle state, separately from containers
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m34s
`cli.py cleanup` already enumerated orphan containers + networks
and asked for confirmation before nuking them. Per-bottle state
under ~/.claude-bottle/state/ wasn't touched — accumulated forever,
including orphans from old code paths.

Add state to the cleanup flow with its own prompt: the trade-off is
different from containers (which are pure debris) because a state
dir may carry a resumable bottle (capability-block rebuild +
transcript snapshot) the operator still wants.

Output shows the resumable / orphan / rebuilt-Dockerfile / transcript /
preserve-marker flags for each state dir so the operator sees what
they'd lose. Both sections are skippable independently — answering
"n" to containers doesn't skip the state prompt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:56:04 -04:00
didericis 9dbd20398e feat(state): clean up per-bottle state on session end (except capability-block)
test / unit (pull_request) Successful in 19s
test / integration (pull_request) Successful in 1m35s
Previously every bottle launch left ~/.claude-bottle/state/<identity>/
behind forever — metadata.json on every run, plus per-bottle
Dockerfile + transcript snapshot on capability-block rebuilds. The
metadata accumulated debris across launches; the only state worth
keeping was the capability-block rebuild bundle.

Make cleanup the default; preserve only on capability-block.

- bottle_state.py: .preserve marker helpers (mark_preserved,
  is_preserved, clear_preserve_marker, preserve_marker_path) +
  cleanup_state(identity) that rm -rf's the per-bottle dir.
- capability_apply.apply_capability_change writes mark_preserved
  before teardown so cli.py's session-end cleanup keeps the dir.
- prepare.py clears any leftover marker at launch (start or resume),
  so a marker from a prior capability-block doesn't keep state
  alive past a subsequent normal session-end.
- cli/start.py runs the cleanup decision AFTER the launch context
  closes: if is_preserved → print resume hint; else cleanup_state.
  The resume hint moves out of the launch with-block (was previously
  printed unconditionally — would have misled the operator about
  whether state was actually kept).

Future-proof: cli.py never persists state speculatively. If the
agent wants to be resumable, it has to go through capability-block.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:51:13 -04:00
didericis 6e46ca4478 feat(supervise): provision agent-side MCP config so Claude sees the sidecar
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m30s
The supervise sidecar (PRD 0013) has been serving MCP at
http://supervise:9100/ since it landed, but the in-bottle Claude
Code had no `.mcp.json` or settings pointing there — so the agent
couldn't actually call cred-proxy-block / pipelock-block /
capability-block as tools. To exercise the flow you had to curl
the sidecar from a sibling container.

This closes that last mile.

- claude_bottle/backend/docker/provision/supervise.py (new):
  provision_supervise(plan, target) writes
  ~/.claude/settings.json into the running agent container with an
  mcpServers.supervise entry of type http pointing at the
  per-bottle sidecar. No-op when bottle.supervise is False.
- BottleBackend.provision orchestrator gains provision_supervise as
  the last step (after CA, prompt, skills, git, cred-proxy). Default
  impl is a no-op so non-Docker backends aren't forced to implement it.
- DockerBottleBackend wires it through to the new module.
- Test covers the rendered settings shape so a future regression in
  the MCP entry format would surface in unit-level CI.

To test the full flow end-to-end now:
  ./cli.py start <agent> --cwd       # agent's claude sees supervise
  # agent calls cred-proxy-block via MCP
  ./cli.py dashboard                  # approve
  ./cli.py resume <identity>          # restart with new capabilities

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:22:25 -04:00
didericis 4032e04a9c feat(bottle): random-suffix identity + cli.py resume <identity>
test / unit (pull_request) Successful in 18s
test / integration (pull_request) Successful in 1m30s
Replaces the cwd-hash identity with a random 5-char base36 suffix
per launch, so two simultaneous `start <agent>` invocations against
the same cwd no longer collide on container names. Each launch is
its own bottle.

State carries metadata: every prepare step writes
~/.claude-bottle/state/<identity>/metadata.json with the
(agent_name, cwd, copy_cwd, started_at) the bottle was launched
with. The new `cli.py resume <identity>` reads this metadata and
re-launches a bottle pinned to the same identity — picking up the
per-bottle Dockerfile (from a prior capability-block apply) and
the transcript snapshot under the same state dir.

- bottle_state.py: bottle_identity(agent_name) drops the cwd param
  and gains a random suffix; BottleMetadata dataclass +
  read/write/metadata_path helpers.
- BottleSpec gains an optional identity field — resume sets it to
  pin the identity; start leaves it empty so prepare mints fresh.
- prepare.py: writes metadata at launch time; uses spec.identity if
  provided (resume) else bottle_identity(agent_name) (fresh start).
- start.py: extracted _launch_bottle from cmd_start so resume can
  share the launch core; prints `./cli.py resume <identity>` hint
  at session end.
- cli/resume.py (new): reads metadata, reconstructs BottleSpec
  with the recorded identity + cwd, delegates to _launch_bottle.
  Errors clearly when no state exists for the given identity.
- cli/__init__.py: registers `resume` in COMMANDS + usage.
- dashboard.py: capability-block approval status line now appends
  the `resume <identity>` hint so the operator can copy-paste the
  rebuild command without leaving the TUI.

Closes the rebuild loop in PRD 0016: agent calls capability-block →
operator approves → bottle torn down with state preserved → status
line shows resume command → operator runs it → replacement bottle
boots with the new Dockerfile and prior transcript.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 06:09:45 -04:00
didericis e996f72532 fix(bottle): identity-key all per-bottle resources by (agent, cwd)
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Successful in 1m30s
The single point that computed `slug = slugify(agent_name)` in
prepare.py is now `slug = bottle_identity(agent_name, cwd)`. With
--cwd the identity has a sha256(resolved-cwd)[:12] suffix, so the
same agent against different projects gets distinct container
names, network names, queue dir, audit log paths, and per-bottle
state (Dockerfile + transcript). Without --cwd the identity is
just slugify(agent_name), unchanged from before — no-cwd bottles
look the same as today.

The downstream `slug` field on DockerBottlePlan keeps its name —
every module already threads it under "slug" and the value flowing
through is now the bottle's full identity. A comment in prepare.py
flags the change.

Fixes the bug surfaced in PR #22 review: running the same agent
against project-A's cwd then project-B's would silently share
project-A's per-bottle Dockerfile + transcript snapshot, container
name (forcing serialized runs), and queue/audit history.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:46:26 -04:00
didericis ac8f14ae6f test(capability): integration test for apply_capability_change (PRD 0016)
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m30s
Phase 4 of PRD 0016. End-to-end test against real Docker:

- Stages a fake bottle: alpine:latest container named
  claude-bottle-<slug> with a marker file at
  /home/node/.claude/sessions.json, plus a fake supervise sidecar.
- Calls apply_capability_change with a new Dockerfile.
- Verifies: per-bottle Dockerfile written, agent + sidecars
  removed, networks removed, transcript snapshot dir on host
  contains the marker file (proving docker cp transferred bytes).
- Subsequent-apply test proves the per-bottle Dockerfile state
  persists across rebuilds (before-diff uses the prior override,
  not the repo Dockerfile).
- Teardown-idempotent test: apply against a never-started bottle
  doesn't raise.

docker exec / cp / rm / network rm work fine across the docker
socket boundary, so this runs in DinD too — no act_runner skip
needed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:30:04 -04:00
didericis d9c47d0fbe feat(dashboard): wire capability-block approval to real apply (PRD 0016)
Phase 3 of PRD 0016. dashboard.approve() now dispatches to
apply_capability_change when the proposal is a capability-block:

  cred-proxy-block → apply_routes_change
  pipelock-block   → apply_allowlist_change
  capability-block → apply_capability_change   (new in PRD 0016)

CapabilityApplyError joins the ApplyError tuple, so the TUI's key
handlers catch it the same way and surface failures in the status
line.

After a successful capability-block apply, dashboard archives the
proposal+response itself — the supervise sidecar was torn down by
apply_capability_change and can't archive its own queue file.
Without this, dashboard.discover_pending would keep surfacing the
resolved proposal forever.

No audit log for capability-block per PRD 0013 — its record lives
in the per-bottle Dockerfile state + transcript snapshot.

Tests stub apply_capability_change at the dashboard module level,
add TestCapabilityApplyWiring (call wiring, failure-keeps-pending,
no-audit invariant, archive-after-apply), and update TestApproveReject
to stub the capability path too so it stays docker-independent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:28:35 -04:00
didericis 0899a898e0 feat(capability): host-side apply_capability_change orchestrator (PRD 0016)
Phase 2 of PRD 0016. New module
claude_bottle/backend/docker/capability_apply.py:

- apply_capability_change(slug, new_dockerfile): snapshot transcript
  → push working tree → write per-bottle Dockerfile → teardown.
  Returns (before, after) for the dashboard's audit/diff render.
- fetch_current_dockerfile(slug): per-bottle Dockerfile if set,
  else the repo's Dockerfile.
- Internal helpers _snapshot_transcript, _push_working_tree are
  best-effort (log + return on failure); _teardown_bottle is
  idempotent (force-rm + network rm silently ignore missing names).

Fire-and-forget from the agent's perspective: by the time the
dashboard writes the response file the supervise sidecar is already
gone (it was torn down), so the agent's tool call connection drops
without receiving the response. The replacement agent (next manual
`cli.py start <agent>`) sees the new per-bottle Dockerfile and the
transcript snapshot for resume. v1 does not auto-relaunch.

Tests cover sequencing (snapshot → push → teardown order), the
per-bottle vs repo Dockerfile fallback chain, empty-input rejection,
and the per-bottle-Dockerfile write. The docker exec / cp / rm
plumbing is covered by the Phase 4 integration test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:26:38 -04:00
didericis 02811e0417 feat(bottle): per-bottle Dockerfile state + image build hook (PRD 0016)
Phase 1 of PRD 0016. Lays the per-bottle state plumbing that
capability-block remediation will write into:

- claude_bottle/backend/docker/bottle_state.py: bottle_state_dir,
  per_bottle_dockerfile (read), write_per_bottle_dockerfile,
  per_bottle_image_tag (unique per slug), transcript_snapshot_dir.
  Stores under ~/.claude-bottle/state/<slug>/.
- prepare.py: when a per-bottle Dockerfile exists, use
  per_bottle_image_tag(slug) as the base image and pass the
  per-bottle Dockerfile path through DockerBottlePlan.dockerfile_path.
  --cwd still layers a derived image on top.
- launch.py: passes plan.dockerfile_path to build_image so the
  per-bottle Dockerfile is what docker build reads.
- DockerBottlePlan gains dockerfile_path field; print() surfaces it
  in the preflight summary so the operator can see at-a-glance that
  this bottle is running on a rebuilt image.

Phase 2 will write to write_per_bottle_dockerfile (capability-block
approval); Phase 3 wires it into the dashboard.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:23:31 -04:00
didericis 4fada1651b test(pipelock): integration test for apply_allowlist_change (PRD 0015)
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Successful in 1m8s
Phase 4 of PRD 0015. End-to-end test against real Docker:

- Brings up a real pipelock sidecar via the production
  DockerPipelockProxy bring-up + pipelock_tls_init.
- Calls apply_allowlist_change to add a new host.
- Polls the live /etc/pipelock.yaml until the new host shows up
  (bridging the docker-restart window).
- Verifies api_allowlist contains both old + new hosts and
  tls_interception block is preserved.
- Smaller cases: invalid hostname raises, missing sidecar raises,
  fetch_current_allowlist returns one-per-line format.

Skipped under GITEA_ACTIONS because pipelock_tls_init bind-mounts a
host path that doesn't share fs in the runner, matching the
existing pipelock smoke test's skip pattern.

Drive-by fix: fetch_current_yaml now uses `docker cp` (daemon-API
tarball copy) instead of `docker exec cat` because the pipelock
image is distroless and has no shell utilities.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:07:26 -04:00
didericis 1d58d62c47 feat(dashboard): pipelock edit TUI verb (PRD 0015)
Phase 3 of PRD 0015. Adds the proactive `pipelock edit` path,
mirroring routes edit from PRD 0014:

- discover_pipelock_slugs() lists running pipelock sidecars.
- operator_edit_allowlist(slug, new) wraps apply_allowlist_change
  and writes an audit entry tagged ACTION_OPERATOR_EDIT.
- New 'p' keybinding in the main TUI: discover slugs, prompt if
  multiple, fetch current allowlist, open in $EDITOR, apply on
  save.
- Extracts shared scaffolding into _operator_edit_flow used by
  both routes-edit and pipelock-edit — DRY without sacrificing
  the per-verb status-line copy.
- Footer updated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:03:20 -04:00
didericis 5a6c4be342 feat(dashboard): wire pipelock-block approval to real apply (PRD 0015)
Phase 2 of PRD 0015. dashboard.approve() now dispatches on the
proposal's tool:

  cred-proxy-block → apply_routes_change   (from PRD 0014)
  pipelock-block   → apply_allowlist_change (new in PRD 0015)
  capability-block → no-op (lands in PRD 0016)

PipelockApplyError joins CredProxyApplyError under the ApplyError
tuple the TUI catches: failures keep the proposal pending and the
status line surfaces the message; no response is written and no
audit entry is appended.

Tests: existing TestApproveReject stubs both apply paths; new
TestPipelockApplyWiring covers the call wiring, failure-propagation,
and real-diff-in-audit invariants.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 05:01:18 -04:00
didericis c05457fbef feat(pipelock): host-side apply_allowlist_change helper (PRD 0015)
Phase 1 of PRD 0015. New module
claude_bottle/backend/docker/pipelock_apply.py:

- fetch_current_yaml(slug): docker exec cat of the live
  /etc/pipelock.yaml.
- fetch_current_allowlist(slug): parses the yaml, extracts
  api_allowlist, renders as one-per-line for the operator/agent.
- parse_allowlist_content / render_allowlist_content: one-per-line
  with `#` comments + blank-line tolerance, conservative hostname
  validation.
- apply_allowlist_change(slug, new): parses new hosts, fetches +
  parses current yaml, swaps api_allowlist, re-renders via
  pipelock_render_yaml, docker cp into sidecar, docker restart.
  Returns (before, after) as one-per-line strings for the audit diff.
- PipelockApplyError: caller surfaces to operator without crashing
  the dashboard.

v1 uses restart, not SIGHUP — pipelock has no in-process reload
hook; adding one is the PRD's open question. Restart drops in-flight
outbound calls and the agent retries pick up the restarted proxy.

Yaml roundtrip is covered by tests: parse(render(cfg)) preserves
all fields pipelock_render_yaml emits, including tls_interception
+ passthrough_domains.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:59:13 -04:00
didericis 70f43d8c4f test(cred-proxy): integration test for SIGHUP + apply round-trip (PRD 0014)
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 1m12s
Phase 5 of PRD 0014. End-to-end test against real Docker:

- Brings up a cred-proxy sidecar with route /a/ → unreachable
  upstream (so 502 = route matched, 404 = no route).
- Calls apply_routes_change to swap to /b/ only.
- Polls until the route table flips: /a/ now 404s, /b/ now 502s.
- Separately verifies fetch_current_routes returns the live file,
  apply with invalid JSON raises, and apply against a non-existent
  sidecar raises.

No fake-upstream container needed: unreachable hostnames give the
502 signal directly. apply_routes_change uses docker exec / cp / kill
(not bind mounts), so this should work in docker-in-docker too —
no DinD skip needed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:50:29 -04:00
didericis 81277e9d81 feat(dashboard): routes edit TUI verb for operator-initiated changes (PRD 0014)
Phase 4 of PRD 0014. Adds the proactive routes-edit path that
doesn't require a pending proposal:

- discover_cred_proxy_slugs() lists running cred-proxy sidecars by
  parsing docker ps output. Returns [] when docker is unreachable
  or not installed (no exception escapes).
- operator_edit_routes(slug, new_content) wraps apply_routes_change
  and writes an audit entry tagged ACTION_OPERATOR_EDIT (so a
  future reader can distinguish operator-initiated changes from
  agent-proposal approvals in the log).
- New 'e' keybinding in the main TUI: discover slugs, prompt if
  multiple (or use the only one directly), fetch current routes,
  open in $EDITOR, apply on save. CredProxyApplyError lands in the
  status line; the operator can retry.

Tests cover audit-entry shape, failure path, and docker-missing
recovery for slug discovery.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:47:22 -04:00
didericis f3a1b4d667 feat(dashboard): wire cred-proxy-block approval to real apply (PRD 0014)
Phase 3 of PRD 0014. dashboard.approve() now does the real
remediation for cred-proxy-block proposals:

- Calls apply_routes_change(slug, file_to_apply) which fetches the
  current routes.json from the running sidecar, validates the new
  JSON, docker cp's it in, and SIGHUPs the sidecar.
- Audit entry's diff is now the real before→after from the apply
  return — not the empty-string placeholder 0013 wrote.
- On apply failure (CredProxyApplyError): no response file, no
  audit entry. Proposal stays pending so the operator can fix the
  input and retry. The TUI's key handlers catch the exception and
  surface the message in the status line.
- pipelock-block + capability-block remain no-op approvals; their
  remediation lands in PRDs 0015 + 0016 and the audit diff stays
  empty until then.
- reject path unchanged: no apply, audit entry with empty diff.

Tests stub apply_routes_change at the dashboard module level so the
unit suite doesn't need a running sidecar; integration test in
Phase 5 covers the real docker exec/cp/SIGHUP plumbing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:44:33 -04:00
didericis f7f1a7d5da feat(cred-proxy): host-side apply_routes_change helper (PRD 0014)
Phase 2 of PRD 0014. New module
claude_bottle/backend/docker/cred_proxy_apply.py:

- fetch_current_routes(slug): docker exec cat of the live
  routes.json from the running cred-proxy sidecar.
- validate_routes_json(content): syntactic check before SIGHUP so
  failures keep the old routes live and surface a clearer error
  than 'reload failed' in the sidecar logs.
- apply_routes_change(slug, new): fetch current → validate new →
  write to temp → docker cp into sidecar → docker kill --signal HUP.
  Returns (before, after) so the caller can render a real audit diff.
- CredProxyApplyError: caller surfaces to operator without crashing
  the dashboard.

docker exec / cp / kill paths are covered by the integration test
in Phase 5; unit tests here cover the validator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:41:18 -04:00
didericis ee60b09816 feat(cred-proxy): SIGHUP reload of routes.json (PRD 0014)
Phase 1 of PRD 0014. Adds the in-sidecar SIGHUP signal handler that
re-reads routes.json + re-resolves tokens from env without dropping
in-flight connections:

- reload_routes(server, path, environ=...) does the atomic swap.
  Returns (ok, message) so the caller can log/surface failures.
  On failure (bad JSON, missing file) the server keeps serving the
  old routes rather than dying — typos shouldn't crash the sidecar.
- install_sighup_handler wires SIGHUP → reload_routes. No-op on
  platforms without SIGHUP (Windows).
- serve() now installs the handler at startup.

Atomicity: Python attribute reassignment is atomic, and the request
handler reads server.routes/tokens once at the top of _proxy() so
an in-flight request keeps the version it captured.

Tests cover successful reload, JSON-parse failure, and missing-file
failure (both verify the old routes survive).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:39:54 -04:00
didericis 92fee89e20 test(supervise): skip queue round-trip test in docker-in-docker (PRD 0013)
test / unit (pull_request) Successful in 17s
test / integration (pull_request) Successful in 41s
The integration test test_tools_call_round_trips_through_queue
relies on a host bind-mount to share the queue dir between the
sidecar (writing proposals) and the test process (approving via
dashboard helpers). In the Gitea Actions runner the docker socket
forwards to the outer host's daemon, so bind-mount paths are
resolved against the outer host's fs — not the runner container's.
The sidecar writes its proposal where the test can't see it; the
test times out.

Add a one-shot probe that does docker run -v <tmp>:<container> and
checks both directions of fs visibility. Skip the round-trip test
when the probe fails. tools_list and the orphan-name test are
unaffected — they don't touch the queue.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:26:06 -04:00
didericis 9f445d61be test(supervise): docker integration test for the sidecar (PRD 0013)
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Failing after 1m25s
Phase 5 of PRD 0013. End-to-end integration test against real Docker:

- Brings up the supervise sidecar on a per-bottle internal network.
- A curl-image "agent" on the same network does tools/list and gets
  back the three PRD 0013 tool names over real MCP wire format.
- A tools/call round-trips through the queue: agent blocks on the
  call, host watches the queue, dashboard.approve writes a Response,
  agent receives the approval payload (status, notes) in MCP content.
- Documents the orphan-sidecar name-collision behavior so a future
  auto-cleanup change can flip the assertion.

Skips if docker is unreachable, matching the existing integration
pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis 0aecb41e33 feat(supervise): TUI dashboard for approve/modify/reject (PRD 0013)
Phase 4 of PRD 0013. Adds `claude-bottle dashboard` subcommand:

- discover_pending() walks ~/.claude-bottle/queue/* and gathers
  pending proposals across all bottles, sorted FIFO by arrival.
- approve / approve-with-final-file / reject helpers write the
  Response file the sidecar polls, and append an AuditEntry for
  cred-proxy and pipelock tools. capability-block proposals don't
  write to an audit log here (PRD 0016 captures via rebuild record).
- Stdlib-curses TUI: list view, detail view, $EDITOR shellout for
  modify-then-approve, inline prompt for reject reason.
- `dashboard --once` dumps pending proposals to stdout without
  bringing up curses — useful for scripted checks and tests.

For 0013 the audit entry's diff field is render_diff("", proposed)
because we don't yet have access to the live on-disk current file;
PRDs 0014 / 0015 fill in real before→after diffs once they own the
host-side config writes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis 4b2dbcdefd feat(supervise): Docker lifecycle + bottle integration (PRD 0013)
Phase 3 of PRD 0013. Wires the supervise sidecar into bottle launch:

- Manifest: bottle.supervise (bool, default False). Opt-in for v1 so
  existing bottles are unchanged.
- supervise.py: adds SupervisePlan + abstract Supervise(ABC) with a
  prepare template that stages the per-bottle queue dir on the host
  and the current-config dir under stage_dir (routes.json + allowlist
  + Dockerfile). Stdlib-only so it still runs as the in-container
  shared helper.
- backend/docker/supervise.py: DockerSupervise concrete start/stop.
  No egress network (the sidecar doesn't make outbound calls); just
  the bottle's internal network with network-alias "supervise" and a
  bind-mount of the host queue dir at /run/supervise/queue.
- Prepare wires supervise.prepare into the DockerBottlePlan, derives
  routes_content from cred_proxy_plan, allowlist_content from
  pipelock_effective_allowlist, and dockerfile_content from the
  repo's Dockerfile. supervise sidecar added to the orphan probe.
- Launch starts the supervise sidecar after pipelock + cred-proxy
  but before the agent (so DNS resolution for `supervise` is up on
  the agent's first tool call).
- Agent container gets a read-only bind-mount of the current-config
  dir at /etc/claude-bottle/current-config when supervise is enabled.
- bottle_plan print + to_dict surface the supervise state.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis d5ba253878 feat(supervise): MCP sidecar HTTP server + Dockerfile (PRD 0013)
Phase 2 of PRD 0013. Adds the in-container MCP server:

- claude_bottle/supervise_server.py: minimal JSON-RPC over HTTP MCP
  server. Handles initialize / notifications/initialized / tools/list /
  tools/call. Each tools/call validates the proposed file syntactically,
  writes a Proposal to the host-mounted queue, blocks waiting for a
  Response, archives both files, returns the operator's {status, notes}
  wrapped in MCP content.
- Three tool definitions with JSON Schema inputs: cred-proxy-block
  (routes.json), pipelock-block (allowlist), capability-block
  (Dockerfile).
- Dockerfile.supervise mirroring the cred-proxy pattern: same pinned
  python:3.13-alpine, copies supervise.py + supervise_server.py into
  /app, exposes port 9100.

Stdlib-only. Tests cover JSON-RPC parsing, per-tool validation, all
three handlers, the queue round-trip via a background responder
thread, and an end-to-end HTTP sanity check on a random port.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis 2e06090464 feat(supervise): host-side queue + audit log primitives (PRD 0013)
Phase 1 of PRD 0013. Adds claude_bottle/supervise.py with:

- Proposal / Response / AuditEntry dataclasses
- Per-bottle queue dir under ~/.claude-bottle/queue/<slug>/
- write/read/list/archive proposal helpers + wait_for_response
- Audit log writer (JSON-Lines under ~/.claude-bottle/audit/)
- Unified-diff rendering + sha256 helper for stale-proposal detection

Stdlib-only; in-container code (Phase 2) and Docker lifecycle
(Phase 3) follow. Tests cover queue, audit, and diff/hash helpers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:20:57 -04:00
didericis 6ba5f9a9d3 feat(manifest): per-file MD directory loader (PRD 0011)
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 22s
Manifest.resolve walks $HOME/.claude-bottle/{bottles,agents}/ and
$CWD/.claude-bottle/agents/ instead of reading claude-bottle.json.
A bottles/ subdir under $CWD is logged as a warn and ignored —
the filesystem layout IS the trust boundary, no resolver check
needed.

If claude-bottle.json exists alongside no .claude-bottle/ dir at
either location, dies with a clear pointer at the README — the
manifest format changed and we don't silently fall back.

Manifest.from_md_dirs(home, cwd) is the programmatic entry point
tests use to build a Manifest from fixture directories without
touching os.environ. Manifest.from_json_obj is preserved for
tests that still want to build manifests in-memory.

Bottle / agent frontmatter goes through Bottle.from_dict /
Agent.from_dict — same validators as today's JSON path. Unknown
top-level frontmatter keys die with a "did you mean" pointer
listing accepted keys. Filenames that don't match [a-z][a-z0-9-]*
are skipped with a warn.

Agent files accept the Claude Code subagent passthrough fields
(name, description, model, color, memory) so the same file can
drop into ~/.claude/agents/ — claude-bottle ignores them at
launch but doesn't reject.

The dry-run integration test ships a real MD fixture tree now;
all 200 unit + 17 integration tests stay green.
2026-05-24 22:15:02 -04:00
didericis 8c1e4d0220 feat(yaml_subset): hand-rolled YAML-subset + frontmatter parser
test / unit (pull_request) Successful in 12s
test / integration (pull_request) Successful in 25s
claude_bottle/yaml_subset.py — stdlib-only, ~450 lines. Parses the
bounded shape claude-bottle's manifest files use:

  - Block mappings (top-level + nested via indentation)
  - Block lists (under a key, items can be scalars or block-style
    mappings whose keys align with the rest after the dash)
  - Inline lists `[a, b]` and inline dicts `{a: 1}` for one-level
    leaves
  - Quoted (single + double) and bare strings
  - Scalars: string, int, true/false, null/~

Rejects, each with a clear pointer at the line number:

  - `yes`/`no`/`on`/`off`/`Y`/`N`/`TRUE`/`FALSE` — only literal
    `true` / `false` are bools (the Norway problem stays solved by
    "quote your strings if they look like bools")
  - Bare strings that look like dates / octals / hex / floats
  - Anchors (`&`/`*`), aliases, YAML tags (`!!str`)
  - Multi-line block scalars (`|`, `>`)
  - Tabs in indentation
  - Nested flow style (only one level allowed)

Public API:

  parse_yaml_subset(text) -> dict[str, object]
    Top level must be a mapping.

  parse_frontmatter(text) -> (dict, body_text)
    Strips `---` delimiters, parses content as YAML subset, returns
    the verbatim body text after the closing fence.

46 unit tests covering every construct the real manifest files use
(the cred_proxy.routes structure, role-as-inline-list, nested
ExtraHosts dicts) plus every rejection case listed in PRD 0011.
2026-05-24 21:59:34 -04:00
didericis 77a51702fc fix(cred_proxy): force identity encoding on upstream requests
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 25s
claude-code sends Accept-Encoding: gzip, deflate, br on every
request. api.anthropic.com honors it and returns gzip-compressed
SSE responses. Pipelock 2.3.0 has no decompression path; its
response scanner fails closed with "blocked: compressed
sse_stream response cannot be scanned" — and that gate fires
even with response_scanning.enabled=false and sse_streaming
disabled. Verified empirically against the real pipelock image.

Cleanest fix that preserves DLP coverage end-to-end: have
cred-proxy ask upstream for uncompressed bytes. Strip the
agent's Accept-Encoding when building the upstream headers and
inject `Accept-Encoding: identity`. Upstream returns plaintext;
pipelock can scan; no 403.

Bandwidth cost is the gzip ratio one-way (cred-proxy ↔ upstream
through pipelock). For LLM SSE streams that's a few KB extra per
turn — trivial compared to the alternative of leaving
pipelock's response scanner blind.
2026-05-24 14:08:35 -04:00
didericis 4662087b32 fix(pipelock): disable seed_phrase_detection for anthropic bottles
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 22s
The previous attempt added a `suppress: [{rule, path}]` entry. The
yaml validated and the entry showed up in the live pipelock's
config, but the BIP-39 detector kept firing — `suppress` only
silences alerts, not enforcement.

Reproduced the failure in isolation, probed three knobs against a
real pipelock with a canonical BIP-39 body
(`abandon abandon ... about`):

  suppress: [{rule: "BIP-39 Seed Phrase", path: "/anthropic/**"}]
    -> still 403
  rules.disabled: ["dlp:BIP-39 Seed Phrase"]
    -> still 403
  seed_phrase_detection: { enabled: false }
    -> 200 (forwarded)

Only the global toggle actually stops the block. Pipelock 2.3.0
has no per-path / per-host knob for this detector, so the
trade-off is: when the bottle declares an `anthropic-base-url`
route, BIP-39 detection comes off globally for that bottle. Every
other DLP pattern (gh*_, sk-ant-, AKIA, etc.) keeps firing — the
ones that actually map to claude-bottle's threat model.

Drops the `suppress:` emitter from pipelock_build_config /
pipelock_render_yaml; replaces with a `seed_phrase_detection:
{ enabled: false }` block driven by
`pipelock_seed_phrase_detection_enabled(bottle)`. Tests flip from
suppress-shape to seed_phrase shape. End-to-end probe through the
real pipelock image confirms BIP-39 bodies forward.
2026-05-24 13:59:05 -04:00
didericis c5d729e25d fix(pipelock): suppress BIP-39 detector on cred-proxy anthropic path
test / unit (pull_request) Successful in 14s
test / integration (pull_request) Successful in 22s
claude-code's chat bodies legitimately trip pipelock's BIP-39 seed-
phrase detector — any 12+ English words that pass the BIP-39
checksum match. The direct path to api.anthropic.com already sits
on tls_interception.passthrough_domains so no body scan runs
there, but the cred-proxy hop is plain HTTP through pipelock and
the body scanner fires.

Add an anthropic-route-specific suppress entry:
  suppress:
    - rule: "BIP-39 Seed Phrase"
      path: "/anthropic/**"

Just this one detector, only on this one path. Every other DLP
pattern (AKIA, gh*_, sk-ant-, etc.) keeps firing — those are
unambiguous credential shapes with no legitimate reason to appear
in a chat completion. Other detectors that fire on natural
language can be added to the suppress list when/if they surface.

Wiring: pipelock_effective_suppress(bottle) computes the entries
from bottle.cred_proxy.routes; pipelock_build_config accepts them
and emits a `suppress:` block; pipelock_render_yaml renders it.
Probed schema with `pipelock check --config` to confirm the
{rule, path} shape; full yaml validates clean.
2026-05-24 13:49:31 -04:00
didericis 51b20340a9 fix(pipelock): allow agent->sidecar traffic via SSRF exception
test / unit (pull_request) Successful in 12s
test / integration (pull_request) Successful in 21s
The agent's HTTP_PROXY points at pipelock, so a request to
http://cred-proxy:9099/... arrives at pipelock; pipelock resolves
the host, sees an RFC1918 address (the bottle's internal Docker
network sits in 172.x), and 403's "SSRF blocked: cred-proxy
resolves to internal IP 172.20.0.4". Bypassing pipelock entirely
would also remove its body scanner from the agent->cred-proxy leg
— we want to keep that DLP coverage.

Pipelock has `ssrf.ip_allowlist` for exactly this: CIDRs that
override the built-in internal-IP block while api_allowlist + body
scanning + tls_interception keep firing.

Wiring:

- `pipelock_build_config` accepts `ssrf_ip_allowlist`; when
  non-empty, emits an `ssrf: { ip_allowlist: [...] }` block.
- `pipelock_render_yaml` renders that block.
- `PipelockProxyPlan` gains `internal_network_cidr`.
- New `network_inspect_cidr(name)` helper reads the Docker-assigned
  subnet via `docker network inspect`.
- launch.py: after `network_create_internal`, inspect the CIDR,
  re-render the yaml with `ssrf_ip_allowlist=(cidr,)`, overwrite
  the file in place; `DockerPipelockProxy.start` then docker-cp's
  the updated content. Prepare's initial render stays unchanged
  (CIDR isn't known yet at prepare time).

The exception scope is the bottle's own internal network only —
agent ↔ pipelock / git-gate / cred-proxy. Body scanning still
applies to the bytes flowing through pipelock; pipelock just no
longer treats those internal IPs as exfil targets.
2026-05-24 13:39:27 -04:00
didericis f4452b391d fix(pipelock): auto-allow cred-proxy hostname when routes are declared
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 22s
The agent's HTTP_PROXY env points at pipelock, so an
ANTHROPIC_BASE_URL like http://cred-proxy:9099/anthropic doesn't
short-circuit through Docker's embedded DNS — it gets forwarded
through pipelock, which then checks its api_allowlist for the
hostname `cred-proxy` and 403's because the name isn't there. The
agent surfaces the failure as "API Error: 403 blocked: domain not
in allowlist: cred-proxy" on Claude's first call.

Fix: pipelock_effective_allowlist auto-adds CRED_PROXY_HOSTNAME
when bottle.cred_proxy.routes is non-empty (i.e., when the
sidecar will actually be running and reachable).

Move CRED_PROXY_HOSTNAME from backend/docker/cred_proxy.py to the
backend-agnostic claude_bottle/cred_proxy.py so pipelock can
reference it without a layering violation; the docker concrete
imports it from the same place.
2026-05-24 13:25:21 -04:00
didericis 32b62cbacc feat(cred_proxy)!: cred-proxy is the only Anthropic auth path
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 23s
Removes the legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` -> `CLAUDE_CODE_OAUTH_TOKEN`
forward in prepare.py. Bottles that need claude-code to authenticate
must declare a cred_proxy route with role: "anthropic-base-url" — there
is no fallback that hands the token to the agent directly.

Drops the now-dead BottleSpec.forward_oauth_token field, the CLI
setter that read CLAUDE_BOTTLE_OAUTH_TOKEN from the host env at
prepare time, and the forward_oauth_token=False arg in the six
pipelock integration tests.

PRD 0010 and README updated; the dev ~/claude-bottle.json gains an
anthropic-base-url route so the implementer/researcher agents keep
working.

BREAKING: bottles previously relying on the implicit OAuth forward
will now produce an agent environ without any Anthropic credential.
Verified with --dry-run: a bottle with no anthropic-base-url route
yields env_names: [] (no token at all); a bottle that declares the
route yields ANTHROPIC_BASE_URL plus a non-secret placeholder for
CLAUDE_CODE_OAUTH_TOKEN.
2026-05-24 12:56:09 -04:00
didericis 2990c3c903 refactor(cred_proxy): rename Upstream -> Route, fix tea-login AttributeError
test / unit (pull_request) Successful in 16s
test / integration (pull_request) Successful in 25s
Three leftovers from the manifest refactor:

1. provision/cred_proxy.py:223 referenced u.kind == 'gitea' for the
   tea login count — kind was removed from the runtime class, so any
   bottle with a tea-login route raised AttributeError at provision
   time. Switch to `'tea-login' in r.roles`.

2. The runtime class CredProxyUpstream is renamed to CredProxyRoute
   (its data is a route on the proxy, not an "upstream"; the field
   route.upstream is the upstream URL). Module's own naming now
   aligns with manifest.CredProxyRoute and routes.json.

3. cred_proxy_upstreams_for_bottle -> cred_proxy_routes_for_bottle;
   CredProxyPlan.upstreams -> CredProxyPlan.routes; local
   `upstreams` collections become `routes`. Callers in
   backend.py, launch.py, prepare.py, bottle_plan.py,
   provision/cred_proxy.py, and tests updated.

Also strips lingering `bottle.tokens` references from docstrings
(pipelock.py, cred_proxy.py prepare(), manifest._parse_https_host,
test_pipelock_allowlist.py module doc) and removes dead helpers
from the integration test (the _bottle helper used a tokens field
that no longer parses).
2026-05-15 02:39:10 -04:00
didericis fcbbc4484d refactor(cred_proxy): flat routes, role-driven provisioning (PRD 0010)
test / unit (pull_request) Successful in 14s
test / integration (pull_request) Successful in 22s
Replace bottle.tokens (with Kind enum and hardcoded per-kind
route/auth tables) with bottle.cred_proxy.routes — each route
declares its own path, upstream, auth_scheme, token_ref, and
optional role[]. The manifest is now the source of truth for the
proxy's runtime route table; adding an upstream is a manifest edit,
not a code change.

Agent-side rewrites move from per-kind dispatch to per-role tags
on routes:
  anthropic-base-url -> set ANTHROPIC_BASE_URL=<proxy><path>
  npm-registry       -> write ~/.npmrc registry=
  git-insteadof      -> write ~/.gitconfig [url] insteadOf, keyed
                        off route.upstream (suppressed when
                        bottle.git brokers the same host)
  tea-login          -> add a ~/.config/tea/config.yml login

Roles are a list (string accepted as sugar). A gitea route
typically carries ["git-insteadof", "tea-login"]. Singleton roles
(anthropic-base-url, npm-registry) appear on at most one route.

token_env slots are assigned per distinct TokenRef in declaration
order — two routes sharing a token_ref (e.g. github API + git
endpoints) share a slot.

Drops: TOKEN_KINDS, _KIND_ROUTES, _KIND_AUTH_SCHEME, _TOKEN_DEFAULT_HOST,
cred_proxy_route_path_for_gitea, the kind field on CredProxyUpstream,
and the kind-based hardcoding in pipelock_token_hosts (now derives
from route.UpstreamHost).

Legacy bottle.tokens manifests now die with a hint pointing at
bottle.cred_proxy.routes + this PRD. Tests rewritten end-to-end.
Docs + example.json + the dev ~/claude-bottle.json updated to match.
2026-05-13 21:49:55 -04:00
didericis 27b2d78b11 fix(cred_proxy): close git-push bypass + route through pipelock (PRD 0010)
test / unit (pull_request) Successful in 15s
test / integration (pull_request) Successful in 29s
Three coupled fixes that close a documented bypass of git-gate's
gitleaks pre-receive hook:

1. cred-proxy refuses git smart-HTTP push at runtime. Any path
   ending in /git-receive-pack or /info/refs?service=git-receive-pack
   returns 403 with a pointer at the bottle.git SSH path. Fetch
   (upload-pack) is still allowed — the bypass we're closing is
   push, where gitleaks is the load-bearing scanner. Hard guarantee.

2. The provisioner suppresses the cred-proxy `~/.gitconfig` insteadOf
   rewrite for any host already declared in bottle.git. git-gate is
   the canonical git path there; we don't write a competing rule
   that would let `git clone https://<host>/...` succeed in ways
   that confuse on push. Defense in depth — (1) is the hard guarantee.

3. cred-proxy routes its outbound HTTPS through pipelock. The
   sidecar's environ now sets HTTPS_PROXY=<pipelock-url>, and the
   image's entrypoint runs `update-ca-certificates` over the
   per-bottle pipelock CA (docker cp'd into
   /usr/local/share/ca-certificates/pipelock.crt before start) so
   the proxy's HTTPS client trusts pipelock's bumped certs.

   Consequence: pipelock's allowlist + body scanner now sit in the
   cred-proxy egress path the same way they sit in front of direct
   agent traffic. The cred-proxy upstream hosts (api.github.com,
   github.com, gitea hosts, registry.npmjs.org) come OFF
   pipelock's passthrough_domains. Only api.anthropic.com remains
   on passthrough (LLM body content legitimately trips DLP).

PRD 0010 updated to reflect all three. Tests adjusted: the
"cred-proxy hosts go on passthrough" assertion in
test_pipelock_allowlist flips to "they don't", a new
TestIsGitPushRequest exercises the smart-HTTP refusal predicate,
and the gitconfig renderer tests cover the per-host suppression
matrix.
2026-05-13 21:09:33 -04:00
didericis c8ab90d01d fix(manifest): allow token + git on the same host (PRD 0010)
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 22s
git-gate holds an SSH IdentityFile for push/fetch; cred-proxy holds
a PAT for HTTPS REST API calls. The two brokers are orthogonal —
the common dev setup names both on the same host (e.g. gitea.dideric.is
SSH for push, gitea.dideric.is PAT for `tea pr create`).

The original PRD 0010 wording called this a "configuration smell"
and rejected it at parse time. That was wrong; this drops the
overlap rejection from the validator and updates the PRD prose to
match. Tests flip from "rejection" to "coexistence" assertions.
2026-05-13 16:38:36 -04:00
didericis 07da4366ad test(cred_proxy): integration tests for header inject + strip (PRD 0010)
Drives DockerCredProxy.start through the production code path against
a fake upstream container running on the same egress network. The
"agent" is a curl container on the bottle's internal network — same
access topology the agent uses in production.

Covers PRD 0010 success criteria:
- SC3 (the request reaches upstream, header round-trip works)
- SC6 (inbound Authorization stripped; the proxy injects the
  configured token even when the agent tries to smuggle one in)
- partial SC2 (cred-proxy reachable by the alias from the internal
  network)
- 404 for unconfigured routes

Live-network tests against real Anthropic / GitHub / Gitea / npm
upstreams (SC4 and SC5 specifically) are deferred — the fake-upstream
shape covers the routing + header layer that's actually under test
here.
2026-05-13 16:29:10 -04:00
didericis 051896ba4c feat(pipelock): auto-allowlist cred-proxy upstream hosts (PRD 0010)
bottle.tokens declarations contribute their upstream hosts to both
pipelock's allowlist (so cred-proxy can reach them) and
passthrough_domains (so pipelock doesn't MITM the connection —
cred-proxy validates real upstream certs with the system CA bundle).

Mapping: anthropic -> api.anthropic.com (already on defaults);
github -> api.github.com + github.com; gitea -> the entry's host;
npm -> registry.npmjs.org.
2026-05-13 16:22:44 -04:00
didericis b3529b27a5 feat(cred_proxy): add agent-side provisioner (PRD 0010)
provision_cred_proxy(plan, target) drops:
- ~/.npmrc with registry= pointing at /npm/ on the proxy
- ~/.gitconfig insteadOf rules for github (https://github.com/) and
  per-gitea hosts, appended after provision_git's git-gate rules
- ~/.config/tea/config.yml with a logins: entry per declared gitea
  URL, pointing at /gitea/<host>/ on the proxy

Renderers are pure and unit-tested. The dispatcher reads
plan.cred_proxy_plan.upstreams, which the backend wiring (next
commit) populates on DockerBottlePlan.

ANTHROPIC_BASE_URL is deliberately *not* a dotfile — it goes into
the agent's docker run -e env so claude sees it from process start.
2026-05-13 16:11:04 -04:00
didericis 61e334c1b8 feat(cred_proxy): add DockerCredProxy concrete lifecycle (PRD 0010)
Mirrors DockerGitGate: build the image, docker create on the internal
network with --network-alias cred-proxy, docker cp the routes.json
into /run/cred-proxy/, attach the egress network, docker start. stop()
is idempotent.

Token values flow host env -> subprocess env -> sidecar env via
docker create -e NAME (no =VALUE on argv). The resolver fails early
with a clear pointer at the missing host env var name if any TokenRef
is unset.

Helpers (cred_proxy_container_name, cred_proxy_url) are agent-side
stable: the URL uses the network alias, not the slugged container
name, so the provisioner can write a fixed http://cred-proxy:9099/
URL regardless of which bottle is running.
2026-05-13 16:07:52 -04:00
didericis 3436d8a68a feat(cred_proxy): add HTTP server + sidecar image (PRD 0010)
Stdlib-only Python proxy: reads /run/cred-proxy/routes.json on boot,
listens on 0.0.0.0:9099, strips inbound Authorization, injects the
configured header (Bearer or token) using the route's token_env env
var, forwards over HTTPS to the upstream, and streams the response
back chunk-by-chunk (SSE-safe).

Hop-by-hop headers are stripped per RFC 7230, including anything
listed in `Connection:`. Content-Length is dropped so http.client
recomputes it on the upstream leg. Tokens never reach routes.json —
they arrive via the container's environ.

Dockerfile.cred-proxy builds on python:3.13-alpine pinned by digest;
mkdir /run/cred-proxy is baked in so docker cp can drop the route
table at start time. No pip install layer.

Smoke-tested: container boots, logs listen line, returns 404 for
unmatched paths. Full request/response cycle covered by the
integration tests in a follow-up commit.
2026-05-13 16:05:56 -04:00
didericis 3165fbeafe feat(cred_proxy): add abstract CredProxy + plan (PRD 0010)
Lifts bottle.tokens into a per-route CredProxyUpstream table, renders a
mode-600 routes.json that carries no token values or host env-var
names, and derives the {token_env: TokenRef} map the launch step will
use to forward host env values into the sidecar's environ.

Shape mirrors GitGate/PipelockProxy: abstract base does the host-side
prepare; start/stop is backend-specific. No backend wiring yet.
2026-05-13 16:01:18 -04:00
didericis 930997d0a7 feat(manifest): add bottle.tokens with TokenEntry (PRD 0010)
TokenEntry carries Kind (anthropic / github / gitea / npm), TokenRef
(name of host env var the CLI resolves at launch), and an optional Url
(required for gitea, fixed for the other kinds). Validation rejects
unknown kinds, duplicate non-gitea entries, duplicate gitea Urls, and
overlap with bottle.git hosts (where git-gate is already brokering).

No wiring yet — the field exists on Bottle but cred-proxy is the next
step. Adds tests/unit/test_manifest_tokens.py.
2026-05-13 15:59:00 -04:00
didericis 249e8cc15e test: drop ssh-gate suites and shadow-route assertions (PRD 0009)
- Delete tests/unit/test_ssh_gate.py and the fixture_with_ssh helpers.
- test_pipelock_yaml: drop the ssh-leak guard (structurally
  impossible now); the remaining tests switch to fixture_minimal.
- test_pipelock_allowlist: rewrite the union/dedup test to
  exercise an egress.allowlist that duplicates a baked default
  (the property the ssh-leak assertion was hitching onto).
- test_manifest_git: shadow-route assertion becomes a legacy-ssh-
  dies-with-hint assertion, since bottle.ssh is now parse-fail.
- test_orphan_cleanup: drop the SSHGate.stop idempotency check;
  pipelock equivalent stays.
- test_dry_run_plan: drop assertions on the removed ssh_hosts /
  ssh_gate keys.

52 unit tests pass.
2026-05-12 23:54:22 -04:00