bot-bottle

Author	SHA1	Message	Date
didericis-codex	cea832b21d	fix(codex): stop injecting api key placeholder test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 41s Details	2026-05-29 02:39:37 -04:00
didericis-codex	50baf63669	docs(prd): mark PRD 0028 active test / unit (pull_request) Successful in 35s Details test / integration (pull_request) Successful in 45s Details test / unit (push) Successful in 29s Details test / integration (push) Successful in 44s Details	2026-05-29 02:27:42 -04:00
didericis-claude	9dc0dfd5ee	docs(prd): PRD 0028 — git-gate new-branch push scan scope test / unit (pull_request) Successful in 29s Details test / integration (pull_request) Successful in 42s Details git-gate's pre-receive scans the full ancestry of a new branch, so the repo's historical test-fixture findings block every new-branch push (issue #106). Scope the new-ref scan to incoming commits (`$new --not --all`) with no loss of coverage, and harden the forward ssh against hangs. Refs #106 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-29 01:52:07 -04:00
didericis	2ea73e40a8	docs(decisions): ADR 0003 — system prompts stay user-directed test / integration (pull_request) Successful in 41s Details test / integration (push) Successful in 42s Details test / unit (pull_request) Successful in 28s Details test / unit (push) Successful in 26s Details Record that we considered auto-generating an agent's system prompt from its bottle's egress/git config (so it would know its access up front) but opted to keep prompts operator-authored: we may want to withhold that information from the agent directly, and the agent can infer its access on its own regardless. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-29 00:40:19 -04:00
didericis	ae1531835d	docs: drop "forge" jargon for concrete Gitea wording test / integration (pull_request) Successful in 53s Details test / integration (push) Successful in 57s Details test / unit (pull_request) Successful in 33s Details test / unit (push) Successful in 36s Details We use Gitea, not an abstract forge. Reword the docs added in this branch: "forge thread" -> "Gitea thread", and the research note's generic "forge" -> "Gitea" / "hosting provider" as context demands, keeping its portability argument coherent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	5c5f576df0	docs(research): add README describing research notes Document what research notes are (opinionated investigations of a question/design space), their unnumbered kebab-case naming, and their loose verdict-first shape — explicitly freeform, not a template. Point the AGENTS.md research line at it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	d329e511fd	docs: drop docs/INDEX.md, add PRD README with format Remove the one-line docs/INDEX.md (its directory pointers are covered by docs/README.md's "when to write which document" table). Add docs/prds/README.md documenting the PRD naming, Status lifecycle, and section format. Repoint the AGENTS.md repository-layout list at the new READMEs and add the decisions/ dir. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	1308e61c7e	docs: hoist "when to write which document" to docs/README.md Move the document-type comparison out of docs/decisions/README.md (where it only surfaced if you were already in the decisions dir) up to a new docs/README.md, renamed "When to write which document". Leave a pointer from the decisions README. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	2141a85884	docs(decisions): drop hand-maintained index from README Per review on PR #97: an index that lists every ADR is a sync burden. The files in docs/decisions/ are the index. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	ccbed97776	docs(prd): inline #88 rationale into PRD 0025 Add an "Alternatives considered" section enumerating the design options from issue #88 (duplicate bottles / agent-side bottle_config / bottle-side extends) and why extends won, so the PRD stands without the forge thread. Repoint the two phrases that depended on the #88 comment thread at the new section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	1df78ee77f	docs(decisions): add ADR-lite decision log Add docs/decisions/ with a convention README and back-fill two decisions that previously had no in-repo home: merging PRs with rebase (ADR 0001) and the agent-identity claimed-not-vouched trust posture from PRD 0027 (ADR 0002). Point docs/INDEX.md at it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	c840182d12	docs(research): issue tracking vs in-repo decision history Analyze tracking feature requests in Gitea against the project's in-repo PRDs/research notes, given the goal of keeping decision history portable and not provider-locked. Recommends demoting issues to an ephemeral inbox and reifying durable rationale into the repo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	7b4c1cd091	docs: drop "forge" jargon for concrete wording test / unit (push) Successful in 28s Details test / integration (push) Successful in 42s Details test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 43s Details We use Gitea, not an abstract forge. Reword the pre-existing research and PRD docs: the generic "Forge-API gate"/"forge tokens" become "Git-host-API gate"/"Git-host tokens" (the gate still spans Gitea / GitHub / GitLab), "Git/forge history" -> "Git/Gitea history", and the KNOWN_FORGE_HOSTS / forge: manifest-field examples -> KNOWN_GIT_HOSTS / git_host:. Meaning preserved; only the word "forge" is dropped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 22:57:20 -04:00
didericis	47c3ba63f8	docs(prd): mark merged PRDs as Active test / unit (pull_request) Successful in 36s Details test / integration (pull_request) Successful in 58s Details test / integration (push) Successful in 54s Details test / unit (push) Successful in 32s Details Flip Status: Draft -> Active for the 23 PRDs whose work has shipped to main (including 0027, now that PR #95 has merged). Leaves the terminal-status PRDs unchanged: 0007 and 0010 (Superseded) and 0014 (Retargeted) were replaced, not shipped as-is. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 22:12:03 -04:00
didericis	f9e3b6adda	docs(prd): add PRD 0027 agent-level git user identity test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 43s Details Lift git.user (name/email) to the agent layer with a per-field overlay onto the referenced bottle, mirroring the extends: merge. git.remotes stays bottle-only. Includes identity provenance in preflight/info and an example collapse. Refs #94 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 20:58:00 -04:00
didericis-codex	18e3b62b72	docs: rename CLAUDE.md to AGENTS.md and rebrand provider-agnostic test / unit (pull_request) Successful in 28s Details test / integration (pull_request) Successful in 40s Details test / unit (push) Successful in 31s Details test / integration (push) Successful in 44s Details Delete CLAUDE.md in favor of AGENTS.md as the orientation doc, rebrand the project from Codex-bottle to provider-agnostic bot-bottle, and repoint every CLAUDE.md reference across PRDs, research notes, the implementer agent example, and the yaml_subset comment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 20:36:47 -04:00
didericis-codex	cdb1870b1c	docs(agent): clarify claude oauth env test / unit (pull_request) Successful in 29s Details test / integration (pull_request) Successful in 43s Details	2026-05-28 18:20:09 -04:00
didericis-codex	cacba087c9	docs(agent): document provider base bottles test / unit (pull_request) Successful in 34s Details test / integration (pull_request) Successful in 53s Details Assisted-by: Codex	2026-05-28 18:00:38 -04:00
didericis-codex	1cbedc91c0	refactor(agent): use agent-neutral runtime names Assisted-by: Codex	2026-05-28 17:59:24 -04:00
didericis-codex	c08b09dc9f	refactor!: rename project to bot-bottle Assisted-by: Codex	2026-05-28 17:56:14 -04:00
didericis-codex	500fd910c4	feat(agent): add provider templates test / unit (pull_request) Successful in 28s Details test / integration (pull_request) Successful in 40s Details Assisted-by: Codex	2026-05-28 02:18:53 -04:00
didericis-codex	e03d90962d	docs(prd): scaffold PRD 0026 — Agent Provider Templates test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 45s Details Assisted-by: Codex	2026-05-28 02:05:09 -04:00
didericis-codex	59ee32cc8d	refactor(manifest): key git config by host test / unit (pull_request) Successful in 33s Details test / integration (pull_request) Successful in 42s Details	2026-05-28 00:49:34 -04:00
didericis-claude	4f7a506a9e	docs(prd): 0025 — bottle composition via `extends:` (issue #88 ) test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 40s Details	2026-05-27 23:27:04 -04:00
didericis-claude	7eda2a66ec	feat(smolmachines): patch smolvm state DB to actually enforce per-bottle allowlist test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 44s Details Earlier commit framed this PR as "infrastructure landed, TSI enforcement blocked on upstream smolvm 0.8.0." Found a clean workaround that lets us enforce now. Smolvm persists each machine's config (including `allowed_cidrs`) as a JSON BLOB in `~/Library/Application Support/smolvm/server/smolvm.db`, `vms.data`. `machine create --allow-cidr X/32` silently writes `allowed_cidrs: null` to that row when combined with `--from`, but smolvm reads the row at `machine start` — so patching the row between create and start sets the allowlist for real. New `loopback_alias.force_allowlist(machine_name, cidrs)` opens the SQLite DB, JSON-decodes the row, sets `allowed_cidrs`, and writes back as BLOB (Text type silently corrupts smolvm's later reads). launch.py calls it immediately after `machine_create` and before `machine_start`. Verified end-to-end on macOS / Docker Desktop: VM allowlist after start: ["127.0.0.16/32"] VM → 127.0.0.1:3000 → BLOCKED (Permission denied) VM → 8.8.8.8:53 → BLOCKED (Permission denied) VM → 127.0.0.16:<bundle> → CONNECTED The DB-patch hack is correct only because smolvm reads `allowed_cidrs` from the row at start time (not derived in- process). When upstream honors `--allow-cidr` with `--from`, the call becomes redundant — drop the call and the workaround is gone. Tests: 4 new for `force_allowlist` (BLOB round-trip; Linux no-op; missing DB; missing row). Total 593 unit tests pass. README + PRD updated to reflect the fix landed (no longer "infrastructure pending upstream"). gitea#75 can close. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 16:55:03 -04:00
didericis-claude	a919268d5e	docs: honest framing of upstream smolvm 0.8.0 allowlist bug test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 40s Details PR #76 originally claimed the per-bottle alias scoping closed gitea#75 ("agent can reach host loopback"). Verified empirically that's not actually true: `smolvm 0.8.0 machine create --from <smolmachine> --net --allow-cidr X/32` silently drops the allowlist (`agent.config.json` shows `allowed_cidrs: null`, and the running VM reaches all of `127.0.0.0/8` regardless). So the alias-allocation + alias-bind infrastructure is correct pre-work, but the actual TSI enforcement is blocked on an upstream smolvm bug. README + PRD 0023 + the module docstring get reworded to say so plainly. gitea#75 stays open. Workarounds tried (all dead-ends): - `machine update --allow-cidr` doesn't exist - stop-edit-`agent.config.json`-restart fails (smolvm removes the file on stop) - `--smolfile` is mutually exclusive with `--from` - `--image localhost:<port>/...` fails because smolvm's agent process can't reach host loopback during pull When upstream lands a fix, our existing code (alias allocation, port-bind, --allow-cidr in launch) will scope correctly without further changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 16:37:56 -04:00
didericis-claude	2edc1abb9a	feat(smolmachines): per-bottle loopback alias scopes TSI to single /32 test / unit (pull_request) Successful in 27s Details test / integration (pull_request) Successful in 41s Details PR #74's Docker-Desktop fix routed the agent through `127.0.0.1:<random>` loopback forwards, but TSI filters by IP only — so the allowlist `127.0.0.1/32` let the agent VM reach any host service on macOS loopback (postgres, dev servers, other bottles' published ports, mDNSResponder, ...). Real downgrade vs the docker backend's `--internal` network. Resolution: per-bottle loopback alias. - New `loopback_alias` module manages a pool of `127.0.0.16` .. `127.0.0.31` on `lo0`. macOS only routes `127.0.0.1` by default; the extras need `sudo ifconfig lo0 alias`. `ensure_pool()` lazily adds the missing entries via one sudo prompt on first launch per reboot — aliases persist on `lo0` until reboot, so subsequent launches skip the prompt entirely. - `allocate(slug)` picks the lowest-numbered unused alias by inspecting running bundle containers' port-binding HostIps. No on-disk reservation — docker is the source of truth. - Bundle bringup binds published ports to the allocated alias (`docker run -p <alias>::<port>`) instead of `127.0.0.1`. - TSI allowlist becomes the alias's /32 — narrows reachability to this bottle's bundle only. - Linux native daemons share the host's network namespace; `127.0.0.0/8` works without aliases, so the module no-ops on non-Darwin and returns `127.0.0.1` from `allocate`. Tracking issue closed: gitea/issues/75. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 16:23:17 -04:00
didericis-claude	45c821a8f3	docs(smolmachines): note loopback-scope limitation + tracking issue test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 43s Details PR #74's Docker-Desktop pivot widened the smolmachines TSI allowlist from `<bundle-ip>/32` to `127.0.0.1/32` (TSI can't filter by port, and docker bridge IPs aren't reachable from macOS networking). The agent VM can therefore reach any service on macOS's loopback while the bottle is running — not just the bundle's published ports. README gets a "Smolmachines backend" subsection under Quickstart spelling this out as a known v1 limitation. PRD 0023 grows a new open question #8 with the proposed v2 fix (per-bottle loopback alias + TSI allowlist scoped to that /32, via sudo `ifconfig lo0 alias`). Tracking issue: gitea.dideric.is/didericis/claude-bottle/issues/75. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 15:58:30 -04:00
didericis-claude	1fa17d1822	feat(smolmachines): build agent image from repo Dockerfile (PRD 0023 chunk 4c) test / unit (pull_request) Successful in 21s Details test / unit (push) Successful in 21s Details test / integration (push) Successful in 42s Details test / integration (pull_request) Successful in 41s Details Replaces the alpine:latest placeholder with a real claude-bottle agent image, converted into a .smolmachine artifact via an ephemeral local OCI registry. Why the registry hop: smolvm pack create only accepts OCI registry refs. Empirically it rejects docker-daemon://, oci-layout://, docker-archive: tarballs, and every other transport tested — the crane backend treats anything with a scheme prefix as a registry hostname. To convert a locally-built docker image into a .smolmachine we have to push it somewhere smolvm can pull from. Smallest path: bring up registry:2.8.3 bound to 127.0.0.1:<random>, docker tag + docker push into it, smolvm pack create --image localhost:<port>/claude-bottle:<id>, tear down the registry. The .smolmachine is cached under ~/.cache/claude-bottle/smolmachines/ keyed by the docker image ID (first 16 hex chars of the sha256), so a Dockerfile change picks up a new image ID and invalidates the cache. Unchanged rebuilds skip the whole build → registry → pack pipeline. This puts `docker build` in smolmachines prepare (the docker backend defers it to launch). Necessary because pack_create needs the image ID to derive the cache key, and prepare is the only hook ahead of launch that runs once per slug. Adds: - claude_bottle/backend/docker/util.py: image_id / tag / push helpers (thin docker CLI wrappers). - claude_bottle/backend/smolmachines/local_registry.py: ephemeral_registry() context manager; pins registry:2.8.3 by digest, binds 127.0.0.1::5000 (loopback-only), force-removes on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 13:51:02 -04:00
didericis-claude	5929caa219	docs(prd-0023): pivot to smolvm + TSI single-IP allowlist test / unit (pull_request) Successful in 22s Details test / integration (pull_request) Successful in 43s Details Chunk-1's empirical spike against smolvm 0.8.0 contradicted the research note that motivated the gvproxy network design: smolvm exposes no virtio-net-over-unixgram attachment. The first draft's "why gvproxy, not TSI" argument turns out to apply only to `--outbound-localhost-only`, not to TSI generally. New design: - Bundle (PRD 0024) runs on a dedicated per-bottle docker bridge with a pinned IP. Smolfile sets `[network] allow_cidrs = ["<bundle-ip>/32"]` and nothing else. Agent can reach the bundle and nothing else — host loopback, LAN, public internet directly are all refused at the VMM (TSI) layer. - Bind-address mitigation: egress binds 127.0.0.1:9099 inside the bundle (pipelock-internal); pipelock / git-gate / supervise bind 0.0.0.0 so the agent (across the TSI allowlist) can reach them. This is the port-granularity TSI's IP-only allowlist doesn't provide. - Smolfile renderer rewritten in chunk 2 to smolvm 0.8.0's actual schema (image / entrypoint / cmd / env / [network] allow_cidrs). The chunk-1 renderer (name= / [[net]]= under the gvproxy design) emits the wrong shape and will be replaced. - Drop gvproxy + VZFileHandleNetworkDeviceAttachment + the PyObjC fallback. Backend layout loses gvproxy_config.py, gvproxy.py, vfkit_attach.py. - Acceptance plan adds an egress-port-bypass probe in addition to the localhost-reach probe. - Chunks reshape: chunk 1 stays (renderer rewrite is part of chunk 2's cost); chunk 2 covers VM lifecycle + bundle + new Smolfile renderer; chunk 3 is the bundle bind-address change; chunks 4-5 unchanged in spirit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 03:47:03 -04:00
didericis	bce1ea21db	Merge pull request 'docs(prd-0023): smolmachines bottle backend' (#53 ) from prd-0023-smolmachines-backend into main test / unit (push) Successful in 21s Details test / integration (push) Successful in 40s Details	2026-05-27 02:16:11 -04:00
didericis	539234f29e	refactor(sidecars): drop vestigial start/stop methods (PRD 0024 chunk 3) test / unit (pull_request) Successful in 21s Details test / integration (pull_request) Successful in 41s Details Compose-up has owned per-container lifecycle since PRD 0018 ch3; the .start() / .stop() methods on DockerPipelockProxy / DockerEgress / DockerGitGate / DockerSupervise (and their abstractmethod declarations in the four base ABCs) were already documented as vestigial. With the bundle path in flight (PRD 0024 ch2), they are truly dead — collapse to nothing. Changes: - Removed start/stop methods from the four DockerSidecar classes. Plan dataclasses, image/path constants, container-name helpers, and the .prepare() methods all stay (the renderer + apply path still need them). - Removed the matching @abstractmethod declarations in the base ABCs so concrete subclasses don't have to stub them. - launch.launch() and prepare.resolve_plan() no longer take proxy/git_gate/egress/supervise instance parameters. backend.py loses the four instance attributes it threaded through. prepare.resolve_plan() instantiates the four classes itself to call their .prepare() methods. - Deleted four integration tests that only exercised the removed lifecycle: test_pipelock_sidecar_smoke, test_supervise_sidecar, test_git_gate_sidecar, test_git_gate_mirror. - Dropped the .stop-idempotency case in test_orphan_cleanup; the network-cleanup cases stay (those test real production code). - Marked test_pipelock_apply @skip pending chunk 4 — its bringup helper used .start; chunk 4 rewrites it with direct `docker run`. Dockerfile deletion deferred to chunk 5 (when the bundle flag default flips) — the legacy compose path still needs Dockerfile.{egress,git-gate,supervise} until then. Net: 708 lines removed, 80 added. 533 unit tests + 27 integration tests passing (5 skipped: the chunk-4-pending case + existing GITEA_ACTIONS guards). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 01:01:10 -04:00
didericis	62109a1caf	fix(sidecars): child death no longer tears down the bundle test / unit (pull_request) Successful in 20s Details test / integration (pull_request) Successful in 1m8s Details Reverses chunk 1's "any unexpected child death tears down the rest" policy. New behavior: a daemon dying is logged but does NOT initiate shutdown — the surviving daemons keep running and whatever the dead one served starts failing visibly on the agent side. The supervisor exits only when (a) it receives SIGTERM/SIGINT, or (b) every child has died on its own. Eventual design is restart-the-dead-daemon plus a notification to the supervise sidecar so the operator sees the event explicitly; this commit ships only the "log and leave alone" half. PRD 0024 open question 1 updated to reflect the new intent. Tests updated: replaced "crash propagates exit code via auto-teardown" with three cases that exercise the new policy (crash without shutdown leaves survivors up, crash-then-signal surfaces the nonzero code, all-children-die-unattended still converges the loop). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-27 00:19:50 -04:00
didericis	1894f621dd	docs(prd-0024): consolidate per-bottle sidecars into a single bundle test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m11s Details Replace pipelock + egress + git-gate + supervise as four separate containers with one bundle image (claude-bottle-sidecars) running all four daemons under a small stdlib Python init supervisor. Compose file collapses from five services to two; same daemons, same ports, same protocols, one container. Sized: bundle image + init → renderer collapse (feature-flagged) → backend Python trim → integration sweep → flag removal. Prerequisite for PRD 0023 chunk 3 (smolmachines backend reuses the same bundle as its sole host-side sidecar container). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-26 23:54:29 -04:00
didericis	4e00430c6e	docs(prd-0023): consume PRD 0024's bundle as the single sidecar test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m11s Details Replace the four host-side sidecar processes (pipelock + egress + git-gate + supervise) with a single bundled container per bottle, defined in PRD 0024 and consumed here. egress is internal to the bundle as pipelock's upstream; only pipelock, git-gate, and supervise are externally addressable, and only when the bottle uses them. gvproxy port_forwards collapse from one-per-process to one-per- external-port, all pointing into the one bundle container. Sizing: chunk 3 becomes "sidecar bundle lifecycle" and depends on PRD 0024 having landed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-26 23:51:57 -04:00
didericis	041da1d7af	docs(prd-0023): make gvproxy the network primitive; reject TSI test / unit (pull_request) Successful in 19s Details test / integration (pull_request) Successful in 1m9s Details TSI's --outbound-localhost-only is permissive on all of 127.0.0.0/8 with no destination-port filter, so any host loopback service (local Postgres, IDE plugins, another bottle's sidecar) is reachable from the guest. That's the wrong default for the malicious-agent threat model. Reworked the network design around gvproxy + VFKT unixgram attachment: the guest gets a virtio-net device, gvproxy is the userspace TCP/IP stack on the host side, and the only thing reachable from the guest is the explicit port-forward list (typically just pipelock). Host LAN, host loopback, and the public internet directly are gone by construction. VMM choice (smolmachines vs PyObjC + Virtualization.framework) is an open question contingent on whether libkrun's virtio-net mode lets us point at a custom unixgram socket. Backend name stays "smolmachines" either way per the original spec. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-26 23:41:32 -04:00
didericis	a2ac124d5c	docs(prd-0023): smolmachines bottle backend test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details Specs a second concrete BottleBackend selectable via CLAUDE_BOTTLE_BACKEND=smolmachines: per-agent libkrun microVM on macOS, sidecars relocated to host-side loopback ports plumbed via Smolfile env, PRD 0022's sandbox-escape suite as the acceptance gate (the env-var flip is the only change required). Docker backend ships unchanged and remains default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-26 23:19:08 -04:00
didericis	1111ced04d	docs(prd-0022): resolve remaining open Qs test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details All seven open questions now have decisions baked in: - Q1 (HTTP-exfil scope): authoritative. Every shape MUST block; chunk 3 expands into remediation sub-PRDs if any of path/query/header leak today. - Q3 (fake secret): multiple shapes, parameterized. Three env vars (TEST_SECRET_ANTHROPIC, _AWS, _GENERIC); test 5 loops via subTest. Resilient to gitleaks rule renames. - Q6 (missing backend): die. `get_bottle_backend()`'s current behavior surfaces clearly; surprise-skips are worse than loud failures for new-backend branches. - Q7 (tool deps): preflight check. setUpClass runs `which curl && which git && which dig`; SkipTest with the missing list catches future backends shipping thinner base images. Updated implementation chunks + test-5 sketch to match. No remaining open questions.	2026-05-26 22:11:32 -04:00
didericis	73939861f9	docs(prd-0022): resolve open Qs 2, 4, 5 (DNS, gitleaks order, CI) test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details User feedback: - Q2 (direct DNS resolver test): yes — test 4 grows a second sub-assertion verifying `dig @8.8.8.8` from the agent has no path out, alongside the existing crafted-subdomain check. - Q4 (gitleaks ordering): test 5 grows an ordering check — asserts the rejection mentions `gitleaks` AND does NOT mention upstream-network-phase phrases (resolve / refused / unreachable / upstream). Confirms gitleaks rejects BEFORE git-gate tries any upstream push. - Q5 (CI): try it, accept fallback. New chunk 6 adds a Gitea Actions job marked `continue-on-error: true` — runs the suite if the runner can host compose, doesn't block the workflow if docker-in-docker prevents it. Three open questions remain (1: pipelock's actual DLP coverage for non-body shapes; 3: realistic fake secret shape vs. gitleaks regex; 6+7: backend-agnostic invocation + required tools — for the smolmachines work).	2026-05-26 22:04:46 -04:00
didericis	62f6716e8d	docs(prd-0022): end-to-end sandbox-escape integration test test / unit (pull_request) Successful in 19s Details test / integration (pull_request) Successful in 1m9s Details Draft a PRD for a composite integration test that brings up a real bottle with a known allowlist + planted secret and runs five attacks from inside the agent container: 1. Request to non-allowlisted hostname 2. Request to non-allowlisted IP (incl. host-header spoof) 3. Secret exfil via HTTP — path / query / body / headers 4. Secret exfil via crafted DNS subdomain 5. Secret exfil via README link pushed through git-gate Each attack passes only when blocked with a permissions error. The suite is backend-agnostic — runs against whatever CLAUDE_BOTTLE_BACKEND selects — so it becomes the gate the upcoming smolmachines spike has to pass before that backend can substitute for Docker. Sized into 5 chunks (fixture → attacks 1+2 → attack 3 → attack 4 → attack 5). Seven open questions called out, biggest being: today's pipelock probably leaks via header / path / query because DLP only scans bodies — the test will expose this as a real gap (chunk 3 lands with `expectedFailure` markers if so).	2026-05-26 21:52:24 -04:00
didericis	e5316be454	docs(prd-0021): rewrite as standalone — no references to closed PR #48 test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m6s Details PR #48 closed; treat the implementation as starting from main, where no tmux integration exists yet. The PRD now describes the full design (including the `_in_tmux` detection + helper scaffolding) as fresh work. Sized into 4 chunks: `claude_docker_argv` refactor → tmux helpers + pane state + `_attach_to_bottle` dispatch → new-agent flow → stop + indicator. Same design as before — opt-in by `\$TMUX`, split-window-then- respawn, falls back to handoff on tmux failure or missing binary. No external references to PR #48.	2026-05-26 14:18:24 -04:00
didericis	8b8d668602	docs(prd-0021): dashboard as left tmux pane, selected agent as right pane test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m8s Details Draft a PRD that tightens PR #48's tmux integration from "one new window per attach" to "one persistent right pane that the dashboard's selection drives." Inside tmux (`\$TMUX` set): dashboard in the left pane; pressing Enter or `n` spawns claude in the right pane via `tmux split-window` on first attach, then `tmux respawn-pane` on subsequent attaches so the operator-focused agent is always the visible one. Outside tmux: falls back to today's handoff. Opt-in by environment; no flag. Sized into 4 chunks (pane state + create → respawn → stop integration → supersede PR #48's new-window). Seven open questions called out, the biggest being whether the dashboard should auto-exec into a fresh tmux session when launched outside one (v1 says no — operators start tmux themselves).	2026-05-26 14:14:02 -04:00
didericis	26322bdfd5	docs(prd-0020): record answers to open questions, switch to no-teardown-on-quit	2026-05-26 03:10:26 -04:00
didericis	ec20293c0a	docs(prd-0020): start + attach to agents from the dashboard test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m7s Details Draft a PRD that turns the dashboard into the operator's single surface — collapses today's two-terminal workflow (one for `./cli.py start`, one for `./cli.py dashboard`) into a single dashboard invocation that can spin up new agents, re-attach to ones it already spun up, and explicitly stop them. Picks the "handoff" mechanism from `docs/research/claude-code- pane-in-dashboard.md` (curses.endwin → docker exec -it claude → stdscr.refresh) and crucially decouples the bottle's lifetime from any single claude session: exit claude → back to dashboard with the bottle still running; quit dashboard → tear down every bottle the dashboard owns. Sized into 5 chunks (refactor → picker + new-agent → re-attach → explicit stop → quit-cleanup). Seven open questions called out, the biggest being modal-vs-drop-and-resume for the preflight Y/N inside curses.	2026-05-26 02:59:42 -04:00
didericis	8cd867f3d2	docs(research): claude-code pane in the dashboard test / integration (pull_request) Successful in 1m8s Details test / unit (pull_request) Successful in 17s Details test / unit (push) Successful in 17s Details test / integration (push) Successful in 1m2s Details Survey the three realistic ways to surface a claude-code session inside the dashboard TUI: 1. Handoff — drop curses, foreground claude, restore on exit (the existing `e`/`p` pattern, extended). Minimal code, side-by-time rather than side-by-side. 2. Embedded emulator — own a PTY, parse claude-code's ANSI stream via `pyte`, paint it into a curses pane. Real "pane in the dashboard" but a six-week build with one new dep and several integration trap-doors (alt-screen, resize, input routing, multi-PTY state). 3. External multiplexer — delegate pane creation to tmux / iTerm / wezterm when detected. Tiny code, but splits the operator's mental model and gives up layout control. Recommendation: ship Option 1 first; defer Option 2 to "only if Option 1 is observably insufficient"; treat Option 3 as a niche augmentation for power users. Calls out four followups worth verifying before committing (PTY behavior at small sizes, attach-to-existing-exec, SIGWINCH handling, `-it` vs `-i` for the embedded path).	2026-05-26 02:51:08 -04:00
didericis	9c9c32a941	docs(prd-0019): drop e/p fallback — selection-only, no-op otherwise test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m6s Details When no agent is selected, `e` / `p` do nothing (status line shows "no agent selected") rather than falling back to today's global discover-and-prompt. The discover-and-prompt scaffolding in `_operator_edit_routes_flow` / `_operator_edit_allowlist_flow` comes out entirely — selection in the agents pane is now the only way to scope an edit. Old open-question #4 (single-bottle shortcut behavior in proposals-pane mode) is moot and removed.	2026-05-26 01:03:23 -04:00
didericis	9539982d3f	docs(prd-0019): active agents in dashboard + agent-scoped edit verbs test / unit (pull_request) Successful in 17s Details test / integration (pull_request) Successful in 1m3s Details Draft a PRD that adds an "active agents" pane to the dashboard TUI (below the existing proposals pane) and reshapes the operator `routes edit` (e) / `pipelock edit` (p) verbs to be agent-scoped when the cursor is in the agents pane — no more global discover + disambiguation prompt on every press. Tab toggles which pane nav keys move through. Sized into 4 chunks (discovery helper → render pane → selection state → agent-scoped verbs). Six open questions called out, the biggest being whether per-bottle `compose ps` on every 1s tick scales for hosts with many bottles (answer leans toward one label-filtered `docker ps`).	2026-05-26 00:58:34 -04:00
didericis	3386cabe62	docs(prd-0018): resolve TTY open question — keep exec -it test / unit (pull_request) Successful in 18s Details test / integration (pull_request) Successful in 1m3s Details	2026-05-25 22:34:26 -04:00
didericis	3251ee1394	docs(prd-0018): one compose project per bottle instance test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 1m3s Details Draft a PRD that replaces the chain of per-sidecar docker SDK calls in `claude-bottle start` with a single `docker compose` project per instance. Each `state/<slug>/` dir gets a self-describing set of artifacts: metadata.json, docker-compose.yml, compose.log, and the existing transcript/ + live-config/.	2026-05-25 22:15:32 -04:00
didericis	9cd583fbbb	feat(egress-proxy): retarget remediation at egress-proxy (PRD 0017 chunk 3) test / unit (pull_request) Successful in 19s Details test / integration (pull_request) Successful in 1m6s Details Finishes PRD 0017. The `cred-proxy-block` MCP tool is renamed and its remediation apply path is repointed at egress-proxy. - `claude_bottle/supervise.py` — `TOOL_CRED_PROXY_BLOCK` → `TOOL_EGRESS_PROXY_BLOCK`; `COMPONENT_FOR_TOOL` maps the new tool ID to `egress-proxy` for audit-log routing. - `claude_bottle/supervise_server.py` — tool definition renamed + description rewritten: "Call when egress-proxy refused your HTTPS request ... Read the current routes.yaml from /etc/ claude-bottle/current-config/routes.yaml, compose a modified version, pass the full new file plus a justification." The syntactic validator dispatches on the new tool ID. - `claude_bottle/backend/docker/egress_proxy_apply.py` — renamed from `cred_proxy_apply.py`. Reads routes.yaml from /etc/egress-proxy/routes.yaml via `docker exec cat`; validates via `egress_proxy_addon_core.load_routes` (so both sides use the same parser); writes via `docker cp`; SIGHUPs egress-proxy with `docker kill --signal HUP`. `EgressProxyApplyError` replaces `CredProxyApplyError`. - `claude_bottle/cli/dashboard.py` — wires the new apply + `discover_egress_proxy_slugs` helper; the operator-initiated `routes edit <bottle>` verb now writes to egress-proxy with `.yaml` suffix. Stale follow-up comment about path-aware filtering removed — PRD 0017 settled that question. - `tests/integration/test_supervise_sidecar.py` — restores the approval round-trip test (chunk 2 had switched it to a reject path because no cred-proxy existed). Approval stubs `apply_routes_change` so the test focuses on the supervise queue/response plumbing rather than docker-exec into a real egress-proxy sidecar (that's covered separately). - `tests/unit/test_egress_proxy_apply.py` — rewritten against the new validator; covers JSON shape, missing routes key, partial-auth-pair rejection (the addon-core parser catches these before SIGHUP). - PRDs 0010 + 0014 — status headers updated to Superseded / Retargeted with a callout block pointing at PRD 0017's migration section. Historical text preserved. 384 unit + integration tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 15:13:44 -04:00

1 2 3

122 Commits