bot-bottle

Author	SHA1	Message	Date
didericis-claude	8f05226a4a	docs(research): local ollama deployment, harness selection, and model sizing test / unit (pull_request) Successful in 38s Details test / integration (pull_request) Successful in 51s Details	2026-06-04 01:26:11 +00:00
didericis	ae1531835d	docs: drop "forge" jargon for concrete Gitea wording test / integration (pull_request) Successful in 53s Details test / integration (push) Successful in 57s Details test / unit (pull_request) Successful in 33s Details test / unit (push) Successful in 36s Details We use Gitea, not an abstract forge. Reword the docs added in this branch: "forge thread" -> "Gitea thread", and the research note's generic "forge" -> "Gitea" / "hosting provider" as context demands, keeping its portability argument coherent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	5c5f576df0	docs(research): add README describing research notes Document what research notes are (opinionated investigations of a question/design space), their unnumbered kebab-case naming, and their loose verdict-first shape — explicitly freeform, not a template. Point the AGENTS.md research line at it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	c840182d12	docs(research): issue tracking vs in-repo decision history Analyze tracking feature requests in Gitea against the project's in-repo PRDs/research notes, given the goal of keeping decision history portable and not provider-locked. Recommends demoting issues to an ephemeral inbox and reifying durable rationale into the repo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 23:05:02 -04:00
didericis	7b4c1cd091	docs: drop "forge" jargon for concrete wording test / unit (push) Successful in 28s Details test / integration (push) Successful in 42s Details test / unit (pull_request) Successful in 26s Details test / integration (pull_request) Successful in 43s Details We use Gitea, not an abstract forge. Reword the pre-existing research and PRD docs: the generic "Forge-API gate"/"forge tokens" become "Git-host-API gate"/"Git-host tokens" (the gate still spans Gitea / GitHub / GitLab), "Git/forge history" -> "Git/Gitea history", and the KNOWN_FORGE_HOSTS / forge: manifest-field examples -> KNOWN_GIT_HOSTS / git_host:. Meaning preserved; only the word "forge" is dropped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 22:57:20 -04:00
didericis-codex	18e3b62b72	docs: rename CLAUDE.md to AGENTS.md and rebrand provider-agnostic test / unit (pull_request) Successful in 28s Details test / integration (pull_request) Successful in 40s Details test / unit (push) Successful in 31s Details test / integration (push) Successful in 44s Details Delete CLAUDE.md in favor of AGENTS.md as the orientation doc, rebrand the project from Codex-bottle to provider-agnostic bot-bottle, and repoint every CLAUDE.md reference across PRDs, research notes, the implementer agent example, and the yaml_subset comment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-28 20:36:47 -04:00
didericis-codex	cdb1870b1c	docs(agent): clarify claude oauth env test / unit (pull_request) Successful in 29s Details test / integration (pull_request) Successful in 43s Details	2026-05-28 18:20:09 -04:00
didericis-codex	c08b09dc9f	refactor!: rename project to bot-bottle Assisted-by: Codex	2026-05-28 17:56:14 -04:00
didericis	8cd867f3d2	docs(research): claude-code pane in the dashboard test / integration (pull_request) Successful in 1m8s Details test / unit (pull_request) Successful in 17s Details test / unit (push) Successful in 17s Details test / integration (push) Successful in 1m2s Details Survey the three realistic ways to surface a claude-code session inside the dashboard TUI: 1. Handoff — drop curses, foreground claude, restore on exit (the existing `e`/`p` pattern, extended). Minimal code, side-by-time rather than side-by-side. 2. Embedded emulator — own a PTY, parse claude-code's ANSI stream via `pyte`, paint it into a curses pane. Real "pane in the dashboard" but a six-week build with one new dep and several integration trap-doors (alt-screen, resize, input routing, multi-PTY state). 3. External multiplexer — delegate pane creation to tmux / iTerm / wezterm when detected. Tiny code, but splits the operator's mental model and gives up layout control. Recommendation: ship Option 1 first; defer Option 2 to "only if Option 1 is observably insufficient"; treat Option 3 as a niche augmentation for power users. Calls out four followups worth verifying before committing (PTY behavior at small sizes, attach-to-existing-exec, SIGWINCH handling, `-it` vs `-i` for the embedded path).	2026-05-26 02:51:08 -04:00
didericis	5e8ca21669	docs: replace stale bash-first framing with Python-stdlib-first test / unit (pull_request) Successful in 16s Details test / integration (pull_request) Successful in 1m32s Details The project started life as bash scripts and got rewritten to Python (documented in docs/research/bash-vs-python-vs-go.md). Several docs still carried the old "bash-first" framing — misleading for anyone reading them now (8.7k lines of Python vs. ~130 lines of bash, all in scripts/demo*.sh). - CLAUDE.md "What this is" + "Conventions": orchestrator is Python, posture is stdlib-first. - docs/prds/0010-cred-proxy.md, docs/research/manifest-format-and- grouping.md: quoted CLAUDE.md's old wording — re-quote. - docs/research/built-in-supervisor-design.md, landscape-containerized- claude.md, agent-sandbox-landscape.md, pipelock-assessment.md, network-egress-guard.md: drop "bash-first" claims about the project, keep accurate descriptions of external tools' bash usage. Leaves untouched: bash code-fence syntax in examples, README's literal `bash scripts/demo.sh` invocation (the demo IS bash), Claude Code's "Bash tool" references, IVIJL/devbox bash description (that project actually is bash), and the bash-vs-python-vs-go research note that records the rewrite decision. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 06:32:42 -04:00
didericis	4cce535008	docs(research): drop auto-respawn from the supervisor design The autonomous "review comment → respawn bottle with comment as next prompt" loop is the one feature that opens a prompt-injection vector the bottle wall can't close (a public commenter would get to issue instructions inside the agent's perimeter on every launch). The available mitigations — commenter allowlists, prompt-injection regex screens, private-repo defaults — are all soft. The durable defense is to keep the human between the review comment and any next agent prompt. So `supervise` is now strictly notify-only. The `auto_respawn` manifest field, the "with auto_respawn: true" behavior paragraph, and the matching trust-model edge case all go. The reasoning stays in the "Where to be conservative" bullet so the decision isn't re-litigated later.	2026-05-25 04:19:50 -04:00
didericis	afbb77b040	docs(research): built-in supervisor design (TUI + PR feedback)	2026-05-25 04:19:50 -04:00
didericis	1f9722ae27	docs(research): add Betterleaks switching analysis test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 28s Details	2026-05-24 23:59:42 -04:00
didericis	c33930290f	docs(research): survey gitleaks dashboards + add baseline-file primitive test / unit (pull_request) Successful in 13s Details test / integration (pull_request) Successful in 24s Details	2026-05-24 23:54:46 -04:00
didericis	a74dd2b97f	docs: research on git-gate commit approval; link from PRD 0012 test / unit (pull_request) Successful in 12s Details test / integration (pull_request) Successful in 22s Details	2026-05-24 23:39:17 -04:00
didericis	da969a503d	docs(research): manifest format + grouping options test / unit (pull_request) Successful in 12s Details test / integration (pull_request) Successful in 25s Details Captures the two open questions surfaced by PRD 0011: should bottles and agents stay grouped in one file or split per file, and should the format stay JSON or move to YAML / MD-with-frontmatter. Recommends per-file MD-with-frontmatter (with agents shaped close to Claude Code's subagent spec so they can drop into ~/.claude/agents/ as a side effect), explicitly flags the PyYAML runtime dependency as a user-decision crossing the project's "low deps by default" line, and leaves several other choices (hidden dotdir vs visible, migration tooling) as open questions. Companion to docs/prds/0011-cwd-manifest-trust-boundary.md (which solves the trust problem at the resolver layer); this doc explores a structural alternative that would make the boundary self-documenting on disk.	2026-05-24 21:12:43 -04:00
didericis	00649d27e9	docs(research): add credential-proxy landscape and DLP-minimization framing test / unit (push) Successful in 14s Details test / integration (push) Successful in 29s Details Consolidates oauth-token-exposure-to-claude.md and tea-token-isolation-via-proxy.md into agent-credential-proxy-landscape.md, adding a May-2026 survey of existing tools (Docker AI Sandboxes, Cloudflare Sandbox Auth, Infisical Agent Vault, nono, Aembit, LiteLLM CVE-2026-42208, Portkey, Helicone, etc.) and a build-vs-adopt verdict. Adds secret-minimization-over-dlp.md explaining why pipelock's body DLP and gitleaks's pre-receive scan cannot stop encoding/splitting exfil, and why moving credentials out of the bottle (the git-gate pattern, generalized) is the only robust answer. Updates git-secret-scanning-hardening.md's reference to point at the new consolidated landscape doc. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 23:25:12 -04:00
didericis	96d2c7b7a1	docs(research): add note on git secret-scanning as defense-in-depth test / unit (push) Successful in 12s Details test / integration (push) Successful in 15s Details Threat-models the case where a credential ends up in a tracked file and is git-pushed to a public remote — the secret is compromised the instant the push lands (events API, scrapers), not at merge time. Recommends gitleaks as the smallest-blast- radius layer to add: Go binary, MIT, offline, scans full history, hookable from the existing .githooks/. No code or workflow change; just the research note.	2026-05-12 16:24:06 -04:00
didericis	6716f091c1	docs(prd): add 0006, enable pipelock's native TLS interception test / unit (pull_request) Successful in 12s Details test / integration (pull_request) Successful in 13s Details Supersedes the abandoned PR #8 (`mitmproxy-tls-interception`), which built a mitmproxy + addon chain on the (falsified) premise that pipelock could not MITM. Empirical proof from the impl-time spike: with `tls_interception: { enabled: true, ca_cert, ca_key }` in pipelock's config, pipelock answered a credential POST over HTTPS with `STATUS=403 / body: blocked: request body contains secret: GitHub Token` and emitted both `scanner:"tls_intercept"` and `scanner:"body_dlp"` events. Standalone, no second proxy. Net change vs PR #8: one sidecar instead of two, no vendored addon, no addon-verdict pattern matching, no HTTPS-trust / DNS / lookup workarounds. Same end-state behavior — pipelock's DLP fires on plaintext for HTTPS hosts in the allowlist. Also cleaning up the now-stale TLS-research notes: - `docs/research/tls-mitm-for-pipelock.md` is removed. Its entire premise (mitmproxy in front of pipelock) is moot now that pipelock does the work natively. The mechanics of CONNECT bumping and the CA-lifecycle considerations it documented are the same as what pipelock implements; the PRD restates the parts that matter for the integration. - `docs/research/pipelock-assessment.md` had two stale claims corrected: the "Pipelock does not perform TLS inspection (no CA trust injection)" line in §Scope gaps and the "no TLS termination" cell in the comparison table. Both now point at the `tls_interception` config and `pipelock tls` CLI instead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 14:15:44 -04:00
didericis	8e261563dc	docs(research): TLS interception topologies for pipelock content scanning test / unit (push) Successful in 14s Details test / integration (push) Failing after 13s Details Survey of TLS-MITM tools (mitmproxy, Squid+ssl_bump, Go libraries) and five candidate topologies for adding TLS termination to the egress path so pipelock's DLP, subdomain-entropy, and MCP scanners can fire on plaintext bodies. Recommends mitmproxy in front of pipelock for v1 with a per-bottle ephemeral CA. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 11:41:34 -04:00
didericis	b97807ac71	docs(research): evaluate smolmachines as VM backend test / run tests/run_tests.py (push) Successful in 16s Details Compares smolmachines against the six subsystems in agent-vm-isolation.md. smolmachines replaces the microVM runtime, network attachment (libkrun TSI with built-in DNS-over-vsock filter), vsock control plane, and Python lifecycle wrapper. Pipelock stays; disk-image story shifts to OCI + writable overlay. Recommends adopting smolmachines as the macOS VM backend after smoke-testing TSI passthrough to a host-side pipelock.	2026-05-11 16:32:04 -04:00
didericis	aba9a823ba	docs(research): document macOS agent VM isolation approach Transcript-style notes on running an agent in a hardware-isolated microVM on macOS. Covers Virtualization.framework / vfkit / libkrun choices, hardware-isolation guarantees, driving VMs from Python (subprocess or PyObjC), pipelock as the egress proxy, vsock for the control channel, and egress enforcement via VZFileHandleNetworkDeviceAttachment + gvisor-tap-vsock.	2026-05-11 16:31:40 -04:00
didericis	08159e1031	docs(research): survey AI-agent sandbox tools test / run tests/run_tests.py (push) Successful in 19s Details Compares claude-bottle to endo-familiar, litterbox, agent-safehouse, matchlock, tilde.run, boxlite, microsandbox, and smolmachines. Covers isolation primitive, locality, agent integration, network policy, and maturity, and notes three borrowable ideas (per-use SSH confirmation, in-flight secret injection, microVM backend) that fit the current bash-first / local-Docker stance.	2026-05-11 15:56:23 -04:00
didericis	7e0e256370	docs: add research note on polish priorities to close the maturity gap test / run tests/run_tests.py (push) Successful in 21s Details Captures the ranked list of changes that would move the project from "works for me" toward the perceived maturity of comparable tools — onboarding friction, error messages, distribution, versioning, schema validation, starter library, docs site, cross-platform CI. Includes effort estimates and an explicit "what polish is not" section so the roadmap doesn't drift into feature work. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 20:38:44 -04:00
didericis	e1efc64862	docs: add research note on Apple container as an alternative backend test / run tests/run_tests.py (push) Successful in 14s Details Captures the surface area of the current Docker integration, how it maps to Apple's `container` framework, the dominant networking risk (pipelock multi-network attach), and the cost difference between a faithful port and a simplified VM-firewall variant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 02:36:11 -04:00
didericis	1e6f254db5	docs: add research note comparing bash, Python, and Go for the CLI test / run tests/run_tests.py (push) Successful in 14s Details Captures the reasoning for staying on Python, the conditions under which a Go rewrite would pay for itself, and why bash isn't viable at the project's current size. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 02:34:40 -04:00
didericis	ec6261cd77	docs: add Fly Machines case study to remote-docker-vm-isolation note test / run tests/run_tests.py (push) Successful in 13s Details Concrete worked example covering image strategy (with the bake-the- claude-bottle-image-in optimization that elides 30-90s of in-VM build), cold/warm/hot boot-to-prompt timing, standby vs ephemeral cost breakdown, three workflow patterns, and Fly-specific gotchas (DinD kernel requirements, the y/N preflight blocking automated launch, pricing-may-have-moved hedge). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 01:18:08 -04:00
didericis	43453c66ea	docs: add research note on remote Docker VM as an isolation upgrade test / run tests/run_tests.py (push) Successful in 15s Details Argues that running claude-bottle unchanged on a remote Linux VM with dockerd is the cheapest practical path to stronger isolation than local Docker — preserves the v1 pipelock topology, requires zero code changes, and shrinks the agent's blast radius from the developer laptop to a disposable VM. Cross-references the existing stronger-isolation-alternatives and local-vs-remote-agent-execution notes so the research set composes cleanly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 01:07:17 -04:00
didericis	7986f2bd23	docs: add research note on stronger isolation alternatives test / run tests/run_tests.py (push) Successful in 19s Details Surveys gVisor, Kata, Firecracker, and Apple Container as replacements or complements to Docker+runc, with concrete file-level migration notes for this codebase and a recommended rung-by-rung path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 00:38:46 -04:00
didericis	cc5e772519	docs: replace stale .sh paths with claude_bottle/.py equivalents test / run tests/run_tests.py (push) Successful in 13s Details Cleans up references to the pre-refactor bash layout (cli.sh, lib/.sh, scripts/*.sh) across README, Dockerfile, the pipelock PRD, and research notes. Refreshes line numbers in the oauth-token note against the current cli/start.py. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 00:27:25 -04:00
didericis	08597ebcf8	docs: add redundancy analysis to pipelock assessment Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-08 00:25:01 -04:00
didericis	b36e6da0b3	docs: add research note assessing pipelock for egress/exfil control Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-08 00:15:11 -04:00
didericis	c74bd5cf26	docs: add research note on multi-encoding secret exfil tripwires Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-08 00:00:51 -04:00
didericis	bc7f506311	docs: add research note on isolating tea token via proxy Investigates whether the Gitea `tea` CLI can be authenticated via a header-injecting proxy so the token never enters the container — even as an env var. Parallels the OAuth-token research note. Recommends an in-container root-owned reverse proxy as the lowest-friction shape, and flags the unavoidable tradeoff that the agent retains the token's full API scope (no exfil ≠ no harm). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:30:06 -04:00
didericis	edf79b3880	docs: add research note on container network egress guards Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:27:18 -04:00
didericis	7a38b8da23	docs: add research note on OAuth token exposure to claude Walks the current `docker run -e CLAUDE_CODE_OAUTH_TOKEN` flow, why claude can read the token trivially via its Bash tool, why no Linux primitive hides an env var from its own process, and why a root-owned localhost auth-injecting reverse proxy (paired with an egress allowlist) is the realistic mitigation. Documents `ANTHROPIC_BASE_URL` caveats (SSE, header passthrough, issue #36998, out-of-band traffic).	2026-05-07 23:24:39 -04:00
didericis	9b4ff29f49	docs: add research note on revoking Claude Code OAuth tokens Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 23:13:42 -04:00
didericis	c45f384fb8	Initial commit	2026-05-07 22:45:36 -04:00

38 Commits