refactor!: rename project to bot-bottle

Assisted-by: Codex
This commit is contained in:
2026-05-28 17:56:14 -04:00
parent 8875d8cc17
commit c08b09dc9f
200 changed files with 1271 additions and 1271 deletions
@@ -3,7 +3,7 @@
Consolidated research on running an auth-header-injecting proxy in
front of an AI agent so API tokens stay out of the agent's process
space. Folds in the per-service mechanics for the Anthropic OAuth
token and the Gitea PAT — the two cases claude-bottle hits first —
token and the Gitea PAT — the two cases bot-bottle hits first —
and surveys existing tools as of May 2026.
Companion to
@@ -15,7 +15,7 @@ the biggest credential risk).
## Summary
Today every claude-bottle agent gets `CLAUDE_CODE_OAUTH_TOKEN` (and
Today every bot-bottle agent gets `CLAUDE_CODE_OAUTH_TOKEN` (and
any `bottle.env` secrets like a Gitea PAT) injected as env vars,
which means the agent process can read them with `printenv` or
`/proc/self/environ`. A prompt-injected or hijacked agent can ship
@@ -28,11 +28,11 @@ level via `ptrace_may_access`; a future smolmachines backend
enforces it harder, at the VM line.
Several existing tools implement this pattern, but none of them are
a clean drop-in for claude-bottle today: the most architecturally
a clean drop-in for bot-bottle today: the most architecturally
aligned (nono) is alpha; the most mature open-source
(Infisical Agent Vault) requires TLS MITM and would double up on
pipelock's TLS-interception stack. For the Anthropic-token slice, a
small claude-bottle-specific reverse proxy modeled on the
small bot-bottle-specific reverse proxy modeled on the
phantom-token shape is probably the right call. For Gitea / GitHub /
GitLab, the same proxy generalizes by config.
@@ -49,7 +49,7 @@ the caller's UID/GID don't match the target's and the caller lacks
`CAP_SYS_PTRACE` or `CAP_PERFMON`. A `node`-uid claude attempting to
read a root-owned proxy's environ gets `EACCES`. Escape hatches
(`--cap-add=SYS_PTRACE`, `--cap-add=PERFMON`, `--privileged`) are
not used by claude-bottle. Yama `ptrace_scope` is irrelevant — it
not used by bot-bottle. Yama `ptrace_scope` is irrelevant — it
only relaxes the *same-UID* relationship check; the cross-UID
match requirement still blocks the read. On a smolmachines backend
the boundary becomes the VM line; same property, harder.
@@ -77,8 +77,8 @@ The remaining credible designs reduce to three:
### Anthropic / Claude Code
**Today's wiring** (`claude_bottle/cli/start.py`): the host's
`CLAUDE_BOTTLE_OAUTH_TOKEN` is forwarded into the bottle as
**Today's wiring** (`bot_bottle/cli/start.py`): the host's
`BOT_BOTTLE_OAUTH_TOKEN` is forwarded into the bottle as
`CLAUDE_CODE_OAUTH_TOKEN` via `docker run -e CLAUDE_CODE_OAUTH_TOKEN`
(no `=value`, so the value never lands on argv — good). Inside the
bottle, claude runs as `node` (UID 1000) with
@@ -128,7 +128,7 @@ never the token.
A hijacked claude could exfil the captured token (or any other
data) through any of these even with the proxy in place. Pair
the proxy with an explicit egress allowlist for the full benefit
(claude-bottle does this via pipelock).
(bot-bottle does this via pipelock).
- **Token refresh**: `claude setup-token` issues a ~1-year OAuth
token with no client-side refresh, so a static proxy value is
fine. The flip side is a one-year blast radius if the token leaks
@@ -138,7 +138,7 @@ never the token.
rewriting is safe.
- **`--bare` mode** reads only `ANTHROPIC_API_KEY`, not
`CLAUDE_CODE_OAUTH_TOKEN`. Not relevant to the interactive flow
claude-bottle ships, but worth noting if `--bare` is ever wired in.
bot-bottle ships, but worth noting if `--bare` is ever wired in.
### Gitea (`tea` + git HTTPS)
@@ -191,7 +191,7 @@ mitigation. Either composes cleanly with the same proxy.
## Proxy architectures
Four shapes worth comparing. The first is the lowest-friction
match for claude-bottle today.
match for bot-bottle today.
| Shape | Pros | Cons |
|---|---|---|
@@ -200,7 +200,7 @@ match for claude-bottle today.
| **Host-side proxy** | Token stays entirely outside the Linux VM. This is the Docker AI Sandbox shape. | A host daemon to maintain; the published port is reachable by any container on the host unless firewalled. UDS-across-VM doesn't work on Docker Desktop on macOS (no AF_UNIX `connect()` over the VM), but `host.docker.internal:<port>` over TCP works fine. |
| **Sidecar container** | Clean isolation; portable across hosts. Matches the existing pipelock / ssh-gate / git-gate topology. | Another container to orchestrate per agent; the token is in another container's env, which is a lateral move unless the sidecar runs with stricter isolation than the agent container does. |
For claude-bottle today — local Docker, per-agent containers, the
For bot-bottle today — local Docker, per-agent containers, the
root-owned-helper pattern already established by the SSH agent —
the **in-container reverse proxy** is the lowest-friction option
that gives the desired property. The sidecar-container shape is
@@ -214,7 +214,7 @@ Two categories:
- **A. Generic LLM / API gateways** that happen to support credential
injection as a side feature.
- **B. Purpose-built agent credential brokers** — newer, closer to
what claude-bottle wants.
what bot-bottle wants.
| Tool | Category | License | Topology | Injection mechanism | `ANTHROPIC_BASE_URL` compatible | Per-route allowlist | Maturity |
|---|---|---|---|---|---|---|---|
@@ -235,7 +235,7 @@ Two categories:
### Cluster commentary
- **The phantom-token pattern** (nono) is the cleanest architectural
fit for claude-bottle. The agent receives a per-session
fit for bot-bottle. The agent receives a per-session
cryptographically random token scoped to the localhost proxy;
the proxy validates and swaps for the real upstream credential.
No TLS interception, no CA trust setup, works directly with
@@ -275,7 +275,7 @@ is a bet on the project rather than a buy-vs-build win.
**Most mature OSS purpose-built:** Infisical Agent Vault. MIT,
v0.19.0 active, v0.17.0 added a containerized agent mode that
maps directly to claude-bottle. Friction is the TLS-MITM topology
maps directly to bot-bottle. Friction is the TLS-MITM topology
— another container-local CA, the Go-loopback workaround,
duplication with pipelock's existing TLS interception layer.
+8 -8
View File
@@ -4,13 +4,13 @@ A broader survey than [`landscape-containerized-claude.md`](landscape-containeri
which focused on Claude-Code-specific containerizers. This one covers
general AI-agent sandbox / containment projects — some Claude-specific,
some agent-agnostic, some hosted SaaS — and contrasts them with
claude-bottle's design.
bot-bottle's design.
Research conducted 2026-05-11.
## Summary
Eight projects surveyed. None duplicate claude-bottle's combination of
Eight projects surveyed. None duplicate bot-bottle's combination of
local Docker, declarative JSON manifest, per-agent egress allowlist via
pipelock, and bottle/agent split. Two clusters stand out:
@@ -157,7 +157,7 @@ plausible without a heavy stack.
## Comparison table
| Axis | claude-bottle | endo-familiar | litterbox | agent-safehouse | matchlock | tilde.run | boxlite | microsandbox | smolmachines |
| Axis | bot-bottle | endo-familiar | litterbox | agent-safehouse | matchlock | tilde.run | boxlite | microsandbox | smolmachines |
|---|---|---|---|---|---|---|---|---|---|
| Isolation | Docker + internal net + pipelock; gVisor if present | Object-capability (no OS isolation) | Podman + opt. Landlock | macOS `sandbox-exec` | MicroVM (Firecracker / Virt.fw) | Hosted container (unverified) | MicroVM (KVM / Hypervisor.fw) | MicroVM (libkrun) | MicroVM (libkrun / KVM) |
| Local vs hosted | Local | Local | Local (Linux) | Local (macOS) | Local | Hosted SaaS | Local | Local | Local |
@@ -171,9 +171,9 @@ plausible without a heavy stack.
## What's closest, what's different
**Closest in design and scope.** agent-safehouse and litterbox sit
nearest claude-bottle: local, single-user, thin wrappers over an
nearest bot-bottle: local, single-user, thin wrappers over an
existing OS primitive, low-dep. The split is the isolation primitive —
claude-bottle uses Docker + pipelock egress (plus gVisor where
bot-bottle uses Docker + pipelock egress (plus gVisor where
available); agent-safehouse uses `sandbox-exec`; litterbox uses Podman +
Landlock. matchlock and smolmachines are spiritually close on the
*policy* side (default-deny net, per-host allowlist) but use microVMs
@@ -181,16 +181,16 @@ instead of containers.
**Solving a different problem.** tilde.run is hosted SaaS for team /
production agent pipelines with data-versioned rollback — explicitly
opposite to claude-bottle's "infrastructure I control" goal. boxlite and
opposite to bot-bottle's "infrastructure I control" goal. boxlite and
microsandbox are infrastructure libraries aimed at platform builders
embedding sandboxes into agent frameworks; they would be a *backend*
claude-bottle could call, not a competitor to its manifest layer.
bot-bottle could call, not a competitor to its manifest layer.
endo-familiar is in a different paradigm entirely: capability passing
rather than kernel boundaries.
## Borrowable ideas
What claude-bottle already has that the survey suggested as
What bot-bottle already has that the survey suggested as
differentiators:
- Default-deny egress with a per-agent allowlist (pipelock).
- DLP scanning of outbound traffic.
+14 -14
View File
@@ -24,26 +24,26 @@ which version you want before starting.
## Current Docker surface area
The places claude-bottle shells out to `docker` today:
The places bot-bottle shells out to `docker` today:
- `build` — base image plus a per-cwd derived image
(`claude_bottle/docker.py:67-103`).
(`bot_bottle/docker.py:67-103`).
- `run` — with `--runtime`, `--env-file`, `-e`, `--name`, `--network`,
and volume mounts (`claude_bottle/cli/start.py:217-261`).
and volume mounts (`bot_bottle/cli/start.py:217-261`).
- `exec -it` / `exec -u 0` — for `claude` itself, file-ownership fixups,
and SSH provisioning (`claude_bottle/ssh.py`, `claude_bottle/skills.py`,
`claude_bottle/cli/start.py`).
and SSH provisioning (`bot_bottle/ssh.py`, `bot_bottle/skills.py`,
`bot_bottle/cli/start.py`).
- `cp` — skills, SSH keys, the prompt file, the workspace `.git`,
and the pipelock config
(`claude_bottle/skills.py:73`, `claude_bottle/ssh.py:106`,
`claude_bottle/cli/start.py:279`, `claude_bottle/pipelock.py:218`).
(`bot_bottle/skills.py:73`, `bot_bottle/ssh.py:106`,
`bot_bottle/cli/start.py:279`, `bot_bottle/pipelock.py:218`).
- `network create` / `connect` / `inspect` / `rm` — bottle network plus
multi-network attach for the pipelock sidecar
(`claude_bottle/network.py`, `claude_bottle/pipelock.py:227`).
(`bot_bottle/network.py`, `bot_bottle/pipelock.py:227`).
- `create` / `start` / `rm -f` — pipelock sidecar lifecycle
(`claude_bottle/pipelock.py:207-258`).
(`bot_bottle/pipelock.py:207-258`).
- Misc preflight: `image inspect`, `ps -a -f name=^...$`, `info` for
registered runtimes (`claude_bottle/docker.py`).
registered runtimes (`bot_bottle/docker.py`).
## Mapping to Apple's `container`
@@ -60,10 +60,10 @@ The places claude-bottle shells out to `docker` today:
Roughly two weeks for one person, split as:
1. **Backend abstraction (12 days).** `claude_bottle/docker.py` is
already a partial seam, but `claude_bottle/network.py`,
`claude_bottle/pipelock.py`, `claude_bottle/ssh.py`,
`claude_bottle/skills.py`, and `claude_bottle/cli/start.py` all call
1. **Backend abstraction (12 days).** `bot_bottle/docker.py` is
already a partial seam, but `bot_bottle/network.py`,
`bot_bottle/pipelock.py`, `bot_bottle/ssh.py`,
`bot_bottle/skills.py`, and `bot_bottle/cli/start.py` all call
`subprocess.run(["docker", ...])` directly. Define a `Backend`
protocol — `run`, `exec`, `cp`, `build`, `network_create`,
`network_connect`, `inspect`, `rm` — route every call through it,
+4 -4
View File
@@ -1,6 +1,6 @@
# Implementation language: bash vs. Python vs. Go
Research into which runtime claude-bottle should be implemented in, given
Research into which runtime bot-bottle should be implemented in, given
where the project is today (~1250 lines, Python, mostly orchestration of
`docker` / `flyctl` / `ssh`). The project started in bash and was rewritten
to Python; this note evaluates whether either of the other two options
@@ -10,7 +10,7 @@ would be a better fit going forward.
Stay on Python. Switch to Go if and when distribution friction becomes the
dominant pain — i.e., when bug reports about Python interpreter / venv
behavior start outweighing bug reports about claude-bottle itself. Bash is
behavior start outweighing bug reports about bot-bottle itself. Bash is
not the right tool at the project's current size; reverting would be a
regression.
@@ -54,7 +54,7 @@ The relevant criteria, in roughly the order they bite:
## Bash
Right tool *if the project stays under ~500 lines*. claude-bottle has
Right tool *if the project stays under ~500 lines*. bot-bottle has
already crossed that threshold (~1250 lines), and the orchestration is no
longer "stitch CLIs together" — it has manifest validation, env-var
resolution, network and sidecar lifecycle, and SSH provisioning. Bash
@@ -119,7 +119,7 @@ Costs:
Stay on Python. The signal to watch for, before reconsidering, is bug
reports about Python interpreter or venv behavior outnumbering bug reports
about claude-bottle's actual logic. Until that pattern shows up, the Go
about bot-bottle's actual logic. Until that pattern shows up, the Go
rewrite isn't paying for itself.
Independent of language: invest in the backend abstraction now. A clean
+5 -5
View File
@@ -2,13 +2,13 @@
## Question
Can claude-bottle grow a built-in supervisor — TUI inventory plus PR-feedback routing — without breaking the per-bottle isolation model, and without departing from the Python-stdlib-first, low-dependency posture?
Can bot-bottle grow a built-in supervisor — TUI inventory plus PR-feedback routing — without breaking the per-bottle isolation model, and without departing from the Python-stdlib-first, low-dependency posture?
## Context
claude-bottle today is a fleet *executor*: `./cli.py start <agent>` brings up one bottle (agent container + pipelock + optional git-gate + optional cred-proxy on a per-bottle internal network), and `cli.py` tears it down when the session ends. There is no inventory view, no idle-detection, no automated reaction to PR or CI events. In parallel use, a human is the supervisor — opening one terminal per bottle, switching between them, and watching upstream PR state by hand.
bot-bottle today is a fleet *executor*: `./cli.py start <agent>` brings up one bottle (agent container + pipelock + optional git-gate + optional cred-proxy on a per-bottle internal network), and `cli.py` tears it down when the session ends. There is no inventory view, no idle-detection, no automated reaction to PR or CI events. In parallel use, a human is the supervisor — opening one terminal per bottle, switching between them, and watching upstream PR state by hand.
A separate survey of the broader ecosystem ([agent control dashboards research, mid-2026](https://gitea.dideric.is/didericis/consilium-research/src/branch/main/developer-workflow/agent-control-dashboards-2026-05-24.md)) sorts dashboards into five tiers (session managers, parallel runners, Kanban boards, mission-control SPAs, observability backends). The earlier first-pass conclusion was that a full SPA tier conflicts with claude-bottle's isolation model. This doc reconsiders the smaller question: a TUI supervisor in the existing Python CLI.
A separate survey of the broader ecosystem ([agent control dashboards research, mid-2026](https://gitea.dideric.is/didericis/consilium-research/src/branch/main/developer-workflow/agent-control-dashboards-2026-05-24.md)) sorts dashboards into five tiers (session managers, parallel runners, Kanban boards, mission-control SPAs, observability backends). The earlier first-pass conclusion was that a full SPA tier conflicts with bot-bottle's isolation model. This doc reconsiders the smaller question: a TUI supervisor in the existing Python CLI.
## What I got wrong the first time
@@ -75,7 +75,7 @@ A few design defaults worth holding:
- **No auto-respawn.** The supervisor surfaces PR feedback to a human, never to the bottle's next prompt. The autonomous flow (review-comment → tear down → relaunch with the comment prepended) was considered and rejected: in a public-ish repo, any commenter could inject content that the next launch would treat as system instructions, with the agent's full bottle privileges. Available mitigations — commenter allowlists, prompt-injection regex screens, private-repo defaults — are all soft. The load-bearing defense is to keep the human between the review comment and any agent prompt. Notify-only is the only mode.
- **Idle detection is harder than it looks.** Last-log-line-age works ~80% of the time. Codeman's Ralph Loop tracker (watching for `<promise>` tags) is more accurate but adds complexity and tooling-coupling. Start with the dumb version; add heuristics only when actual confusion arises.
- **No web UI.** A browser UI reintroduces the privileged-channel problem — the browser talks to a server that talks to all bottles. TUI sidesteps it because the supervisor runs in the user's own shell context, not as a long-running daemon serving multiple consumers.
- **State file in `~/.claude-bottle/`, not inside any bottle.** The mapping of bottle → PR → status lives next to the manifest. Nothing about the supervisor's bookkeeping enters a bottle.
- **State file in `~/.bot-bottle/`, not inside any bottle.** The mapping of bottle → PR → status lives next to the manifest. Nothing about the supervisor's bookkeeping enters a bottle.
- **No new credentials on bottles.** PR-watch is a host-side concern. A bottle's manifest *names* the upstream/branch to watch; it does not grant the bottle the ability to read PR state itself.
## Trust-model edge cases worth flagging
@@ -91,6 +91,6 @@ Phased: `status` first (purely additive, no design decisions), then `watch` (the
## Conclusion
A supervisor that respects the bottle wall is a small natural extension of what claude-bottle already is, not a category shift toward Mission Control / Codeman / Composio AO. The mistake in earlier framing was treating "supervisor" as synonymous with "dashboard SPA." The trust-model question that disqualifies the SPA tier (privileged channel into every bottle) does not apply to a TUI that reads host-side signals and shells out to the existing CLI.
A supervisor that respects the bottle wall is a small natural extension of what bot-bottle already is, not a category shift toward Mission Control / Codeman / Composio AO. The mistake in earlier framing was treating "supervisor" as synonymous with "dashboard SPA." The trust-model question that disqualifies the SPA tier (privileged channel into every bottle) does not apply to a TUI that reads host-side signals and shells out to the existing CLI.
Recommendation: build `status` and `watch` opportunistically when the pain is felt; treat `supervise` as a separate PRD before implementation, scoped to notify-only (no autonomous loop from review comment to next agent prompt — see "Where to be conservative").
@@ -15,7 +15,7 @@ What's the cheapest path to that, and where does it bottom out?
Today the flow is bimodal. `./cli.py start <agent>` brings the
bottle up and immediately drops you into an interactive
`docker exec -it claude-bottle-<slug> claude ...` — claude-code
`docker exec -it bot-bottle-<slug> claude ...` — claude-code
owns the whole terminal until you Ctrl-D out, at which point the
bottle tears down. The dashboard (`./cli.py dashboard`) is a
*separate* invocation that watches across bottles but never
@@ -68,7 +68,7 @@ Below are the actual costs.
The dashboard sees a key (say Enter on a selected agent in the
agents pane). It calls `curses.endwin()`, then `subprocess.run(
["docker", "exec", "-it", "claude-bottle-<slug>", "claude",
["docker", "exec", "-it", "bot-bottle-<slug>", "claude",
"--dangerously-skip-permissions"])`. claude-code takes the
terminal full-screen. When the operator exits claude-code
(Ctrl-D, `/exit`), the subprocess returns; the dashboard calls
@@ -192,7 +192,7 @@ want one focused session at a time with proposals visible.
## Option 3: External multiplexer
The dashboard binds a key (e.g. `Enter` on agent) to
`tmux split-window -h 'docker exec -it claude-bottle-<slug>
`tmux split-window -h 'docker exec -it bot-bottle-<slug>
claude'` when run inside a tmux session, or to `osascript`-
driven iTerm pane spawning on macOS, or to `wezterm cli
spawn` if the user is on wezterm.
@@ -278,7 +278,7 @@ interface; the multiplexer is convenience for power users.
- PRD 0018 chunk 3 — agent container runs `sleep infinity`;
claude is invoked via `docker exec -it` (the
attachment-point this doc is layering against)
- `claude_bottle/cli/dashboard.py:_operator_edit_flow` — the
- `bot_bottle/cli/dashboard.py:_operator_edit_flow` — the
existing `curses.endwin` → shell out → `stdscr.refresh()`
pattern Option 1 would clone
- pyte: <https://pyte.readthedocs.io/> — the candidate
@@ -2,7 +2,7 @@
Research into how to revoke a long-lived `CLAUDE_CODE_OAUTH_TOKEN` (the kind
`claude setup-token` mints), prompted by needing to rotate a token baked into a
claude-bottle container.
bot-bottle container.
## Summary
@@ -63,7 +63,7 @@ For a known-leaked or suspected-leaked token:
1. Revoke the entry at `claude.ai/settings/claude-code`.
2. Run "Log out all sessions" under Settings → Account → Active Sessions.
3. Run `claude setup-token` to mint a replacement, and rotate it into
`CLAUDE_BOTTLE_OAUTH_TOKEN` immediately.
`BOT_BOTTLE_OAUTH_TOKEN` immediately.
4. Email Anthropic support at `support.anthropic.com`. Security issues
sometimes get attention that GitHub issues do not.
+9 -9
View File
@@ -13,7 +13,7 @@ wrong in the user-intent sense, and there is no way to say so.
## Summary
No off-the-shelf dashboard fits the shape claude-bottle needs
No off-the-shelf dashboard fits the shape bot-bottle needs
(per-bottle, host-local, integrated into a pre-receive rejection
with approval feeding back into the gate's own decision). Gitleaks
itself is a CLI with no UI and was declared **feature-complete** in
@@ -49,9 +49,9 @@ baseline), and recommends a direction.
## Question 1: Existing dashboards and control surfaces
### Inside claude-bottle today
### Inside bot-bottle today
`claude_bottle/cli/` has `_common, cleanup, edit, info, init, list,
`bot_bottle/cli/` has `_common, cleanup, edit, info, init, list,
start` — nothing gate-specific. The gate appears only as a sidecar
in `bottle_plan.py`'s preflight rendering. Rejections are written
to the pre-receive hook's stderr (`echo "git-gate: gitleaks
@@ -76,14 +76,14 @@ TOML allowlist, and a roadmap that includes LLM-assisted
classification and automatic secret revocation via provider APIs.
Still CLI-shaped — no dashboard either.
Relevant to claude-bottle in two ways:
Relevant to bot-bottle in two ways:
- The upstream direction of travel is *toward* agent-driven
scanners, which makes "the bottle invokes a scanner and reports
findings up" a supported pattern rather than a hack.
- CEL is a richer expression language for filter entries than
gitleaks's selector struct, which loosens the design space for
Option B (below). If claude-bottle ever swaps gitleaks for
Option B (below). If bot-bottle ever swaps gitleaks for
Betterleaks, the approval-flow design should be expressible in
both.
@@ -107,7 +107,7 @@ false-positive in its UI, and tracks remediation state. Designed
for org-scale: one DefectDojo instance covers many repos and
scanners.
Shape mismatch for claude-bottle:
Shape mismatch for bot-bottle:
- DefectDojo's review state is *informational* — marking a finding
as accepted in DefectDojo does not write to gitleaks's allowlist
@@ -137,7 +137,7 @@ premise is sandbox isolation.
### Bottom line
No off-the-shelf dashboard fits claude-bottle's shape: per-bottle,
No off-the-shelf dashboard fits bot-bottle's shape: per-bottle,
host-local, integrated into a pre-receive rejection with the
approval feeding back into the gate's own decision-making. The
nearest open-source analogue (DefectDojo) is post-hoc and
@@ -334,7 +334,7 @@ project, and the vendor-side benchmark numbers (98.6% recall vs
gitleaks's 70.4% on CredData) have not been independently
reproduced in published sources.
### What Betterleaks would add for claude-bottle
### What Betterleaks would add for bot-bottle
- **Detection coverage on encoded secrets.** Native handling of
doubly- and triply-encoded matches. This matters in the
@@ -434,6 +434,6 @@ redesign.
- [AWS example access key (`AKIAIOSFODNN7EXAMPLE`)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html)
— documented placeholder safe to use in examples without
triggering most secret scanners.
- `claude_bottle/git_gate.py` — pre-receive hook implementation.
- `bot_bottle/git_gate.py` — pre-receive hook implementation.
Today: `gitleaks git --log-opts="$log_opts" --no-banner
--redact`; no `--config`, no `--baseline-path`.
@@ -1,6 +1,6 @@
# Git secret scanning as further hardening
Research into whether claude-bottle should add a secret-scanning step to
Research into whether bot-bottle should add a secret-scanning step to
its git workflow — both on the host repo and (potentially) inside
bottles — and what tools exist for it. Motivated by the threat model
below: a secret accidentally `git push`ed to a public remote is
@@ -14,7 +14,7 @@ of defense-in-depth that doesn't replace any existing control
(`.gitignore`, environment-variable hygiene, network egress guards) but
catches the one case where everything else fails: a credential ending
up in a tracked file or commit message and being pushed to a public
remote. For claude-bottle specifically, `gitleaks` is the clearest fit
remote. For bot-bottle specifically, `gitleaks` is the clearest fit
— Go binary, MIT, scans full history including commit messages, runs
fully offline, and integrates with the existing `.githooks/` directory
without adding a new runtime.
@@ -83,12 +83,12 @@ suspicious, let me close without merging," the bytes that mattered are
already on the attacker's box. Detection has to be at *commit* time
(or *push* time at the latest), not at review time.
### Why this matters for claude-bottle
### Why this matters for bot-bottle
Two surfaces are exposed:
1. **The claude-bottle repo itself.** Development happens on a host
with `CLAUDE_BOTTLE_OAUTH_TOKEN`, Gitea tokens, and other
1. **The bot-bottle repo itself.** Development happens on a host
with `BOT_BOTTLE_OAUTH_TOKEN`, Gitea tokens, and other
credentials in the environment. A fixture, test snapshot, log
capture, or pasted-in debug output could carry one of them into a
tracked file. The repo's Gitea remote is private, but mirrors or
@@ -209,7 +209,7 @@ it with a separate message-scanning step.
## Recommended path forward
In priority order, for the host claude-bottle repo:
In priority order, for the host bot-bottle repo:
1. **One-time retro scan** with gitleaks:
`gitleaks detect --source . --log-opts="--all" --redact`.
@@ -2,7 +2,7 @@
## Question
Can host Claude decide which claude-bottle container to spin up for a task, while guaranteeing the work executes in the container and not on the host?
Can host Claude decide which bot-bottle container to spin up for a task, while guaranteeing the work executes in the container and not on the host?
## Claude Code Agent Mechanisms
@@ -16,7 +16,7 @@ Claude Code provides two mechanisms for defining reusable agent behavior:
## The Reliability Problem
The previous approach used an MCP server to bridge host Claude and claude-bottle containers. It failed because host Claude had both work-capable tools (Edit, Write, Bash) and MCP dispatch tools. Claude could choose to do the work itself rather than dispatch, with no enforcement mechanism to prevent it.
The previous approach used an MCP server to bridge host Claude and bot-bottle containers. It failed because host Claude had both work-capable tools (Edit, Write, Bash) and MCP dispatch tools. Claude could choose to do the work itself rather than dispatch, with no enforcement mechanism to prevent it.
## Why Tool Restriction Solves It
@@ -26,7 +26,7 @@ Claude Code's subagent `tools:` allowlist is architecturally enforced — not a
Three pieces in combination give a 100% guarantee:
1. **Restricted host subagent** — a `.claude/agents/claude-bottle-dispatch.md` with `tools:` limited to MCP container tools and git-read operations. No Edit, Write, or arbitrary Bash.
1. **Restricted host subagent** — a `.claude/agents/bot-bottle-dispatch.md` with `tools:` limited to MCP container tools and git-read operations. No Edit, Write, or arbitrary Bash.
2. **MCP server** — exposes tools the restricted host can call:
- `list_agents()` — available agents from the manifest (host Claude decides which to use)
@@ -40,11 +40,11 @@ Three pieces in combination give a 100% guarantee:
Build host-dispatch-to-container in two deliverables:
**Deliverable 1: Non-interactive run mode for claude-bottle**
**Deliverable 1: Non-interactive run mode for bot-bottle**
Extend `cli.py` with a `run <agent> <task>` subcommand. Starts the container, writes the task prompt to a file inside it (same `docker cp` pattern used for `--append-system-prompt-file`), invokes `claude --print` with the prompt, streams stdout back to the host, and exits when Claude finishes. Results committed and pushed from inside the container as usual.
**Deliverable 2: MCP server wrapping claude-bottle**
**Deliverable 2: MCP server wrapping bot-bottle**
A minimal MCP server (bash or node) exposing `list_agents`, `run_agent`, `get_status`, `get_output`. Registered in the host Claude Code settings so a restricted dispatch subagent can call it.
@@ -52,7 +52,7 @@ The combination enforces the container boundary at the tool layer, not the promp
**Critical:** the tool restriction only applies within the dispatch agent's context. A normal Claude session has its full toolset and may never invoke the dispatch agent regardless of its description. The dispatch agent must be the *entry point* for the session, not an optional subagent a full-tool host might call. Two ways to enforce this:
- Launch with `claude --agent claude-bottle-dispatch` — makes the dispatch agent the primary agent for the session.
- Set `agent: claude-bottle-dispatch` in the project `.claude/settings.json` — same effect automatically for any `claude` invocation in that directory.
- Launch with `claude --agent bot-bottle-dispatch` — makes the dispatch agent the primary agent for the session.
- Set `agent: bot-bottle-dispatch` in the project `.claude/settings.json` — same effect automatically for any `claude` invocation in that directory.
Without one of these, the guarantee does not hold.
@@ -1,11 +1,11 @@
# Landscape: containerized Claude Code agent tools
Research into whether claude-bottle is redundant with existing projects, and
Research into whether bot-bottle is redundant with existing projects, and
whether it's worth publishing.
## Summary
The "Claude Code in Docker" space is active but not saturated. claude-bottle
The "Claude Code in Docker" space is active but not saturated. bot-bottle
occupies a distinct position: no surveyed project combines all five of its
defining features. Publishing is likely worthwhile, with the main risk being
claudebox expanding to absorb the same niche.
@@ -38,7 +38,7 @@ manifest merge.
## Adjacent (different model)
- **dagger/container-use** (mid-2025) — exposes an MCP server so the *agent*
spins up its own containers with Git worktrees. Inverted model vs. claude-bottle
spins up its own containers with Git worktrees. Inverted model vs. bot-bottle
(agent controls container rather than being launched into one by a manifest).
Still marked early-development.
- **E2B, Northflank, Cloudflare Sandbox SDK** — cloud-hosted SaaS sandbox
@@ -2,7 +2,7 @@
Research notes on when to run containerized Claude Code agents on a remote machine
outside the local network versus inside it, focusing on security and privacy concerns.
Relevant to a potential claude-bottle extension for remote agent execution.
Relevant to a potential bot-bottle extension for remote agent execution.
---
@@ -16,7 +16,7 @@ escapes**, and **whether credentials are short-lived and scoped**.
## Threat landscape by topology
### Local (current claude-bottle model)
### Local (current bot-bottle model)
- Container escape → developer laptop → `~/.ssh`, `~/.aws`, browser cookies, Keychain, everything
- Outbound: Docker containers have full internet access by default; no egress monitoring on most home networks
@@ -99,7 +99,7 @@ Key insight: once a container is compromised via prompt injection, the blast rad
## Credentials and secrets
### Local topology (current claude-bottle)
### Local topology (current bot-bottle)
- Secrets live in the host environment or are prompted from `/dev/tty`
- Forwarded to containers via `-e NAME` (not `=value`), never on argv, never in env-files for secrets
@@ -125,10 +125,10 @@ An 8,640x reduction in abuse window comes from switching from 90-day keys to 15-
### Local topology
- Monitoring: whatever the home/office router supports — usually minimal
- Containment: `--network none` + a proxy socket provides the strongest containment; claude-bottle does not currently do this
- Containment: `--network none` + a proxy socket provides the strongest containment; bot-bottle does not currently do this
- DLP: essentially none unless specifically deployed on the LAN
- Domain fronting risk: even allowlisted-domain proxies can be bypassed via domain fronting — an agent that can reach `api.anthropic.com` could relay data to an attacker-controlled backend through that domain
- **claude-bottle today: containers have full outbound internet access. No egress restrictions.**
- **bot-bottle today: containers have full outbound internet access. No egress restrictions.**
### Remote topology (cloud VM)
@@ -177,7 +177,7 @@ Strongest exfiltration controls for either topology:
---
## Concrete recommendations if extending claude-bottle for remote
## Concrete recommendations if extending bot-bottle for remote
1. **Never build the VPN-pivot pattern.** A remote agent connected back to the LAN via VPN is the worst of both worlds. If a remote agent needs LAN resources, expose those through a narrow API, not a VPN.
@@ -199,7 +199,7 @@ Strongest exfiltration controls for either topology:
## Bottom line
For the current claude-bottle use case (developer feature implementation, no regulated data,
For the current bot-bottle use case (developer feature implementation, no regulated data,
single developer), local execution is the right default. The biggest unaddressed risk
right now isn't topology — it's that containers have unrestricted outbound internet access.
Adding `--network none` + a proxy socket would be higher-leverage than any topology change.
+36 -36
View File
@@ -1,6 +1,6 @@
# Manifest format and grouping
Two open questions for claude-bottle's manifest layer after PRD 0011:
Two open questions for bot-bottle's manifest layer after PRD 0011:
1. **Grouping.** Keep bottles and agents in the same manifest file
(today's shape), or split them — one file per bottle and one
@@ -8,7 +8,7 @@ Two open questions for claude-bottle's manifest layer after PRD 0011:
2. **Format.** Stay on JSON, switch to YAML, or move to a Markdown
spec with YAML frontmatter. The Markdown option splits into two
sub-flavors: reuse Claude Code's existing subagent format with
bottle-specific extensions, or invent a claude-bottle-owned
bottle-specific extensions, or invent a bot-bottle-owned
Markdown spec used for both agents and bottles.
The trust boundary from PRD 0011 — bottle infrastructure lives in
@@ -19,8 +19,8 @@ will be once a user has 5+ bottles and 10+ agents.
## Why this matters
Current shape: one JSON file at `$HOME/claude-bottle.json` (and
optionally `$CWD/claude-bottle.json` for cwd-defined agents). After
Current shape: one JSON file at `$HOME/bot-bottle.json` (and
optionally `$CWD/bot-bottle.json` for cwd-defined agents). After
PRD 0011, the home file owns bottles + home agents; the cwd file is
agents-only.
@@ -48,7 +48,7 @@ the inflection point has been reached.
### Option A: one file for both (current)
`$HOME/claude-bottle.json` contains `bottles:` and `agents:`. Cwd
`$HOME/bot-bottle.json` contains `bottles:` and `agents:`. Cwd
file (optional) contains `agents:` only.
**Pros**
@@ -74,16 +74,16 @@ file (optional) contains `agents:` only.
### Option B: file per thing
Bottles live as `$HOME/.claude-bottle/bottles/<name>.<ext>`. Agents
live as `$HOME/.claude-bottle/agents/<name>.<ext>` (home agents)
and `$CWD/.claude-bottle/agents/<name>.<ext>` (cwd agents). The
Bottles live as `$HOME/.bot-bottle/bottles/<name>.<ext>`. Agents
live as `$HOME/.bot-bottle/agents/<name>.<ext>` (home agents)
and `$CWD/.bot-bottle/agents/<name>.<ext>` (cwd agents). The
resolver globs each directory.
**Pros**
- Scales to N bottles + N agents without any single file growing.
- Trust boundary expresses on disk: `$HOME/.claude-bottle/bottles/`
is the only place bottles can come from. `$CWD/.claude-bottle/`
- Trust boundary expresses on disk: `$HOME/.bot-bottle/bottles/`
is the only place bottles can come from. `$CWD/.bot-bottle/`
can only contribute agents. No resolver logic needed to enforce
it — the file paths are the enforcement.
- Aligns with Claude Code's existing model: each subagent already
@@ -147,7 +147,7 @@ preserves this format.
### Option 2: full YAML
`$HOME/claude-bottle.yaml` (or `.yml`). Parser pulls in PyYAML (or
`$HOME/bot-bottle.yaml` (or `.yml`). Parser pulls in PyYAML (or
ruamel.yaml).
**Pros**
@@ -172,14 +172,14 @@ ruamel.yaml).
users will reach for one (Jinja, Helm-style) and then we're in
yaml-as-template-language territory.
### Option 3: reuse Claude Code's subagent spec (Markdown + YAML frontmatter), with claude-bottle extensions
### Option 3: reuse Claude Code's subagent spec (Markdown + YAML frontmatter), with bot-bottle extensions
Claude Code already stores subagents at `~/.claude/agents/<name>.md`
with YAML frontmatter and a Markdown body. Frontmatter today
carries fields like `name`, `description`, `model`, `color`,
`memory`; the body is the system prompt. Adding fields like
`bottle: dev` and a `claude_bottle:` sub-block to the same
frontmatter would make each claude-bottle agent a drop-in addition
`bottle: dev` and a `bot_bottle:` sub-block to the same
frontmatter would make each bot-bottle agent a drop-in addition
to Claude Code's agent directory.
```markdown
@@ -188,12 +188,12 @@ name: implementer
description: Implements features against PRDs in this repo
model: opus
bottle: dev
claude_bottle:
bot_bottle:
skills: [init-prd]
---
You are a feature-implementation agent running inside an
ephemeral claude-bottle sandbox. The host has copied the user's
ephemeral bot-bottle sandbox. The host has copied the user's
project into /home/node/workspace...
```
@@ -202,7 +202,7 @@ infrastructure, not behavior. Either:
- (3a) Bottles stay JSON / YAML; only agents adopt the
MD+frontmatter format. Mixed-format manifest.
- (3b) Bottles adopt MD+frontmatter too, using a claude-bottle-only
- (3b) Bottles adopt MD+frontmatter too, using a bot-bottle-only
schema. Then we're really doing option 4 for bottles + option 3
for agents. Two formats but one parser.
@@ -214,7 +214,7 @@ infrastructure, not behavior. Either:
- Each agent's prompt lives naturally as Markdown body — long
prompts read well, can use headings/lists/code blocks.
- File-per-thing falls out automatically (one MD per agent).
- Claude Code may eventually consume claude-bottle's agent files
- Claude Code may eventually consume bot-bottle's agent files
directly, doubling their utility.
**Cons**
@@ -222,7 +222,7 @@ infrastructure, not behavior. Either:
- **Coupling to Claude Code's spec.** Anthropic owns that schema;
field names and semantics can change. Today's `model` /
`description` / `memory` are stable, but tomorrow's may not be.
Our `bottle:` / `claude_bottle:` extensions could collide with
Our `bottle:` / `bot_bottle:` extensions could collide with
future official fields.
- The agent file's frontmatter starts to carry two unrelated
schemas: Claude Code's (model, description) and ours (bottle,
@@ -233,28 +233,28 @@ infrastructure, not behavior. Either:
frontmatter library (python-frontmatter) or hand-parse the
`---` block and feed it to PyYAML. Either way, a new dep.
### Option 4: invent a claude-bottle MD spec, used for both agents and bottles
### Option 4: invent a bot-bottle MD spec, used for both agents and bottles
```markdown
---
# $HOME/.claude-bottle/agents/implementer.md
# $HOME/.bot-bottle/agents/implementer.md
bottle: dev
skills: [init-prd]
---
You are a feature-implementation agent running inside an
ephemeral claude-bottle sandbox...
ephemeral bot-bottle sandbox...
```
```markdown
---
# $HOME/.claude-bottle/bottles/dev.md
# $HOME/.bot-bottle/bottles/dev.md
cred_proxy:
routes:
- path: /anthropic/
upstream: https://api.anthropic.com
auth_scheme: Bearer
token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN
token_ref: BOT_BOTTLE_OAUTH_TOKEN
role: anthropic-base-url
egress:
allowlist: [example.com]
@@ -273,8 +273,8 @@ for publishing scoped packages.
documentation (why does this bottle exist? what tokens does it
hold? who owns the keys?).
- Not coupled to Claude Code's schema; we own the spec.
- Trust boundary on disk: `$HOME/.claude-bottle/bottles/` is the
only place bottles can come from; `$CWD/.claude-bottle/agents/`
- Trust boundary on disk: `$HOME/.bot-bottle/bottles/` is the
only place bottles can come from; `$CWD/.bot-bottle/agents/`
is the only thing cwd contributes.
- Agent files in this spec are *almost* compatible with Claude
Code's subagent format. If we keep the `name` / `description`
@@ -308,14 +308,14 @@ grouping fits how users iterate on agents (write a prompt, save,
launch).
Between option 3 (reuse CC spec) and option 4 (new spec): the
appealing middle ground is "claude-bottle agents follow the CC
appealing middle ground is "bot-bottle agents follow the CC
subagent shape closely (name / description / model + bottle and
skills extensions) so they drop into `~/.claude/agents/` as a
side effect, while bottles use the same MD+frontmatter shape but
with claude-bottle's own schema and live in a dedicated directory."
with bot-bottle's own schema and live in a dedicated directory."
This:
- gives agents both a claude-bottle launch story AND a Claude Code
- gives agents both a bot-bottle launch story AND a Claude Code
invocation story from the same file;
- keeps bottles entirely under our schema (no Anthropic dependency
for the security-load-bearing config);
@@ -343,18 +343,18 @@ per-file grouping (which is the bigger UX win), and the per-file
shape is what makes the trust boundary self-documenting on disk.
The dependency cost (PyYAML) is the main thing that needs an
explicit yes from the user — claude-bottle today has zero
explicit yes from the user — bot-bottle today has zero
third-party Python deps for production code, and adopting one
crosses a clean architectural line. If "low deps" stays a hard
constraint, the alternative is to hand-parse the frontmatter block
and feed it to a minimal YAML subset parser (the keys
claude-bottle uses are all flat string/list/dict — no anchors, no
bot-bottle uses are all flat string/list/dict — no anchors, no
multi-line block scalars, no implicit type coercion).
If we don't want to commit to the move yet, the next-cheapest
option is keeping JSON but splitting into per-file (option B ×
option 1): `$HOME/.claude-bottle/bottles/<name>.json` +
`$HOME/.claude-bottle/agents/<name>.json`. Most of the scaling
option 1): `$HOME/.bot-bottle/bottles/<name>.json` +
`$HOME/.bot-bottle/agents/<name>.json`. Most of the scaling
wins; none of the body-prose or dependency story.
## Open questions
@@ -362,7 +362,7 @@ wins; none of the body-prose or dependency story.
- **Does Claude Code object to extra frontmatter fields?** Test:
drop a file with `bottle:` in `~/.claude/agents/` and see if CC
warns / ignores / breaks. If it warns, we'd want a different
field name (e.g. `claude-bottle-bottle`) or a namespaced block.
field name (e.g. `bot-bottle-bottle`) or a namespaced block.
- **Migration story.** Is the project willing to ship a one-shot
`./cli.py migrate-manifest` command that does the JSON → MD
conversion? Or do users just rewrite by hand from the new docs?
@@ -370,8 +370,8 @@ wins; none of the body-prose or dependency story.
empty body, is the MD-with-frontmatter format still warranted?
An alternative is YAML for bottles only (no body, but with
comments) and MD+frontmatter for agents.
- **Dotfiles vs not.** `$HOME/.claude-bottle/` or
`$HOME/claude-bottle/`? The hidden dotfile shape matches dev
- **Dotfiles vs not.** `$HOME/.bot-bottle/` or
`$HOME/bot-bottle/`? The hidden dotfile shape matches dev
conventions (`.config/`, `.ssh/`); the visible shape signals
"this is a real thing you own."
- **PyYAML hard dep, or minimal subset parser?** Trade-off between
+4 -4
View File
@@ -1,4 +1,4 @@
# Network egress guard for claude-bottle containers
# Network egress guard for bot-bottle containers
Research into preventing data exfiltration from Docker containers running
Claude Code (`--dangerously-skip-permissions`), with a focus on approaches
@@ -358,7 +358,7 @@ services:
- agent-net
claude-agent:
image: claude-bottle:latest
image: bot-bottle-claude:latest
environment:
HTTPS_PROXY: "http://proxy:4750"
HTTP_PROXY: "http://proxy:4750"
@@ -387,7 +387,7 @@ docker run -d --name "$container_name" \
--network agent-net-"$slug" \
-e HTTPS_PROXY=http://"$proxy_name":4750 \
-e HTTP_PROXY=http://"$proxy_name":4750 \
claude-bottle:latest
bot-bottle-claude:latest
```
The `--internal` flag on the network prevents containers from reaching
@@ -639,7 +639,7 @@ this is not relevant — the binary uses the Linux certificate store.
Justified only if the threat model includes sophisticated actors deliberately
crafting domain-fronting payloads. The extra complexity and CA-trust-management
overhead is not worth it for v1. Keep in view for v2 if the claude-bottle use
overhead is not worth it for v1. Keep in view for v2 if the bot-bottle use
case expands to high-value agent deployments.
---
+31 -31
View File
@@ -1,4 +1,4 @@
# Pipelock assessment for claude-bottle egress control
# Pipelock assessment for bot-bottle egress control
Research into whether pipelock — an open-source AI agent firewall —
is a suitable replacement for, or complement to, the egress-control
@@ -10,7 +10,7 @@ tripwire approach sketched in `secret-exfil-tripwire-encodings.md`.
- Pipelock (`luckyPipewrench/pipelock`) is an open-source AI agent
firewall implemented as a single Go binary. It sits inline as an HTTP
forward proxy and, optionally, applies OS-level process containment. It
is the most directly relevant tool found for claude-bottle's egress /
is the most directly relevant tool found for bot-bottle's egress /
data-exfiltration concerns.
- Its strongest differentiator over the v1 iptables recommendation is
**content-aware DLP**: it matches 48 credential patterns across
@@ -238,12 +238,12 @@ The following threat-model items from `network-egress-guard.md` are
---
## Fit for claude-bottle
## Fit for bot-bottle
### Deployment topology
Pipelock explicitly supports two deployment shapes relevant to
claude-bottle:
bot-bottle:
**Sidecar proxy.** A separate container running pipelock on an
`--internal` Docker network, with the agent container's only route to the
@@ -269,12 +269,12 @@ its own scanner. This avoids a second container but introduces the
`--best-effort` degradation problem described below and means the agent and
the proxy run in the same failure domain.
The sidecar topology is recommended for claude-bottle because it matches
The sidecar topology is recommended for bot-bottle because it matches
the existing Python-orchestrated multi-container model (the SSH key agent
already uses a separate process), keeps pipelock outside the agent's kill
reach, and avoids the `--best-effort` issue on macOS Docker Desktop.
The claude-bottle manifest model would need one new piece of plumbing: a
The bot-bottle manifest model would need one new piece of plumbing: a
per-agent pipelock ACL YAML generated from the manifest's `allowlist`
and `ssh` entries, analogous to what the smokescreen section of
`network-egress-guard.md` already sketches. The `cli.py` changes required
@@ -341,7 +341,7 @@ generated with `pipelock generate config --preset balanced > pipelock.yaml`.
The config watcher picks up changes at runtime (100ms debounce on SIGHUP or
file events), so per-agent ACL updates do not require a container restart.
For claude-bottle, the relevant per-agent configuration is the domain
For bot-bottle, the relevant per-agent configuration is the domain
allowlist. The manifest already captures the necessary inputs: the `ssh`
array has target hostnames, and an `allowlist` key is planned for the v2
egress work (per `network-egress-guard.md`). Generating a per-agent pipelock
@@ -353,14 +353,14 @@ The YAML format is more expressive than smokescreen's YAML ACL: it also
carries DLP sensitivity settings, per-domain data budgets, and rate limits.
For a first integration pass, only the `api_allowlist` section needs
per-agent population; the rest of the defaults are appropriate for the
claude-bottle threat model.
bot-bottle threat model.
### Runtime footprint
A single Go binary, ~1220 MB (sources report slightly different figures; the
GitHub description says "~20 MB" and the randomcpu.com writeup says "~12 MB").
Zero runtime dependencies; the Go standard library is statically linked. This
is consistent with claude-bottle's low-dependency principle. Adding Go as a
is consistent with bot-bottle's low-dependency principle. Adding Go as a
host build dependency is not required — the binary is fetched from a Docker
image or Homebrew.
@@ -410,7 +410,7 @@ The reasoning:
everything smokescreen covers (CONNECT-based hostname allowlisting,
RFC 1918 blocking, Docker `--internal` network isolation) and adds DLP,
subdomain-entropy DNS exfil detection, MCP scanning, and request
redaction. The integration shape for claude-bottle is identical: a
redaction. The integration shape for bot-bottle is identical: a
separate container on an internal Docker network, with the agent's
`HTTPS_PROXY` pointing at it. The `cli.py` changes are the same pattern.
@@ -454,7 +454,7 @@ The reasoning:
hostname-based allowlisting, content DLP, subdomain entropy analysis, and
MCP scanning on top of the v1 IP layer. Implementation effort is comparable
to the smokescreen plan; capabilities are a strict superset for the
claude-bottle threat model.
bot-bottle threat model.
- **DIY tripwire script (deferred):** the `secret-exfil-tripwire-encodings.md`
DIY sketch can be deferred entirely if pipelock's DLP patterns cover the
secrets in use. Custom patterns (for secrets not matching pipelock's 48
@@ -463,17 +463,17 @@ The reasoning:
---
## Does pipelock make claude-bottle redundant?
## Does pipelock make bot-bottle redundant?
Pipelock is itself an AI-agent firewall with an in-process sandbox mode,
which raises a fair question: if pipelock can already wrap an agent process
with Landlock + seccomp + namespaces (or `sandbox-exec` on macOS), is the
Docker-container layer that claude-bottle provides still doing useful work?
Docker-container layer that bot-bottle provides still doing useful work?
The short answer: **no, pipelock does not make claude-bottle redundant**.
The short answer: **no, pipelock does not make bot-bottle redundant**.
The two operate at different layers and the overlap is narrow.
### Where pipelock substitutes for parts of claude-bottle
### Where pipelock substitutes for parts of bot-bottle
For a single-agent use case on Linux with full unprivileged-userns support,
`pipelock sandbox -- claude` could replace the Docker container with a
@@ -483,7 +483,7 @@ a real isolation primitive, not a fig leaf. A user whose only concern is
"don't let one agent's bug touch my home directory or exfil my keys" could
plausibly run pipelock on the host and skip Docker entirely.
### Where claude-bottle does work pipelock does not
### Where bot-bottle does work pipelock does not
The redundancy argument breaks down once the actual goals from
`CLAUDE.md` are enumerated:
@@ -499,7 +499,7 @@ The redundancy argument breaks down once the actual goals from
filesystem.
2. **Parallel agents.** A primary stated goal is "Allow me to easily spin
up agent tasks in parallel". claude-bottle launches one container per
up agent tasks in parallel". bot-bottle launches one container per
agent invocation with a slug-derived name and a numeric suffix on
conflict. Pipelock has no equivalent fleet-management concept; it is a
per-process wrapper. Running `pipelock sandbox -- claude` four times in
@@ -508,7 +508,7 @@ The redundancy argument breaks down once the actual goals from
keychain. That is not the same property as four containers each with
its own ephemeral filesystem.
3. **The manifest model.** claude-bottle's `claude-bottle.json` carries
3. **The manifest model.** bot-bottle's `bot-bottle.json` carries
per-agent `env`, `skills`, `prompt`, and `ssh` configuration with
precise resolution semantics (prompt-at-launch secrets, host-env
forwarding, literal env-file values, host-key fingerprint pinning).
@@ -530,43 +530,43 @@ The redundancy argument breaks down once the actual goals from
(no UDS forwarding from host agent into the VM) and gives the property
that the `node` user can sign with the key but cannot read its bytes.
Pipelock does not address SSH at all, which is one of its documented
gaps. claude-bottle's solution remains relevant under either deployment.
gaps. bot-bottle's solution remains relevant under either deployment.
6. **Skill-directory injection per agent.** The `skills` array copies named
directories from `~/.claude/skills/` into the container at launch. There
is no analogous concept in pipelock; the skill set claude-bottle exposes
is no analogous concept in pipelock; the skill set bot-bottle exposes
is part of the per-agent isolation model, not just a configuration.
7. **Shareability of agent definitions.** A `claude-bottle.json` file can
7. **Shareability of agent definitions.** A `bot-bottle.json` file can
be checked into a project repo, and a third party can run the same
agent with the same env-resolution rules. Pipelock configurations are
per-installation; they do not encode "this is an agent named X".
### The opposite question
Does claude-bottle make pipelock redundant? Equally no. Docker container
Does bot-bottle make pipelock redundant? Equally no. Docker container
isolation does nothing about content-level exfil over an allowed channel.
A misbehaving agent inside a claude-bottle container with HTTPS access to
A misbehaving agent inside a bot-bottle container with HTTPS access to
`api.anthropic.com` can still attempt to exfiltrate via DNS subdomain
encoding, prompt-injection responses from MCP servers, or covert HTTP
parameters. Those are exactly the threats pipelock is designed to detect.
The containment argument for claude-bottle and the content-inspection
The containment argument for bot-bottle and the content-inspection
argument for pipelock do not overlap.
### Net conclusion
Pipelock and claude-bottle are layered defenses, not alternatives.
claude-bottle provides filesystem isolation, per-agent state ephemerality,
Pipelock and bot-bottle are layered defenses, not alternatives.
bot-bottle provides filesystem isolation, per-agent state ephemerality,
fleet management, manifest-driven configuration, and the SSH-agent-without-
key-leak property. Pipelock provides hostname allowlisting, content-aware
DLP, MCP scanning, and subdomain-entropy DNS exfil detection at the network
boundary. The strongest deployment is both: pipelock as a sidecar on the
container's only egress route, claude-bottle as the per-agent container
container's only egress route, bot-bottle as the per-agent container
orchestrator. Removing either layer leaves a real and named threat
uncovered.
The one scenario in which adopting pipelock could justify retiring
claude-bottle is a single-user, single-agent, host-resident deployment
bot-bottle is a single-user, single-agent, host-resident deployment
where the user is willing to give up the parallel-agent goal, accept
Landlock-level filesystem restriction in place of mount-namespace
isolation, and re-implement env / skill / SSH-key / prompt management
@@ -587,7 +587,7 @@ some other way. That is not the use case the project was built for.
from an unvetted source. Pinning by digest (as the CLAUDE.md recommends
for supply-chain hygiene) and building from source are both options.
3. **What is the actual DLP false-positive rate for the secrets claude-bottle
3. **What is the actual DLP false-positive rate for the secrets bot-bottle
agents use?** The 48 patterns cover well-known credential formats. Custom
patterns can be added but the mechanism (signed rule bundles) is not
documented in detail in public search results. Before v2, testing against
@@ -606,12 +606,12 @@ some other way. That is not the use case the project was built for.
integration work, audit which capabilities cited above are in the
Apache 2.0 core and which require accepting ELv2 terms (which permit
internal use and modification but prohibit offering pipelock as a
managed service). For claude-bottle's local-Docker single-user use case,
managed service). For bot-bottle's local-Docker single-user use case,
ELv2 is likely acceptable, but the determination should be explicit.
6. **Can pipelock's YAML config be generated per-agent from the manifest in
a way that handles the `ssh` array correctly?** The `ssh` array in
`claude-bottle.json` contains hostnames, ports, and `KnownHostKey` entries.
`bot-bottle.json` contains hostnames, ports, and `KnownHostKey` entries.
These need to be mapped to pipelock's `api_allowlist` (for HTTP) and
potentially to a separate bypass for the SSH socket. SSH is opaque to the
HTTP proxy and does not go through `HTTPS_PROXY`; the allowlist entry is
+7 -7
View File
@@ -1,7 +1,7 @@
# Closing the maturity gap: polish priorities
Research into what would close the perceived maturity gap between
claude-bottle and a "polished" comparable like claudebox. Motivated
bot-bottle and a "polished" comparable like claudebox. Motivated
by adopter feedback citing first-run friction, manifest authoring,
and distribution as the dominant obstacles to recommending the tool
to others.
@@ -33,11 +33,11 @@ on top of working onboarding.
### Onboarding friction
A first-time user today goes through five steps: install Docker,
install `uv`, set `CLAUDE_BOTTLE_OAUTH_TOKEN`, write
`claude-bottle.json`, run `./cli.py start`. One of those is
install `uv`, set `BOT_BOTTLE_OAUTH_TOKEN`, write
`bot-bottle.json`, run `./cli.py start`. One of those is
"author a JSON manifest." Polished tools in this category let
users skip that step on day one. The fix is an `init` subcommand
that drops a working `claude-bottle.json` with a default `coder`
that drops a working `bot-bottle.json` with a default `coder`
bottle/agent into the user's home directory and prints the next
command to run.
@@ -45,14 +45,14 @@ command to run.
Missing Docker, missing OAuth token, manifest typo, image build
failure — each should print a one-line fix rather than a stack
trace. claudebox handles this well; claude-bottle currently exits
trace. claudebox handles this well; bot-bottle currently exits
on `die()` calls that vary in helpfulness. A focused pass over
every `die()` site, ensuring each message says what failed *and*
what to do, is cheap and compounds across every user interaction.
### Distribution
`brew install claude-bottle` or `curl | sh`, not "clone the repo,
`brew install bot-bottle` or `curl | sh`, not "clone the repo,
install Python deps, `chmod cli.py`." The single highest-leverage
polish item, and the one that interacts with the language choice
covered in `bash-vs-python-vs-go.md`. Staying on Python means a
@@ -69,7 +69,7 @@ small; the signal value is large.
### Schema
A JSON schema for `claude-bottle.json` published with a `$schema`
A JSON schema for `bot-bottle.json` published with a `$schema`
URL gives VSCode and Cursor users autocomplete and inline
validation. ~½ day to author the schema, plus a few hours to
publish it where editors can fetch it.
+15 -15
View File
@@ -1,7 +1,7 @@
# Remote Docker VM as an isolation upgrade for claude-bottle
# Remote Docker VM as an isolation upgrade for bot-bottle
Note on the cheapest practical path to stronger isolation than local
Docker: run claude-bottle unchanged on a remote Linux VM that has
Docker: run bot-bottle unchanged on a remote Linux VM that has
dockerd. Complements `stronger-isolation-alternatives.md` (which
surveys runtime swaps like gVisor, Kata, Firecracker, Apple Container)
and `local-vs-remote-agent-execution.md` (which surveys the
@@ -10,7 +10,7 @@ local-vs-remote decision broadly).
## Summary
If the goal is "stronger isolation than Docker-on-my-laptop without
rewriting the runtime," the cleanest answer is to keep claude-bottle
rewriting the runtime," the cleanest answer is to keep bot-bottle
exactly as it is and run it on a remote Linux VM where you can install
dockerd. The v1 design — pipelock as a separate container on a
`--internal` network, ephemeral agent containers, OAuth-token
@@ -91,7 +91,7 @@ work:
may not allow installing dockerd depending on tier; Fly Machines,
EC2, GCE, Hetzner, Linode, and self-hosted hypervisors give you full
control.
- Enough disk + RAM to host the claude-bottle image, the agent
- Enough disk + RAM to host the bot-bottle image, the agent
container, and the pipelock sidecar. Headroom of ~24 GB RAM and
~5 GB disk is comfortable; less works for short sessions.
- An interactive reach path. SSH is fine. The launcher uses
@@ -102,7 +102,7 @@ work:
- **Typing latency.** Interactive Claude sessions over SSH have visible
per-keystroke latency; usually fine on wired/fiber, less fine on
Wi-Fi-to-cloud. Mosh helps if it's bothersome.
- **Token shipping.** `CLAUDE_BOTTLE_OAUTH_TOKEN` has to live on the
- **Token shipping.** `BOT_BOTTLE_OAUTH_TOKEN` has to live on the
remote box for the launcher to forward it into containers. Use the
provider's secret-injection path (cloud-init user-data,
`flyctl secrets`, Tailscale-served local file, etc.). Never echo the
@@ -122,15 +122,15 @@ work:
## Operational shape
The minimum-viable workflow, no claude-bottle code changes:
The minimum-viable workflow, no bot-bottle code changes:
1. `terraform apply` / `flyctl machine run` / `gcloud compute
instances create` — provision a fresh Linux VM.
2. Install dockerd via the provider's image or a one-liner
(`curl -fsSL https://get.docker.com | sh`).
3. SSH in.
4. `git clone` claude-bottle on the VM, drop a manifest in place,
inject `CLAUDE_BOTTLE_OAUTH_TOKEN` via the provider's secrets path.
4. `git clone` bot-bottle on the VM, drop a manifest in place,
inject `BOT_BOTTLE_OAUTH_TOKEN` via the provider's secrets path.
5. `./cli.py start <agent>` — the existing launcher handles the rest.
6. On exit: destroy the VM. No host artifacts persist.
@@ -150,14 +150,14 @@ gotchas the abstract pattern leaves implicit.
Build a custom OCI image `FROM docker:dind` that bakes in:
- The claude-bottle repository checkout.
- A pre-built `claude-bottle:latest` image, saved via `docker save` on
- The bot-bottle repository checkout.
- A pre-built `bot-bottle-claude:latest` image, saved via `docker save` on
your laptop and loaded in at image-build time
(`RUN docker load < claude-bottle.tar`) or pushed as a layer into
(`RUN docker load < bot-bottle.tar`) or pushed as a layer into
the dind storage. Without this step, the first in-VM `docker build`
runs `apt-get` and a global `npm install -g
@anthropic-ai/claude-code`, which adds 3090 s to every cold start.
- A `flyctl secrets`-injected `CLAUDE_BOTTLE_OAUTH_TOKEN`, exposed to
- A `flyctl secrets`-injected `BOT_BOTTLE_OAUTH_TOKEN`, exposed to
the VM's PID 1 as an env var.
- An entrypoint that starts dockerd, waits for it to be healthy, then
either drops into a shell or directly runs `cli.py start <agent>`.
@@ -166,7 +166,7 @@ Deploy with `flyctl deploy` or `flyctl machine run --image …`.
### Boot-to-first-prompt timing
Three scenarios, all assuming the custom image above (claude-bottle
Three scenarios, all assuming the custom image above (bot-bottle
image baked in, token injected, no in-VM rebuild):
| Phase | Cold (image not cached on Fly host) | Warm (image cached, `machine run` fresh) | Hot (`machine stop`ped, `machine start`) |
@@ -186,7 +186,7 @@ is mostly about cost, not speed.
### Cost of standby vs. create-per-session
Stopped Fly Machines stop billing CPU/RAM but continue to bill for
storage and any allocated IPv4. A reasonable claude-bottle Machine
storage and any allocated IPv4. A reasonable bot-bottle Machine
size (2 vCPU / 2 GB / ~3 GB rootfs) costs roughly:
| Item | While stopped | Monthly |
@@ -275,7 +275,7 @@ overhead in any given setting.
If we wanted to land this as a real project direction:
1. Add a short "Running claude-bottle on a remote Docker VM" section
1. Add a short "Running bot-bottle on a remote Docker VM" section
to the README pointing at this doc.
2. Optionally: prototype the `--remote=user@host` launcher subcommand.
3. Update `stronger-isolation-alternatives.md` to mention the remote
@@ -35,7 +35,7 @@ which naive encoding a misbehaving agent uses.
explicitly does not decode base64 or ROT13 in all inspection paths due
to processing overhead concerns.
- Rolling this in ~100 lines of Python is feasible and is probably the right
path for claude-bottle v1. The limiting factor is not the encoding logic
path for bot-bottle v1. The limiting factor is not the encoding logic
— that is straightforward — but the false-positive rate from common
base64 alphabet collisions and the zero coverage against any re-encoding
that involves a key (encryption) or destroys byte boundaries (packet
@@ -385,7 +385,7 @@ involves:
- Steganography (hiding data in image pixels, audio samples, whitespace),
...will produce output with no detectable relationship to the input plaintext.
For claude-bottle's threat model — an untrusted but not adversarially
For bot-bottle's threat model — an untrusted but not adversarially
sophisticated agent that might accidentally or naively encode secrets before
logging or posting them — naive-encoding detection covers the realistic concern.
A deliberately adversarial agent with knowledge of its own monitoring is a
@@ -393,9 +393,9 @@ different threat model entirely.
---
## Integration point in claude-bottle
## Integration point in bot-bottle
The natural choke point for encoding-aware content inspection in claude-bottle
The natural choke point for encoding-aware content inspection in bot-bottle
is the body of outbound HTTP/HTTPS requests, visible at the egress proxy layer.
The egress-guard note documents smokescreen (Stripe,
[github.com/stripe/smokescreen](https://github.com/stripe/smokescreen))
@@ -450,7 +450,7 @@ and pre-computable. The following techniques defeat it entirely:
intercepted by an HTTP proxy at all. The egress-guard note covers DNS
separately via a controlled resolver.
In the claude-bottle context, the primary realistic concern is an agent
In the bot-bottle context, the primary realistic concern is an agent
that naively embeds a secret in a log line, a curl argument, a JSON body,
or a shell heredoc without specifically intending to obfuscate. All of the
above bypass techniques require deliberate, adversarially-motivated engineering
@@ -13,7 +13,7 @@ existing tools in that space).
## Summary
claude-bottle's v1 egress story is: pipelock allowlists hostnames,
bot-bottle's v1 egress story is: pipelock allowlists hostnames,
intercepts TLS, body-scans every request against 48 builtin DLP
patterns, and blocks on hit. Gitleaks does the analog on `git push`.
Both are signature-based. Against a *determined* compromised or
@@ -79,7 +79,7 @@ The agent's conversation channel is therefore wide open as an exfil
path. A prompt-injected agent that has been told a secret can ship
it to Anthropic as conversation text, formatted however it likes,
and pipelock sees only `CONNECT api.anthropic.com:443`. The
`CLAUDE_BOTTLE_OAUTH_TOKEN` itself rides this exact path.
`BOT_BOTTLE_OAUTH_TOKEN` itself rides this exact path.
### 3. Out-of-band channels exist regardless
@@ -134,7 +134,7 @@ per-bottle gate that:
Two concrete instances worth implementing:
**Anthropic-API gate.** Holds `CLAUDE_BOTTLE_OAUTH_TOKEN`. Agent's
**Anthropic-API gate.** Holds `BOT_BOTTLE_OAUTH_TOKEN`. Agent's
`ANTHROPIC_BASE_URL` points at the gate; gate injects
`Authorization: Bearer …` and forwards to api.anthropic.com. The
token is no longer in the bottle's env. Once the token is out,
+1 -1
View File
@@ -1,4 +1,4 @@
# smolmachines as a VM backend for claude-bottle
# smolmachines as a VM backend for bot-bottle
Evaluation of whether [smolmachines](https://smolmachines.com/) would
simplify the macOS agent-VM-isolation work spelled out in
@@ -1,7 +1,7 @@
# Stronger isolation alternatives: gVisor, Kata, Firecracker, Apple Container
Research into what it would take to replace or augment Docker (with `runc`)
as the agent runtime in claude-bottle, and what each option would actually
as the agent runtime in bot-bottle, and what each option would actually
buy in security terms vs. cost in launcher rewrite.
## Summary
@@ -17,7 +17,7 @@ There is a ladder, not a menu. Three realistic rungs, ordered by effort:
A fourth option, **Apple Container**, is the right macOS-native answer to
"I want Kata's isolation model without giving up MacBooks as the dev
target." Probably the right v2 if claude-bottle keeps macOS in scope.
target." Probably the right v2 if bot-bottle keeps macOS in scope.
The pipelock egress design is portable across all four: every option can
provide a network primitive that means "no default route except through
@@ -54,11 +54,11 @@ forwarded to the host kernel.
### What changes in this codebase
- `claude_bottle/cli/start.py` (where `docker run` is assembled): add
- `bot_bottle/cli/start.py` (where `docker run` is assembled): add
`--runtime=runsc` to the container args when the bottle requests it.
Make it configurable: `bottles.<name>.runtime: "runsc" | "runc"`,
default `runc`.
- `claude_bottle/docker.py`: add a `require_runsc()` check that runs
- `bot_bottle/docker.py`: add a `require_runsc()` check that runs
`docker info --format '{{.Runtimes}}'` once and dies with an install
pointer if `runsc` isn't registered.
- `network.py`, `pipelock.py`, `skills.py`, `ssh.py`: **no changes**.
@@ -111,7 +111,7 @@ Docker network.
- Slower cold start (hundreds of ms vs. tens). For interactive Claude
this is fine; for ephemeral batch agents you would notice.
- Not natively supported on macOS at all — needs a Linux host or a Linux
VM you control. **This is the moment claude-bottle stops being "works
VM you control. **This is the moment bot-bottle stops being "works
on a Mac dev laptop with Docker Desktop."**
### When this is the right rung
@@ -138,18 +138,18 @@ replacing Docker, not adding to it.
### Files in this repo that would change
- `claude_bottle/docker.py` → replaced by a new `claude_bottle/firecracker.py`
- `bot_bottle/docker.py` → replaced by a new `bot_bottle/firecracker.py`
that POSTs to the Firecracker API socket per microVM (`/boot-source`,
`/drives`, `/network-interfaces`, `/actions`).
- `claude_bottle/network.py` → a host-side networking module that creates
- `bot_bottle/network.py` → a host-side networking module that creates
a Linux bridge per agent, two TAPs (agent-side, pipelock-side), and
either iptables rules or no host route at all so the agent VM
literally cannot reach anything except pipelock.
- `claude_bottle/pipelock.py` → instead of a sidecar container, run
- `bot_bottle/pipelock.py` → instead of a sidecar container, run
pipelock as its own microVM (or on the host pinned to the bridge).
The hostname-allowlist semantics carry over; the implementation is
different.
- `claude_bottle/skills.py`, `ssh.py` → can no longer use `docker cp`.
- `bot_bottle/skills.py`, `ssh.py` → can no longer use `docker cp`.
Bake skills into the rootfs at build time, or mount a virtiofs share
read-only.
- `Dockerfile` → replaced by a rootfs builder. Realistically this means
@@ -221,7 +221,7 @@ scope and the manifest example carries `/Users/didericis` paths:
and look at Apple Container. Smaller launcher rewrite than
Firecracker; Linux stays on the gVisor / Kata path. Probably the
right v2.
3. **Firecracker only if** claude-bottle's deployment target settles on
3. **Firecracker only if** bot-bottle's deployment target settles on
self-hosted Linux, not laptops — at which point the "non-goal:
self-hosted VMs" line in `CLAUDE.md` flips and the project's
identity changes.