0693107dd6a67178e752668ed4f18bb4bdcff184
60 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
bcadc07d09 | feat(pipelock): allow route tls passthrough policy | ||
|
|
1cbedc91c0 |
refactor(agent): use agent-neutral runtime names
Assisted-by: Codex |
||
|
|
c08b09dc9f |
refactor!: rename project to bot-bottle
Assisted-by: Codex |
||
|
|
59ee32cc8d | refactor(manifest): key git config by host | ||
|
|
d7cef27584 |
feat(smolmachines): PRD 0022 sandbox-escape suite green under smolmachines (PRD 0023 chunk 5)
Final PRD 0023 chunk. The PRD 0022 attack suite was already backend-agnostic — it goes through get_bottle_backend(), so the right dispatch happens based on CLAUDE_BOTTLE_BACKEND. Two cleanups to make it actually run cleanly under CLAUDE_BOTTLE_BACKEND=smolmachines: - setUpClass raises unittest.SkipTest with a useful message when CLAUDE_BOTTLE_BACKEND=smolmachines but smolvm isn't on PATH, or when the host isn't macOS (libkrun + TSI single-IP allowlist is macOS-only in v1). Without this, the test would die deep inside backend.prepare's smolmachines_preflight rather than skipping. - test_5_readme_push_blocked switches from a hardcoded `git://git-gate/...` remote URL (only resolvable on docker via the bundle's short alias) to the bottle's declared upstream URL (`ssh://git@unreachable.invalid:22/throwaway.git`). The agent's ~/.gitconfig insteadOf rewrite — set up by provision_git on both backends — transparently redirects to the gate, so the same test exercises docker's `git://git-gate/...` and smolmachines's `git://<bundle_ip>:9418/...` URLs without branching on backend. README gets a "Backend selection" subsection under Quickstart documenting CLAUDE_BOTTLE_BACKEND, the macOS-only v1 scope for smolmachines, and the `curl -sSL .../install.sh | sh` install prerequisite — per PRD 0023's acceptance criteria. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
519a71f2e7 |
refactor(docker): drop legacy names from capability_apply teardown
Last of the per-sidecar legacy names. `_per_bottle_container_names` used to list the four pre-bundle sidecars (cred-proxy, pipelock, git-gate, supervise) so capability-apply's teardown would force-rm them on remediation. None of those containers exist anymore — the four daemons run in the sidecar bundle (PRD 0024), so the list collapses to the agent + the bundle. Integration test follows: the fake supervise-sidecar setup, which existed to give teardown an extra container to clean up, switches to a fake sidecar bundle with the current name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
727f30d422 |
refactor(docker): drop legacy per-sidecar container_name functions
Same line of cleanup as the supervise rename: the per-sidecar container names (`claude-bottle-pipelock-<slug>`, `claude-bottle-egress-<slug>`, `claude-bottle-git-gate-<slug>`) were docker-network aliases pointing at the bundle, kept so legacy URLs would keep resolving. Replaces them with short hostnames (`pipelock`, `egress`, `git-gate`) matching the existing `EGRESS_HOSTNAME` pattern, and inlines the bundle-loopback URL (`http://127.0.0.1:8888`) for the in-bundle egress→pipelock hop — matching what smolmachines already does. Drops the three `*_container_name` functions, `pipelock_proxy_url`, and `git_gate_host`. Their callers move to the new constants: - `PIPELOCK_HOSTNAME = "pipelock"` (claude_bottle/pipelock.py) - `GIT_GATE_HOSTNAME = "git-gate"` (claude_bottle/git_gate.py) - `BUNDLE_LOCAL_PIPELOCK_URL` (backend/docker/pipelock.py) The agent's HTTP_PROXY now reads `http://pipelock:8888` (vs the old `http://claude-bottle-pipelock-<slug>:8888`); the gitconfig insteadOf rewrites become `git://git-gate/<repo>.git`. The prepare- time orphan probe is collapsed onto the bundle container name (`claude-bottle-sidecars-<slug>`) instead of the four legacy per-sidecar names that no backend creates anymore. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
73dc0d4a40 |
refactor(sidecars): instantiate sidecar ABCs directly from any backend
The four sidecar prepare-time helpers (PipelockProxy, Egress, GitGate, Supervise) had docker-flavored subclasses that existed only as instantiation shims for ABCs that already had no abstract methods. PipelockProxy.prepare() reached for class-level CA path constants that were only defined on the docker subclass — so smolmachines had to import DockerPipelockProxy to render pipelock yaml, reaching across the backend boundary for what's actually a platform-neutral operation. This moves the universal in-container CA paths (PIPELOCK_CA_CERT_IN_CONTAINER / PIPELOCK_CA_KEY_IN_CONTAINER) to claude_bottle/pipelock.py, drops the class-attr indirection on the ABC, and deletes the four empty docker subclasses. Both backends now instantiate the ABCs directly; the docker-side modules keep the docker-flavored helpers (image pin, container naming, host CA mint) and re-export the moved pipelock constants for compat. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
1dfc359141 |
feat(smolmachines): thread inner Plans + bundle daemons run (PRD 0023 chunk 4b)
Bundle daemons (pipelock, egress, optionally git-gate + supervise)
now actually start with their config files bind-mounted from the
inner Plans the docker backend already produces. Chunks 2d + 3
ran with daemons_csv="" so the bundle's init supervisor idled;
chunk 4b wires up the real path: agent → pipelock → egress →
internet (when routes declared) is now functional, modulo agent-
image gaps (claude-code / TLS-trust-store / git in the guest)
that chunk 4c addresses.
bottle_plan.py — added the four inner Plan fields:
proxy_plan: PipelockProxyPlan
git_gate_plan: GitGatePlan
egress_plan: EgressPlan
supervise_plan: SupervisePlan | None
Same shape the docker backend's plan uses. Docker-network-only
fields (internal_network, egress_network) stay at dataclass
defaults — the smolmachines bundle is on a per-bottle bridge
with a pinned IP, not docker's --internal + egress topology.
prepare.py — instantiates DockerPipelockProxy / DockerEgress /
DockerGitGate / DockerSupervise and calls their .prepare()
methods to write the per-bottle config files (pipelock.yaml,
routes.yaml, git-gate entrypoint/hooks, supervise queue dir)
under the per-bottle state dir. (The "Docker" prefix on the
class names is a misnomer here — .prepare() is platform-neutral,
inherited from each sidecar's ABC. A future cleanup could factor
the prepare logic out of the docker subpackage.)
launch.py — major rewrite:
- pipelock_tls_init at launch (always); egress_tls_init only
when the bottle declares routes (otherwise the CA files
aren't bind-mounted and openssl runs would be wasted).
- Inner Plans updated in place with launch-time CA paths +
EGRESS_UPSTREAM_PROXY = http://127.0.0.1:8888 (egress's
upstream is pipelock on the bundle's own loopback; same
container's network namespace).
- BundleLaunchSpec env + volumes built from the inner Plans:
pipelock.yaml + CA + key (always); egress routes + CAs +
upstream env + token-slot bare names (when routes); git-gate
entrypoint + hooks + per-upstream identity files (when
upstreams); supervise queue dir + env (when enabled).
- daemons_csv = ["egress", "pipelock"] + ["git-gate"] (if
upstreams) + ["supervise"] (if enabled).
- Token env values resolved from host env via
`egress_resolve_token_values` and threaded into the
docker-run subprocess env (bare-name -e entries in spec
inherit from there — values never land on argv).
Tests:
- 552 unit passing (no new unit cases; fixture updated to
populate the new plan fields).
- 5 integration cases passing locally (Darwin + smolvm + docker
+ not GITEA_ACTIONS):
* test_smoke_exec_echo — still works.
* test_localhost_reach_probe — host loopback still refused.
* test_egress_port_bypass_probe — <bundle-ip>:9099 still
refused, NOW WITH EGRESS ACTUALLY RUNNING (chunk 3's
127.0.0.1 bind-address is doing its job).
* test_prompt_file_lands_in_guest — still works.
* test_pipelock_answers_on_bundle_ip — NEW. From inside the
guest, wget to <bundle-ip>:8888 gets an HTTP response
(not "connection refused") — proves pipelock is actually
listening and the bind-mount + CA generation path works.
What's left in chunk 4:
- 4c: agent-image-conversion (claude-code + git + curl +
ca-certificates in the guest). Chunk 2d's alpine placeholder
stays for now.
- 4d: provision_ca + provision_git + provision_supervise once
the agent image has the required tools.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
9e3b7e441e |
feat(smolmachines): provision_prompt + provision_skills (PRD 0023 chunk 4a)
First slice of chunk 4: implement the two provisioning methods
that don't depend on agent-image tooling beyond `cp` and
`mkdir`. provision_ca / provision_git / provision_supervise
land once the agent-image gap is solved (chunk 4b+) — they need
update-ca-certificates, git, and the claude binary respectively,
none of which the chunk-2d alpine placeholder provides.
What this PR ships:
- `claude_bottle/backend/smolmachines/provision/` subpackage
with `prompt.py` + `skills.py`. Each routes through
`smolvm.machine_cp` / `machine_exec`. provision_prompt mirrors
the docker contract (file always copied; return value drives
--append-system-prompt-file iff the agent has a non-empty
prompt). provision_skills mkdir + cp per skill, matching
the docker backend's loop.
- prepare.py now writes the prompt file under
agent_state_dir(slug) with the agent's `prompt` body, mode
0o600. The in-guest path is `/root/.claude-bottle-prompt.txt`
(alpine has no `node` user; will become `/home/node/...` once
the real claude-bottle image lands).
- launch.py calls `provision(plan, machine_name)` after
machine_start. The returned prompt path threads to
SmolmachinesBottle so exec_claude can add
--append-system-prompt-file when the agent has a prompt.
- backend.py: provision_prompt / provision_skills now real;
provision_git is a deliberate stub (waiting on the git-gate
inner Plan + git in the agent image). provision_supervise
stays the chunk-2d stub.
Tests:
- 7 new unit cases (test_smolmachines_provision.py): argv
shape (mocked smolvm.machine_cp / .machine_exec),
prompt return-value contract, no-op-with-no-skills,
CLAUDE_BOTTLE_GUEST_SKILLS_DIR override, fail-on-missing-skill.
- 1 new integration case in test_smolmachines_launch.py:
end-to-end verification that the prompt file lands in the
alpine guest at /root/.claude-bottle-prompt.txt with the
expected content (via `bottle.exec("cat ...")`). The smoke +
the two TSI probes stay green.
552 unit + 4 integration (Darwin+smolvm+docker gated) passing.
What's left in chunk 4:
- 4b: thread the inner Plans (PipelockProxyPlan / EgressPlan /
GitGatePlan / SupervisePlan) through prepare + launch so the
bundle daemons actually run (currently daemons_csv="").
- 4c: the agent-image-conversion gap — get claude-code + git +
curl + ca-certificates into the guest image (build a
.smolmachine via `pack create --from-vm` after manual setup,
or push the docker image to a registry smolvm can pull).
- 4d: provision_ca + provision_git + provision_supervise once
4b + 4c land.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
9f65b137b9 |
feat(smolmachines): end-to-end launch + Bottle.exec + smoke + probes (PRD 0023 chunk 2d)
End-to-end launch flow for the smolmachines backend. Brings up
the per-bottle docker bridge + sidecar bundle, creates and
starts the smolvm guest pointed at the bundle's pinned IP via
TSI's `--allow-cidr <bundle-ip>/32`, yields a SmolmachinesBottle
handle that routes exec/cp through `smolvm machine exec / cp`,
tears everything down on context exit.
launch.py:
- ExitStack-managed: create_bundle_network → start_bundle →
machine_create → machine_start (each registered for reverse
teardown).
- daemons_csv="" for chunk 2d — bundle init logs "no daemons
selected" and idles. Real daemon bringup with inner-Plan-driven
env + volumes lands in chunk 4.
bottle.py:
- SmolmachinesBottle.exec → smolvm.machine_exec (captured).
- SmolmachinesBottle.exec_claude → direct subprocess.run with
inherited TTY for interactive sessions.
- SmolmachinesBottle.cp_in → smolvm.machine_cp.
Architecture pivots forced by smolvm 0.8.0's CLI shape:
1. `--from <smolmachine>` and `--smolfile <toml>` are MUTUALLY
EXCLUSIVE in smolvm 0.8.0. We need --from to avoid the
registry-pull race that bit us on machine_start (libkrun
agent's network attempt got refused by macOS with
"connect: permission denied" on IPv6). So Smolfile is dropped
entirely; per-bottle env + allow_cidrs flow as CLI flags
(`--allow-cidr CIDR`, `-e K=V`) directly to machine_create.
2. `smolvm pack create --image` doesn't pull from the local
docker daemon — only OCI registries via crane. The real
claude-bottle:latest image lives in the local docker daemon
and isn't reachable that way. Chunk 2d ships with an alpine
placeholder; the agent-image-conversion gap belongs to
chunk 4 (push the image to a registry, or smolvm grows a
docker-daemon transport).
Other changes:
- machine_create grew `image=` / `from_path=` / `allow_cidrs=`
/ `env=` kwargs; smolfile= dropped.
- bottle_plan: smolfile_path → agent_from_path + guest_env.
- prepare: pack_create against `alpine:latest`, cached under
~/.cache/claude-bottle/smolmachines/ keyed by image ref.
- Deleted smolfile.py + test_smolfile.py (dead code now).
Tests:
- Unit: 540 passing (smolvm wrapper grew 4 new flag forms; one
test renamed to reflect --from + --allow-cidr + -e combo).
- Integration: 3 new cases in tests/integration/
test_smolmachines_launch.py, gated on Darwin + smolvm on PATH
+ docker + not GITEA_ACTIONS:
* smoke: bottle.exec("echo hello-from-vm") round-trips with
the correct stdout + returncode.
* localhost-reach probe: agent dials 127.0.0.1:9 → connect
refused (TSI's <bundle-ip>/32 allowlist doesn't include
loopback). The regression test for the gap the PRD design
pivot was about.
* egress-port-bypass probe: agent dials <bundle-ip>:9099
(egress's port) → connect refused. Chunk 2d has no
daemons running so nothing's listening anyway; chunk 3
will preserve this property once egress is up but bound
to 127.0.0.1 inside the bundle.
End-to-end smoke + both probes green locally on macOS with
smolvm 0.8.0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
495be7f9c0 |
feat(smolmachines): bundle bringup on per-bottle docker bridge (PRD 0023 chunk 2c)
claude_bottle/backend/smolmachines/sidecar_bundle.py — primitives for the per-bottle bridge + bundle container with pinned IP: - bundle_network_name(slug) / bundle_container_name(slug) - create_bundle_network(name, subnet, gateway) - remove_bundle_network(name) - start_bundle(BundleLaunchSpec, env=) - stop_bundle(slug) `BundleLaunchSpec` carries the launch-time fields (network + subnet + gateway + bundle_ip + daemons_csv + environment + volumes). Wiring it up from the inner Plans (PipelockProxyPlan, EgressPlan, GitGatePlan, SupervisePlan) is chunk 2d's job; this module is the docker-argv surface only. Pinning the bundle IP via `docker run --ip <bundle-ip>` is what makes smolvm's TSI allowlist (`<bundle-ip>/32`) safe to compute at prepare time — without pinning, we'd have to inspect the assigned IP after start and feed it back into the Smolfile. Idempotent semantics where it matters: `create_bundle_network` treats "already exists" as success, `remove_bundle_network` + `stop_bundle` treat "no such ..." as success. Other failures die / warn depending on whether the launch flow can recover. Tests: - 15 unit cases (mocked subprocess.run): argv shape for create / remove / start / stop, idempotent paths, host-env inheritance to docker run subprocess. - 1 integration case (real docker daemon, gated on docker available + not GITEA_ACTIONS): end-to-end bringup of an empty-daemons bundle on a 192.168.211.0/24 bridge, confirms the container lands at the pinned IP. Skipped if the claude-bottle-sidecars:latest image isn't built (operator hasn't run a docker bottle yet). 546 unit tests passing. Real-docker bundle bringup green locally. Launch wiring + provisioning + PRD 0022 acceptance probes land in chunk 2d. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
9c333bc130 |
feat(smolmachines): smolvm subprocess wrapper (PRD 0023 chunk 2b)
claude_bottle/backend/smolmachines/smolvm.py — one thin Python
function per smolvm CLI subcommand the launch flow needs:
- pack_create(image, output) → smolvm pack create
- machine_create(name, from_path,
smolfile) → smolvm machine create
- machine_start(name) → smolvm machine start
- machine_stop(name) → smolvm machine stop
- machine_delete(name) → smolvm machine delete -f
- machine_exec(name, argv, env,
workdir, timeout) → smolvm machine exec
- machine_cp(src, dst) → smolvm machine cp
- is_available() → shutil.which check
The wrapper hides the CLI's inconsistent name-flag style
(positional NAME on create/delete, --name on start/stop/exec/
status) behind a uniform `name=` kwarg.
Two return shapes:
- SmolvmRunResult (returncode + stdout + stderr) from
machine_exec, because callers care about the in-VM
command's exit code.
- Raises SmolvmError on non-zero for all other commands;
failure to create/start/stop a VM is fatal to the launch
flow, not branched on.
Tests:
- 15 unit cases mocking subprocess.run, covering argv shape
per subcommand (the --name vs positional inconsistency
locked down), SmolvmError on non-zero for non-exec paths,
SmolvmRunResult passthrough on exec, empty-path cp no-op.
- 2 integration cases against the real smolvm binary
(gated on Darwin + smolvm on PATH + not GITEA_ACTIONS):
smolvm --help responds, machine ls --json parses as a
list (the contract chunk 4's list_active will consume).
531 unit tests passing. Real-smolvm smoke green locally.
Bundle bringup + launch wiring + the localhost-reach /
egress-port-bypass probes land in chunks 2c + 2d.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
5b9ceaaaee |
fix(sidecars): per-daemon pipelock restart keeps supervise socket alive
`apply_allowlist_change` used `docker restart <bundle>` to make
pipelock reload, which bounced ALL four daemons — including
supervise, whose MCP socket the agent's claude-code client had
open. That dropped the connection. A second apply works because
supervise has come back up by then.
Fix: per-daemon restart via SIGUSR1.
- New `_Supervisor.restart_daemon(name)` terminates one named
child and spawns a replacement in place. Other daemons keep
running.
- main() wires SIGUSR1 → `restart_daemon("pipelock")`. Pipelock
has no in-process reload, so this is its analog of egress's
SIGHUP-reload-addon path. Pipelock is the only daemon that
currently needs hot-config reload via restart; if others
acquire the need, add a new signal.
- `apply_allowlist_change` now `docker kill --signal USR1
<bundle>` instead of `docker restart`. Supervise / egress /
git-gate keep running across the apply.
Tests:
- New `_Supervisor.restart_daemon` cases: replaces in place
(different pid post-restart, sibling daemon unchanged),
unknown name is a no-op, restart-during-shutdown is a no-op.
- `test_pipelock_apply` rewritten to bring up the bundle image
with `CLAUDE_BOTTLE_SIDECAR_DAEMONS=pipelock` so the
supervisor is PID 1 and handles SIGUSR1. The previous
standalone-pipelock setup wouldn't survive SIGUSR1 (pipelock
default disposition is terminate). Test builds the bundle
image in setUpClass (cached layers make repeat runs fast).
531 tests passing locally (unit + integration).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
62f6f8db34 |
refactor(sidecars): bundle is the only shape (PRD 0024 chunk 5)
The CLAUDE_BOTTLE_SIDECAR_BUNDLE feature flag is gone. Every
bottle ships with the agent + bundle pair — no opt-in, no legacy
four-sidecar fallback.
Changes:
- Renderer (compose.py): bottle_plan_to_compose unconditionally
emits {agent, sidecars}. Deleted _pipelock_service,
_git_gate_service, _egress_service, _supervise_service helpers.
_agent_service.depends_on collapses to ["sidecars"].
- sidecar_bundle.py: deleted sidecar_bundle_enabled (the flag
parser). SIDECAR_BUNDLE_IMAGE + container-name helper stay.
- pipelock_apply.py: docker cp + docker restart now target
sidecar_bundle_container_name(slug). Bundle restart bounces
all four daemons together (per-daemon reload is the eventual
feature, not v1).
- Per-sidecar modules trimmed:
- egress.py: dropped EGRESS_IMAGE, EGRESS_DOCKERFILE,
build_egress_image, egress_url. Kept EGRESS_PORT, CA paths,
egress_container_name (still used by the renderer's network
aliases).
- git_gate.py: dropped GIT_GATE_IMAGE, GIT_GATE_DOCKERFILE,
build_git_gate_image. Kept git_gate_host + GIT_GATE_PORT.
- supervise.py: dropped SUPERVISE_IMAGE, SUPERVISE_DOCKERFILE,
build_supervise_image, supervise_url.
- Deleted Dockerfile.{egress,git-gate,supervise}. The bundle's
Dockerfile.sidecars is the only sidecar image now.
- test_compose.py: deleted TestPipelockAlwaysPresent,
TestConditionalGitGate, TestConditionalEgress,
TestConditionalSupervise, TestFullMatrix (legacy-shape only),
TestSidecarBundleFlag (flag is gone). TestSidecarBundleShape
drops its patch.dict wrapper. TestAgentAlwaysPresent's
depends_on cases collapse to one.
- test_pipelock_apply.py: bringup container name uses
sidecar_bundle_container_name(slug) to match the production
target.
- README.md Architecture section rewritten to describe the
agent + bundle pair.
Net: -626 lines.
Test status: 498 unit + 27 integration + 1 skipped (chunk-4
pending — superseded by this chunk's rewrite). Locally verified
end-to-end bottle launch produces exactly 2 containers
(claude-bottle-<slug> + claude-bottle-sidecars-<slug>).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
2287b0dd08 |
test(sidecars): integration sweep for the bundle path (PRD 0024 chunk 4)
Three deliverables:
1. Rewrite test_pipelock_apply bringup with a direct `docker run`.
Replaces the .start-based bringup deleted in chunk 3. Stages
the yaml + CAs to the real pipelock_state_dir so the bind-
mount target matches what apply_allowlist_change writes to —
the legacy .start path did this implicitly because it lived
inside the production flow; the new bringup needs to be
explicit about the path. All 4 cases pass.
2. New tests/integration/test_sidecar_bundle_compose.py: end-
to-end smoke with CLAUDE_BOTTLE_SIDECAR_BUNDLE=1. Brings up
a real bottle via the compose path and verifies the agent
can reach pipelock + supervise through the bundle's legacy
aliases (no agent-side config changes between flag positions).
Skipped under act_runner — multi-stage build + bind mounts.
3. Two bundle-path bugs surfaced and fixed while running PRD
0022 with the flag on:
- egress_entrypoint.sh: add `--set confdir=/home/mitmproxy/
.mitmproxy` so mitmdump finds the bind-mounted CA. The
legacy Dockerfile.egress runs as user mitmproxy (~mitmproxy
resolves correctly); the bundle runs as root and otherwise
would look in /root/.mitmproxy/ and mint a NEW CA the agent
doesn't trust. Symptom: PRD 0022 attack-3 curl failed with
"unable to get local issuer certificate".
- sidecar_init.py: add `--listen 0.0.0.0:8888` to pipelock's
argv. Without it pipelock defaults to 127.0.0.1, so the
in-bundle egress's upstream connect to the
`claude-bottle-pipelock-<slug>` alias arrives over the
docker network and gets refused. The legacy renderer
passed this flag verbatim; the bundle dropped it. Symptom:
egress returned HTTP 502 with "Connect call failed
('172.x.x.x', 8888)".
PRD 0022's 5-attack sandbox-escape suite now passes with the
bundle flag on AND off.
Test status:
- Unit: 533 passing.
- Integration: 9 passing locally with flag off, 5 passing with
flag on. Bundle compose smoke + PRD 0022 sandbox-escape both
green under CLAUDE_BOTTLE_SIDECAR_BUNDLE=1.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
539234f29e |
refactor(sidecars): drop vestigial start/stop methods (PRD 0024 chunk 3)
Compose-up has owned per-container lifecycle since PRD 0018 ch3;
the .start() / .stop() methods on DockerPipelockProxy /
DockerEgress / DockerGitGate / DockerSupervise (and their
abstractmethod declarations in the four base ABCs) were already
documented as vestigial. With the bundle path in flight
(PRD 0024 ch2), they are truly dead — collapse to nothing.
Changes:
- Removed start/stop methods from the four DockerSidecar
classes. Plan dataclasses, image/path constants,
container-name helpers, and the .prepare() methods all stay
(the renderer + apply path still need them).
- Removed the matching @abstractmethod declarations in the
base ABCs so concrete subclasses don't have to stub them.
- launch.launch() and prepare.resolve_plan() no longer take
proxy/git_gate/egress/supervise instance parameters. backend.py
loses the four instance attributes it threaded through.
prepare.resolve_plan() instantiates the four classes itself
to call their .prepare() methods.
- Deleted four integration tests that only exercised the
removed lifecycle: test_pipelock_sidecar_smoke,
test_supervise_sidecar, test_git_gate_sidecar,
test_git_gate_mirror.
- Dropped the .stop-idempotency case in test_orphan_cleanup;
the network-cleanup cases stay (those test real production
code).
- Marked test_pipelock_apply @skip pending chunk 4 — its
bringup helper used .start; chunk 4 rewrites it with direct
`docker run`.
Dockerfile deletion deferred to chunk 5 (when the bundle flag
default flips) — the legacy compose path still needs
Dockerfile.{egress,git-gate,supervise} until then.
Net: 708 lines removed, 80 added.
533 unit tests + 27 integration tests passing (5 skipped: the
chunk-4-pending case + existing GITEA_ACTIONS guards).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
61f63684ac |
feat(sidecars): bundle image + Python init supervisor (PRD 0024 chunk 1)
New Dockerfile.sidecars multi-stage build: pulls the pinned
pipelock and gitleaks binaries into a mitmproxy-base final image,
installs git + openssh-client, and ships the project's egress
addon + supervise server alongside a stdlib-Python init at
/app/sidecar_init.py.
The init supervisor (claude_bottle/sidecar_init.py) is PID 1 in
the bundle. It spawns the daemons named in
CLAUDE_BOTTLE_SIDECAR_DAEMONS (or all four by default),
propagates SIGTERM/SIGINT to children with an 8s grace before
SIGKILL, and exits with the first-unexpected-child exit code so
a daemon crash tears down the bundle (per PRD 0024 open
question 1's default).
claude_bottle/egress_entrypoint.sh extracted verbatim from
Dockerfile.egress's prior inline sh -c so the supervisor can
call it as a normal child.
Tests:
- unit: _selected_daemons env-var subset behavior (7 cases),
_Supervisor signal/exit-code semantics including SIGKILL
escalation, and end-to-end main() via subprocess.
- integration: builds the image and probes that pipelock,
gitleaks, mitmdump, and the supervise Python module are
present + executable, plus a no-daemons-selected smoke test
of the entrypoint wiring. Skipped under act_runner (200+MB
base pulls + multi-stage build).
Renderer collapse and the deletion of Dockerfile.{egress,git-gate,
supervise} land in chunk 2 + 3.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
5c17fcdf90 |
test(integration): skip sandbox-escape suite under act_runner
The Gitea CI runner shares the host docker socket but not its filesystem, so pipelock_tls_init's host bind-mount path for CA files is invisible to the runner container — the same constraint that already gates the other bottle-bringup integration tests. PRD 0022's test suite was missing this guard; it failed on the post-merge main build with "pipelock tls init did not produce ca files". Mirror the existing skipIf pattern at the class level. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
23f50f7720 |
fix(pipelock): scan all request headers + fix attack-3 destination
Two related changes the PRD 0022 sandbox-escape test surfaced:
1. `pipelock_build_config` now emits
`request_body_scanning.scan_headers: true` and
`header_mode: all`. Pipelock's default `header_mode:
sensitive` only checks Authorization / Cookie / X-Api-Key
/ X-Token / Proxy-Authorization / X-Goog-Api-Key — an
agent attempting exfil could trivially pick a
non-sensitive header (`X-Custom: $SECRET`) and slip
through. `all` closes the gap; pipelock caps it by the
same max_body_bytes the body scan uses.
2. Test 3 (HTTP exfil shapes) now targets
raw.githubusercontent.com instead of api.anthropic.com.
api.anthropic.com is in `DEFAULT_TLS_PASSTHROUGH` —
pipelock can't MITM it because real LLM conversation
bodies false-positive on DLP scanners (BIP-39 etc.). The
trade-off is documented in `pipelock.DEFAULT_TLS_PASSTHROUGH`;
the test now exercises a host where the sandbox is
actually supposed to block.
All 5 sandbox-escape attacks now produce HTTP 403 with the
expected sandbox marker (`egress:`, `pipelock`, or `blocked:`):
- Attack 1 (non-allowlisted host) ✓ egress
- Attack 2 (non-allowlisted IP + spoof) ✓ egress
- Attack 3a (URL path) ✓ pipelock DLP
- Attack 3b (URL query) ✓ pipelock DLP
- Attack 3c (request body) ✓ pipelock DLP
- Attack 3d (request header) ✓ pipelock DLP (scan_headers)
- Attack 4a (crafted subdomain) ✓ egress
- Attack 4b (direct dig @8.8.8.8) ✓ network isolation
- Attack 5 (README push, 3 secret shapes) ✓ gitleaks (pre-upstream)
489 unit tests pass (1 updated for the new request_body_scanning
shape). Full integration suite passes in ~6s.
|
||
|
|
e2231f46a3 |
test(integration): PRD 0022 sandbox-escape suite (chunks 1-5)
End-to-end test that brings up a real bottle with allowlisted
egress + git-gate + three planted secrets, then runs five
attacks from inside the agent container.
Chunks 1-5 implemented in one pass against the Docker backend:
Attack 1 — non-allowlisted hostname (curl evil.example.com)
✓ blocked by egress
Attack 2 — non-allowlisted IP literal (198.51.100.1) + host-
header spoof via curl --resolve
✓ both blocked by egress
Attack 3 — HTTP exfil to allowlisted destination via path /
query / body / header
✗ ALL FOUR LEAK — request reaches api.anthropic.com
with the secret embedded. Pipelock's DLP doesn't
catch the anthropic-key shape in the body, and
nothing scans path / query / headers.
Attack 4 — DNS exfil via crafted subdomain + direct
dig @8.8.8.8 query
✓ both blocked (egress rejects subdomain, internal
network has no path to 8.8.8.8)
Attack 5 — README push through git-gate with secret-bearing
attacker URL (parameterized over anthropic / AWS /
generic shapes); ordering check that gitleaks fires
BEFORE any upstream attempt
✓ all three secret shapes blocked by gitleaks
Per PRD 0022 Q1 the assertion in attack 3 is authoritative —
HTTP 403 with an egress/pipelock marker in the body is the only
acceptable outcome. Any 4xx from upstream means the secret
reached the network. The four failing sub-tests are real
sandbox gaps that need their own remediation PRDs before this
test merges green.
Also adds `dnsutils` (dig) to the base agent image so attack 4's
direct-DNS check has a tool to run.
CI: no changes needed — `.gitea/workflows/test.yml` already runs
`tests/integration/` and the suite skip_unless_dockers cleanly
when the runner has no Docker socket.
|
||
|
|
1e5b0dcfca |
refactor: rename egress-proxy → egress everywhere
The manifest key is `egress:` now; finish the rename so the rest of the codebase matches. Files (Dockerfile.egress, claude_bottle/egress.py etc.), classes (Egress, EgressConfig, EgressRoute, EgressPlan, DockerEgress), constants (EGRESS_HOSTNAME, EGRESS_ROUTES, ...), container name prefix (claude-bottle-egress-*), docker network alias (egress), the introspection host (_egress.local), the MCP tool IDs (egress-block, list-egress-routes), and the preflight label all drop the `-proxy` suffix. |
||
|
|
572106d98f |
refactor(cli): drop --format=json end-to-end
Companion to the compact preflight in #31 — the JSON format was the structured alternative to the verbose text summary. With the new compact text already on screen, no consumer was using the JSON shape, and the abstract `BottlePlan.to_dict` was the biggest piece of API surface no one is implementing against. Removed: - `--format` CLI flag from `start` and `resume`. - `output_format` kwarg from `_launch_bottle`. - `BottlePlan.to_dict` abstract method. - `DockerBottlePlan.to_dict` (60-line dict builder). - The `_PlanView` dataclass — `print` was the only remaining caller, so the env-name computation is inlined. - `tests/integration/test_dry_run_plan.py` (JSON-shape integration test). - `tests/unit/test_cli_start_format.py` (flag-conflict unit). Plan-introspection is still possible by reading the `DockerBottlePlan` dataclass directly — fields like `image`, `container_name`, `stage_dir`, `use_runsc` are all there. Tooling that needs a stable wire shape can JSON-serialize the dataclass themselves. 411 unit + integration tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
1542ee0b93 |
feat(egress-proxy-block): single-route input + merge-on-apply
Instead of asking the agent to compose and submit a full routes
file, the tool now takes ONE proposed route — host + optional
path_allowlist + optional auth — and the supervisor merges it
into the live routes table at approval time. The agent no longer
needs to fetch / reproduce / extend the existing allowlist; it
just describes the host it wants reachable.
Tool input (new):
- `host` (required)
- `path_allowlist` (optional, array of absolute path prefixes)
- `auth` (optional, {scheme, token_ref})
- `justification` (required)
Merge semantics (in `egress_proxy_apply._merge_single_route`):
- Host NOT in current routes → append the proposed route as a
new entry. If `auth` is set, assign the next EGRESS_PROXY_TOKEN_N
slot.
- Host already present → union the proposed `path_allowlist`
with the existing one (proposed entries appended after
existing, deduped). Existing `auth_scheme` / `token_env`
preserved; proposed `auth` ignored (operator-controlled, not
agent-controlled).
- Hostname comparison is case-insensitive.
Dashboard wiring: `approve()` on an egress-proxy-block proposal
now calls `add_route(slug, proposed_route_json)` instead of
`apply_routes_change(slug, full_file)`. add_route fetches the
current routes from the running egress-proxy, merges, and calls
apply_routes_change with the merged content — so the
pipelock-mirror + SIGHUP plumbing from chunk 3 still runs
end-to-end. Audit diff still captures the full-file before/after.
Tool description rewritten to make the new shape obvious and to
stop pointing the agent at the routes file. The
`list-egress-proxy-routes` tool stays available for agents that
want to see what's currently allowed.
Tests: 9 new `_merge_single_route` cases (host absent/present,
path-allowlist union+dedup, auth-slot indexing, case-insensitive
match, existing-auth preservation, missing-host rejection,
malformed-current rejection). 407 unit + integration pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
3be70eb07a |
feat(supervise): list-egress-proxy-routes MCP tool, defaults on egress-proxy
Reshape the allowlist topology so the egress-proxy is the bottle's
single allowlist surface, and replace the agent-side
routes/allowlist file mounts with a live MCP tool.
Policy change (move defaults to egress-proxy):
- `egress_proxy_routes_for_bottle(bottle)` now folds in
DEFAULT_ALLOWLIST (the claude-code defaults) and
`bottle.egress.allowlist` (user adds) as bare-pass routes (no
auth, no path filter), on top of the bottle's
`egress_proxy.routes`. Manifest routes win on host collision.
- `pipelock_effective_allowlist(bottle)` mirrors egress-proxy's
effective host set when egress-proxy is in use. Pipelock is
no longer the bottle's primary allowlist authority; it
enforces a downstream copy as defense-in-depth + does DLP body
scanning.
- Split out `egress_proxy_manifest_routes(bottle)` for callers
that want just the manifest entries (tests, internal use).
- DEFAULT_ALLOWLIST moves from `pipelock.py` to `egress_proxy.py`
(pipelock re-imports for the no-egress-proxy fallback path).
- Dropped the `egress-proxy` auto-allow on pipelock's allowlist
— the agent never dials egress-proxy via the proxy mechanism;
pipelock only sees upstream hostnames from egress-proxy's
CONNECTs.
Introspection endpoint (existing mitmproxy feature):
- Egress-proxy addon recognises requests to the magic host
`_egress-proxy.local` and synthesizes responses via
`flow.response = http.Response.make(...)` — no upstream
connection, no allowlist enforcement on the magic host.
- `GET /allowlist` returns the in-memory route table as JSON
(host + path_allowlist + auth_scheme + token_env per route;
no token VALUES).
- Smoke-tested end-to-end against a real egress-proxy container.
MCP tool (existing supervise plumbing):
- New `list-egress-proxy-routes` tool (no inputs, no operator
approval). Handler fetches via egress-proxy's introspection
endpoint using urllib's ProxyHandler against
`EGRESS_PROXY_FORWARD_PROXY`. Returns the JSON payload as the
tool's text content; `isError: true` if the proxy is
unreachable.
- `egress-proxy-block` description now points the agent at
`list-egress-proxy-routes` instead of a staged file path.
- `pipelock-block` description acknowledges the mirror — agents
should prefer `egress-proxy-block` to add hosts; pipelock-block
stays for the rare divergence case.
Drop agent-side file mounts:
- Supervise's `current-config` dir staging no longer writes
routes.yaml / allowlist. Only `Dockerfile` remains
(capability-block still reads it from
`/etc/claude-bottle/current-config/Dockerfile`).
- `prepare.py` stops passing `routes_content` /
`allowlist_content` to `supervise.prepare`.
- `Supervise.prepare` signature simplified to one
`dockerfile_content` kwarg.
Tests: 400 unit + integration pass. Added coverage for
defaults-folding (`TestRoutesForBottleFoldsDefaults`), the new
tool definition + handler, and the updated supervise.prepare
shape.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
9cd583fbbb |
feat(egress-proxy): retarget remediation at egress-proxy (PRD 0017 chunk 3)
Finishes PRD 0017. The `cred-proxy-block` MCP tool is renamed and
its remediation apply path is repointed at egress-proxy.
- `claude_bottle/supervise.py` — `TOOL_CRED_PROXY_BLOCK` →
`TOOL_EGRESS_PROXY_BLOCK`; `COMPONENT_FOR_TOOL` maps the new
tool ID to `egress-proxy` for audit-log routing.
- `claude_bottle/supervise_server.py` — tool definition renamed
+ description rewritten: "Call when egress-proxy refused your
HTTPS request ... Read the current routes.yaml from /etc/
claude-bottle/current-config/routes.yaml, compose a modified
version, pass the full new file plus a justification." The
syntactic validator dispatches on the new tool ID.
- `claude_bottle/backend/docker/egress_proxy_apply.py` — renamed
from `cred_proxy_apply.py`. Reads routes.yaml from
/etc/egress-proxy/routes.yaml via `docker exec cat`; validates
via `egress_proxy_addon_core.load_routes` (so both sides use
the same parser); writes via `docker cp`; SIGHUPs egress-proxy
with `docker kill --signal HUP`. `EgressProxyApplyError`
replaces `CredProxyApplyError`.
- `claude_bottle/cli/dashboard.py` — wires the new apply +
`discover_egress_proxy_slugs` helper; the operator-initiated
`routes edit <bottle>` verb now writes to egress-proxy with
`.yaml` suffix. Stale follow-up comment about path-aware
filtering removed — PRD 0017 settled that question.
- `tests/integration/test_supervise_sidecar.py` — restores the
approval round-trip test (chunk 2 had switched it to a reject
path because no cred-proxy existed). Approval stubs
`apply_routes_change` so the test focuses on the supervise
queue/response plumbing rather than docker-exec into a real
egress-proxy sidecar (that's covered separately).
- `tests/unit/test_egress_proxy_apply.py` — rewritten against
the new validator; covers JSON shape, missing routes key,
partial-auth-pair rejection (the addon-core parser catches
these before SIGHUP).
- PRDs 0010 + 0014 — status headers updated to
Superseded / Retargeted with a callout block pointing at PRD
0017's migration section. Historical text preserved.
384 unit + integration tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
70f773ac61 |
feat(egress-proxy): cutover from cred-proxy (PRD 0017 chunk 2)
Hard cutover. cred-proxy is deleted; egress-proxy is now the agent's
HTTP_PROXY (when routes are declared) with pipelock on its outbound
leg. Two per-bottle CAs are minted: egress-proxy's (agent trust
store) and pipelock's (egress-proxy's outbound trust store).
Manifest:
- `bottle.cred_proxy` → hard error with a migration recipe.
- `bottle.egress_proxy` is the new shape (PRD 0017 chunk 1).
- CredProxy* types + role validators removed.
Wiring:
- launch.py: `egress_proxy_tls_init` mints the egress-proxy CA
(cert+key concat for mitmproxy + cert-only for agent trust);
`DockerEgressProxy.start` docker-cps both CAs in, sets
`HTTPS_PROXY=pipelock` + `EGRESS_PROXY_UPSTREAM_CA` so mitmdump
trusts pipelock's MITM. Agent's HTTP_PROXY points at
egress-proxy when routes exist, else falls back to pipelock
(no-routes bottles unchanged).
- prepare.py / backend.py: `cred_proxy` arg → `egress_proxy`;
sidecar-orphan probe + plan field + dashboard view all
renamed.
- provision_ca: selects the egress-proxy CA when present, else
pipelock's (filename renamed to claude-bottle-mitm-ca.crt).
- bottle.provision: cred-proxy dotfile rewrites (~/.npmrc,
~/.gitconfig insteadOf, tea config) are gone — HTTP_PROXY
catches everything respecting it.
Pipelock helpers:
- `pipelock_token_hosts` → `pipelock_route_hosts` (now reading
egress_proxy.routes).
- cred-proxy hostname auto-allow → egress-proxy hostname
auto-allow.
- Anthropic seed-phrase workaround now triggers when an
egress_proxy route targets api.anthropic.com (was based on the
cred-proxy `anthropic-base-url` role).
Dockerfile.egress-proxy:
- Entrypoint conditionally passes
`--set ssl_verify_upstream_trusted_ca=$EGRESS_PROXY_UPSTREAM_CA`
(via the `${VAR:+...}` shell expansion) so standalone runs without
a mounted pipelock CA still boot.
- mkdirs `/home/mitmproxy/.mitmproxy` ahead of `docker cp`.
Deleted: claude_bottle/{cred_proxy,cred_proxy_server}.py,
backend/docker/{cred_proxy,provision/cred_proxy}.py,
Dockerfile.cred-proxy, plus the corresponding unit + integration
tests. backend/docker/cred_proxy_apply.py stays as a stub for
chunk 3 to rewrite (its container-name + routes-path constants
are inlined so it survives without the deleted module).
Test changes:
- test_pipelock_allowlist rewritten against egress-proxy routes
+ the new `pipelock_route_hosts`.
- test_manifest_md_load + test_pipelock_yaml + test_yaml_subset
fixtures migrated to the `egress_proxy: { routes: [...] }`
shape.
- test_supervise_sidecar's round-trip test switched from
`dashboard.approve` to `dashboard.reject`: the approval-apply
path on cred-proxy-block proposals hits a deleted sidecar in
chunk 2's transitional state. Chunk 3 restores the approval
test once the remediation flow is retargeted at egress-proxy.
376 tests pass (was 427; net delta is removed cred-proxy tests).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
ac8f14ae6f |
test(capability): integration test for apply_capability_change (PRD 0016)
Phase 4 of PRD 0016. End-to-end test against real Docker: - Stages a fake bottle: alpine:latest container named claude-bottle-<slug> with a marker file at /home/node/.claude/sessions.json, plus a fake supervise sidecar. - Calls apply_capability_change with a new Dockerfile. - Verifies: per-bottle Dockerfile written, agent + sidecars removed, networks removed, transcript snapshot dir on host contains the marker file (proving docker cp transferred bytes). - Subsequent-apply test proves the per-bottle Dockerfile state persists across rebuilds (before-diff uses the prior override, not the repo Dockerfile). - Teardown-idempotent test: apply against a never-started bottle doesn't raise. docker exec / cp / rm / network rm work fine across the docker socket boundary, so this runs in DinD too — no act_runner skip needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
4fada1651b |
test(pipelock): integration test for apply_allowlist_change (PRD 0015)
Phase 4 of PRD 0015. End-to-end test against real Docker: - Brings up a real pipelock sidecar via the production DockerPipelockProxy bring-up + pipelock_tls_init. - Calls apply_allowlist_change to add a new host. - Polls the live /etc/pipelock.yaml until the new host shows up (bridging the docker-restart window). - Verifies api_allowlist contains both old + new hosts and tls_interception block is preserved. - Smaller cases: invalid hostname raises, missing sidecar raises, fetch_current_allowlist returns one-per-line format. Skipped under GITEA_ACTIONS because pipelock_tls_init bind-mounts a host path that doesn't share fs in the runner, matching the existing pipelock smoke test's skip pattern. Drive-by fix: fetch_current_yaml now uses `docker cp` (daemon-API tarball copy) instead of `docker exec cat` because the pipelock image is distroless and has no shell utilities. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
70f43d8c4f |
test(cred-proxy): integration test for SIGHUP + apply round-trip (PRD 0014)
Phase 5 of PRD 0014. End-to-end test against real Docker: - Brings up a cred-proxy sidecar with route /a/ → unreachable upstream (so 502 = route matched, 404 = no route). - Calls apply_routes_change to swap to /b/ only. - Polls until the route table flips: /a/ now 404s, /b/ now 502s. - Separately verifies fetch_current_routes returns the live file, apply with invalid JSON raises, and apply against a non-existent sidecar raises. No fake-upstream container needed: unreachable hostnames give the 502 signal directly. apply_routes_change uses docker exec / cp / kill (not bind mounts), so this should work in docker-in-docker too — no DinD skip needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
92fee89e20 |
test(supervise): skip queue round-trip test in docker-in-docker (PRD 0013)
The integration test test_tools_call_round_trips_through_queue relies on a host bind-mount to share the queue dir between the sidecar (writing proposals) and the test process (approving via dashboard helpers). In the Gitea Actions runner the docker socket forwards to the outer host's daemon, so bind-mount paths are resolved against the outer host's fs — not the runner container's. The sidecar writes its proposal where the test can't see it; the test times out. Add a one-shot probe that does docker run -v <tmp>:<container> and checks both directions of fs visibility. Skip the round-trip test when the probe fails. tools_list and the orphan-name test are unaffected — they don't touch the queue. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
9f445d61be |
test(supervise): docker integration test for the sidecar (PRD 0013)
Phase 5 of PRD 0013. End-to-end integration test against real Docker: - Brings up the supervise sidecar on a per-bottle internal network. - A curl-image "agent" on the same network does tools/list and gets back the three PRD 0013 tool names over real MCP wire format. - A tools/call round-trips through the queue: agent blocks on the call, host watches the queue, dashboard.approve writes a Response, agent receives the approval payload (status, notes) in MCP content. - Documents the orphan-sidecar name-collision behavior so a future auto-cleanup change can flip the assertion. Skips if docker is unreachable, matching the existing integration pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
6ba5f9a9d3 |
feat(manifest): per-file MD directory loader (PRD 0011)
Manifest.resolve walks $HOME/.claude-bottle/{bottles,agents}/ and
$CWD/.claude-bottle/agents/ instead of reading claude-bottle.json.
A bottles/ subdir under $CWD is logged as a warn and ignored —
the filesystem layout IS the trust boundary, no resolver check
needed.
If claude-bottle.json exists alongside no .claude-bottle/ dir at
either location, dies with a clear pointer at the README — the
manifest format changed and we don't silently fall back.
Manifest.from_md_dirs(home, cwd) is the programmatic entry point
tests use to build a Manifest from fixture directories without
touching os.environ. Manifest.from_json_obj is preserved for
tests that still want to build manifests in-memory.
Bottle / agent frontmatter goes through Bottle.from_dict /
Agent.from_dict — same validators as today's JSON path. Unknown
top-level frontmatter keys die with a "did you mean" pointer
listing accepted keys. Filenames that don't match [a-z][a-z0-9-]*
are skipped with a warn.
Agent files accept the Claude Code subagent passthrough fields
(name, description, model, color, memory) so the same file can
drop into ~/.claude/agents/ — claude-bottle ignores them at
launch but doesn't reject.
The dry-run integration test ships a real MD fixture tree now;
all 200 unit + 17 integration tests stay green.
|
||
|
|
32b62cbacc |
feat(cred_proxy)!: cred-proxy is the only Anthropic auth path
Removes the legacy `CLAUDE_BOTTLE_OAUTH_TOKEN` -> `CLAUDE_CODE_OAUTH_TOKEN` forward in prepare.py. Bottles that need claude-code to authenticate must declare a cred_proxy route with role: "anthropic-base-url" — there is no fallback that hands the token to the agent directly. Drops the now-dead BottleSpec.forward_oauth_token field, the CLI setter that read CLAUDE_BOTTLE_OAUTH_TOKEN from the host env at prepare time, and the forward_oauth_token=False arg in the six pipelock integration tests. PRD 0010 and README updated; the dev ~/claude-bottle.json gains an anthropic-base-url route so the implementer/researcher agents keep working. BREAKING: bottles previously relying on the implicit OAuth forward will now produce an agent environ without any Anthropic credential. Verified with --dry-run: a bottle with no anthropic-base-url route yields env_names: [] (no token at all); a bottle that declares the route yields ANTHROPIC_BASE_URL plus a non-secret placeholder for CLAUDE_CODE_OAUTH_TOKEN. |
||
|
|
2990c3c903 |
refactor(cred_proxy): rename Upstream -> Route, fix tea-login AttributeError
Three leftovers from the manifest refactor: 1. provision/cred_proxy.py:223 referenced u.kind == 'gitea' for the tea login count — kind was removed from the runtime class, so any bottle with a tea-login route raised AttributeError at provision time. Switch to `'tea-login' in r.roles`. 2. The runtime class CredProxyUpstream is renamed to CredProxyRoute (its data is a route on the proxy, not an "upstream"; the field route.upstream is the upstream URL). Module's own naming now aligns with manifest.CredProxyRoute and routes.json. 3. cred_proxy_upstreams_for_bottle -> cred_proxy_routes_for_bottle; CredProxyPlan.upstreams -> CredProxyPlan.routes; local `upstreams` collections become `routes`. Callers in backend.py, launch.py, prepare.py, bottle_plan.py, provision/cred_proxy.py, and tests updated. Also strips lingering `bottle.tokens` references from docstrings (pipelock.py, cred_proxy.py prepare(), manifest._parse_https_host, test_pipelock_allowlist.py module doc) and removes dead helpers from the integration test (the _bottle helper used a tokens field that no longer parses). |
||
|
|
fcbbc4484d |
refactor(cred_proxy): flat routes, role-driven provisioning (PRD 0010)
Replace bottle.tokens (with Kind enum and hardcoded per-kind
route/auth tables) with bottle.cred_proxy.routes — each route
declares its own path, upstream, auth_scheme, token_ref, and
optional role[]. The manifest is now the source of truth for the
proxy's runtime route table; adding an upstream is a manifest edit,
not a code change.
Agent-side rewrites move from per-kind dispatch to per-role tags
on routes:
anthropic-base-url -> set ANTHROPIC_BASE_URL=<proxy><path>
npm-registry -> write ~/.npmrc registry=
git-insteadof -> write ~/.gitconfig [url] insteadOf, keyed
off route.upstream (suppressed when
bottle.git brokers the same host)
tea-login -> add a ~/.config/tea/config.yml login
Roles are a list (string accepted as sugar). A gitea route
typically carries ["git-insteadof", "tea-login"]. Singleton roles
(anthropic-base-url, npm-registry) appear on at most one route.
token_env slots are assigned per distinct TokenRef in declaration
order — two routes sharing a token_ref (e.g. github API + git
endpoints) share a slot.
Drops: TOKEN_KINDS, _KIND_ROUTES, _KIND_AUTH_SCHEME, _TOKEN_DEFAULT_HOST,
cred_proxy_route_path_for_gitea, the kind field on CredProxyUpstream,
and the kind-based hardcoding in pipelock_token_hosts (now derives
from route.UpstreamHost).
Legacy bottle.tokens manifests now die with a hint pointing at
bottle.cred_proxy.routes + this PRD. Tests rewritten end-to-end.
Docs + example.json + the dev ~/claude-bottle.json updated to match.
|
||
|
|
07da4366ad |
test(cred_proxy): integration tests for header inject + strip (PRD 0010)
Drives DockerCredProxy.start through the production code path against a fake upstream container running on the same egress network. The "agent" is a curl container on the bottle's internal network — same access topology the agent uses in production. Covers PRD 0010 success criteria: - SC3 (the request reaches upstream, header round-trip works) - SC6 (inbound Authorization stripped; the proxy injects the configured token even when the agent tries to smuggle one in) - partial SC2 (cred-proxy reachable by the alias from the internal network) - 404 for unconfigured routes Live-network tests against real Anthropic / GitHub / Gitea / npm upstreams (SC4 and SC5 specifically) are deferred — the fake-upstream shape covers the routing + header layer that's actually under test here. |
||
|
|
249e8cc15e |
test: drop ssh-gate suites and shadow-route assertions (PRD 0009)
- Delete tests/unit/test_ssh_gate.py and the fixture_with_ssh helpers. - test_pipelock_yaml: drop the ssh-leak guard (structurally impossible now); the remaining tests switch to fixture_minimal. - test_pipelock_allowlist: rewrite the union/dedup test to exercise an egress.allowlist that duplicates a baked default (the property the ssh-leak assertion was hitching onto). - test_manifest_git: shadow-route assertion becomes a legacy-ssh- dies-with-hint assertion, since bottle.ssh is now parse-fail. - test_orphan_cleanup: drop the SSHGate.stop idempotency check; pipelock equivalent stays. - test_dry_run_plan: drop assertions on the removed ssh_hosts / ssh_gate keys. 52 unit tests pass. |
||
|
|
f9d9e9cf33 |
test(git-gate): bidirectional mirror round-trip
A pair of integration tests against a real sshd-based "upstream"
sibling container that prove every operation through the gate is
observably equivalent to the same operation against the upstream:
- test_clone_and_refetch_reflect_upstream: clone via gate
returns the upstream's current commit; an out-of-band commit
on the upstream shows up via the gate on the next ls-remote.
- test_push_through_gate_lands_on_upstream: a clean push routed
through the gate lands on the upstream's bare repo.
The upstream container is a tiny inline-built alpine image with
openssh-server, a `git` user (passwd -u so sshd doesn't reject
the locked account), and a baked bare repo seeded with one
commit. Host keys are baked in at build so the test can pin
KnownHostKey on the manifest entry before the container starts.
While wiring this up the access-hook gained a one-shot HEAD
sync: `git init --bare` defaults HEAD to refs/heads/master, and
upstreams that use main would leave the bare repo's HEAD
unresolvable — clones came through but the working tree was
empty. The hook now does a `rev-parse --verify HEAD` check
after the first fetch and runs `ls-remote --symref` to repoint
HEAD if it doesn't resolve. One extra round-trip on first
fetch only.
|
||
|
|
fdd06c54d2 |
feat(git-gate): mirror fetch through access-hook (bidirectional)
The gate is now a transparent mirror, not push-only. Per-repo init now runs `git remote add --mirror=fetch origin <url>` so a later `git fetch origin` mirrors the upstream's full ref graph at canonical paths. The pre-receive hook forwards accepted refs via `git push origin` (renamed from upstream). New: an access-hook script wired via `git daemon --access-hook` runs `git fetch origin --prune` against the real upstream before every upload-pack request (clone, fetch, pull, ls-remote). On upstream error the hook exits non-zero — the agent's fetch fails rather than the gate serving stale data. The pre-existing smoke test (ls-remote against unreachable upstream returns refs) had to invert: under the bidirectional design any ls-remote success is necessarily a success against the upstream, so the unreachable-upstream case now correctly fails closed. |
||
|
|
89981f9048 |
test(git-gate): integration smoke + secret-blocking push
Two integration tests against a real Docker daemon:
- test_ls_remote_succeeds_against_fresh_gate: a freshly-started
gate has its empty bare repo exported via git daemon; ls-remote
from a sibling container on the internal network returns no
refs and exits 0.
- test_push_with_secret_is_rejected: the PRD 0008 success
criterion — a push containing an AKIA-shaped synthetic that
trips gitleaks's aws-access-token rule is rejected by the
pre-receive hook with a non-zero exit on the client and a
gitleaks rejection in the response.
Dockerfile.git-gate switches base to zricethezav/gitleaks (alpine
3.22 + gitleaks v8.30.1, pinned by digest) since gitleaks isn't
packaged for alpine, and adds git-daemon (the sub-package the
listener needs; the core git binary in the base doesn't include
the daemon).
|
||
|
|
f787edb861 |
feat(git-gate): wire DockerGitGate through prepare/launch/plan
DockerBottleBackend now instantiates a DockerGitGate alongside DockerPipelockProxy and DockerSSHGate; the prepare step lifts bottle.git into a GitGatePlan stored on DockerBottlePlan, and launch starts/stops the sidecar in the same ExitStack as the other two (only when bottle.git is non-empty). bottle_plan.print now surfaces git remotes and per-upstream gate forwards in the y/N preflight; to_dict adds git_remotes and git_gate keys to the dry-run JSON payload for CLI consumers. PRD: docs/prds/0008-git-gate.md |
||
|
|
4f0cd0f782 |
fix(pipelock): passthrough api.anthropic.com so Claude auth/chat works
Pipelock's BIP-39 seed-phrase scanner fires on Anthropic Messages API bodies because user-authored conversation text can hit 12 consecutive BIP-39 dictionary words that pass the checksum, returning a 403 `blocked: request body contains secret: BIP-39 Seed Phrase` that the Claude CLI surfaces as `Please run /login`. Pipelock's `suppress` section only covers git/file findings, not the inline body scanner, so the recommended treatment for LLM endpoints is `tls_interception.passthrough_domains`: CONNECT is still allowlist- gated, but the body is not MITM'd. The existing body-scan integration test moves to `raw.githubusercontent.com` so it still pins TLS body DLP on non-passthrough'd hosts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
a7633977de |
test(ssh-gate): assert SSHGate.stop is no-op on missing sidecar
PRD 0007: the launch ExitStack calls gate.stop on every failure path, so an early bring-up error (where the gate container was never created) must not raise from teardown. Mirrors the existing DockerPipelockProxy.stop assertion. The orphan-container enumeration in cleanup.py already covers ssh-gate containers via its `claude-bottle-` name prefix filter — no code change there. |
||
|
|
2533f8a00b |
feat(ssh-gate): wire gate into DockerBottlePlan, prepare, launch
PRD 0007: thread the DockerSSHGate through the bottle lifecycle. - DockerBottlePlan gains gate_plan: SSHGatePlan. - prepare.resolve_plan accepts a gate and renders its entrypoint script next to the pipelock yaml. - launch.launch starts the gate sidecar after pipelock (so it's on the same internal + egress networks) and registers its stop in the ExitStack. Skipped when the bottle has no ssh entries. - DockerBottleBackend instantiates DockerSSHGate alongside the pipelock proxy. - bottle_plan.print + to_dict surface the upstream table so --dry-run shows the per-host listen-port mapping. ssh_config provisioning still points at pipelock; that swap lands in the next commit so this one stays a pure wiring change. |
||
|
|
d3115ae5fd |
test(pipelock): HTTPS integration tests for the bumped path
Fourth and final step of PRD 0006. Two new end-to-end tests pin the two paths through pipelock's tls_interception layer. - test_pipelock_blocks_secret_https_post: posts a GitHub-PAT-shaped body to api.anthropic.com over HTTPS through the bottle. With pipelock now bumping the CONNECT and seeing the decrypted body, it returns 403 with the documented `blocked: request body contains secret: GitHub Token` body. The probe is a single curl invocation — curl natively does CONNECT through HTTPS_PROXY, the agent's trust store now contains pipelock's CA, no hand-rolled TLS in the test. - test_pipelock_allows_normal_https: GETs git's README from raw.githubusercontent.com (a baked-in allowlist host). 200 + non-zero body length proves the full chain works: pipelock_tls_init → docker cp of CA into sidecar → bumped CONNECT → provision_ca installed CA in agent → curl trusts pipelock's bumped leaf → body forwarded back through the tunnel. - test_pipelock_sidecar_smoke: pre-existing direct-start smoke test updated to call pipelock_tls_init and populate the CA paths on the plan. (The full launch flow does this in launch.py; this test exercises the proxy class in isolation.) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
fb10c8dd8a |
feat(bottle-plan): render TLS interception in the dry-run preflight
Third step of PRD 0006. The preflight now surfaces the TLS-
intercept layer so the operator sees it before agreeing to launch.
- Text output: one new line under the egress summary
("tls intercept : pipelock (per-bottle ephemeral CA, generated
at launch)").
- JSON output (--format=json contract): new
egress.tls_interception: { enabled: true, ca_fingerprint: null }
block. Fingerprint is always null at dry-run because the CA
only exists after launch; real launches print it as a stderr
log line from provision_ca.
- Pin the new shape in the dry-run integration test.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
e45cd2fb07 |
test(dry-run): skip docker-state guard under act_runner
The no-side-effects assertion calls `docker network ls` and `docker ps -a` to verify the dry run created nothing. Inside the Gitea Actions job container, those exit non-zero against the host-mounted docker socket — the same act_runner topology issue that already excludes other integration tests from CI (see docs/ci.md). The failure was silently swallowed under the default check=False; the recent style sweep that added check=True surfaced it. Gate the docker-enumerating check on GITEA_ACTIONS so the JSON contract — the more useful part of the test — keeps running on CI. Consolidate the two count helpers into one that surfaces stderr in the failure message instead of raising a context-free CalledProcessError, so the next docker surprise is debuggable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
427ef96e3f |
feat(pipelock): enforce DLP body-scan hits by default
Adds bottle.egress.dlp_action ("block" | "warn", default block) and
wires it into pipelock as request_body_scanning.action. Pipelock's
own default is "warn", which previously meant claude-bottle detected
credential patterns in outbound bodies but forwarded the request
anyway.
The matching integration test posts a manifest env var shaped like
a GitHub PAT to api.anthropic.com via plain HTTP forward proxy so
pipelock can see the body. Pipelock answers 403 from its body-scan
layer instead of forwarding to the upstream.
Behavior change: bottles without an explicit egress.dlp_action now
block on body-scan hits. Set egress.dlp_action: "warn" to restore
the prior detect-only behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
4864516b33 |
feat(bottle): add exec method to the bottle abstraction
Bottle.exec(script) -> ExecResult runs a POSIX shell script inside a running bottle and returns captured stdout/stderr/returncode. The Docker impl pipes the script via stdin to `docker exec -i ... sh -s` so the source never crosses argv. Two integration tests exercise it end-to-end through the pipelock sidecar: a Node request to a non-allowlisted host (example.com) returns 403 from pipelock; a Node CONNECT to an allowlisted host (raw.githubusercontent.com) is tunneled with 200 Connection Established. The 200/403 split on each verb is decided by pipelock itself, isolating the allowlist decision from whatever the remote might return. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |