PipelockProxy.prepare now accepts (bottle, slug, stage_dir) and derives
the yaml_path itself, so callers don't need to know the filename.
DockerBottleBackend.prepare_proxy becomes a one-line wrapper whose only
caller already has bottle and slug in scope, so it's inlined and
deleted.
The four lower-level helpers (pipelock_bottle_allowlist,
pipelock_bottle_ssh_hostnames, pipelock_bottle_ssh_ip_cidrs,
pipelock_bottle_ssh_trusted_domains) are one-line filters; testing
each in isolation duplicates coverage that pipelock_effective_allowlist
already provides end-to-end. The /32 CIDR suffix is the only behavior
beyond filtering, so it keeps a tiny dedicated test.
Drops the misplaced test_rejects_non_string_entry — that's manifest
validation, not allowlist resolution. Belongs in a manifest-validation
test file (which doesn't exist yet); leaving for a separate PR rather
than adding a one-branch sample here.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move the --format=json-requires-dry-run check out of the integration
suite (it doesn't need Docker — argparse fails before any backend
runs) and tighten the assertion: previously asserted only that exit
code was nonzero, so any unrelated breakage (manifest resolution
failure, bad agent name, etc.) silently passed. Now asserts stderr
contains the actual flag-conflict message.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The old smoke test hand-rolled the docker create/cp/start sequence in
parallel with what DockerPipelockProxy.start already does, so any
divergence in production code wouldn't trip it. Rewritten to call
.prepare and .start directly and probe /health from a sibling curl
container on the same internal network — same access topology the
agent container uses in production.
In-network probing means the test no longer depends on a published
port, so it can run under act_runner (where host-loopback port
publishing isn't reachable from the job container).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
BottlePlan gains a to_dict method (abstract on the base, implemented
on DockerBottlePlan) returning a JSON-serializable view of the resolved
plan. `cli.py start --dry-run --format=json` prints it to stdout and
exits zero. --format=json without --dry-run is rejected — emitting JSON
during a real launch would race the y/N prompt.
The dry-run integration test now parses the JSON and asserts on
structured fields (agent, bottle, runtime, hosts sorted+deduped, etc.)
instead of regex-matching the human-readable preflight stdout. That
kills the magic-"8 hosts allowed" coupling — adding a new baked
default doesn't break the test.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Split pipelock config building from YAML rendering: pipelock_build_config
returns a dict, pipelock_render_yaml serializes it, and _build_pipelock_yaml
chains the two onto disk. Unchanged behavior — pipelock loads the same YAML.
The yaml test now asserts on the structured config dict, which is
robust to cosmetic YAML changes (key order, quoting). The two checks
that only make sense on the rendered output — file mode 0600 and
no-secret-leakage — stay against the on-disk content.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the hand-maintained INTEGRATION_NAMES classifier (and the
bespoke run_tests.py around it) with a directory-driven split:
tests/unit/ unit tests, always run
tests/integration/ Docker-dependent, skip cleanly without Docker
tests/canaries/ upstream-regression checks, opt-in via
CLAUDE_BOTTLE_RUN_CANARIES=1
The pinned-pipelock-image check moves to the canary suite — it tests
upstream packaging, not our code, so it shouldn't gate every dev push.
A scheduled canaries.yml workflow runs it weekly.
The manifest-runtime tests collapse the four assertRaises cases for
distinct 'runtime' values into one subTest loop and drop the
error-message-wording assertions; the contract is "any value is
rejected", not "the error literally contains 'auto-detect'".
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Matches the allowlist-resolution helpers' shape: the caller resolves
the bottle once and passes it in. Signature drops from
(manifest, bottle_name, slug, yaml_path) to (bottle, slug, yaml_path).
DockerBottleBackend.prepare_proxy uses manifest.bottle_for(agent_name)
to get the bottle directly. Tests pass fixture.bottles[name].
prepare's docstring also explains what `slug` is: the lowercased,
hyphen-normalized agent identifier used as the suffix in every
per-agent resource name (agent container, pipelock container, the
internal/egress networks). It's stored on the plan so start can
derive the sidecar's container name.
Top-level pipelock.py drops the Manifest import — no longer used.
Both constants were already only used by Docker-specific code (the
sidecar boot, the proxy_url/host_port naming helpers, the image
contract test). Move them next to DockerPipelockProxy.
Top-level pipelock.py drops the 'os' import along with the constants;
the two test files that pulled PIPELOCK_IMAGE retarget at the new
location.
The three slug-based naming helpers were nominally on pipelock.py but
each assumed a Docker container topology (the container name is
'claude-bottle-pipelock-<slug>', the proxy URL uses that container
name). Move them next to DockerPipelockProxy:
pipelock_container_name -> claude-bottle-pipelock-<slug>
pipelock_proxy_url -> http://<container>:<port>
pipelock_proxy_host_port -> <container>:<port>
backend.py imports them directly from .pipelock; the orphan-cleanup
test imports container_name from the same place.
PipelockProxy becomes an ABC with the platform-agnostic
prepare/_build_pipelock_yaml as concrete methods and start/stop as
abstract. Docker-specific sidecar lifecycle moves to a new sibling
file:
claude_bottle/backend/docker/pipelock.py
DockerPipelockProxy(PipelockProxy) — implements start (docker
create/cp/network connect/start) and stop (docker inspect/rm -f).
DockerBottleBackend._proxy is now a DockerPipelockProxy instance.
Tests that previously instantiated PipelockProxy() directly switch to
DockerPipelockProxy() (the base is no longer constructable).
Every function in the 'Allowlist resolution' section was doing
`manifest.bottles[bottle_name].X` as its first move. Push the lookup
to the caller and have each helper take a resolved Bottle:
pipelock_bottle_allowlist
pipelock_bottle_ssh_hostnames
pipelock_bottle_ssh_trusted_domains
pipelock_bottle_ssh_ip_cidrs
pipelock_effective_allowlist
pipelock_allowlist_summary
PipelockProxy._build_pipelock_yaml resolves bottle once at the top
and passes it through; DockerBottleBackend.prepare already had the
bottle in scope and now uses it directly. Tests pass the resolved
bottle from each fixture.
The classifier is a pure dotted-quad regex check — nothing
pipelock-specific about it. Pipelock now imports it from util.
test_pipelock_classify.py retargets at the new location.
Two manifest-accessor functions in pipelock.py
(pipelock_bottle_allowlist, pipelock_bottle_ssh_hostnames) look
generic but are 1-line wrappers used only internally; they stay
for now.
ProxyPlan -> PipelockProxyPlan, with two additional fields populated
in launch: internal_network, egress_network (default ""). prepare
fills yaml_path + slug; launch uses dataclasses.replace to populate
the networks before calling start.
pipelock_start -> PipelockProxy.start(plan). Reads yaml_path, slug,
internal_network, egress_network off the plan. Returns the resolved
container name.
pipelock_stop -> PipelockProxy.stop(proxy_target). Takes the resolved
container name directly (the value that start returned); no longer
needs to know about slugs or naming conventions.
Backend launch passes the running container name (state["pipelock"])
to stop. Test for stop's idempotency uses pipelock_container_name to
construct the proxy_target.
The YAML generation now lives on PipelockProxy.prepare(manifest,
bottle_name, yaml_path) in claude_bottle/pipelock.py. The class is the
natural home for any future proxy-level state.
DockerBottleBackend keeps an instance as a class attribute
(_proxy = PipelockProxy()) and its prepare_proxy becomes a thin
delegation. A future backend that wants a different egress proxy
(or none) plugs in its own strategy.
Tests retarget at the new home — PipelockProxy.prepare gets the
content-shape assertions; the sidecar smoke test uses the class
directly too. Same coverage.
Drops 6 tests with no real coverage loss:
- tests/test_pipelock_naming.py — 4 tests asserting that f-string
format helpers return their f-string. Shape locks, not behavior
gates.
- tests/test_pipelock_image.py:test_entrypoint_contains_pipelock and
:test_cmd_contains_run — Docker image metadata inspection. The
remaining test_binary_runs already covers 'does the pinned image
actually work,' which is the only scenario these were really
guarding against.
31 tests -> 25.
The yaml generation logic moves wholesale onto DockerBottleBackend
where it's used. pipelock_write_yaml is deleted; pipelock.py keeps
the allowlist resolution helpers (still called by prepare_proxy and
by pipelock_allowlist_summary).
The pipelock_start error message that referenced "pipelock_write_yaml
must run first" now says "backend.prepare_proxy must run first."
tests/test_pipelock_yaml.py rewritten to drive DockerBottleBackend().
prepare_proxy(spec, yaml_path); test_pipelock_sidecar_smoke.py call
site updated similarly. Same coverage at the new location.
Across the package:
- claude_bottle/platform/ -> claude_bottle/backend/
- platform/docker/platform.py -> backend/docker/backend.py
- class BottlePlatform -> BottleBackend
- class DockerBottlePlatform -> DockerBottleBackend
- get_bottle_platform() -> get_bottle_backend()
- env var CLAUDE_BOTTLE_PLATFORM -> CLAUDE_BOTTLE_BACKEND
- dict _PLATFORMS -> _BACKENDS
"Backend" is shorter and more established as the term for a
pluggable strategy-pattern implementation. "Platform" was vague
(could mean OS, hardware, cloud) and mildly redundant — Docker is
itself a platform.
The previous PRD section claiming "the Backend protocol was
rejected" referred to a low-level run/exec/cp/network_connect
protocol; the name was never the reason. The PRD is updated to
describe that rejected design by shape rather than by name.
The bottle/agent concepts and the manifest schema are unchanged.
Introduce claude_bottle/bottles/ with a Bottle Protocol and a
get_bottle_factory() that dispatches on CLAUDE_BOTTLE_PLATFORM
(default "docker"). Move every Docker-specific subprocess.run call
from cli/start.py, plus the orchestration of build, networks, the
pipelock sidecar, container launch, and per-container provisioning
(prompt, skills, ssh, .git), into create_docker_bottle.
Drop bottles[].runtime from the manifest schema. Auto-detect whether
gVisor is registered with the daemon and pass --runtime=runsc when it
is; the preflight shows the resolved runtime so the choice is visible.
Manifests still carrying 'runtime' get a clear error pointing at the
auto-detect behavior, rather than silent ignore.
Out of scope: cli/cleanup.py and cli/list.py still call docker
directly. They enumerate active bottles across the host, which is a
separate concern from "create a bottle" and is left for a follow-up
that introduces a list_active/cleanup primitive on the factory.
Replace the TypedDict + 14 manifest_* free functions with frozen
dataclasses (SshEntry, BottleEgress, Bottle, Agent, Manifest) carrying
their own validators and constructors. Call sites import Manifest and
chain attribute access; the manifest_* helpers and manifest_validate
are gone.
Behavior changes worth flagging:
- Agent.bottle is now required (was optional with a "(none)" fallback).
Manifest.from_json_obj dies if any agent lacks a 'bottle' field or
references an undefined bottle, where previously start.py raised the
error lazily for the specific agent being launched.
- ssh.py now takes SshEntry instances; Host/IdentityFile shape checks
moved upstream into Manifest construction, leaving only the IdentityFile
filesystem-existence check in ssh_validate_entries.
- pipelock_bottle_allowlist's per-element string check is dropped — the
Manifest validator enforces it at load.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move schema checks out of per-access getters into a single
manifest_validate pass invoked by manifest_resolve. Getters can now
assume bottles/agents are well-typed dicts and every agent has a
defined bottle, so the .get(...) or {} chains collapse. Behavior
change: a bad runtime / shape error anywhere in the manifest now
fails at load instead of on the N-th read.
Intermediate step toward replacing TypedDict with a dataclass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bottles can now set "runtime": "runsc" to launch the agent container
under gVisor instead of runc, adding a userspace syscall barrier
between the agent and the host kernel. Default is runc (Docker
default). Pipelock stays on the default runtime per the research doc's
minimum-diff prescription.
The launcher verifies runsc is registered with the daemon before
launch, surfaces the runtime in the preflight plan, and dies with an
install pointer (and a macOS-not-supported note) when runsc is
requested but unavailable.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces cli.sh + lib/*.sh with a claude_bottle/ Python package and a
cli.py entry point. No external dependencies — uses only Python's
stdlib (json, subprocess, getpass, tempfile, argparse, re, etc.).
- claude_bottle/{log,docker,manifest,env_resolve,network,pipelock,
skills,ssh,cli}.py mirror the previous lib/*.sh modules.
- Tests converted to unittest under tests/test_*.py with a stdlib
runner at tests/run_tests.py (unit | integration | path).
- .githooks/commit-msg ported to Python; same Conventional Commits rules.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>