BottlePlan gains a to_dict method (abstract on the base, implemented
on DockerBottlePlan) returning a JSON-serializable view of the resolved
plan. `cli.py start --dry-run --format=json` prints it to stdout and
exits zero. --format=json without --dry-run is rejected — emitting JSON
during a real launch would race the y/N prompt.
The dry-run integration test now parses the JSON and asserts on
structured fields (agent, bottle, runtime, hosts sorted+deduped, etc.)
instead of regex-matching the human-readable preflight stdout. That
kills the magic-"8 hosts allowed" coupling — adding a new baked
default doesn't break the test.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Split pipelock config building from YAML rendering: pipelock_build_config
returns a dict, pipelock_render_yaml serializes it, and _build_pipelock_yaml
chains the two onto disk. Unchanged behavior — pipelock loads the same YAML.
The yaml test now asserts on the structured config dict, which is
robust to cosmetic YAML changes (key order, quoting). The two checks
that only make sense on the rendered output — file mode 0600 and
no-secret-leakage — stay against the on-disk content.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the hand-maintained INTEGRATION_NAMES classifier (and the
bespoke run_tests.py around it) with a directory-driven split:
tests/unit/ unit tests, always run
tests/integration/ Docker-dependent, skip cleanly without Docker
tests/canaries/ upstream-regression checks, opt-in via
CLAUDE_BOTTLE_RUN_CANARIES=1
The pinned-pipelock-image check moves to the canary suite — it tests
upstream packaging, not our code, so it shouldn't gate every dev push.
A scheduled canaries.yml workflow runs it weekly.
The manifest-runtime tests collapse the four assertRaises cases for
distinct 'runtime' values into one subTest loop and drop the
error-message-wording assertions; the contract is "any value is
rejected", not "the error literally contains 'auto-detect'".
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Empty commit to seed the branch so a PR can be opened against main.
Actual test refactor work will follow.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Compares claude-bottle to endo-familiar, litterbox, agent-safehouse,
matchlock, tilde.run, boxlite, microsandbox, and smolmachines. Covers
isolation primitive, locality, agent integration, network policy, and
maturity, and notes three borrowable ideas (per-use SSH confirmation,
in-flight secret injection, microVM backend) that fit the current
bash-first / local-Docker stance.
Renames the file and rewrites the body around what actually shipped:
class-based BottleBackend ABC (not a free create_docker_bottle
function), the two-phase prepare/launch split, the backend/docker/
subpackage layout, env.py reshaped into a backend-neutral ResolvedEnv,
and PipelockProxy split between top-level and backend/docker/.
resolve_env_into(...) becomes resolve_env(manifest, agent) -> ResolvedEnv
(forwarded names + literals). The docker backend now owns env-file /
argv serialization and the --env-file newline check. Also drops stray
Docker references from manifest.py, pipelock.py, util.py, and trims
the duplicated command list from cli.py's docstring (usage() in
claude_bottle/cli/__init__.py is now the only listing).
Module name aligns with the others (manifest, pipelock, network,
log) — nouns/noun-y, not verb phrases. The function name now reads
naturally at the call site: resolve_env_into(manifest, agent,
env_file, args_file).
New file claude_bottle/backend/util.py for cross-backend host-side
helpers:
host_skill_dir(name) — resolves $HOME/.claude/skills/<name>
docker/util.py gains:
docker_exec_root(container, argv) — `docker exec -u 0` wrapper used
by SSH provisioning
DockerBottleBackend drops the two methods that wrapped these
(`_host_skill_dir`, `_docker_exec_root`) — they had no instance state
and just lived on the class for organizational reasons. Call sites
now use the imported functions directly.
Matches the allowlist-resolution helpers' shape: the caller resolves
the bottle once and passes it in. Signature drops from
(manifest, bottle_name, slug, yaml_path) to (bottle, slug, yaml_path).
DockerBottleBackend.prepare_proxy uses manifest.bottle_for(agent_name)
to get the bottle directly. Tests pass fixture.bottles[name].
prepare's docstring also explains what `slug` is: the lowercased,
hyphen-normalized agent identifier used as the suffix in every
per-agent resource name (agent container, pipelock container, the
internal/egress networks). It's stored on the plan so start can
derive the sidecar's container name.
Top-level pipelock.py drops the Manifest import — no longer used.
Both constants were already only used by Docker-specific code (the
sidecar boot, the proxy_url/host_port naming helpers, the image
contract test). Move them next to DockerPipelockProxy.
Top-level pipelock.py drops the 'os' import along with the constants;
the two test files that pulled PIPELOCK_IMAGE retarget at the new
location.
The three slug-based naming helpers were nominally on pipelock.py but
each assumed a Docker container topology (the container name is
'claude-bottle-pipelock-<slug>', the proxy URL uses that container
name). Move them next to DockerPipelockProxy:
pipelock_container_name -> claude-bottle-pipelock-<slug>
pipelock_proxy_url -> http://<container>:<port>
pipelock_proxy_host_port -> <container>:<port>
backend.py imports them directly from .pipelock; the orphan-cleanup
test imports container_name from the same place.
PipelockProxy becomes an ABC with the platform-agnostic
prepare/_build_pipelock_yaml as concrete methods and start/stop as
abstract. Docker-specific sidecar lifecycle moves to a new sibling
file:
claude_bottle/backend/docker/pipelock.py
DockerPipelockProxy(PipelockProxy) — implements start (docker
create/cp/network connect/start) and stop (docker inspect/rm -f).
DockerBottleBackend._proxy is now a DockerPipelockProxy instance.
Tests that previously instantiated PipelockProxy() directly switch to
DockerPipelockProxy() (the base is no longer constructable).
Every function in the 'Allowlist resolution' section was doing
`manifest.bottles[bottle_name].X` as its first move. Push the lookup
to the caller and have each helper take a resolved Bottle:
pipelock_bottle_allowlist
pipelock_bottle_ssh_hostnames
pipelock_bottle_ssh_trusted_domains
pipelock_bottle_ssh_ip_cidrs
pipelock_effective_allowlist
pipelock_allowlist_summary
PipelockProxy._build_pipelock_yaml resolves bottle once at the top
and passes it through; DockerBottleBackend.prepare already had the
bottle in scope and now uses it directly. Tests pass the resolved
bottle from each fixture.
The classifier is a pure dotted-quad regex check — nothing
pipelock-specific about it. Pipelock now imports it from util.
test_pipelock_classify.py retargets at the new location.
Two manifest-accessor functions in pipelock.py
(pipelock_bottle_allowlist, pipelock_bottle_ssh_hostnames) look
generic but are 1-line wrappers used only internally; they stay
for now.
ProxyPlan -> PipelockProxyPlan, with two additional fields populated
in launch: internal_network, egress_network (default ""). prepare
fills yaml_path + slug; launch uses dataclasses.replace to populate
the networks before calling start.
pipelock_start -> PipelockProxy.start(plan). Reads yaml_path, slug,
internal_network, egress_network off the plan. Returns the resolved
container name.
pipelock_stop -> PipelockProxy.stop(proxy_target). Takes the resolved
container name directly (the value that start returned); no longer
needs to know about slugs or naming conventions.
Backend launch passes the running container name (state["pipelock"])
to stop. Test for stop's idempotency uses pipelock_container_name to
construct the proxy_target.
Add a frozen ProxyPlan dataclass to pipelock.py (currently one field:
yaml_path; kept as a class so future proxy-level state has a home).
- prepare_proxy(spec, stage_dir) now returns pipelock.ProxyPlan
instead of a raw Path.
- DockerBottlePlan replaces pipelock_yaml_path + pipelock_yaml_filename
with a single proxy: ProxyPlan field.
- launch reads plan.proxy.yaml_path.parent / .name when calling
pipelock_start. Eventually pipelock_start should just take a Path
but that's a separate change.
prepare_proxy(spec, stage_dir) -> Path now decides where the
pipelock yaml lives inside stage_dir (currently 'pipelock.yaml'),
writes it via PipelockProxy.prepare, and returns the resolved path.
The caller (prepare) drops its 'pipelock_yaml_filename' /
'pipelock_yaml = stage_dir / filename' setup and just consumes the
returned Path; the plan's pipelock_yaml_filename is derived from
.name on the path.
The YAML generation now lives on PipelockProxy.prepare(manifest,
bottle_name, yaml_path) in claude_bottle/pipelock.py. The class is the
natural home for any future proxy-level state.
DockerBottleBackend keeps an instance as a class attribute
(_proxy = PipelockProxy()) and its prepare_proxy becomes a thin
delegation. A future backend that wants a different egress proxy
(or none) plugs in its own strategy.
Tests retarget at the new home — PipelockProxy.prepare gets the
content-shape assertions; the sidecar smoke test uses the class
directly too. Same coverage.
Drops 6 tests with no real coverage loss:
- tests/test_pipelock_naming.py — 4 tests asserting that f-string
format helpers return their f-string. Shape locks, not behavior
gates.
- tests/test_pipelock_image.py:test_entrypoint_contains_pipelock and
:test_cmd_contains_run — Docker image metadata inspection. The
remaining test_binary_runs already covers 'does the pinned image
actually work,' which is the only scenario these were really
guarding against.
31 tests -> 25.
The yaml generation logic moves wholesale onto DockerBottleBackend
where it's used. pipelock_write_yaml is deleted; pipelock.py keeps
the allowlist resolution helpers (still called by prepare_proxy and
by pipelock_allowlist_summary).
The pipelock_start error message that referenced "pipelock_write_yaml
must run first" now says "backend.prepare_proxy must run first."
tests/test_pipelock_yaml.py rewritten to drive DockerBottleBackend().
prepare_proxy(spec, yaml_path); test_pipelock_sidecar_smoke.py call
site updated similarly. Same coverage at the new location.
_expand_tilde was a path-string helper on DockerBottleBackend but
nothing about it was Docker-specific — any future backend reading
host paths from manifest entries would want it. Lift to
claude_bottle/util.py (sibling of log.py / manifest.py) as a public
expand_tilde() function. Docker-specific primitives stay in
claude_bottle/backend/docker/util.py.
Mirrors the skills.py absorb. ssh_validate_entries -> validate_ssh_entries
(called from prepare); ssh_setup body -> provision_ssh; the
_docker_exec_root and _expand_tilde helpers become private methods on
the backend.
The detailed isolation-model docstring from the old module moves to
provision_ssh, where the code lives now. ssh.py deleted.
The whole module folds into two methods on the backend:
validate_skills(skills) — called from prepare; fails loudly when
a named skill is missing on the host so
the user doesn't get a y/N for a plan
that's already known to break.
_host_skill_dir(name) — private helper used by both
validate_skills and provision_skills.
skills.py is deleted; the four prior functions (host_skill_dir,
host_skill_exists, require_host_skill, skills_validate_all) collapse
into the two above without losing the pre-y/N validation.
The copy logic was Docker-specific (docker exec mkdir / rm -rf,
docker cp); it had no reason to live in a top-level skills module.
Pull the body into DockerBottleBackend.provision_skills.
skills.py keeps the host-side discovery + validation
(host_skill_dir, host_skill_exists, require_host_skill,
skills_validate_all). The orphaned CONTAINER_HOME /
CONTAINER_SKILLS_DIR constants and the now-unused subprocess + info
imports are removed.
Template Method pattern. BottleBackend.provision is now concrete and
orchestrates four abstract sub-methods:
provision_prompt -> str | None (only one with a meaningful return)
provision_skills -> None
provision_ssh -> None
provision_git -> None
Each is self-gating: skills/ssh/git short-circuit on empty inputs;
prompt always copies (the path must exist) and returns None when the
agent has no prompt content.
DockerBottleBackend drops its own `provision` (inherited from the
base) and now just implements the four sub-methods. Each sub-method
takes `plan: BottlePlan` (matching the abstract) and asserts
isinstance to narrow to DockerBottlePlan internally, same pattern as
`launch`.
A future fly.io backend implements the four sub-methods; provision
works for it unchanged.
Four symmetric provision sub-methods now: provision_prompt,
provision_skills, provision_ssh, provision_git. Each self-gates with
an early return; provision is pure orchestration.
provision now orchestrates three focused sub-methods. Each sub-method
self-gates: provision_ssh is a no-op when the bottle has no SSH
entries; provision_git is a no-op when --cwd was not set. The prompt
copy + chown always runs (so the path always exists in-container);
the return is gated on whether the agent has a non-empty prompt.
BottleProvisioner had no independent identity — no state, only one
caller, never selected, never crossed a method boundary as data. It
was a method dressed up as a class. Reverting that turn:
- BottleBackend gains an abstract provision(plan, target).
- DockerBottleBackend.provision absorbs the body that lived on
DockerBottleProvisioner.
- backend/docker/provisioner.py deleted.
- BottleProvisioner ABC removed from backend/__init__.py.
- launch now calls self.provision(plan, container) directly.
Net: -1 file, -1 class, -1 ABC. Same behavior; tests pass.
Lift the file-copying-into-the-running-container step out of
DockerBottleBackend._provision_container into its own class. The
backend now holds a DockerBottleProvisioner instance and delegates
the post-launch provisioning to it.
- BottleProvisioner (abstract) in backend/__init__.py with a
`provision(plan, target) -> str | None` method.
- DockerBottleProvisioner (concrete) in backend/docker/provisioner.py
inheriting from the base, narrowing plan to DockerBottlePlan via
isinstance, and carrying the prompt/skills/SSH/.git copy logic
unchanged.
- DockerBottleBackend keeps a class-level DockerBottleProvisioner()
and calls self._provisioner.provision(plan, container) from launch.
_provision_container method removed.
No behavior change.
Across the package:
- claude_bottle/platform/ -> claude_bottle/backend/
- platform/docker/platform.py -> backend/docker/backend.py
- class BottlePlatform -> BottleBackend
- class DockerBottlePlatform -> DockerBottleBackend
- get_bottle_platform() -> get_bottle_backend()
- env var CLAUDE_BOTTLE_PLATFORM -> CLAUDE_BOTTLE_BACKEND
- dict _PLATFORMS -> _BACKENDS
"Backend" is shorter and more established as the term for a
pluggable strategy-pattern implementation. "Platform" was vague
(could mean OS, hardware, cloud) and mildly redundant — Docker is
itself a platform.
The previous PRD section claiming "the Backend protocol was
rejected" referred to a low-level run/exec/cp/network_connect
protocol; the name was never the reason. The PRD is updated to
describe that rejected design by shape rather than by name.
The bottle/agent concepts and the manifest schema are unchanged.
'bottles' was the package name when it held a single Bottle Protocol;
since we added BottlePlatform / BottlePlan / BottleCleanupPlan and
made it the home of platform dispatch, 'platform' describes the
package better. The 'bottle' concept (and the manifest field) stays.
CLI imports update from ..bottles to ..platform; internal relative
imports inside the package survive the rename unchanged. Git
detected all 7 file renames.
Bottle was the only Protocol in an otherwise-ABC family
(BottlePlan, BottleCleanupPlan, BottlePlatform are all ABCs).
Convert to an ABC with abstract exec_claude / cp_in / close,
matching the rest of the hierarchy.
Rename _DockerBottle -> DockerBottle: the underscore was a
default-Python-private instinct that doesn't match the sibling
plan classes (DockerBottlePlan, DockerBottleCleanupPlan), all of
which are equally "only constructed by the platform" and yet
public-by-name.
Re-export DockerBottle from claude_bottle.bottles.docker.
The single docker/__init__.py grew to ~555 lines holding the platform,
its plan classes, the bottle handle, and the runsc probe. Split into:
- util.py : Docker subprocess primitives + runsc_available
- bottle_plan.py : DockerBottlePlan (+ its print method)
- bottle_cleanup_plan.py : DockerBottleCleanupPlan
- bottle.py : _DockerBottle handle class
- platform.py : DockerBottlePlatform (the bulk)
docker/__init__.py becomes a thin re-export shim so existing imports
(claude_bottle.bottles.docker.DockerBottlePlatform, etc.) keep working.
- claude_bottle/bottles/docker.py -> claude_bottle/bottles/docker/__init__.py
- claude_bottle/docker.py -> claude_bottle/bottles/docker/util.py
The host-side Docker primitives (require_docker, slugify, image_exists,
container_exists, build_image, build_image_with_cwd) lived at the top
of the package as if cross-cutting, but they are Docker-specific.
Moving them into bottles/docker/util.py colocates them with the
platform that uses them and clears the top-level namespace.
Imports in bottles/docker/__init__.py shift to `from . import util as
docker_mod` so the existing call sites (`docker_mod.require_docker()`,
etc.) keep working unchanged.
Add list_active() abstract method on BottlePlatform; DockerBottlePlatform
implements it with the existing docker ps logic. cli/list.py no longer
calls docker directly — it just dispatches the active branch to the
platform.
cmd_cleanup used to only sweep running containers via `docker ps`,
missing stopped pipelock sidecars and orphaned networks entirely. On
my host the new version surfaced ~10 stranded networks left behind by
SIGKILLed sessions — the kind of thing the old command implied it was
handling.
New shape, symmetric with start:
- BottleCleanupPlan (abstract, in bottles/__init__.py) with `print` +
`empty` abstract members.
- DockerBottleCleanupPlan (concrete, in bottles/docker.py) carrying
the resolved tuples of containers and networks.
- BottlePlatform gains abstract prepare_cleanup() + cleanup(plan).
DockerBottlePlatform implements both:
- prepare_cleanup: docker ps -a + docker network ls, both
filtered to ^claude-bottle-, sorted for stable output.
- cleanup: docker rm -f containers first (they hold the network
attachment), then docker network rm.
- cmd_cleanup is now ~25 lines: prepare → print → y/N → cleanup.
DockerBottlePlatform.launch already runs 'docker build' on every
start, and the layer cache makes no-change rebuilds nearly free.
Anything 'cli.py build' did, 'cli.py start' does for you.
They were the only remaining module-level helpers in docker.py besides
runsc_available(); promote them to private methods on DockerBottlePlatform
so all the Docker orchestration logic lives in one place.
The module-level prepare_docker_bottle and launch_docker_bottle were
trivial delegations from the platform class. Move their bodies onto
DockerBottlePlatform.prepare and .launch directly. The launch method
keeps the @contextmanager decorator; isinstance narrows BottlePlan
to DockerBottlePlan at the entry point.
Internal helpers (_run_agent_container, _provision_container,
runsc_available) stay at module scope — they're stateless and don't
benefit from self.
No behavior change. Tests pass; dry-run output unchanged.
Mirror the BottlePlan -> DockerBottlePlan hierarchy at the platform
layer. BottlePlatform is now an abstract base with abstract `prepare`
and `launch` methods; DockerBottlePlatform lives alongside the rest of
the Docker code in bottles/docker.py and supplies the concrete impls.
The registry in bottles/__init__.py now holds an instance of each
concrete platform class. Future per-platform state (region, api
token, cleanup primitives) has a natural home on the subclass rather
than being stitched onto a dataclass struct.
No behavior change. Tests pass; dry-run output unchanged.
- Add BottlePlan (frozen dataclass + ABC) with spec, stage_dir, and an
abstract `print(*, remote_control)` method.
- DockerBottlePlan now inherits from BottlePlan; spec/stage_dir come
from the base, Docker-specific fields stay on the subclass.
- Move BottleSpec from bottles/docker.py to bottles/__init__.py so the
cross-platform types live together. docker.py pulls them via
`from . import ...`.
- Move show_plan from cli/start.py to `DockerBottlePlan.print`. Caller
becomes `plan.print(remote_control=...)`. The CLI no longer reads
any Docker-specific fields.
- BottlePlatform.prepare is now typed `Callable[..., BottlePlan]`.
cmd_start drops ~46 more lines.
The spec is intent-only and platform-agnostic — only the plan carries
Docker-specific fields. Drop the 'Docker' prefix and re-export from
claude_bottle.bottles so callers see it as cross-platform.
The Docker factory had absorbed live container ops but left the
host-side prep (image-name resolution, container-name collision
retry, pipelock yaml generation, env_resolve writes, host
validation) in cli/start.py. That kept ~half the Docker-specific
logic outside the abstraction.
Split the factory into two phases:
prepare_docker_bottle(spec, stage_dir=...) -> DockerBottlePlan
Resolves names, validates skills/SSH, writes scratch files.
No Docker resources created yet.
launch_docker_bottle(plan) -> ContextManager[Bottle]
Builds image, creates networks, boots pipelock, runs the
agent container, provisions files. Teardown on exit.
DockerBottleSpec shrinks to intent-only inputs (manifest, agent
name, --cwd flag, user_cwd, forward_oauth_token). The CLI no longer
references docker_mod, pipelock, skills, ssh, or env_resolve.
get_bottle_factory becomes get_bottle_platform returning a
BottlePlatform with .prepare and .launch — one selectable thing per
platform.
The Bottle handle now remembers the in-container prompt path and
adds --append-system-prompt-file to claude's argv when present, so
the CLI no longer needs to know the path.
cmd_start: ~148 lines down from 229. Tests pass; dry-run output
byte-identical.
Introduce claude_bottle/bottles/ with a Bottle Protocol and a
get_bottle_factory() that dispatches on CLAUDE_BOTTLE_PLATFORM
(default "docker"). Move every Docker-specific subprocess.run call
from cli/start.py, plus the orchestration of build, networks, the
pipelock sidecar, container launch, and per-container provisioning
(prompt, skills, ssh, .git), into create_docker_bottle.
Drop bottles[].runtime from the manifest schema. Auto-detect whether
gVisor is registered with the daemon and pass --runtime=runsc when it
is; the preflight shows the resolved runtime so the choice is visible.
Manifests still carrying 'runtime' get a clear error pointing at the
auto-detect behavior, rather than silent ignore.
Out of scope: cli/cleanup.py and cli/list.py still call docker
directly. They enumerate active bottles across the host, which is a
separate concern from "create a bottle" and is left for a follow-up
that introduces a list_active/cleanup primitive on the factory.