Avoids cross-instance state via class attribute; the proxy is now
constructed in __init__ alongside its owning backend.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The test overrides HOME to isolate the manifest under test from the
dev's real ~/claude-bottle.json. On Docker Desktop that override
also breaks docker CLI endpoint resolution, since the active context
is read from $HOME/.docker/config.json and the per-user socket lives
under $HOME/.docker/run/docker.sock. Forward the parent's resolved
endpoint via DOCKER_HOST so the subprocess reaches the same daemon
regardless of $HOME.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ResolvedEnv.forwarded now carries name->value pairs instead of names
whose values had been side-loaded into os.environ. The Docker backend
collects the dict (plus the renamed OAuth token) and passes it via
subprocess.run(env=...) so docker run -e NAME forwards by-name from
the child's environment, not the parent's.
Values are excluded from the dataclass repr (forwarded on ResolvedEnv,
forwarded_env on DockerBottlePlan) so accidental logging cannot leak
secret or interpolated values.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Silences pylint W1510 / ruff PLW1510 across the codebase. The choice
at each site reflects existing intent:
- check=True where the caller implicitly trusts success (docker ps /
network ls returning stdout, docker build, exec chown/chmod inside
provisioners).
- check=False where the caller inspects .returncode (race-retry on
docker run, pipelock sidecar lifecycle, network plumbing, exec_claude
propagating the session's exit code, best-effort cleanup paths).
No behavior change; check= defaults to False so the False sites are
semantically identical.
Adds pyrightconfig.json (strict, Python 3.11) covering cli.py,
claude_bottle/, and tests/. Fixes the 49 strict-mode errors:
- Type DockerBottle.teardown as Callable[[], None].
- ResolvedEnv default_factory uses parameterized list[str] / dict[str, str].
- Erase BottleBackend generics at the registry boundary
(BottleBackend[Any, Any]) since selection is runtime-driven and
callers use the unparameterized interface.
- DockerBottleBackend.launch returns Generator[DockerBottle, None, None];
@contextmanager now flags Iterator returns as deprecated.
- Sidestep cli.list submodule shadowing builtins.list in main()'s argv
annotation via an aliased re-import in cli/__init__.py.
- Cast cfg[...] results in test_pipelock_yaml at the dict[str, object]
boundary.
- Annotate write_fixture's fn parameter and _manifest_with_runtime's
return type.
DockerBottlePlan.print and .to_dict each pulled the same agent /
bottle / env_names / ssh_hosts / prompt-first-line out of the spec
before formatting. Extract a private _view() helper that returns a
small frozen _PlanView dataclass with those derived fields; both
methods consume it. Removes the duplicated derivation and the risk
that one renderer drifts from the other (the OAuth-name append in
particular existed twice).
Previously prepare wrote two on-disk artifacts that launch consumed:
agent.env (NAME=VALUE) and docker-args (paired -e\nNAME\n lines), with
launch parsing the second back into argv. Docker requires the literals
file on disk for --env-file, but the args-file round-trip was a pure
serialize/deserialize trip with hand-rolled line pairing logic.
Drop docker-args entirely. Pass forwarded names as a structured
tuple[str, ...] field on DockerBottlePlan; launch iterates it directly
to extend docker_args. _write_env_files becomes _write_env_file (only
the literals file remains).
Both prepare-time probing and launch-time race-retry generated the
same `<base>, <base>-2, ..., <base>-N` sequence with their own copies
of the suffix arithmetic and the 99-cap. Extract the candidate stream
into docker/util.container_name_candidates and have both call sites
walk it; each keeps its own predicate (probe vs. retry).
Also bumps the cap into a named constant (MAX_CONTAINER_SUFFIX) so
the two error messages can't drift.
Previously _run_agent_container set os.environ["CLAUDE_CODE_OAUTH_TOKEN"]
deep inside the launch path and added a one-off `-e` pair to docker_args,
which was the only env var to bypass the resolved.forwarded flow used
for everything else.
Move the os.environ mutation + name registration into prepare, right
after resolve_env, so the OAuth token rides the same forwarded-by-name
mechanism as secrets and interpolated entries. _run_agent_container
loses the special case entirely.
Parameterize BottleBackend over PlanT (bound to BottlePlan) and
CleanupT (bound to BottleCleanupPlan). DockerBottleBackend declares
itself BottleBackend[DockerBottlePlan, DockerBottleCleanupPlan], which
narrows every method's plan parameter to the concrete type and lets
the six `assert isinstance(plan, DockerBottlePlan)` lines on
launch/cleanup/provision_* go away.
The dict in get_bottle_backend keeps its unparameterized
BottleBackend element type so it can hold heterogeneous backend
specializations.
Replace the manual state-dict + per-resource branching teardown in
DockerBottleBackend.launch with an ExitStack: each resource registers
its own cleanup callback at the moment it's created, and stack.close()
unwinds in LIFO order. The previous form had to hand-coordinate four
nullable slots and re-check existence for the container; ExitStack
encodes the same semantics declaratively.
The smoke test now drives the production prepare/start path, which
calls network_create_internal. Under Gitea act_runner the docker
socket mount topology makes `docker network create --internal` fail
(or be invisible across the host/job-container boundary) — the same
limitation that test_orphan_cleanup.test_create_and_remove already
skips for. Match that skip here so CI goes green; the test still
runs in environments with a direct docker daemon.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PipelockProxy.prepare now accepts (bottle, slug, stage_dir) and derives
the yaml_path itself, so callers don't need to know the filename.
DockerBottleBackend.prepare_proxy becomes a one-line wrapper whose only
caller already has bottle and slug in scope, so it's inlined and
deleted.
The four lower-level helpers (pipelock_bottle_allowlist,
pipelock_bottle_ssh_hostnames, pipelock_bottle_ssh_ip_cidrs,
pipelock_bottle_ssh_trusted_domains) are one-line filters; testing
each in isolation duplicates coverage that pipelock_effective_allowlist
already provides end-to-end. The /32 CIDR suffix is the only behavior
beyond filtering, so it keeps a tiny dedicated test.
Drops the misplaced test_rejects_non_string_entry — that's manifest
validation, not allowlist resolution. Belongs in a manifest-validation
test file (which doesn't exist yet); leaving for a separate PR rather
than adding a one-branch sample here.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move the --format=json-requires-dry-run check out of the integration
suite (it doesn't need Docker — argparse fails before any backend
runs) and tighten the assertion: previously asserted only that exit
code was nonzero, so any unrelated breakage (manifest resolution
failure, bad agent name, etc.) silently passed. Now asserts stderr
contains the actual flag-conflict message.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Compares smolmachines against the six subsystems in
agent-vm-isolation.md. smolmachines replaces the microVM runtime,
network attachment (libkrun TSI with built-in DNS-over-vsock filter),
vsock control plane, and Python lifecycle wrapper. Pipelock stays;
disk-image story shifts to OCI + writable overlay. Recommends adopting
smolmachines as the macOS VM backend after smoke-testing TSI
passthrough to a host-side pipelock.
Transcript-style notes on running an agent in a hardware-isolated
microVM on macOS. Covers Virtualization.framework / vfkit / libkrun
choices, hardware-isolation guarantees, driving VMs from Python
(subprocess or PyObjC), pipelock as the egress proxy, vsock for the
control channel, and egress enforcement via
VZFileHandleNetworkDeviceAttachment + gvisor-tap-vsock.
The old smoke test hand-rolled the docker create/cp/start sequence in
parallel with what DockerPipelockProxy.start already does, so any
divergence in production code wouldn't trip it. Rewritten to call
.prepare and .start directly and probe /health from a sibling curl
container on the same internal network — same access topology the
agent container uses in production.
In-network probing means the test no longer depends on a published
port, so it can run under act_runner (where host-loopback port
publishing isn't reachable from the job container).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
BottlePlan gains a to_dict method (abstract on the base, implemented
on DockerBottlePlan) returning a JSON-serializable view of the resolved
plan. `cli.py start --dry-run --format=json` prints it to stdout and
exits zero. --format=json without --dry-run is rejected — emitting JSON
during a real launch would race the y/N prompt.
The dry-run integration test now parses the JSON and asserts on
structured fields (agent, bottle, runtime, hosts sorted+deduped, etc.)
instead of regex-matching the human-readable preflight stdout. That
kills the magic-"8 hosts allowed" coupling — adding a new baked
default doesn't break the test.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Split pipelock config building from YAML rendering: pipelock_build_config
returns a dict, pipelock_render_yaml serializes it, and _build_pipelock_yaml
chains the two onto disk. Unchanged behavior — pipelock loads the same YAML.
The yaml test now asserts on the structured config dict, which is
robust to cosmetic YAML changes (key order, quoting). The two checks
that only make sense on the rendered output — file mode 0600 and
no-secret-leakage — stay against the on-disk content.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the hand-maintained INTEGRATION_NAMES classifier (and the
bespoke run_tests.py around it) with a directory-driven split:
tests/unit/ unit tests, always run
tests/integration/ Docker-dependent, skip cleanly without Docker
tests/canaries/ upstream-regression checks, opt-in via
CLAUDE_BOTTLE_RUN_CANARIES=1
The pinned-pipelock-image check moves to the canary suite — it tests
upstream packaging, not our code, so it shouldn't gate every dev push.
A scheduled canaries.yml workflow runs it weekly.
The manifest-runtime tests collapse the four assertRaises cases for
distinct 'runtime' values into one subTest loop and drop the
error-message-wording assertions; the contract is "any value is
rejected", not "the error literally contains 'auto-detect'".
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Empty commit to seed the branch so a PR can be opened against main.
Actual test refactor work will follow.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Compares claude-bottle to endo-familiar, litterbox, agent-safehouse,
matchlock, tilde.run, boxlite, microsandbox, and smolmachines. Covers
isolation primitive, locality, agent integration, network policy, and
maturity, and notes three borrowable ideas (per-use SSH confirmation,
in-flight secret injection, microVM backend) that fit the current
bash-first / local-Docker stance.
Renames the file and rewrites the body around what actually shipped:
class-based BottleBackend ABC (not a free create_docker_bottle
function), the two-phase prepare/launch split, the backend/docker/
subpackage layout, env.py reshaped into a backend-neutral ResolvedEnv,
and PipelockProxy split between top-level and backend/docker/.
resolve_env_into(...) becomes resolve_env(manifest, agent) -> ResolvedEnv
(forwarded names + literals). The docker backend now owns env-file /
argv serialization and the --env-file newline check. Also drops stray
Docker references from manifest.py, pipelock.py, util.py, and trims
the duplicated command list from cli.py's docstring (usage() in
claude_bottle/cli/__init__.py is now the only listing).
Module name aligns with the others (manifest, pipelock, network,
log) — nouns/noun-y, not verb phrases. The function name now reads
naturally at the call site: resolve_env_into(manifest, agent,
env_file, args_file).
New file claude_bottle/backend/util.py for cross-backend host-side
helpers:
host_skill_dir(name) — resolves $HOME/.claude/skills/<name>
docker/util.py gains:
docker_exec_root(container, argv) — `docker exec -u 0` wrapper used
by SSH provisioning
DockerBottleBackend drops the two methods that wrapped these
(`_host_skill_dir`, `_docker_exec_root`) — they had no instance state
and just lived on the class for organizational reasons. Call sites
now use the imported functions directly.
Matches the allowlist-resolution helpers' shape: the caller resolves
the bottle once and passes it in. Signature drops from
(manifest, bottle_name, slug, yaml_path) to (bottle, slug, yaml_path).
DockerBottleBackend.prepare_proxy uses manifest.bottle_for(agent_name)
to get the bottle directly. Tests pass fixture.bottles[name].
prepare's docstring also explains what `slug` is: the lowercased,
hyphen-normalized agent identifier used as the suffix in every
per-agent resource name (agent container, pipelock container, the
internal/egress networks). It's stored on the plan so start can
derive the sidecar's container name.
Top-level pipelock.py drops the Manifest import — no longer used.
Both constants were already only used by Docker-specific code (the
sidecar boot, the proxy_url/host_port naming helpers, the image
contract test). Move them next to DockerPipelockProxy.
Top-level pipelock.py drops the 'os' import along with the constants;
the two test files that pulled PIPELOCK_IMAGE retarget at the new
location.
The three slug-based naming helpers were nominally on pipelock.py but
each assumed a Docker container topology (the container name is
'claude-bottle-pipelock-<slug>', the proxy URL uses that container
name). Move them next to DockerPipelockProxy:
pipelock_container_name -> claude-bottle-pipelock-<slug>
pipelock_proxy_url -> http://<container>:<port>
pipelock_proxy_host_port -> <container>:<port>
backend.py imports them directly from .pipelock; the orphan-cleanup
test imports container_name from the same place.
PipelockProxy becomes an ABC with the platform-agnostic
prepare/_build_pipelock_yaml as concrete methods and start/stop as
abstract. Docker-specific sidecar lifecycle moves to a new sibling
file:
claude_bottle/backend/docker/pipelock.py
DockerPipelockProxy(PipelockProxy) — implements start (docker
create/cp/network connect/start) and stop (docker inspect/rm -f).
DockerBottleBackend._proxy is now a DockerPipelockProxy instance.
Tests that previously instantiated PipelockProxy() directly switch to
DockerPipelockProxy() (the base is no longer constructable).
Every function in the 'Allowlist resolution' section was doing
`manifest.bottles[bottle_name].X` as its first move. Push the lookup
to the caller and have each helper take a resolved Bottle:
pipelock_bottle_allowlist
pipelock_bottle_ssh_hostnames
pipelock_bottle_ssh_trusted_domains
pipelock_bottle_ssh_ip_cidrs
pipelock_effective_allowlist
pipelock_allowlist_summary
PipelockProxy._build_pipelock_yaml resolves bottle once at the top
and passes it through; DockerBottleBackend.prepare already had the
bottle in scope and now uses it directly. Tests pass the resolved
bottle from each fixture.
The classifier is a pure dotted-quad regex check — nothing
pipelock-specific about it. Pipelock now imports it from util.
test_pipelock_classify.py retargets at the new location.
Two manifest-accessor functions in pipelock.py
(pipelock_bottle_allowlist, pipelock_bottle_ssh_hostnames) look
generic but are 1-line wrappers used only internally; they stay
for now.
ProxyPlan -> PipelockProxyPlan, with two additional fields populated
in launch: internal_network, egress_network (default ""). prepare
fills yaml_path + slug; launch uses dataclasses.replace to populate
the networks before calling start.
pipelock_start -> PipelockProxy.start(plan). Reads yaml_path, slug,
internal_network, egress_network off the plan. Returns the resolved
container name.
pipelock_stop -> PipelockProxy.stop(proxy_target). Takes the resolved
container name directly (the value that start returned); no longer
needs to know about slugs or naming conventions.
Backend launch passes the running container name (state["pipelock"])
to stop. Test for stop's idempotency uses pipelock_container_name to
construct the proxy_target.
Add a frozen ProxyPlan dataclass to pipelock.py (currently one field:
yaml_path; kept as a class so future proxy-level state has a home).
- prepare_proxy(spec, stage_dir) now returns pipelock.ProxyPlan
instead of a raw Path.
- DockerBottlePlan replaces pipelock_yaml_path + pipelock_yaml_filename
with a single proxy: ProxyPlan field.
- launch reads plan.proxy.yaml_path.parent / .name when calling
pipelock_start. Eventually pipelock_start should just take a Path
but that's a separate change.
prepare_proxy(spec, stage_dir) -> Path now decides where the
pipelock yaml lives inside stage_dir (currently 'pipelock.yaml'),
writes it via PipelockProxy.prepare, and returns the resolved path.
The caller (prepare) drops its 'pipelock_yaml_filename' /
'pipelock_yaml = stage_dir / filename' setup and just consumes the
returned Path; the plan's pipelock_yaml_filename is derived from
.name on the path.
The YAML generation now lives on PipelockProxy.prepare(manifest,
bottle_name, yaml_path) in claude_bottle/pipelock.py. The class is the
natural home for any future proxy-level state.
DockerBottleBackend keeps an instance as a class attribute
(_proxy = PipelockProxy()) and its prepare_proxy becomes a thin
delegation. A future backend that wants a different egress proxy
(or none) plugs in its own strategy.
Tests retarget at the new home — PipelockProxy.prepare gets the
content-shape assertions; the sidecar smoke test uses the class
directly too. Same coverage.
Drops 6 tests with no real coverage loss:
- tests/test_pipelock_naming.py — 4 tests asserting that f-string
format helpers return their f-string. Shape locks, not behavior
gates.
- tests/test_pipelock_image.py:test_entrypoint_contains_pipelock and
:test_cmd_contains_run — Docker image metadata inspection. The
remaining test_binary_runs already covers 'does the pinned image
actually work,' which is the only scenario these were really
guarding against.
31 tests -> 25.
The yaml generation logic moves wholesale onto DockerBottleBackend
where it's used. pipelock_write_yaml is deleted; pipelock.py keeps
the allowlist resolution helpers (still called by prepare_proxy and
by pipelock_allowlist_summary).
The pipelock_start error message that referenced "pipelock_write_yaml
must run first" now says "backend.prepare_proxy must run first."
tests/test_pipelock_yaml.py rewritten to drive DockerBottleBackend().
prepare_proxy(spec, yaml_path); test_pipelock_sidecar_smoke.py call
site updated similarly. Same coverage at the new location.
_expand_tilde was a path-string helper on DockerBottleBackend but
nothing about it was Docker-specific — any future backend reading
host paths from manifest entries would want it. Lift to
claude_bottle/util.py (sibling of log.py / manifest.py) as a public
expand_tilde() function. Docker-specific primitives stay in
claude_bottle/backend/docker/util.py.
Mirrors the skills.py absorb. ssh_validate_entries -> validate_ssh_entries
(called from prepare); ssh_setup body -> provision_ssh; the
_docker_exec_root and _expand_tilde helpers become private methods on
the backend.
The detailed isolation-model docstring from the old module moves to
provision_ssh, where the code lives now. ssh.py deleted.