# PRD 0003: Bottle factory abstraction - **Status:** Draft - **Author:** didericis - **Created:** 2026-05-10 ## Summary Introduce a per-backend factory function that owns the end-to-end lifecycle of a "bottle" (a running, isolated environment with claude inside). The first and only implementation lands as `create_docker_bottle`. No second backend ships in this PRD. ## Problem Today, "how to launch a bottle" is spread across roughly six modules (`claude_bottle/cli/start.py`, `pipelock.py`, `network.py`, `ssh.py`, `skills.py`, `docker.py`), each shelling out to `docker` directly via `subprocess.run(["docker", ...])`. That coupling means: - Adding a second backend (Apple's `container`, fly.io, a remote SSH host, etc.) requires editing every one of those call sites. The research note `docs/research/apple-container-backend.md` already flags this as a prerequisite for that work. - The pipelock sidecar topology — two networks, multi-attach, sidecar lifecycle — is a Docker implementation detail that has leaked into the top-level CLI orchestration. It reads as a core concept of the project, but a fly.io bottle would not need any of it. - The manifest carries a Docker-specific `runtime: "runsc"` field (`bottles[].runtime`). Anyone setting it has to know about gVisor, whether Docker has it registered, and what to do on macOS where it isn't available natively. The field has one valid non-default value and exists only because the current code can't decide on its own. The shape that fits the project's actual goals (isolated agent runs across multiple backends) is "one factory per backend," not "one container-runtime SDK with N drivers." A previous draft of this PRD considered a low-level runtime-primitive protocol (`run`, `exec`, `cp`, `network_connect`, ...) and rejected it as the wrong layer — it would have forced fly.io to pretend it's Docker. ## Goals / Success Criteria The feature works when all of the following are observable: - `cli.py start` works identically for an existing manifest with no user-visible changes other than (a) a startup log line naming the Docker runtime in use, and (b) `bottles[].runtime` no longer being a valid manifest field. - On a Linux host with gVisor registered, the agent container runs under `runsc` without anything in the manifest requesting it. - On a host without gVisor (including macOS), the agent container runs under the default `runc` runtime; nothing fails, no warning is printed beyond the runtime-name log line. - The existing test suite passes with no behavior changes other than the manifest-schema removal of `runtime`. The feature is **done** when all of the following ship: - A new `claude_bottle/backend/` package exists with `__init__.py` (factory selection) and `docker.py` (`create_docker_bottle`). - `create_docker_bottle` returns a context manager yielding a `Bottle` handle exposing `exec_claude(argv, *, tty=True)`, `cp_in(host, ctr)`, and teardown on context exit. - Every existing `subprocess.run(["docker", ...])` call in `cli/start.py`, `pipelock.py`, `network.py`, `ssh.py`, and `skills.py` either moves into `backend/docker.py` or is called from it. No top-level CLI code references `docker` directly. - `bottles[].runtime` is removed from the manifest schema, the dataclass in `manifest.py`, the example manifest, and any README / docs references. `require_runsc()` in `claude_bottle/docker.py` is deleted. - A single env var, `CLAUDE_BOTTLE_BACKEND` (default `"docker"`), selects the factory. Unknown values die at startup with a list of known backends. - The y/N preflight in `cli.py` includes the resolved Docker runtime alongside the allowlist summary. ## Non-goals - No second backend implementation. `create_container_bottle` and `create_flyio_bottle` are not in this PRD. The factory dict in `backend/__init__.py` ships with one entry. - No retries, async, or streaming exec. The current code is synchronous `subprocess.run`; the `Bottle` handle matches. - No behavior change beyond the runsc auto-detect. Pipelock topology, network naming, container naming, image build flow, and SSH provisioning all stay byte-identical. - No `--require-runsc` CLI escape hatch. If a user later wants "fail rather than silently downgrade," that's a follow-up. - No `bottles[].backend` manifest field. Backend is a property of the host environment, not the bottle definition (at least for now). ## Scope ### In scope - New `claude_bottle/backend/` package containing `__init__.py` and `docker.py`. - The `Bottle` Protocol definition and `create_docker_bottle` factory. - Moving Docker-specific subprocess calls into the factory. - Removing `bottles[].runtime` from the manifest schema and the dataclass. - Auto-detection of `runsc` registration via `docker info`. - Preflight integration: the existing y/N output names the resolved Docker runtime. - Test updates: any manifest fixtures referencing `runtime` are updated; tests that assert on `--runtime=runsc` instead seed the detection by mocking `docker info`. ### Out of scope - Apple `container` and fly.io factories (separate PRDs, deferred until the Docker factory is the only thing shipping). - Generalizing the pipelock sidecar to other backends. Pipelock topology is, after this PRD, an implementation detail private to `backend/docker.py`. - Rewriting `pipelock.py`'s YAML generation. The allowlist→YAML translation stays where it is and is called by the Docker factory. - Changes to `env_resolve.py`, `manifest.py` (beyond the `runtime` removal), or the agent schema. - CLI flags for runtime selection / override. ## Proposed Design ### New services / components A new package, `claude_bottle/backend/`: - **`claude_bottle/backend/__init__.py`** — Defines the `Bottle` Protocol and `get_bottle_factory()`. The factory registry is a module-level dict mapping backend name → factory function. Selection reads `CLAUDE_BOTTLE_BACKEND` (default `"docker"`). Unknown values call `die()` with the list of known backends. ```python class Bottle(Protocol): name: str def exec_claude(self, argv: list[str], *, tty: bool = True) -> int: ... def cp_in(self, host_path: str, ctr_path: str) -> None: ... def close(self) -> None: ... def get_bottle_factory() -> Callable[..., AbstractContextManager[Bottle]]: ... ``` - **`claude_bottle/backend/docker.py`** — `create_docker_bottle(...)`, the only factory implementation in this PRD. Owns: - probing for `runsc` availability (`docker info --format '{{json .Runtimes}}'`), - building the base image and the per-cwd derived image, - creating the per-agent internal and egress networks, - launching the pipelock sidecar (calls `pipelock.py` for YAML generation, but the sidecar's `docker create / cp / network connect / start` sequence moves into this module), - running the agent container with `--runtime=runsc` iff available, - copying skills / SSH keys / prompt / `.git` into the running container, - tearing everything down (container, sidecar, two networks) on context exit. ### Existing code touched - **`claude_bottle/cli/start.py`** — replace the inline docker orchestration with `with get_bottle_factory()(manifest, ...) as bottle:` and call `bottle.exec_claude(...)`. The preflight stays here but is extended to render the resolved Docker runtime alongside the allowlist summary. - **`claude_bottle/manifest.py`** — drop the `runtime` field from the Bottle dataclass and its validation. Existing manifests with `runtime: "runsc"` should produce a clear "unknown field" error so users know to remove it. - **`claude_bottle/docker.py`** — `require_runsc()` deleted. `require_docker()`, `slugify()`, `image_exists()`, `container_exists()`, and the `build_image` / `build_image_with_cwd` helpers stay; they're host-side utilities that the Docker factory consumes. - **`claude_bottle/pipelock.py`** — keep all the allowlist resolution and YAML generation. Remove `pipelock_start` / `pipelock_stop` (or inline them into `backend/docker.py` — decide during implementation). Pipelock-the-sidecar becomes a Docker-factory internal concept. - **`claude_bottle/network.py`** — same call-sites moved into `backend/docker.py`. The module either becomes a thin set of pure name-derivation helpers (`network_name_for_slug`, etc.) or folds entirely into `backend/docker.py`. Decide during implementation. - **`claude_bottle/ssh.py`** and **`claude_bottle/skills.py`** — the `docker cp` and `docker exec` calls move into / are called from `backend/docker.py`. The host-side file-tree generation stays put. - **`claude-bottle.example.json`** — remove the `runtime` field from any example bottle. - **`README.md`** — note `CLAUDE_BOTTLE_BACKEND` and the runsc auto-detect; remove any mention of `runtime: "runsc"` as a manifest field. ### Data model changes The bottle schema loses one field: ```diff { "bottles": { "default": { - "runtime": "runsc", "env": { "...": "..." }, "ssh": [], "egress": { "allowlist": [...] } } } } ``` Any manifest carrying `runtime` produces a validation error on load ("unknown bottle field 'runtime' — gVisor is now auto-detected; remove this field"). The agent schema is unchanged. ### External dependencies None new. This PRD reorganizes existing code; it does not pull in any new images, binaries, or libraries. ### Behavior the runsc auto-detect introduces The Docker factory runs `docker info --format '{{json .Runtimes}}'` exactly once per `create_docker_bottle` call. If `runsc` is in the output, the subsequent `docker run` adds `--runtime=runsc`. Otherwise it runs without that flag. The choice is logged via the existing `info()` helper as part of the preflight: ``` docker runtime: runsc (gVisor) # or: runc (default) ``` The y/N preflight shows the same line, so users can confirm what they're about to run under before approving. ## Open questions - **Where the pipelock sidecar lifecycle lives.** Two reasonable splits: (a) `pipelock.py` keeps `pipelock_start` / `pipelock_stop` and `backend/docker.py` calls them; (b) the sidecar `docker create/cp/network connect/start` sequence moves entirely into `backend/docker.py` and `pipelock.py` shrinks to the YAML + allowlist helpers. (a) keeps git blame intact and is the smaller diff; (b) makes pipelock-as-an-implementation-detail more obvious. Decide during implementation. - **Whether `bottles/__init__.py` re-exports `create_docker_bottle`.** Importing `from claude_bottle.bottles import create_docker_bottle` vs. `from claude_bottle.bottles.docker import create_docker_bottle`. Doesn't matter for v1 (only the registry consumes it), but worth picking a convention before a second factory lands. - **Manifest-error wording when `runtime` is seen.** "Unknown field" is technically correct but unhelpful. A targeted error message ("runtime: was removed; gVisor is now auto-detected when the Docker daemon has it registered") is more useful and worth the extra few lines. - **Test fixtures.** Some tests mock `docker info` or seed `--runtime=runsc` expectations. Audit and update as part of the implementation; not expected to be a large change. - **Future `--require-runsc` flag.** Not in this PRD; flagged here so it's findable when the question comes up. ## References - `docs/research/apple-container-backend.md` — original motivation; prior draft considered a low-level `Backend` protocol and rejected it as the wrong layer. - `docs/research/bash-vs-python-vs-go.md` §Recommendation — argues that the factory abstraction matters independent of language choice. - PRD 0001 (`docs/prds/0001-per-agent-egress-proxy-via-pipelock.md`) — defines the pipelock topology that becomes a private implementation detail of the Docker factory after this PRD ships.