docs(prd): add 0003 bottle factory abstraction
test / run tests/run_tests.py (pull_request) Successful in 17s

This commit is contained in:
2026-05-10 21:56:10 -04:00
parent a39c7b1b7b
commit d5c056f36e
@@ -0,0 +1,279 @@
# PRD 0003: Bottle factory abstraction
- **Status:** Draft
- **Author:** didericis
- **Created:** 2026-05-10
## Summary
Introduce a per-platform factory function that owns the end-to-end
lifecycle of a "bottle" (a running, isolated environment with claude
inside). The first and only implementation lands as
`create_docker_bottle`. No second platform ships in this PRD.
## Problem
Today, "how to launch a bottle" is spread across roughly six modules
(`claude_bottle/cli/start.py`, `pipelock.py`, `network.py`, `ssh.py`,
`skills.py`, `docker.py`), each shelling out to `docker` directly via
`subprocess.run(["docker", ...])`. That coupling means:
- Adding a second backend (Apple's `container`, fly.io, a remote SSH
host, etc.) requires editing every one of those call sites. The
research note `docs/research/apple-container-backend.md` already
flags this as a prerequisite for that work.
- The pipelock sidecar topology — two networks, multi-attach, sidecar
lifecycle — is a Docker implementation detail that has leaked into
the top-level CLI orchestration. It reads as a core concept of the
project, but a fly.io bottle would not need any of it.
- The manifest carries a Docker-specific `runtime: "runsc"` field
(`bottles[].runtime`). Anyone setting it has to know about gVisor,
whether Docker has it registered, and what to do on macOS where it
isn't available natively. The field has one valid non-default value
and exists only because the current code can't decide on its own.
The shape that fits the project's actual goals (isolated agent runs
across multiple platforms) is "one factory per platform," not "one
container-runtime SDK with N drivers." A previous draft of this PRD
considered a low-level `Backend` protocol (`run`, `exec`, `cp`,
`network_connect`, ...) and rejected it as the wrong layer — it would
have forced fly.io to pretend it's Docker.
## Goals / Success Criteria
The feature works when all of the following are observable:
- `cli.py start` works identically for an existing manifest with no
user-visible changes other than (a) a startup log line naming the
Docker runtime in use, and (b) `bottles[].runtime` no longer being a
valid manifest field.
- On a Linux host with gVisor registered, the agent container runs
under `runsc` without anything in the manifest requesting it.
- On a host without gVisor (including macOS), the agent container runs
under the default `runc` runtime; nothing fails, no warning is
printed beyond the runtime-name log line.
- The existing test suite passes with no behavior changes other than
the manifest-schema removal of `runtime`.
The feature is **done** when all of the following ship:
- A new `claude_bottle/bottles/` package exists with
`__init__.py` (factory selection) and `docker.py`
(`create_docker_bottle`).
- `create_docker_bottle` returns a context manager yielding a `Bottle`
handle exposing `exec_claude(argv, *, tty=True)`, `cp_in(host, ctr)`,
and teardown on context exit.
- Every existing `subprocess.run(["docker", ...])` call in
`cli/start.py`, `pipelock.py`, `network.py`, `ssh.py`, and
`skills.py` either moves into `bottles/docker.py` or is called from
it. No top-level CLI code references `docker` directly.
- `bottles[].runtime` is removed from the manifest schema, the
dataclass in `manifest.py`, the example manifest, and any README /
docs references. `require_runsc()` in `claude_bottle/docker.py` is
deleted.
- A single env var, `CLAUDE_BOTTLE_PLATFORM` (default `"docker"`),
selects the factory. Unknown values die at startup with a list of
known platforms.
- The y/N preflight in `cli.py` includes the resolved Docker runtime
alongside the allowlist summary.
## Non-goals
- No second platform implementation. `create_container_bottle` and
`create_flyio_bottle` are not in this PRD. The factory dict in
`bottles/__init__.py` ships with one entry.
- No retries, async, or streaming exec. The current code is
synchronous `subprocess.run`; the `Bottle` handle matches.
- No behavior change beyond the runsc auto-detect. Pipelock topology,
network naming, container naming, image build flow, and SSH
provisioning all stay byte-identical.
- No `--require-runsc` CLI escape hatch. If a user later wants "fail
rather than silently downgrade," that's a follow-up.
- No `bottles[].platform` manifest field. Platform is a property of
the host environment, not the bottle definition (at least for now).
## Scope
### In scope
- New `claude_bottle/bottles/` package containing `__init__.py` and
`docker.py`.
- The `Bottle` Protocol definition and `create_docker_bottle` factory.
- Moving Docker-specific subprocess calls into the factory.
- Removing `bottles[].runtime` from the manifest schema and the
dataclass.
- Auto-detection of `runsc` registration via `docker info`.
- Preflight integration: the existing y/N output names the resolved
Docker runtime.
- Test updates: any manifest fixtures referencing `runtime` are
updated; tests that assert on `--runtime=runsc` instead seed the
detection by mocking `docker info`.
### Out of scope
- Apple `container` and fly.io factories (separate PRDs, deferred
until the Docker factory is the only thing shipping).
- Generalizing the pipelock sidecar to other platforms. Pipelock
topology is, after this PRD, an implementation detail private to
`bottles/docker.py`.
- Rewriting `pipelock.py`'s YAML generation. The allowlist→YAML
translation stays where it is and is called by the Docker factory.
- Changes to `env_resolve.py`, `manifest.py` (beyond the `runtime`
removal), or the agent schema.
- CLI flags for runtime selection / override.
## Proposed Design
### New services / components
A new package, `claude_bottle/bottles/`:
- **`claude_bottle/bottles/__init__.py`** — Defines the `Bottle`
Protocol and `get_bottle_factory()`. The factory registry is a
module-level dict mapping platform name → factory function.
Selection reads `CLAUDE_BOTTLE_PLATFORM` (default `"docker"`).
Unknown values call `die()` with the list of known platforms.
```python
class Bottle(Protocol):
name: str
def exec_claude(self, argv: list[str], *, tty: bool = True) -> int: ...
def cp_in(self, host_path: str, ctr_path: str) -> None: ...
def close(self) -> None: ...
def get_bottle_factory() -> Callable[..., AbstractContextManager[Bottle]]:
...
```
- **`claude_bottle/bottles/docker.py`** — `create_docker_bottle(...)`,
the only factory implementation in this PRD. Owns:
- probing for `runsc` availability (`docker info --format
'{{json .Runtimes}}'`),
- building the base image and the per-cwd derived image,
- creating the per-agent internal and egress networks,
- launching the pipelock sidecar (calls `pipelock.py` for YAML
generation, but the sidecar's `docker create / cp / network
connect / start` sequence moves into this module),
- running the agent container with `--runtime=runsc` iff available,
- copying skills / SSH keys / prompt / `.git` into the running
container,
- tearing everything down (container, sidecar, two networks) on
context exit.
### Existing code touched
- **`claude_bottle/cli/start.py`** — replace the inline docker
orchestration with `with get_bottle_factory()(manifest, ...) as
bottle:` and call `bottle.exec_claude(...)`. The preflight stays
here but is extended to render the resolved Docker runtime alongside
the allowlist summary.
- **`claude_bottle/manifest.py`** — drop the `runtime` field from the
Bottle dataclass and its validation. Existing manifests with
`runtime: "runsc"` should produce a clear "unknown field" error so
users know to remove it.
- **`claude_bottle/docker.py`** — `require_runsc()` deleted.
`require_docker()`, `slugify()`, `image_exists()`,
`container_exists()`, and the `build_image` / `build_image_with_cwd`
helpers stay; they're host-side utilities that the Docker factory
consumes.
- **`claude_bottle/pipelock.py`** — keep all the allowlist resolution
and YAML generation. Remove `pipelock_start` / `pipelock_stop` (or
inline them into `bottles/docker.py` — decide during
implementation). Pipelock-the-sidecar becomes a Docker-factory
internal concept.
- **`claude_bottle/network.py`** — same call-sites moved into
`bottles/docker.py`. The module either becomes a thin set of pure
name-derivation helpers (`network_name_for_slug`, etc.) or folds
entirely into `bottles/docker.py`. Decide during implementation.
- **`claude_bottle/ssh.py`** and **`claude_bottle/skills.py`** — the
`docker cp` and `docker exec` calls move into / are called from
`bottles/docker.py`. The host-side file-tree generation stays put.
- **`claude-bottle.example.json`** — remove the `runtime` field from
any example bottle.
- **`README.md`** — note `CLAUDE_BOTTLE_PLATFORM` and the runsc
auto-detect; remove any mention of `runtime: "runsc"` as a manifest
field.
### Data model changes
The bottle schema loses one field:
```diff
{
"bottles": {
"default": {
- "runtime": "runsc",
"env": { "...": "..." },
"ssh": [],
"egress": { "allowlist": [...] }
}
}
}
```
Any manifest carrying `runtime` produces a validation error on load
("unknown bottle field 'runtime' — gVisor is now auto-detected;
remove this field").
The agent schema is unchanged.
### External dependencies
None new. This PRD reorganizes existing code; it does not pull in any
new images, binaries, or libraries.
### Behavior the runsc auto-detect introduces
The Docker factory runs `docker info --format '{{json .Runtimes}}'`
exactly once per `create_docker_bottle` call. If `runsc` is in the
output, the subsequent `docker run` adds `--runtime=runsc`. Otherwise
it runs without that flag. The choice is logged via the existing
`info()` helper as part of the preflight:
```
docker runtime: runsc (gVisor) # or: runc (default)
```
The y/N preflight shows the same line, so users can confirm what
they're about to run under before approving.
## Open questions
- **Where the pipelock sidecar lifecycle lives.** Two reasonable
splits: (a) `pipelock.py` keeps `pipelock_start` / `pipelock_stop`
and `bottles/docker.py` calls them; (b) the sidecar
`docker create/cp/network connect/start` sequence moves entirely
into `bottles/docker.py` and `pipelock.py` shrinks to the YAML +
allowlist helpers. (a) keeps git blame intact and is the smaller
diff; (b) makes pipelock-as-an-implementation-detail more obvious.
Decide during implementation.
- **Whether `bottles/__init__.py` re-exports `create_docker_bottle`.**
Importing `from claude_bottle.bottles import create_docker_bottle`
vs. `from claude_bottle.bottles.docker import create_docker_bottle`.
Doesn't matter for v1 (only the registry consumes it), but worth
picking a convention before a second factory lands.
- **Manifest-error wording when `runtime` is seen.** "Unknown field"
is technically correct but unhelpful. A targeted error message
("runtime: was removed; gVisor is now auto-detected when the Docker
daemon has it registered") is more useful and worth the extra few
lines.
- **Test fixtures.** Some tests mock `docker info` or seed
`--runtime=runsc` expectations. Audit and update as part of the
implementation; not expected to be a large change.
- **Future `--require-runsc` flag.** Not in this PRD; flagged here so
it's findable when the question comes up.
## References
- `docs/research/apple-container-backend.md` — original motivation;
prior draft considered a low-level `Backend` protocol and rejected
it as the wrong layer.
- `docs/research/bash-vs-python-vs-go.md` §Recommendation — argues
that the factory abstraction matters independent of language choice.
- PRD 0001 (`docs/prds/0001-per-agent-egress-proxy-via-pipelock.md`)
— defines the pipelock topology that becomes a private
implementation detail of the Docker factory after this PRD ships.