Files
bot-bottle/docs/prds/0003-bottle-factory-abstraction.md
T
didericis d5c056f36e
test / run tests/run_tests.py (pull_request) Successful in 17s
docs(prd): add 0003 bottle factory abstraction
2026-05-10 21:56:10 -04:00

12 KiB

PRD 0003: Bottle factory abstraction

  • Status: Draft
  • Author: didericis
  • Created: 2026-05-10

Summary

Introduce a per-platform factory function that owns the end-to-end lifecycle of a "bottle" (a running, isolated environment with claude inside). The first and only implementation lands as create_docker_bottle. No second platform ships in this PRD.

Problem

Today, "how to launch a bottle" is spread across roughly six modules (claude_bottle/cli/start.py, pipelock.py, network.py, ssh.py, skills.py, docker.py), each shelling out to docker directly via subprocess.run(["docker", ...]). That coupling means:

  • Adding a second backend (Apple's container, fly.io, a remote SSH host, etc.) requires editing every one of those call sites. The research note docs/research/apple-container-backend.md already flags this as a prerequisite for that work.
  • The pipelock sidecar topology — two networks, multi-attach, sidecar lifecycle — is a Docker implementation detail that has leaked into the top-level CLI orchestration. It reads as a core concept of the project, but a fly.io bottle would not need any of it.
  • The manifest carries a Docker-specific runtime: "runsc" field (bottles[].runtime). Anyone setting it has to know about gVisor, whether Docker has it registered, and what to do on macOS where it isn't available natively. The field has one valid non-default value and exists only because the current code can't decide on its own.

The shape that fits the project's actual goals (isolated agent runs across multiple platforms) is "one factory per platform," not "one container-runtime SDK with N drivers." A previous draft of this PRD considered a low-level Backend protocol (run, exec, cp, network_connect, ...) and rejected it as the wrong layer — it would have forced fly.io to pretend it's Docker.

Goals / Success Criteria

The feature works when all of the following are observable:

  • cli.py start works identically for an existing manifest with no user-visible changes other than (a) a startup log line naming the Docker runtime in use, and (b) bottles[].runtime no longer being a valid manifest field.
  • On a Linux host with gVisor registered, the agent container runs under runsc without anything in the manifest requesting it.
  • On a host without gVisor (including macOS), the agent container runs under the default runc runtime; nothing fails, no warning is printed beyond the runtime-name log line.
  • The existing test suite passes with no behavior changes other than the manifest-schema removal of runtime.

The feature is done when all of the following ship:

  • A new claude_bottle/bottles/ package exists with __init__.py (factory selection) and docker.py (create_docker_bottle).
  • create_docker_bottle returns a context manager yielding a Bottle handle exposing exec_claude(argv, *, tty=True), cp_in(host, ctr), and teardown on context exit.
  • Every existing subprocess.run(["docker", ...]) call in cli/start.py, pipelock.py, network.py, ssh.py, and skills.py either moves into bottles/docker.py or is called from it. No top-level CLI code references docker directly.
  • bottles[].runtime is removed from the manifest schema, the dataclass in manifest.py, the example manifest, and any README / docs references. require_runsc() in claude_bottle/docker.py is deleted.
  • A single env var, CLAUDE_BOTTLE_PLATFORM (default "docker"), selects the factory. Unknown values die at startup with a list of known platforms.
  • The y/N preflight in cli.py includes the resolved Docker runtime alongside the allowlist summary.

Non-goals

  • No second platform implementation. create_container_bottle and create_flyio_bottle are not in this PRD. The factory dict in bottles/__init__.py ships with one entry.
  • No retries, async, or streaming exec. The current code is synchronous subprocess.run; the Bottle handle matches.
  • No behavior change beyond the runsc auto-detect. Pipelock topology, network naming, container naming, image build flow, and SSH provisioning all stay byte-identical.
  • No --require-runsc CLI escape hatch. If a user later wants "fail rather than silently downgrade," that's a follow-up.
  • No bottles[].platform manifest field. Platform is a property of the host environment, not the bottle definition (at least for now).

Scope

In scope

  • New claude_bottle/bottles/ package containing __init__.py and docker.py.
  • The Bottle Protocol definition and create_docker_bottle factory.
  • Moving Docker-specific subprocess calls into the factory.
  • Removing bottles[].runtime from the manifest schema and the dataclass.
  • Auto-detection of runsc registration via docker info.
  • Preflight integration: the existing y/N output names the resolved Docker runtime.
  • Test updates: any manifest fixtures referencing runtime are updated; tests that assert on --runtime=runsc instead seed the detection by mocking docker info.

Out of scope

  • Apple container and fly.io factories (separate PRDs, deferred until the Docker factory is the only thing shipping).
  • Generalizing the pipelock sidecar to other platforms. Pipelock topology is, after this PRD, an implementation detail private to bottles/docker.py.
  • Rewriting pipelock.py's YAML generation. The allowlist→YAML translation stays where it is and is called by the Docker factory.
  • Changes to env_resolve.py, manifest.py (beyond the runtime removal), or the agent schema.
  • CLI flags for runtime selection / override.

Proposed Design

New services / components

A new package, claude_bottle/bottles/:

  • claude_bottle/bottles/__init__.py — Defines the Bottle Protocol and get_bottle_factory(). The factory registry is a module-level dict mapping platform name → factory function. Selection reads CLAUDE_BOTTLE_PLATFORM (default "docker"). Unknown values call die() with the list of known platforms.

    class Bottle(Protocol):
        name: str
        def exec_claude(self, argv: list[str], *, tty: bool = True) -> int: ...
        def cp_in(self, host_path: str, ctr_path: str) -> None: ...
        def close(self) -> None: ...
    
    def get_bottle_factory() -> Callable[..., AbstractContextManager[Bottle]]:
        ...
    
  • claude_bottle/bottles/docker.pycreate_docker_bottle(...), the only factory implementation in this PRD. Owns:

    • probing for runsc availability (docker info --format '{{json .Runtimes}}'),
    • building the base image and the per-cwd derived image,
    • creating the per-agent internal and egress networks,
    • launching the pipelock sidecar (calls pipelock.py for YAML generation, but the sidecar's docker create / cp / network connect / start sequence moves into this module),
    • running the agent container with --runtime=runsc iff available,
    • copying skills / SSH keys / prompt / .git into the running container,
    • tearing everything down (container, sidecar, two networks) on context exit.

Existing code touched

  • claude_bottle/cli/start.py — replace the inline docker orchestration with with get_bottle_factory()(manifest, ...) as bottle: and call bottle.exec_claude(...). The preflight stays here but is extended to render the resolved Docker runtime alongside the allowlist summary.
  • claude_bottle/manifest.py — drop the runtime field from the Bottle dataclass and its validation. Existing manifests with runtime: "runsc" should produce a clear "unknown field" error so users know to remove it.
  • claude_bottle/docker.pyrequire_runsc() deleted. require_docker(), slugify(), image_exists(), container_exists(), and the build_image / build_image_with_cwd helpers stay; they're host-side utilities that the Docker factory consumes.
  • claude_bottle/pipelock.py — keep all the allowlist resolution and YAML generation. Remove pipelock_start / pipelock_stop (or inline them into bottles/docker.py — decide during implementation). Pipelock-the-sidecar becomes a Docker-factory internal concept.
  • claude_bottle/network.py — same call-sites moved into bottles/docker.py. The module either becomes a thin set of pure name-derivation helpers (network_name_for_slug, etc.) or folds entirely into bottles/docker.py. Decide during implementation.
  • claude_bottle/ssh.py and claude_bottle/skills.py — the docker cp and docker exec calls move into / are called from bottles/docker.py. The host-side file-tree generation stays put.
  • claude-bottle.example.json — remove the runtime field from any example bottle.
  • README.md — note CLAUDE_BOTTLE_PLATFORM and the runsc auto-detect; remove any mention of runtime: "runsc" as a manifest field.

Data model changes

The bottle schema loses one field:

 {
   "bottles": {
     "default": {
-      "runtime": "runsc",
       "env":  { "...": "..." },
       "ssh":  [],
       "egress": { "allowlist": [...] }
     }
   }
 }

Any manifest carrying runtime produces a validation error on load ("unknown bottle field 'runtime' — gVisor is now auto-detected; remove this field").

The agent schema is unchanged.

External dependencies

None new. This PRD reorganizes existing code; it does not pull in any new images, binaries, or libraries.

Behavior the runsc auto-detect introduces

The Docker factory runs docker info --format '{{json .Runtimes}}' exactly once per create_docker_bottle call. If runsc is in the output, the subsequent docker run adds --runtime=runsc. Otherwise it runs without that flag. The choice is logged via the existing info() helper as part of the preflight:

docker runtime: runsc (gVisor)        # or: runc (default)

The y/N preflight shows the same line, so users can confirm what they're about to run under before approving.

Open questions

  • Where the pipelock sidecar lifecycle lives. Two reasonable splits: (a) pipelock.py keeps pipelock_start / pipelock_stop and bottles/docker.py calls them; (b) the sidecar docker create/cp/network connect/start sequence moves entirely into bottles/docker.py and pipelock.py shrinks to the YAML + allowlist helpers. (a) keeps git blame intact and is the smaller diff; (b) makes pipelock-as-an-implementation-detail more obvious. Decide during implementation.

  • Whether bottles/__init__.py re-exports create_docker_bottle. Importing from claude_bottle.bottles import create_docker_bottle vs. from claude_bottle.bottles.docker import create_docker_bottle. Doesn't matter for v1 (only the registry consumes it), but worth picking a convention before a second factory lands.

  • Manifest-error wording when runtime is seen. "Unknown field" is technically correct but unhelpful. A targeted error message ("runtime: was removed; gVisor is now auto-detected when the Docker daemon has it registered") is more useful and worth the extra few lines.

  • Test fixtures. Some tests mock docker info or seed --runtime=runsc expectations. Audit and update as part of the implementation; not expected to be a large change.

  • Future --require-runsc flag. Not in this PRD; flagged here so it's findable when the question comes up.

References

  • docs/research/apple-container-backend.md — original motivation; prior draft considered a low-level Backend protocol and rejected it as the wrong layer.
  • docs/research/bash-vs-python-vs-go.md §Recommendation — argues that the factory abstraction matters independent of language choice.
  • PRD 0001 (docs/prds/0001-per-agent-egress-proxy-via-pipelock.md) — defines the pipelock topology that becomes a private implementation detail of the Docker factory after this PRD ships.