Files
bot-bottle/docs/prds/0003-bottle-backend-abstraction.md
didericis 47c3ba63f8
test / unit (pull_request) Successful in 36s
test / integration (pull_request) Successful in 58s
test / integration (push) Successful in 54s
test / unit (push) Successful in 32s
docs(prd): mark merged PRDs as Active
Flip Status: Draft -> Active for the 23 PRDs whose work has shipped to
main (including 0027, now that PR #95 has merged). Leaves the
terminal-status PRDs unchanged: 0007 and 0010 (Superseded) and 0014
(Retargeted) were replaced, not shipped as-is.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-28 22:12:03 -04:00

13 KiB

PRD 0003: Bottle Backend abstraction

  • Status: Active
  • Author: didericis
  • Created: 2026-05-10

Summary

Introduce a per-backend abstraction that owns the end-to-end lifecycle of a "bottle" (a running, isolated environment with claude inside). The first and only implementation lands as DockerBottleBackend. No second backend ships in this PRD.

Problem

Today, "how to launch a bottle" is spread across roughly six modules (bot_bottle/cli/start.py, pipelock.py, network.py, ssh.py, skills.py, docker.py), each shelling out to docker directly via subprocess.run(["docker", ...]). That coupling means:

  • Adding a second backend (Apple's container, fly.io, a remote SSH host, etc.) requires editing every one of those call sites. The research note docs/research/apple-container-backend.md already flags this as a prerequisite for that work.
  • The pipelock sidecar topology — two networks, multi-attach, sidecar lifecycle — is a Docker implementation detail that has leaked into the top-level CLI orchestration. It reads as a core concept of the project, but a fly.io bottle would not need any of it.
  • The manifest carries a Docker-specific runtime: "runsc" field (bottles[].runtime). Anyone setting it has to know about gVisor, whether Docker has it registered, and what to do on macOS where it isn't available natively. The field has one valid non-default value and exists only because the current code can't decide on its own.

The shape that fits the project's actual goals (isolated agent runs across multiple backends) is "one backend per platform," not "one container-runtime SDK with N drivers." A previous draft of this PRD considered a low-level runtime-primitive protocol (run, exec, cp, network_connect, ...) and rejected it as the wrong layer — it would have forced fly.io to pretend it's Docker.

Goals / Success Criteria

The feature works when all of the following are observable:

  • cli.py start works identically for an existing manifest with no user-visible changes other than (a) a startup log line naming the Docker runtime in use, and (b) bottles[].runtime no longer being a valid manifest field.
  • On a Linux host with gVisor registered, the agent container runs under runsc without anything in the manifest requesting it.
  • On a host without gVisor (including macOS), the agent container runs under the default runc runtime; nothing fails, no warning is printed beyond the runtime-name log line.
  • The existing test suite passes with no behavior changes other than the manifest-schema removal of runtime.

The feature is done when all of the following ship:

  • A new bot_bottle/backend/ package exists with abstract base classes (BottleBackend, BottlePlan, BottleCleanupPlan, Bottle) plus a bot_bottle/backend/docker/ subpackage containing the DockerBottleBackend implementation.
  • DockerBottleBackend.launch(plan) returns a context manager yielding a Bottle handle exposing exec_agent(argv, *, tty=True), cp_in(host, ctr), and teardown on context exit.
  • Every existing subprocess.run(["docker", ...]) call in cli/start.py, pipelock.py, network.py, ssh.py, and skills.py either moves into bot_bottle/backend/docker/ or is called from it. No top-level CLI code references docker directly.
  • bottles[].runtime is removed from the manifest schema, the dataclass in manifest.py, the example manifest, and any README / docs references. require_runsc() in the old top-level bot_bottle/docker.py is deleted.
  • A single env var, BOT_BOTTLE_BACKEND (default "docker"), selects the backend. Unknown values die at startup with a list of known backends.
  • The y/N preflight in cli.py includes the resolved Docker runtime alongside the allowlist summary.

Non-goals

  • No second backend implementation. There is no AppleContainerBottleBackend / FlyioBottleBackend in this PRD. The registry in backend/__init__.py ships with one entry.
  • No retries, async, or streaming exec. The current code is synchronous subprocess.run; the Bottle handle matches.
  • No behavior change beyond the runsc auto-detect. Pipelock topology, network naming, container naming, image build flow, and SSH provisioning all stay byte-identical.
  • No --require-runsc CLI escape hatch. If a user later wants "fail rather than silently downgrade," that's a follow-up.
  • No bottles[].backend manifest field. Backend is a property of the host environment, not the bottle definition (at least for now).

Scope

In scope

  • New bot_bottle/backend/ package containing the abstract types and the registry, plus a bot_bottle/backend/docker/ subpackage containing the Docker implementation.
  • The Bottle, BottleBackend, BottlePlan, and BottleCleanupPlan abstract base classes; BottleSpec data carrier; and DockerBottleBackend implementation.
  • Moving Docker-specific subprocess calls into the Docker subpackage.
  • Removing bottles[].runtime from the manifest schema and the dataclass.
  • Auto-detection of runsc registration via docker info.
  • Preflight integration: the existing y/N output names the resolved Docker runtime.
  • Reshaping env.py (formerly env_resolve.py) to return a backend-neutral ResolvedEnv (forwarded names + literals map) rather than writing docker-shaped files directly. The Docker backend now owns the --env-file / -e NAME serialization and the newline-rejection check.
  • Splitting pipelock.py into a backend-neutral PipelockProxy ABC (yaml + allowlist resolution) and a DockerPipelockProxy subclass (sidecar start/stop) under the Docker subpackage.
  • Test updates: any manifest fixtures referencing runtime are updated; tests that assert on --runtime=runsc instead seed the detection by mocking docker info.

Out of scope

  • Apple container and fly.io backends (separate PRDs, deferred until the Docker backend is the only thing shipping).
  • Generalizing the pipelock sidecar to other backends. Pipelock topology is, after this PRD, an implementation detail private to the Docker backend.
  • Rewriting pipelock.py's YAML generation. The allowlist→YAML translation stays where it is and is called by the Docker backend.
  • CLI flags for runtime selection / override.

Proposed Design

New services / components

A new package, bot_bottle/backend/, with an abstract base layer and a Docker subpackage:

  • bot_bottle/backend/__init__.py — Defines the abstract base classes and the backend registry. BottleSpec carries the CLI-supplied intent; the abstract BottlePlan and BottleCleanupPlan are the prepared-but-not-launched outputs of the two prepare* phases; Bottle is the running-instance handle; BottleBackend is the dispatcher with five methods:

    class BottleBackend(ABC):
        name: str
        def prepare(self, spec: BottleSpec, *, stage_dir: Path) -> BottlePlan: ...
        def launch(self, plan: BottlePlan) -> ContextManager[Bottle]: ...
        def prepare_cleanup(self) -> BottleCleanupPlan: ...
        def cleanup(self, plan: BottleCleanupPlan) -> None: ...
        def list_active(self) -> None: ...
    

    The prepare / launch split lets the CLI render the y/N preflight off the BottlePlan before any container or network is created. The same split applies to cleanup. BottleBackend.provision(plan, target) orchestrates copying skills / SSH / prompt / .git into a running instance via four abstract sub-methods (provision_prompt, provision_skills, provision_ssh, provision_git); subclasses implement those four rather than overriding provision itself.

    Selection reads BOT_BOTTLE_BACKEND (default "docker"). Unknown values call die() with the list of known backends:

    def get_bottle_backend() -> BottleBackend: ...
    
  • bot_bottle/backend/docker/ — Subpackage with the Docker implementation, split into:

    • backend.pyDockerBottleBackend, owning all five abstract methods (prepare, launch, prepare_cleanup, cleanup, list_active) plus the four provision_* sub-methods. Probes for runsc availability (docker info --format '{{json .Runtimes}}'), builds the base image and per-cwd derived image, creates the per-agent internal and egress networks, brings up the pipelock sidecar, runs the agent container with --runtime=runsc iff available, copies skills / SSH keys / prompt / .git into the running container, and tears everything down on context exit.
    • bottle.pyDockerBottle, the running-instance handle yielded by launch.
    • bottle_plan.pyDockerBottlePlan, the prepared-but-not-launched output of prepare. Carries resolved container/network/image names, scratch paths, and use_runsc. Implements print for the y/N preflight.
    • bottle_cleanup_plan.pyDockerBottleCleanupPlan, the analog for orphan cleanup.
    • network.py — Docker network helpers (create/destroy, naming).
    • pipelock.pyDockerPipelockProxy (the sidecar start/stop lifecycle) and Docker-specific naming helpers. The backend-neutral yaml + allowlist resolution stays in the top-level bot_bottle/pipelock.py.
    • util.py — Docker-specific helpers (slugify, image/container existence checks, runsc_available).

Existing code touched

  • bot_bottle/cli/start.py — replace the inline docker orchestration with backend = get_bottle_backend(); plan = backend.prepare(spec, stage_dir=...); with backend.launch(plan) as bottle: bottle.exec_agent(...). The y/N preflight is rendered by plan.print(...).
  • bot_bottle/manifest.py — drop the runtime field from the Bottle dataclass and its validation. Existing manifests with runtime: "runsc" produce a clear "no longer supported; gVisor is now auto-detected by the backend; remove the 'runtime' field" error.
  • bot_bottle/docker.py — module deleted. require_runsc(), slugify(), image_exists(), container_exists(), the build_image / build_image_with_cwd helpers, and require_docker all migrate into bot_bottle/backend/docker/util.py (or backend.py).
  • bot_bottle/pipelock.py — keeps the allowlist resolution and YAML generation. Becomes a thin abstract class (PipelockProxy) exposing prepare (writes the yaml) plus abstract start / stop methods. The Docker-specific subclass DockerPipelockProxy lives under backend/docker/pipelock.py.
  • bot_bottle/network.py — folds entirely into backend/docker/network.py. No top-level network module remains.
  • bot_bottle/ssh.py and bot_bottle/skills.py — absorbed into DockerBottleBackend as provision_ssh and provision_skills. The host-side file-tree generation stays as private helpers on the backend class.
  • bot_bottle/env.py (renamed from env_resolve.py) — resolve_env(manifest, agent) -> ResolvedEnv returns forwarded: list[str] (names whose values were exported into os.environ for inheritance) and literals: dict[str, str] (name → verbatim value). The Docker backend translates the result into --env-file content + -e NAME argv fragments.
  • bot_bottle/util.py — top-level cross-backend helpers (expand_tilde, is_ipv4_literal). Backend-specific helpers live in their backend's util.py.
  • bot-bottle.example.json — remove the runtime field from any example bottle.
  • README.md — note BOT_BOTTLE_BACKEND and the runsc auto-detect; remove any mention of runtime: "runsc" as a manifest field.

Data model changes

The bottle schema loses one field:

 {
   "bottles": {
     "default": {
-      "runtime": "runsc",
       "env":  { "...": "..." },
       "ssh":  [],
       "egress": { "allowlist": [...] }
     }
   }
 }

Any manifest carrying runtime produces a validation error on load ("bottle '<name>' has a 'runtime' field, which is no longer supported. gVisor (runsc) is now auto-detected by the backend; remove the 'runtime' field from the bottle definition.").

The agent schema is unchanged.

External dependencies

None new. This PRD reorganizes existing code; it does not pull in any new images, binaries, or libraries.

Behavior the runsc auto-detect introduces

DockerBottleBackend.prepare runs docker info --format '{{json .Runtimes}}' exactly once per call. If runsc is in the output, use_runsc is set on the DockerBottlePlan and the subsequent docker run adds --runtime=runsc. Otherwise it runs without that flag. The choice is logged via the existing info() helper as part of the preflight:

docker runtime: runsc (gVisor)        # or: runc (default)

The y/N preflight (rendered by DockerBottlePlan.print) shows the same line, so users can confirm what they're about to run under before approving.

References

  • docs/research/apple-container-backend.md — original motivation; prior draft considered a low-level Backend protocol and rejected it as the wrong layer.
  • docs/research/bash-vs-python-vs-go.md §Recommendation — argues that the backend abstraction matters independent of language choice.
  • PRD 0001 (docs/prds/0001-per-agent-egress-proxy-via-pipelock.md) — defines the pipelock topology that becomes a private implementation detail of the Docker backend after this PRD ships.