Files

T

didericis-codex 1cbedc91c0 refactor(agent): use agent-neutral runtime names

Assisted-by: Codex

2026-05-28 17:59:24 -04:00

15 KiB

Raw Blame History

PRD 0018: One Compose project per bottle instance

Status: Draft
Author: didericis
Created: 2026-05-25

Summary

Replace the current pattern of orchestrating each sidecar with its own docker SDK calls with one docker compose project per bottle instance. The compose project is generated at start time, written to disk under the instance's state dir, and brought up with docker compose up. Tearing the instance down is docker compose down. Logs come from docker compose logs and land in a single file per instance, so reading what happened in a session is one less away.

State for each instance (~/.bot-bottle/state/<slug>/) becomes a self-describing folder:

metadata.json           # agent_name, cwd, started_at, compose project name, ...
docker-compose.yml      # the exact compose spec used to start this instance
compose.log             # full dump of `docker compose logs --no-color`
transcript/             # snapshotted agent conversation (existing)
live-config/            # routes.yaml, allowlist — bind-mounted into sidecars (existing)

Anything that needs to look at "what did instance X actually run?" can read those four artifacts. The compose file plus the metadata together fully describe the container topology.

Problem

Today start builds each sidecar (pipelock, egress, git-gate, supervise) and the agent container with a chain of individual SDK calls in bot_bottle/backend/docker/launch.py:

A per-sidecar Docker{Sidecar}.start() method does docker create → docker cp (stage files) → docker network connect → docker start.
Two networks are created up front (network_create calls).
The agent container starts last via its own docker run.

This is fine, but it has three rough edges:

No single artifact describes the topology. To understand what ran for instance <slug>, you have to read the Python that built the SDK calls. Nothing is on disk you can cat.
Logs are scattered. Each container's logs sit in Docker's per- container journal. To debug a session post-mortem you have to remember to run docker logs bot-bottle-pipelock-<slug> etc. before the containers age out, and there's no merged view.
Teardown is bespoke. Each sidecar's stop() is its own method, ordered carefully in start.py's ExitStack. A leftover container or network from a crash takes the cleanup CLI to find.

Compose is purpose-built for this shape: declarative spec, one project name per environment, merged logs, atomic up/down.

Goals / Success Criteria

bot-bottle start <agent> writes ~/.bot-bottle/state/<slug>/docker-compose.yml and brings the project up with docker compose -p <project> up.
The compose file is the source of truth for the container topology — every sidecar that runs is declared as a services: entry, every network is a networks: entry, every bind mount is a volumes: entry.
~/.bot-bottle/state/<slug>/compose.log contains the full merged stdout/stderr of every service for the session, in docker compose logs --no-color format.
metadata.json records the compose project name alongside the existing fields (agent_name, cwd, started_at), so other tools can derive docker compose -p <project> ... invocations without re-deriving the slug.
Session teardown is docker compose -p <project> down. The existing per-sidecar stop() lifecycle methods come out.
The cleanup CLI uses docker compose ls (filtered to bot-bottle-* projects) instead of name-prefix scans across docker ps -a and docker network ls.
The existing remediation flows (pipelock-block, egress-block, capability-block) keep working without protocol changes — they write to host paths under state/<slug>/live-config/, sidecars SIGHUP-reload from the bind mount, no compose-side restart needed.

Non-goals

Multi-host compose. No swarm, no remote contexts. Each instance is one local Docker daemon.
Replacing the manifest format. Manifests stay; compose is an implementation detail of the Docker backend.
Replacing the backend abstraction (PRD 0003). Backend stays abstract; only the Docker implementation changes.
A long-lived "bot-bottle daemon." Each start invocation still owns a single compose project for the lifetime of the session. No persistent service.
Image pre-building. Compose's build: directive triggers builds on first up, same as today; no separate build step.
Backwards compatibility with running instances at upgrade. If an instance was started by the pre-compose code, the user kills it and starts a new one. There's no migration path for live containers.

Scope

In scope

New module bot_bottle/backend/docker/compose.py that renders a compose dict from a BottlePlan and writes it to state/<slug>/docker-compose.yml.
DockerBackend.start rewritten to:
1. Build the plan (existing prepare).
2. Stage bind-mount inputs (CAs, routes.yaml, env file, hooks) into host paths under state/<slug>/.
3. Render + write the compose file.
4. Exec docker compose -p <project> up -d.
5. docker attach bot-bottle-<slug> for the agent's TTY.
6. On exit: docker compose -p <project> logs --no-color → state/<slug>/compose.log, then docker compose -p <project> down --volumes.
Sidecar stage files move from docker cp-into-container to bind-mounts from state/<slug>/. This deletes a lot of code in pipelock.py, git_gate.py, egress.py, supervise.py.
metadata.json gains a compose_project field.
cleanup CLI rewritten to use docker compose ls for discovery.
The per-sidecar Docker{Sidecar}.start/stop lifecycle methods collapse into Docker{Sidecar}.compose_service() returning a service-dict fragment. Their apply / introspection helpers ( egress_apply.py, supervise.py's handlers) are unchanged.

Out of scope

Changing the manifest layer (bot_bottle/manifest.py, egress.py's plan dataclasses, pipelock.py's plan dataclasses).
Changing the agent's runtime contract (proxy env vars, CA bundle paths, current-config mount path).
Changing audit-log shape or location ( ~/.bot-bottle/audit/<component>-<slug>.log stays).
Changing the MCP server's tool list or wire format.
Dropping the --rm semantics for the agent: the agent container is still ephemeral; compose's down --volumes handles cleanup.

Proposed design

Project name

compose_project = f"bot-bottle-{slug}". The slug stays the existing slugify(agent_name)-<5-char-random-base36> from bottle_state.py. Compose adds its own prefix to networks (<project>_<network>) and to default container names — which is why each service gets an explicit container_name: (below).

Service / container naming

Service names inside the compose file are short (agent, pipelock, egress, git-gate, supervise). Each service sets an explicit container_name: matching today's pattern:

services:
  pipelock:
    container_name: bot-bottle-pipelock-<slug>
  egress:
    container_name: bot-bottle-egress-<slug>
  # ...

This keeps the dashboard's container-discovery output stable for operators who've memorized the names. The compose project name (bot-bottle-<slug>) is the only new identifier.

Networks

The two existing networks (bot-bottle-net-<slug> internal + bot-bottle-egress-<slug> upstream-bridge) become compose networks:

networks:
  internal:
    name: bot-bottle-net-<slug>
    internal: true
  egress:
    name: bot-bottle-egress-<slug>

Each service's networks: list mirrors today's wiring.

Bind mounts replace `docker cp`

The current pattern of docker create → docker cp file container:/path → docker start (used by every sidecar to land routes.yaml, CAs, hooks) becomes host bind-mounts. The host paths live under state/<slug>/:

state/<slug>/
  live-config/
    routes.yaml
    allowlist
  pipelock-ca/
    ca.pem
    ca-key.pem
  egress-ca/
    ca.pem
    ca-key.pem
  git-gate/
    entrypoint.sh
    hooks/
    ...
  env/
    agent.env

Each sidecar service mounts the relevant sub-tree read-only at the in-container path it expects. Permissions on the host paths are locked to 0600/0700 at write time (existing mode=0o600 discipline in prepare.py extends naturally).

Conditional services

The compose renderer takes the same BottlePlan the SDK calls read today and only emits services for sidecars that apply:

pipelock — always.
egress — only if bottle.egress.routes is non-empty.
git-gate — only if bottle.git is non-empty.
supervise — only if bottle.supervise is true.
agent — always.

Conditional depends_on: edges keep the agent waiting on sidecars that exist.

Logging

docker compose up -d starts everything detached. The agent is attached for the user's TTY via docker attach bot-bottle- <slug>. Sidecars stream into Docker's per-container journals during the session, exactly as today, and docker compose logs -f gives a merged tail if the user wants it (the dashboard can shell to this).

At session end (success or crash), start.py's ExitStack runs:

snapshot_transcript(slug) (unchanged).
docker compose -p <project> logs --no-color --timestamps → state/<slug>/compose.log.
docker compose -p <project> down --volumes.
cleanup_state(slug) (unchanged — still removes the state dir unless .preserve was written).

The log dump is best-effort; a failure there shouldn't block teardown.

metadata.json shape

Add one field; everything else is unchanged.

{
  "agent_name": "implementer",
  "cwd": "/Users/.../some-project",
  "started_at": "2026-05-25T20:13:04Z",
  "compose_project": "bot-bottle-implementer-a7k3f"
}

Per-sidecar class shape

Today's DockerPipelock, DockerGitGate, DockerEgress, DockerSupervise each carry start() + stop() lifecycle plus helper logic (image building, route validation, apply handlers).

After this PRD:

The start()/stop() methods come out.
A new method per class, compose_service(plan) -> dict, returns the service-stanza fragment (image / build / container_name / networks / volumes / env / depends_on).
The image-build flow becomes build: in the compose file, so the per-sidecar docker build calls go away too.
The apply/introspection helpers (egress_apply.add_route, supervise.py's capability handlers, etc.) are untouched — they read/write host paths under state/<slug>/live-config/ and the bind-mounted sidecars SIGHUP-reload.

Cleanup CLI

./cli.py cleanup switches from "list every container with prefix bot-bottle- and every network with prefix bot-bottle-net- or bot-bottle-egress-" to:

docker compose ls --all --format json → filter to projects whose name starts with bot-bottle-.
For each: docker compose -p <project> down --volumes.
Reap any state dirs under ~/.bot-bottle/state/ whose compose_project no longer appears in compose ls.

Strays from pre-compose code-paths can be mopped up by keeping the existing prefix scan as a fallback for one release.

Open questions

docker compose vs docker-compose v1. Compose v2 ships with Docker Desktop as docker compose (subcommand) and is what tea pr create users will already have. Assume v2; if v1 is detected, die with a pointer to upgrade.
How does claude reach the agent's TTY? Decided: keep today's docker exec -it model. Agent runs sleep infinity under compose; DockerBottle.exec_agent runs docker exec -it bot-bottle-<slug> claude ... exactly like today. Compose owns the lifecycle (so compose logs includes the agent's stdout, compose down tears it down), but the user-facing exec model is unchanged. Rejected docker attach because its default Ctrl-P-Ctrl-Q detach intercept buffers keypresses Claude Code uses; rejected "agent outside compose" because it gives up the unified compose logs view that motivated the PRD.
~~TTY allocation under compose.~~ Resolved by #2: no tty: / stdin_open: on the agent service — interactivity is per-exec.
docker compose logs ordering. The dumped log file interleaves services by timestamp. Confirm --timestamps is enough to keep it readable; otherwise consider per-service subfiles (compose.log.pipelock, etc.).
Image build caching. build: in compose rebuilds on first up unless the image is already tagged. The per-sidecar images (bot-bottle-pipelock, bot-bottle-egress, bot-bottle-git-gate, bot-bottle-supervise) should stay tagged on the daemon between runs so we don't rebuild on every start. Verify compose's behavior matches.
docker compose down --volumes and bind-mount data. down --volumes removes named volumes but leaves bind-mount source paths alone (they're host paths under our state dir, which we manage explicitly). Confirm — and if there's a footgun, drop --volumes and rely on the state-dir cleanup step.
Dashboard discovery. cli/dashboard.py enumerates instances by scanning containers. Should it switch to docker compose ls too, or read metadata.json files under state/? Reading state dirs is faster and survives docker daemon restarts; compose ls is the truth about what's actually running. Probably both: list from state dirs, mark "running" by cross-referencing compose ls.

Implementation chunks

Sized for one PR each, in order.

Compose renderer. Pure function: bottle_plan_to_compose(plan) -> dict. No I/O. Full unit-test coverage for the conditional-service matrix (every combination of git on/off, egress on/off, supervise on/off). No start.py changes yet.
Stage-file move to host paths. Refactor each sidecar's stage-file production (today: write to host stage dir → docker cp after create) to write directly into state/<slug>/ sub-trees with bind-mount-ready perms. SDK path still does docker cp; this is a no-op rearrangement that sets up chunk 3.
Switch start.py to compose. Wire up the renderer + docker compose up -d + attach + teardown. Per-sidecar start()/ stop() lifecycle methods deleted in the same chunk. Compose- log dump on teardown added.
Cleanup CLI on compose. Switch ./cli.py cleanup to docker compose ls-based discovery; keep prefix-scan as fallback for one release.
Dashboard. Decide on the discovery question (open question #7), implement.

References

PRD 0003 — bottle backend abstraction (what stays / what changes underneath it)
PRD 0010 / 0017 — cred-proxy → egress; the sidecar lifecycle this PRD collapses into compose
PRD 0014 / 0015 / 0016 — apply flows that bind-mount-+-SIGHUP has to keep working without protocol change

15 KiB Raw Blame History