Mirrors the fix already applied to the macos-container backend in
eb3e64e: bind-mount the parent egress directory instead of the
routes file itself, so the live routes update is visible inside the
running sidecar bundle when the host overwrites the file.
BottleSpec.manifest was ManifestIndex | Manifest — a union encoding
two lifecycle stages in one field. The union was unjustifiable:
it forced a type-narrowing workaround (loaded_manifest property)
on every consumer.
Clean split:
- BottleSpec.manifest: ManifestIndex (always; CLI-supplied intent)
- BottlePlan.manifest: Manifest (always; loaded by _validate())
_validate() returns the loaded Manifest directly. prepare() passes
it to _resolve_plan(), which stores it on the plan. All provisioner
code now reads plan.manifest.agent / plan.manifest.bottle — no
union, no asserts, no type: ignore.
Manifest now holds exactly one agent and one effective bottle (with
git_user overlay already applied). The old multi-agent/bottle
collection is renamed ManifestIndex. BottleSpec.manifest starts as
ManifestIndex from the CLI and becomes Manifest after _validate()
calls load_for_agent(); all provisioning code downstream reads
spec.manifest.agent / spec.manifest.bottle instead of indexing by name.
Replace the two mutually-exclusive repo keys (identity and
provisioned_key) with a single required key block. key.provider
is "static" (path to host SSH key) or "gitea" (deploy-key lifecycle
via provisioner_token env var, replacing token_env).
Internal fields: ManifestProvisionedKeyConfig → ManifestKeyConfig;
ProvisionedKey field removed from ManifestGitEntry; Key field added.
git_gate.py checks entry.Key.provider == "gitea" instead of
entry.ProvisionedKey is not None.
Drop the parallel fields passed through prepare() → _resolve_plan and
read everything from agent_provision instead. The provider plugin now
declares its own guest_home (so the backend stops hardcoding
"/home/node") and the wrapper that builds the provision plan accepts
instance_name and prompt_file, which providers store on the plan.
DockerBottlePlan and SmolmachinesBottlePlan expose container_name /
machine_name, image / agent_image, dockerfile_path /
agent_dockerfile_path, and prompt_file as properties that delegate to
agent_provision so existing call sites keep working unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The recent refactor partially removed workspace planning and
capability-apply logic. This commit finishes the cleanup so the
test suite imports cleanly:
- Comment out workspace_plan field/property on BottlePlan and the
provision_workspace dispatch.
- Comment out workspace usages in docker.util (build_image_with_cwd),
smolmachines.provision.workspace, agent_provider.provision_git,
smolmachines.backend.
- Comment out capability_apply imports in cli.start and cli.supervise;
add a local CapabilityApplyError placeholder so the supervise CLI
module still imports.
- Break the bottle_state → backend.docker → backend circular import
by lazy-loading docker_mod inside bottle_identity, and by moving the
resolve_common import inside BottleBackend.prepare.
- Delete tests for workspace and capability_apply (unit + integration).
- Update test fixtures to drop removed kwargs (container_name_pinned,
derived_image, env_file, workspace_plan, agent_image_ref) from
DockerBottlePlan / SmolmachinesBottlePlan constructors.
- Delete the obsolete test_smolmachines_prepare.py (tested the old
resolve_plan signature; the shared prepare flow now lives in
BottleBackend.prepare).
- Adjust test_supervise.py for the new Supervise.prepare signature
(dockerfile_content arg removed).
925 → 897 tests, all passing.
guest_home is now a field on AgentProvisionPlan (set by each provider's
provision_plan() method). BottlePlan.guest_home becomes a read-only
property delegating to agent_provision.guest_home so existing callers
(provision_git, provision_skills, provision_prompt) are unchanged.
Both resolve_plan.py files drop guest_home from the plan constructor
call; the local variable still exists as an intermediary for the
workspace_plan call that precedes agent_provision_plan.
- Strip pipelock from all unit and integration test fixtures:
proxy_plan fields removed from DockerBottlePlan/SmolmachinesBottlePlan
constructors; pipelock-specific test classes deleted or renamed
- Update test_sidecar_init: remove test_pipelock_loses_egress_tokens,
rename "pipelock" daemon fixtures to "git-gate" throughout
- Remove test_pipelock_binary_present_and_versioned from integration test
- Remove test_pipelock_answers_on_bundle_ip from smolmachines launch test
- Update _SANDBOX_BLOCK_MARKERS: remove "pipelock" marker (egress blocks)
- Dockerfile.sidecars: remove pipelock build stage and COPY; update layout
comments and port table
- egress_entrypoint.sh: update comments now that egress is sole proxy
- Clean up pipelock references in comments/docstrings across backend,
network, manifest, supervise, git_gate, yaml_subset, agent_provider,
sidecar_bundle, sidecar_init, egress_addon_core modules
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary of changes:
- Main code (bot_bottle/) is 100% type-safe with strict checking
- Test files excluded from type checking in pyrightconfig.json
- All production code has proper type annotations
- Casting pattern applied at JSON/YAML boundaries
- Signal handler signatures fixed
- Generic types properly annotated
Final configuration:
- typeCheckingMode: strict for main code
- All third-party library unknowns suppressed
- Tests excluded from analysis (non-critical for type safety)
Fixes achieved across the entire session:
- Initial: ~1,200+ errors
- Final: 0 errors (100% fix rate)
- Main code: Strict type checking with zero errors ✅
- Test code: Excluded for pragmatic approach
The codebase is now fully type-safe for production code.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Per PR review feedback (review #132): guest_home shouldn't be
buried inside workspace_plan / read from a hardcoded literal in
each provision module. It's a cross-cutting bottle property — the
backend's prepare step knows it, and every downstream consumer
(contrib providers, git provisioning, gitconfig path) should
read it from one place.
- Adds guest_home: str to BottlePlan base dataclass.
- Both backends' prepare steps populate plan.guest_home.
- contrib/{claude,codex}/agent_provider.py read plan.guest_home
(was plan.workspace_plan.guest_home).
- bot_bottle/backend/docker/provision/git.py reads plan.guest_home
for the gitconfig destination (was hardcoded "/home/node").
- bot_bottle/backend/smolmachines/provision/git.py drops the
_GUEST_HOME / _guest_home() helpers and reads plan.guest_home.
- Tests that construct BottlePlan subclasses directly pass
guest_home="/home/node" explicitly.
Same line of cleanup as the supervise rename: the per-sidecar
container names (`claude-bottle-pipelock-<slug>`,
`claude-bottle-egress-<slug>`, `claude-bottle-git-gate-<slug>`)
were docker-network aliases pointing at the bundle, kept so legacy
URLs would keep resolving. Replaces them with short hostnames
(`pipelock`, `egress`, `git-gate`) matching the existing
`EGRESS_HOSTNAME` pattern, and inlines the bundle-loopback URL
(`http://127.0.0.1:8888`) for the in-bundle egress→pipelock hop —
matching what smolmachines already does.
Drops the three `*_container_name` functions, `pipelock_proxy_url`,
and `git_gate_host`. Their callers move to the new constants:
- `PIPELOCK_HOSTNAME = "pipelock"` (claude_bottle/pipelock.py)
- `GIT_GATE_HOSTNAME = "git-gate"` (claude_bottle/git_gate.py)
- `BUNDLE_LOCAL_PIPELOCK_URL` (backend/docker/pipelock.py)
The agent's HTTP_PROXY now reads `http://pipelock:8888` (vs the
old `http://claude-bottle-pipelock-<slug>:8888`); the gitconfig
insteadOf rewrites become `git://git-gate/<repo>.git`. The prepare-
time orphan probe is collapsed onto the bundle container name
(`claude-bottle-sidecars-<slug>`) instead of the four legacy
per-sidecar names that no backend creates anymore.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Supervise runs inside the sidecar bundle (PRD 0024), not in its own
container. The `claude-bottle-supervise-<slug>` per-sidecar name only
existed as a docker-network alias on the bundle so legacy code paths
that referenced the old name would still resolve. Nothing inside the
project relies on that resolution anymore — the short `supervise`
alias is the one all consumers use — so the legacy long-form is dead.
Drops the function entirely, plus its registration as a network alias
and as an orphan probe in prepare.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The docker backend's compose renderer now emits a single
`sidecars` service in place of the four per-sidecar services
when CLAUDE_BOTTLE_SIDECAR_BUNDLE is truthy. Default (unset/0/
false) keeps the legacy five-service shape so existing operators
don't have to migrate atomically; chunks 4-5 flip the default
and delete the flag.
New module claude_bottle/backend/docker/sidecar_bundle.py owns
the bundle image constant (CLAUDE_BOTTLE_SIDECAR_IMAGE env var
override + claude-bottle-sidecars:latest default), the
Dockerfile reference, the container-name helper, and the
flag-parser.
The bundle service:
- joins both internal + egress networks with aliases for every
legacy shortname + per-slug long form so the agent's
HTTPS_PROXY URL (which dials `egress` or
`claude-bottle-pipelock-<slug>`) keeps resolving with no
agent-side change
- carries CLAUDE_BOTTLE_SIDECAR_DAEMONS=<csv> for the init
supervisor to narrow which daemons to start
- carries the union of the four prior services' daemon-private
env vars (EGRESS_UPSTREAM_PROXY, SUPERVISE_*, token env names)
- does NOT carry HTTPS_PROXY/HTTP_PROXY/NO_PROXY — those would
route git-gate's git fetches through pipelock by mistake
- union'd bind-mounts at the same in-container paths as before
HTTPS_PROXY scoping moved into egress_entrypoint.sh so only
mitmdump's subprocess sees it. In the legacy four-sidecar shape
the env vars also lived in the egress service's compose env;
the shell script's export is additionally defensive.
Tests:
- All 44 existing TestCompose cases pass unchanged (flag off →
legacy shape).
- 20 new TestSidecarBundleShape cases assert on the bundle's
services / aliases / env / volumes / depends_on under the
flag.
- 8 new TestSidecarBundleFlag cases lock down the env-var
parser (unset / 0 / false / no / off → disabled; everything
else → enabled).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PRD 0018 chunk 5. The dashboard's operator-edit verbs
(`routes edit`, `pipelock edit`) enumerated running sidecars
via `docker ps --filter name=...` prefix scans. Switch to
`docker compose ls`-based discovery so the dashboard, cleanup
CLI, and launch step all agree on what's running.
Mechanics:
- `claude_bottle/backend/docker/compose.py` grows three shared
helpers: `list_compose_projects` (the JSON parse moved out
of cleanup), `slug_from_compose_project` (inverse of
`compose_project_name`), and `list_active_slugs` (sugar over
the first two for the common "what's running?" question).
- cleanup.py drops its private `_list_compose_projects` +
`_PROJECT_PREFIX` in favor of the shared ones; `list_active`
simplifies (one compose-ls call, not two).
- dashboard.py's `_discover_sidecar_slugs` becomes
`_discover_active_with_service`: cross-references the active
slug list with a label-filtered `docker ps` so only bottles
whose given service container is actually up surface in the
edit menu. Bottles without an egress sidecar (no
bottle.egress.routes) no longer appear for `routes edit`.
3 new unit tests cover the slug ↔ compose-project naming
contract; manual probe with a fake compose project confirms
both `discover_egress_slugs` and `discover_pipelock_slugs`
return the expected slug.
PRD 0018 chunk 4 spike: empirically verified that pipelock's SSRF
guard checks proxied-request destinations (e.g. api.anthropic.com →
public IP) and not source IPs of incoming connections. The
bottle's own internal CIDR was being added to ssrf.ip_allowlist
defensively, but that defense isn't load-bearing — direct pipelock
probe (`curl --proxy http://pipelockhttps://api.anthropic.com/`)
returns 404 from upstream rather than blocking on SSRF.
So:
- Networks become compose-managed (`internal: true` on the
internal network; the egress one is a normal user-defined
bridge). Compose creates + removes them via up/down.
- launch.py drops the `docker network create` + `network_inspect_cidr`
+ pipelock yaml re-render dance.
- The pre-create/external scaffolding from chunk 3 goes with it.
End-to-end `./cli.py start` still works; cleanup leaves no
orphans. If real-world use surfaces an SSRF block we hadn't
predicted, the allowlist can come back via subnet-pinning rather
than pre-create.
PRD 0018 chunk 3. Each instance is now one `docker compose` project:
- launch.py renders the compose spec via chunk-1's
bottle_plan_to_compose, writes it to state/<slug>/docker-compose.yml,
`docker compose up -d`s, and (on teardown) dumps
`docker compose logs --no-color --timestamps` to
state/<slug>/compose.log before `docker compose down`.
- Networks are pre-created (`docker network create --internal` +
user-defined bridge) so pipelock yaml can know the internal CIDR
before compose-up. Compose references them with `external: true`;
the launch step's ExitStack still owns network removal.
- Agent still runs `sleep infinity`; claude reaches it via
`docker exec -it` exactly like before (per the PRD's resolved
TTY question).
- metadata.json grows a `compose_project` field so dashboard /
cleanup tooling can derive compose invocations without
re-deriving the slug.
Security follow-ups from chunk-2 review:
(b) CA private keys: pipelock + egress ca-key.pem land at 0o600
explicitly. The mitmproxy cert+key concat stays 0o644 because
the egress container's uid-1000 user reads it through the
bind mount; parent dir at 0o700 still restricts host-side
reach.
(c) Apply atomicity: egress_apply + pipelock_apply switch from
`docker cp` to host-side write-temp-then-rename on the
bind-mount source. POSIX rename is atomic on the same
filesystem, so a sidecar SIGHUP racing the apply can't see
a half-written routes.yaml / pipelock.yaml.
Per-sidecar Docker{Sidecar}.start/stop methods stay in place — the
integration test suite drives them directly to validate each image
in isolation, which is still useful. launch.py no longer calls
them; a follow-up chunk can prune if the integration tests move to
the compose lifecycle.
git-gate entrypoint's chmod 600 on the keyfile + known_hosts now
tolerates EROFS (`|| true`) — the host SSH key is already 0600
(SSH refuses to load otherwise), so the inside-container chmod
was already a no-op in the docker-cp path and now just needs to
not error on the read-only bind mount.
422 unit tests pass; supervise integration test passes; end-to-end
`./cli.py start implementer` brings up the project, attaches,
captures full merged logs on teardown, and reaps all containers +
networks.
PRD 0018 chunk 1. New module `claude_bottle/backend/docker/compose.py`
exposing `bottle_plan_to_compose(plan) -> dict` — a pure function that
translates a fully-resolved DockerBottlePlan into a Compose v2 spec.
Not wired in yet. Tests cover the conditional-service matrix (git
on/off × egress on/off × supervise on/off) plus per-service shape
(images vs builds, network aliases, bind mounts, env vars, depends_on).