fix(smolmachines): use containerized crane to push, bypassing docker daemon's HTTPS preference
test / unit (pull_request) Successful in 27s
test / integration (pull_request) Successful in 42s

The previous fix (`host.docker.internal:<port>` for daemon-side
push) still failed:

  Get "https://host.docker.internal:53958/v2/":
    http: server gave HTTP response to HTTPS client

`host.docker.internal` is reachable from Docker Desktop's daemon
VM but isn't in the daemon's default insecure-registries CIDRs
(only `::1/128` and `127.0.0.0/8` are), so docker push tries
HTTPS, hits a plain-HTTP registry, and refuses. The daemon.json
fix (`"insecure-registries": ["host.docker.internal"]`) works
but is a one-time manual step in Docker Desktop's UI — not
something we can do for the user.

Sidestep the daemon push entirely:

  1. docker build (as before) — local layer cache makes
     no-change rebuilds cheap.
  2. docker save the image to a per-digest tarball alongside the
     cached `.smolmachine`.
  3. Start an ephemeral registry container on a per-session
     docker network, with `-p :5000` so the host can also reach
     it for the pack step.
  4. docker run a one-shot crane container on the SAME network,
     mount the tarball, `crane push --insecure /img.tar
     <registry-container>:5000/...`. Container DNS resolves the
     registry on the network; `--insecure` forces plain HTTP.
  5. `smolvm pack create --image localhost:<host port>/...` from
     the host. smolvm's bundled crane auto-falls-back to HTTP
     for localhost addresses, so no insecure-registries config
     is needed on that side.
  6. Tear down everything; reap the tarball (registries hold the
     same bytes, no need to keep both around).

Net effect: the docker daemon never does an HTTP/HTTPS-policy
decision on our behalf. `docker push` is gone from the prepare
path; `docker save`, `docker network create`, `docker run` (for
registry + crane) replace it.

Tested end-to-end on Docker Desktop / macOS: `_ensure_smolmachine
("claude-bottle:latest")` produces a 204MB
`.smolmachine.smolmachine` artifact.

Adds:
- backend/docker/util.py:save() — thin docker save wrapper.
- local_registry.crane_push_tarball() — one-shot crane run on
  the registry's network.
- CRANE_IMAGE constant pinned by digest
  (gcr.io/go-containerregistry/crane@sha256:0ae17ecb...).

Removes:
- backend/docker/util.py:tag() / push() — unused without daemon
  push.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 14:52:40 -04:00
parent f4026ea3ae
commit 47eb56bd10
6 changed files with 347 additions and 258 deletions
+22 -16
View File
@@ -34,7 +34,7 @@ from ...pipelock import PipelockProxy
from ...supervise import Supervise
from . import smolvm as _smolvm
from .bottle_plan import SmolmachinesBottlePlan
from .local_registry import ephemeral_registry
from .local_registry import crane_push_tarball, ephemeral_registry
from .util import smolmachines_bundle_subnet, smolmachines_preflight
@@ -185,15 +185,14 @@ def _ensure_smolmachine(image_ref: str) -> Path:
it; the sidecar is the actual artifact).
Conversion path: `docker build` (the existing layer cache
makes no-change rebuilds cheap) → `docker tag` + `docker push`
using the daemon-side endpoint (`host.docker.internal:<port>`
on Docker Desktop, `localhost:<port>` on native Linux) →
`smolvm pack create --image <host endpoint>` using the
host-side endpoint (always `localhost:<port>` — smolvm is a
host process) → tear down the registry. The two endpoints
route to the same registry container; only the hostname
differs because the docker daemon (on Docker Desktop) doesn't
share the host's loopback.
makes no-change rebuilds cheap) → `docker save` to a tarball
→ spin up an ephemeral registry on a private docker network →
`crane push --insecure` from a one-shot container on the same
network → `smolvm pack create --image localhost:<host port>/...`
→ tear down the registry + network. The crane push detour
sidesteps the Docker-Desktop daemon's HTTPS preference for
non-loopback registries — see the `local_registry` module
docstring for the gory details.
Each pack-create costs several seconds even on a hot cache,
so we skip the whole pipeline when the cached sidecar is
@@ -208,10 +207,17 @@ def _ensure_smolmachine(image_ref: str) -> Path:
sidecar = _SMOLMACHINE_CACHE_DIR / f"{digest}.smolmachine.smolmachine"
if sidecar.is_file():
return sidecar
with ephemeral_registry() as endpoints:
push_ref = f"{endpoints.daemon_endpoint}/claude-bottle:{digest}"
pack_ref = f"{endpoints.host_endpoint}/claude-bottle:{digest}"
docker_mod.tag(image_ref, push_ref)
docker_mod.push(push_ref)
_smolvm.pack_create(pack_ref, binary)
tarball = _SMOLMACHINE_CACHE_DIR / f"{digest}.image.tar"
docker_mod.save(image_ref, str(tarball))
try:
with ephemeral_registry() as handle:
push_ref = f"{handle.push_endpoint}/claude-bottle:{digest}"
pack_ref = f"{handle.pull_endpoint}/claude-bottle:{digest}"
crane_push_tarball(handle, str(tarball), push_ref)
_smolvm.pack_create(pack_ref, binary)
finally:
# Tarball is ~500MB-1GB for the agent image; reclaim once
# the smolmachine artifact exists. The artifact itself is
# the long-lived cache entry.
tarball.unlink(missing_ok=True)
return sidecar