fix(smolmachines): docker push fails on Docker Desktop — daemon-side route differs from host loopback #74
Reference in New Issue
Block a user
Delete Branch "fix-local-registry-docker-desktop"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
./cli.py start <agent>underCLAUDE_BOTTLE_BACKEND=smolmachinesdied atdocker push localhost:<port>/...withGet "http://localhost:<port>/v2/": context deadline exceeded.Cause: chunk 4c bound the ephemeral registry to
127.0.0.1::5000and usedlocalhost:<port>as the only image-ref hostname. On Docker Desktop the daemon runs inside its own Linux VM — itslocalhostis the VM's loopback, not the host's, so the daemon can't reach a host-loopback-only binding.Fix: bind the registry to all interfaces (
-p :5000) so it's reachable from both sides, and yield aRegistryEndpoints(daemon_endpoint, host_endpoint)from the context manager:daemon_endpoint—host.docker.internal:<port>on Docker Desktop (the special hostname the daemon resolves to the host VM gateway);localhost:<port>on a native Linux daemon that shares the host's network namespace. Used fordocker tag+docker push.host_endpoint— alwayslocalhost:<port>. Used forsmolvm pack create, which runs as a host process.The registry stores images by repo+tag; a push to
host.docker.internal:<port>/cb:<id>and a pull fromlocalhost:<port>/cb:<id>resolve to the same blob — the hostname is just routing.Detection uses
docker info --format '{{.OperatingSystem}}'(returns"Docker Desktop"on macOS/Windows desktop, and the host's OS string on a native daemon).Trade-off: all-interface binding briefly publishes the registry on every interface (~5-10s during prepare). The pushed image is built from the public repo Dockerfile (no secrets), the port is random, and the window is short — acceptable for v1.
577 unit tests pass (the existing chunk 4c tests get updated to assert the new endpoint shapes; two new tests cover the Docker-Desktop vs native-Linux routing branches).
Second commit pushed (
47eb56b). The previous approach (host.docker.internal:<port>for daemon-side push) still failed because Docker Desktop's daemon doesn't have that hostname in its default insecure-registries CIDRs — push tried HTTPS, hit plain-HTTP registry, refused:The daemon.json fix (
"insecure-registries": ["host.docker.internal"]) works but is a one-time manual UI step.New approach sidesteps
docker pushentirely:docker build(unchanged).docker saveto a per-digest tarball.-p :5000host publish.crane push --insecureon the same docker network — joins the registry's network, resolves it by container DNS, forces plain HTTP via--insecure.smolvm pack create --image localhost:<host port>/...— smolvm's bundled crane auto-falls-back to HTTP for localhost.The docker daemon never makes an HTTP/HTTPS policy decision on our behalf.
docker pushis gone from the prepare path.Tested end-to-end on macOS Docker Desktop:
_ensure_smolmachine('claude-bottle:latest')produces a 204MB.smolmachineartifact.Adds:
backend/docker/util.py:save()— thindocker savewrapperlocal_registry.crane_push_tarball()— one-shot crane on the registry's networkCRANE_IMAGEpinned by digest (gcr.io/go-containerregistry/crane@sha256:0ae17ecb...)Removes the now-unused
tag()/push()helpers.`SmolmachinesBottlePlan.print` iterated over `bottle.egress.routes` (the manifest's capitalized-attribute form on `manifest.EgressRoute`) but accessed `r.host` (lowercase). Worked when no egress routes were declared; AttributeError ("EgressRoute has no attribute 'host'") on the first bottle with a route. Switch to `self.egress_plan.routes` — the resolved plan-level EgressRoute (lowercase `host`), same source the docker backend's print uses. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`smolvm machine exec` runs commands as root in the VM, but the agent image's USER is `node`. claude-code refuses `--dangerously-skip-permissions` when invoked as root, killing the interactive session right after `attaching interactive claude session...`: --dangerously-skip-permissions cannot be used with root/sudo privileges for security reasons Wrap both `exec_claude` and `exec(script)` in `runuser -l node -c ...` so commands run as the node user with node's $HOME / $USER (login shell). The docker backend gets this behavior for free via the image's USER directive; this restores parity. shlex-quote each claude argv element when stitching the runuser -c shell command so paths / flags with shell-special chars survive the parse. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>claude mcp addas node so config lands in node's homedidericis-claude referenced this pull request2026-05-27 16:23:35 -04:00