fix(smolmachines): agent dials bundle via host loopback ports, not docker bridge IP
test / unit (pull_request) Successful in 26s
test / integration (pull_request) Successful in 39s

Claude hung on outbound network calls under
CLAUDE_BOTTLE_BACKEND=smolmachines:

  Unable to connect to API (FailedToOpenSocket)

Root cause: the PRD-0023 design pinned the bundle at a docker
bridge IP (192.168.X.2) and set the smolvm guest's TSI allowlist
to `<bundle-ip>/32`. On native Linux this works — host shares
the docker bridge's network namespace, TSI's syscall
impersonation reaches the bridge IP directly. On Docker Desktop
(macOS), the daemon runs in its own Linux VM and docker bridge
IPs aren't reachable from macOS networking, so the smolvm
guest's TSI requests die "Network is unreachable" before they
hit pipelock.

Fix: publish each agent-facing bundle daemon's port on host
loopback (-p 127.0.0.1::PORT), discover the random host-side
ports after start, and route the agent through
`127.0.0.1:<host port>` instead of the bridge IP. macOS loopback
is the surface Docker Desktop's gvproxy forwards into the
daemon's VM, so the chain (guest TSI -> macOS loopback ->
daemon VM port-forward -> bundle container) works on both
Docker Desktop and native Linux.

Concrete changes:
- BundleLaunchSpec: add `ports_to_publish` so start_bundle adds
  `-p 127.0.0.1::PORT` for the agent-facing ports (pipelock
  always; git-gate when upstreams declared; supervise when
  enabled). Egress's port stays bundle-internal.
- sidecar_bundle.bundle_host_port(): wrap `docker port <bundle>
  <container_port>/tcp` so launch can look up the random
  host-side mapping after start.
- launch.py: discover the host ports, build URLs of the form
  `http://127.0.0.1:<host port>` / `git://127.0.0.1:<host port>`,
  stamp onto guest_env + new agent_*_url fields on the plan.
- launch.py: TSI allow_cidrs flips to `["127.0.0.1/32"]`. The
  bundle IP is no longer the agent's target.
- prepare.py: stop synthesizing HTTPS_PROXY / GIT_GATE_URL /
  MCP_SUPERVISE_URL at prepare time — launch owns those now
  (the values depend on a port docker hasn't assigned yet).
- provision_git: gate_host from plan.agent_git_gate_host.
- provision_supervise: URL from plan.agent_supervise_url.

End-to-end verified on Docker Desktop / macOS: guest dials
pipelock through TSI, pipelock forwards to api.anthropic.com,
the API responds with 401 (i.e. it received the request).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 15:31:44 -04:00
parent da1e5e1ba8
commit 4f136a9932
7 changed files with 170 additions and 44 deletions
+80 -9
View File
@@ -41,14 +41,28 @@ from ..docker.git_gate import (
GIT_GATE_CREDS_DIR_IN_CONTAINER,
GIT_GATE_ENTRYPOINT_IN_CONTAINER,
GIT_GATE_HOOK_IN_CONTAINER,
GIT_GATE_PORT as _GIT_GATE_PORT,
)
from ..docker.pipelock import (
BUNDLE_LOCAL_PIPELOCK_URL,
PIPELOCK_PORT as _PIPELOCK_PORT_STR,
pipelock_tls_init,
)
from ..docker.pipelock import BUNDLE_LOCAL_PIPELOCK_URL, pipelock_tls_init
from . import sidecar_bundle as _bundle
from . import smolvm as _smolvm
from .bottle import SmolmachinesBottle
from .bottle_plan import SmolmachinesBottlePlan
# Container-internal listening ports for each bundle daemon. The
# bundle publishes each one on a random host loopback port (see
# `_bundle.start_bundle`), and `_bundle.bundle_host_port` looks
# them up post-start. Pipelock's port is an env-overridable string
# in docker.pipelock; coerce to int here.
_PIPELOCK_PORT = int(_PIPELOCK_PORT_STR)
_SUPERVISE_PORT = SUPERVISE_PORT
@contextmanager
def launch(
plan: SmolmachinesBottlePlan,
@@ -96,28 +110,74 @@ def launch(
)
# 3. Build the BundleLaunchSpec from the (now-resolved)
# inner Plans: daemon subset, env, bind-mounts.
# inner Plans: daemon subset, env, bind-mounts. The spec's
# ports_to_publish list expands depending on which daemons
# the agent needs to reach from the smolvm guest.
bundle_spec = _bundle_launch_spec(plan, network)
token_env = _resolve_token_env(plan, os.environ)
_bundle.start_bundle(bundle_spec, env={**os.environ, **token_env})
stack.callback(_bundle.stop_bundle, plan.slug)
# 4. smolvm VM. --from carries the pre-packed .smolmachine
# 4. Discover the host-side ports docker assigned for the
# bundle's published container ports, and bind the
# agent's URLs to `127.0.0.1:<host port>`. Docker container
# IPs (192.168.x.x in the daemon's bridge) aren't
# reachable from the smolvm guest on macOS — TSI uses
# macOS networking, and macOS sees the daemon's bridge
# via the published-port loopback forward only.
pipelock_host_port = _bundle.bundle_host_port(plan.slug, _PIPELOCK_PORT)
agent_proxy_url = f"http://127.0.0.1:{pipelock_host_port}"
agent_git_gate_host = ""
if plan.git_gate_plan.upstreams:
git_gate_host_port = _bundle.bundle_host_port(
plan.slug, _GIT_GATE_PORT,
)
agent_git_gate_host = f"127.0.0.1:{git_gate_host_port}"
agent_supervise_url = ""
if plan.supervise_plan is not None:
supervise_host_port = _bundle.bundle_host_port(
plan.slug, _SUPERVISE_PORT,
)
agent_supervise_url = f"http://127.0.0.1:{supervise_host_port}/"
# Stamp the URLs onto the plan + guest_env. provision_git
# and provision_supervise read the plan fields; the agent
# reads guest_env on every exec_claude.
guest_env = {
**plan.guest_env,
"HTTPS_PROXY": agent_proxy_url,
"HTTP_PROXY": agent_proxy_url,
}
if agent_git_gate_host:
guest_env["GIT_GATE_URL"] = f"git://{agent_git_gate_host}"
if agent_supervise_url:
guest_env["MCP_SUPERVISE_URL"] = agent_supervise_url
plan = dataclasses.replace(
plan,
guest_env=guest_env,
agent_proxy_url=agent_proxy_url,
agent_git_gate_host=agent_git_gate_host,
agent_supervise_url=agent_supervise_url,
)
# 5. smolvm VM. --from carries the pre-packed .smolmachine
# artifact (built by prepare); --allow-cidr + -e carry the
# per-bottle TSI allowlist + env. Smolfile isn't usable
# here — smolvm 0.8.0 makes `--from` and `--smolfile`
# mutually exclusive.
# per-bottle TSI allowlist + env. The allowlist is
# `127.0.0.1/32` because every bundle daemon the agent
# reaches is fronted by a host loopback port-forward.
# Smolfile isn't usable here — smolvm 0.8.0 makes `--from`
# and `--smolfile` mutually exclusive.
_smolvm.machine_create(
plan.machine_name,
from_path=plan.agent_from_path,
allow_cidrs=[f"{plan.bundle_ip}/32"],
allow_cidrs=["127.0.0.1/32"],
env=plan.guest_env,
)
stack.callback(_smolvm.machine_delete, plan.machine_name)
_smolvm.machine_start(plan.machine_name)
stack.callback(_smolvm.machine_stop, plan.machine_name)
# 5. Reclaim /home/node for the node user. smolvm's pack
# 6. Reclaim /home/node for the node user. smolvm's pack
# process remaps OCI-layer ownership to the host invoker's
# uid (501 on macOS) rather than preserving the image's
# uid 1000 — so without this chown, node can't write its
@@ -129,7 +189,7 @@ def launch(
["chown", "-R", "node:node", "/home/node"],
)
# 6. Provision (CA / prompt / skills / git / supervise).
# 7. Provision (CA / prompt / skills / git / supervise).
prompt_path = provision(plan, plan.machine_name)
yield SmolmachinesBottle(
@@ -217,6 +277,16 @@ def _bundle_launch_spec(
]
volumes.append((str(sp.queue_dir), QUEUE_DIR_IN_CONTAINER, False))
# Container ports the agent reaches from the smolvm guest —
# published on host loopback so the guest can dial via TSI +
# macOS networking. Egress is bundle-internal and never
# published.
ports_to_publish: list[int] = [_PIPELOCK_PORT]
if gp.upstreams:
ports_to_publish.append(_GIT_GATE_PORT)
if sp is not None:
ports_to_publish.append(_SUPERVISE_PORT)
return _bundle.BundleLaunchSpec(
slug=plan.slug,
network_name=network,
@@ -226,6 +296,7 @@ def _bundle_launch_spec(
daemons_csv=",".join(daemons),
environment=tuple(env),
volumes=tuple(volumes),
ports_to_publish=tuple(ports_to_publish),
)