6ea19a8d53a262fd1d43f75966c3028a6774e3e9
11 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6ea19a8d53 | fix(git-gate): use smart http for smolmachines pushes | ||
|
|
c08b09dc9f |
refactor!: rename project to bot-bottle
Assisted-by: Codex |
||
|
|
59ee32cc8d | refactor(manifest): key git config by host | ||
|
|
c9cdd41110 |
feat(smolmachines): apply git_user via git config --global on provision (issue #86)
Mirror the docker backend's third provisioning subcase in `backend/smolmachines/provision/git.py`: _provision_git_user(plan, target) Runs `smolvm machine exec --name <M> -e HOME=/home/node -e USER=node -- runuser -u node -- git config --global user.<X> <value>` for each git_user field. No-op when `git_user.is_empty()`. `runuser -u node --` switches the UID without invoking a login shell (matching the existing `Bottle.exec_claude` pattern). HOME / USER are forced via `smolvm -e` because bare runuser inherits root's HOME=/root, which would put --global in /root/.gitconfig instead of /home/node/.gitconfig (where the existing `_provision_git_gate_config` writes). 4 unit tests in test_smolmachines_provision.TestProvisionGitUser: no-op, both-set (asserts runuser prefix + HOME/USER env), name-only, email-only. 661 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
5e0130b56f |
fix(smolmachines): build agent image in launch, not prepare
When starting a smolmachines agent from the dashboard the docker-build output rendered on top of the curses preflight modal — the build was kicked off before the operator had confirmed launch. The docker backend's `prepare` is pure resolution (no docker calls); smolmachines was inconsistent because `prepare` called `_ensure_smolmachine` which ran `docker build` → `docker save` → `crane push` → `smolvm pack create`, several seconds of stderr noise rendered before the y/N prompt. Move the pipeline: - `_ensure_smolmachine` (+ `_SMOLMACHINE_CACHE_DIR` + `_REPO_DIR` + the local-registry / smolvm imports) moves from `backend/smolmachines/prepare.py` to `backend/smolmachines/launch.py`. Called right before `_smolvm.machine_create` so the resulting `.smolmachine` sidecar path lands as a local in `launch`, not on the plan. - `SmolmachinesBottlePlan.agent_from_path: Path` becomes `agent_image_ref: str`. `prepare` stashes only the docker tag (`$CLAUDE_BOTTLE_IMAGE` || `claude-bottle:latest`); `launch` resolves it into the artifact at bringup. This puts smolmachines on the same prepare-vs-launch boundary the docker backend uses: the preflight summary in the dashboard prints, the operator confirms, then `launch` runs — and its stderr is routed via `_route_op_to_right_pane` (in tmux) or via `curses.endwin` (foreground handoff) so the build output lands cleanly. Tests: - `tests/unit/test_smolmachines_prepare_image.py` → `tests/unit/test_smolmachines_launch_image.py`, updated to import `_ensure_smolmachine` from `launch` rather than `prepare`. - `test_smolmachines_provision.py`: plan fixture switches `agent_from_path` → `agent_image_ref`. 593 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
d02fe50193 |
fix(smolmachines): run claude mcp add as node so config lands in node's home
provision_supervise dispatched `claude mcp add --scope user` through `smolvm machine_exec`, which runs as root by default. The MCP entry got written to root's ~/.claude.json — but the agent's claude reads /home/node/.claude.json, so `/mcp` showed "No MCP servers configured" inside the bottle. Wrap the exec in `runuser -u node -- env HOME=/home/node ...` so the config writes to the right home. Same pattern as the interactive exec_claude / Bottle.exec wrappers — `smolvm machine_exec` is always root, so any command that touches user state has to switch UID + set HOME explicitly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
5486170be1 |
fix(smolmachines): route agent through egress when routes declared, wait for VM warm-up
Two related bugs: 1. Auth chain bypassed egress. After the Docker-Desktop port pivot, the agent always dialed pipelock directly — meaning egress (which holds the real OAuth token and rewrites the Authorization header) wasn't in the request path. Bearer placeholder reached anthropic verbatim → 401 "Invalid bearer token". Fix: when the bottle declares egress.routes, the agent's first hop is egress (publish egress port 9099 to host loopback, leave pipelock bundle-internal). Without routes, the agent dials pipelock directly. Same hop order as the docker backend. 2. provision_ca's update-ca-certificates SIGKILLed at ~100ms on Docker Desktop. Back-to-back `smolvm machine exec` calls immediately after machine_start hit a VM warm-up race in libkrun's exec channel; the second exec's child got SIGKILL'd before producing more than the first line of stdout. The agent's trust store never got the egress MITM CA's hash symlink, so curl/openssl couldn't validate the TLS chain. Fix: 1.5s sleep after machine_start (empirically enough), plus fold provision_ca's chown + chmod + update-ca-certificates into one `sh -c` so we only pay one exec round trip. Bail with a clear error if update-ca- certificates doesn't report "1 added" (failing silently was how the original SIGKILL went unnoticed). Net effect on Docker Desktop / macOS: claude's HTTPS_PROXY is `http://127.0.0.1:<egress port>`, egress rewrites auth, pipelock allowlists + DLPs, request reaches api.anthropic.com with a real token. End-to-end verified. Also drops the PRD-0023-chunk-3 EGRESS_LISTEN_HOST=127.0.0.1 mitigation. The original concern (agent bypassing pipelock by dialing egress's port on the bundle IP) doesn't apply in this topology: the agent can only reach whatever port we publish on host loopback, and egress is the only HTTP/HTTPS chokepoint that gets published. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
4f136a9932 |
fix(smolmachines): agent dials bundle via host loopback ports, not docker bridge IP
Claude hung on outbound network calls under CLAUDE_BOTTLE_BACKEND=smolmachines: Unable to connect to API (FailedToOpenSocket) Root cause: the PRD-0023 design pinned the bundle at a docker bridge IP (192.168.X.2) and set the smolvm guest's TSI allowlist to `<bundle-ip>/32`. On native Linux this works — host shares the docker bridge's network namespace, TSI's syscall impersonation reaches the bridge IP directly. On Docker Desktop (macOS), the daemon runs in its own Linux VM and docker bridge IPs aren't reachable from macOS networking, so the smolvm guest's TSI requests die "Network is unreachable" before they hit pipelock. Fix: publish each agent-facing bundle daemon's port on host loopback (-p 127.0.0.1::PORT), discover the random host-side ports after start, and route the agent through `127.0.0.1:<host port>` instead of the bridge IP. macOS loopback is the surface Docker Desktop's gvproxy forwards into the daemon's VM, so the chain (guest TSI -> macOS loopback -> daemon VM port-forward -> bundle container) works on both Docker Desktop and native Linux. Concrete changes: - BundleLaunchSpec: add `ports_to_publish` so start_bundle adds `-p 127.0.0.1::PORT` for the agent-facing ports (pipelock always; git-gate when upstreams declared; supervise when enabled). Egress's port stays bundle-internal. - sidecar_bundle.bundle_host_port(): wrap `docker port <bundle> <container_port>/tcp` so launch can look up the random host-side mapping after start. - launch.py: discover the host ports, build URLs of the form `http://127.0.0.1:<host port>` / `git://127.0.0.1:<host port>`, stamp onto guest_env + new agent_*_url fields on the plan. - launch.py: TSI allow_cidrs flips to `["127.0.0.1/32"]`. The bundle IP is no longer the agent's target. - prepare.py: stop synthesizing HTTPS_PROXY / GIT_GATE_URL / MCP_SUPERVISE_URL at prepare time — launch owns those now (the values depend on a port docker hasn't assigned yet). - provision_git: gate_host from plan.agent_git_gate_host. - provision_supervise: URL from plan.agent_supervise_url. End-to-end verified on Docker Desktop / macOS: guest dials pipelock through TSI, pipelock forwards to api.anthropic.com, the API responds with 401 (i.e. it received the request). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
ac8c7ba696 |
feat(smolmachines): provision_ca + provision_git + provision_supervise (PRD 0023 chunk 4d)
End-to-end provisioning parity with the docker backend. After this
chunk a smolmachines bottle has a working trust store, git-gate
gitconfig, and supervise MCP registration — same shape as docker,
dispatched via `smolvm machine cp` / `smolvm machine exec` instead
of `docker cp` / `docker exec`.
Adds three new provision modules:
- ca.py: select egress vs pipelock CA (same logic as
docker), machine cp + update-ca-certificates,
log sha256 fingerprint.
- git.py: copy host .git when --cwd was passed; render
~/.gitconfig with insteadOf URLs. URL prefix is
`git://<bundle_ip>:9418/...` (no DNS in the
TSI-allowlisted guest) vs docker's
`git://git-gate/...`.
- supervise.py: `claude mcp add` via machine_exec; URL is
`http://<bundle_ip>:9100/`. Failure is logged but
non-fatal (matches docker).
Shared render: `render_git_gate_gitconfig` moves out of
backend/docker/provision/git.py into the platform-neutral
claude_bottle/git_gate.py (renamed to git_gate_render_gitconfig
for consistency with the existing git_gate_render_* helpers),
parameterized on a `gate_host` argument so both backends use the
same logic with different addresses.
Path/user fixups for the post-chunk-4c agent image (real
claude-bottle image, USER node, $HOME=/home/node):
- prompt.py default path moves from /root/... to
/home/node/.claude-bottle-prompt.txt; chown + chmod after
machine cp.
- skills.py default skills dir moves from /root/.claude/skills to
/home/node/.claude/skills; chown -R per skill.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
1dfc359141 |
feat(smolmachines): thread inner Plans + bundle daemons run (PRD 0023 chunk 4b)
Bundle daemons (pipelock, egress, optionally git-gate + supervise)
now actually start with their config files bind-mounted from the
inner Plans the docker backend already produces. Chunks 2d + 3
ran with daemons_csv="" so the bundle's init supervisor idled;
chunk 4b wires up the real path: agent → pipelock → egress →
internet (when routes declared) is now functional, modulo agent-
image gaps (claude-code / TLS-trust-store / git in the guest)
that chunk 4c addresses.
bottle_plan.py — added the four inner Plan fields:
proxy_plan: PipelockProxyPlan
git_gate_plan: GitGatePlan
egress_plan: EgressPlan
supervise_plan: SupervisePlan | None
Same shape the docker backend's plan uses. Docker-network-only
fields (internal_network, egress_network) stay at dataclass
defaults — the smolmachines bundle is on a per-bottle bridge
with a pinned IP, not docker's --internal + egress topology.
prepare.py — instantiates DockerPipelockProxy / DockerEgress /
DockerGitGate / DockerSupervise and calls their .prepare()
methods to write the per-bottle config files (pipelock.yaml,
routes.yaml, git-gate entrypoint/hooks, supervise queue dir)
under the per-bottle state dir. (The "Docker" prefix on the
class names is a misnomer here — .prepare() is platform-neutral,
inherited from each sidecar's ABC. A future cleanup could factor
the prepare logic out of the docker subpackage.)
launch.py — major rewrite:
- pipelock_tls_init at launch (always); egress_tls_init only
when the bottle declares routes (otherwise the CA files
aren't bind-mounted and openssl runs would be wasted).
- Inner Plans updated in place with launch-time CA paths +
EGRESS_UPSTREAM_PROXY = http://127.0.0.1:8888 (egress's
upstream is pipelock on the bundle's own loopback; same
container's network namespace).
- BundleLaunchSpec env + volumes built from the inner Plans:
pipelock.yaml + CA + key (always); egress routes + CAs +
upstream env + token-slot bare names (when routes); git-gate
entrypoint + hooks + per-upstream identity files (when
upstreams); supervise queue dir + env (when enabled).
- daemons_csv = ["egress", "pipelock"] + ["git-gate"] (if
upstreams) + ["supervise"] (if enabled).
- Token env values resolved from host env via
`egress_resolve_token_values` and threaded into the
docker-run subprocess env (bare-name -e entries in spec
inherit from there — values never land on argv).
Tests:
- 552 unit passing (no new unit cases; fixture updated to
populate the new plan fields).
- 5 integration cases passing locally (Darwin + smolvm + docker
+ not GITEA_ACTIONS):
* test_smoke_exec_echo — still works.
* test_localhost_reach_probe — host loopback still refused.
* test_egress_port_bypass_probe — <bundle-ip>:9099 still
refused, NOW WITH EGRESS ACTUALLY RUNNING (chunk 3's
127.0.0.1 bind-address is doing its job).
* test_prompt_file_lands_in_guest — still works.
* test_pipelock_answers_on_bundle_ip — NEW. From inside the
guest, wget to <bundle-ip>:8888 gets an HTTP response
(not "connection refused") — proves pipelock is actually
listening and the bind-mount + CA generation path works.
What's left in chunk 4:
- 4c: agent-image-conversion (claude-code + git + curl +
ca-certificates in the guest). Chunk 2d's alpine placeholder
stays for now.
- 4d: provision_ca + provision_git + provision_supervise once
the agent image has the required tools.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
9e3b7e441e |
feat(smolmachines): provision_prompt + provision_skills (PRD 0023 chunk 4a)
First slice of chunk 4: implement the two provisioning methods
that don't depend on agent-image tooling beyond `cp` and
`mkdir`. provision_ca / provision_git / provision_supervise
land once the agent-image gap is solved (chunk 4b+) — they need
update-ca-certificates, git, and the claude binary respectively,
none of which the chunk-2d alpine placeholder provides.
What this PR ships:
- `claude_bottle/backend/smolmachines/provision/` subpackage
with `prompt.py` + `skills.py`. Each routes through
`smolvm.machine_cp` / `machine_exec`. provision_prompt mirrors
the docker contract (file always copied; return value drives
--append-system-prompt-file iff the agent has a non-empty
prompt). provision_skills mkdir + cp per skill, matching
the docker backend's loop.
- prepare.py now writes the prompt file under
agent_state_dir(slug) with the agent's `prompt` body, mode
0o600. The in-guest path is `/root/.claude-bottle-prompt.txt`
(alpine has no `node` user; will become `/home/node/...` once
the real claude-bottle image lands).
- launch.py calls `provision(plan, machine_name)` after
machine_start. The returned prompt path threads to
SmolmachinesBottle so exec_claude can add
--append-system-prompt-file when the agent has a prompt.
- backend.py: provision_prompt / provision_skills now real;
provision_git is a deliberate stub (waiting on the git-gate
inner Plan + git in the agent image). provision_supervise
stays the chunk-2d stub.
Tests:
- 7 new unit cases (test_smolmachines_provision.py): argv
shape (mocked smolvm.machine_cp / .machine_exec),
prompt return-value contract, no-op-with-no-skills,
CLAUDE_BOTTLE_GUEST_SKILLS_DIR override, fail-on-missing-skill.
- 1 new integration case in test_smolmachines_launch.py:
end-to-end verification that the prompt file lands in the
alpine guest at /root/.claude-bottle-prompt.txt with the
expected content (via `bottle.exec("cat ...")`). The smoke +
the two TSI probes stay green.
552 unit + 4 integration (Darwin+smolvm+docker gated) passing.
What's left in chunk 4:
- 4b: thread the inner Plans (PipelockProxyPlan / EgressPlan /
GitGatePlan / SupervisePlan) through prepare + launch so the
bundle daemons actually run (currently daemons_csv="").
- 4c: the agent-image-conversion gap — get claude-code + git +
curl + ca-certificates into the guest image (build a
.smolmachine via `pack create --from-vm` after manual setup,
or push the docker image to a registry smolvm can pull).
- 4d: provision_ca + provision_git + provision_supervise once
4b + 4c land.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|