Add docs/decisions/ with a convention README and back-fill two
decisions that previously had no in-repo home: merging PRs with
rebase (ADR 0001) and the agent-identity claimed-not-vouched trust
posture from PRD 0027 (ADR 0002). Point docs/INDEX.md at it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Analyze tracking feature requests in Gitea against the project's
in-repo PRDs/research notes, given the goal of keeping decision
history portable and not provider-locked. Recommends demoting issues
to an ephemeral inbox and reifying durable rationale into the repo.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
We use Gitea, not an abstract forge. Reword the pre-existing research
and PRD docs: the generic "Forge-API gate"/"forge tokens" become
"Git-host-API gate"/"Git-host tokens" (the gate still spans Gitea /
GitHub / GitLab), "Git/forge history" -> "Git/Gitea history", and the
KNOWN_FORGE_HOSTS / forge: manifest-field examples -> KNOWN_GIT_HOSTS
/ git_host:. Meaning preserved; only the word "forge" is dropped.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Flip Status: Draft -> Active for the 23 PRDs whose work has shipped to
main (including 0027, now that PR #95 has merged). Leaves the
terminal-status PRDs unchanged: 0007 and 0010 (Superseded) and 0014
(Retargeted) were replaced, not shipped as-is.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
README manifest section documents the agent git.user overlay, the
bottle-only git.remotes boundary, and the claimed-not-vouched trust
note. Collapses the example: implementer carries its own identity
against the shared dev bottle instead of an identity-only bottle.
Refs #94
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Agents may declare git.user (name/email); it overlays the referenced
bottle's git.user per-field at Manifest.bottle_for (agent wins on
non-empty), mirroring the extends: merge. git.remotes is rejected on
agents — it carries credentials and host trust and stays bottle-only.
The overlay lives at bottle_for, the single chokepoint both backends
use, so the docker/smolmachines git provisioners are unchanged. Adds
Manifest.git_identity_summary with per-field (agent)/(bottle)
provenance, printed in both preflights and `info`.
Refs #94
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lift git.user (name/email) to the agent layer with a per-field
overlay onto the referenced bottle, mirroring the extends: merge.
git.remotes stays bottle-only. Includes identity provenance in
preflight/info and an example collapse.
Refs #94
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete CLAUDE.md in favor of AGENTS.md as the orientation doc, rebrand
the project from Codex-bottle to provider-agnostic bot-bottle, and
repoint every CLAUDE.md reference across PRDs, research notes, the
implementer agent example, and the yaml_subset comment.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The two Debian-family CA-layout constants lived in
docker/provision/ca.py, which forced the smolmachines backend to
import them cross-backend (smolmachines -> docker). Move them into
the shared backend/util.py next to select_ca_cert; docker, compose,
and smolmachines now all import from there. No behavior change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both backends' provision_ca duplicated _select_ca_cert and the
SHA-256 fingerprint computation verbatim. Lift them into the shared
backend/util.py as select_ca_cert + log_ca_fingerprint; docker and
smolmachines now call the shared helpers. No behavior change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an optional `extends: <bottle-name>` field to bottle
frontmatter. Two-pass load:
1. Collect raw frontmatter for every bottle file.
2. Recursively resolve each name into a merged Bottle via
`_resolve_one_bottle` + `_merge_bottles`.
Merge rules (per PRD 0025):
- env: dict merge, child wins on key collision
- git: full replace if child declares `git:`
- git_user: per-field overlay (child's non-empty fields win)
- egress: full replace if child declares `egress:`
- supervise: full replace if child declares `supervise:`
List-valued fields full-replace because partial merge is
ambiguous (ordering matters, name collisions ambiguous); env is
dict-merge because dict-keyed override is the natural shape.
git_user overlays per-field so a parent can declare just the
name and a child can add just the email.
Cycles / self-extends / missing-parent / non-string `extends:`
all die at parse with a pointer that includes the chain (cycles)
or the available names (missing parent). Resolution is cached
per-name so a diamond reference graph doesn't reparse the same
parent N times.
Both load paths threaded:
- `_load_bottles_from_dir` (md files) — collect raws, then
resolve.
- `Manifest.from_json_obj` (JSON / test fixtures) — same.
Tests (24, in `test_manifest_extends.py`):
- Leaf without extends parses unchanged
- Child inherits parent unchanged when child only declares
`extends:`
- env: disjoint union, collision (child wins), child-omits
- git: replace, omit, explicit-empty-clears-parent
- egress: same shape (replace, inherit)
- git_user: parent-only, child-overrides-both, partial fields
- 3-step chain (grandparent → parent → child)
- Errors: missing parent, self-extends, 2-node cycle, 3-node
cycle, non-string extends
685 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a `git_user:` block to the example bottle frontmatter with a
one-paragraph note on what it does + that either field can be
set independently. Other doc surfaces (manifest module docstring,
provisioner module docstrings) were updated alongside the
implementation commits.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror the docker backend's third provisioning subcase in
`backend/smolmachines/provision/git.py`:
_provision_git_user(plan, target)
Runs `smolvm machine exec --name <M> -e HOME=/home/node -e
USER=node -- runuser -u node -- git config --global user.<X>
<value>` for each git_user field. No-op when
`git_user.is_empty()`.
`runuser -u node --` switches the UID without invoking a login
shell (matching the existing `Bottle.exec_claude` pattern).
HOME / USER are forced via `smolvm -e` because bare runuser
inherits root's HOME=/root, which would put --global in
/root/.gitconfig instead of /home/node/.gitconfig (where the
existing `_provision_git_gate_config` writes).
4 unit tests in test_smolmachines_provision.TestProvisionGitUser:
no-op, both-set (asserts runuser prefix + HOME/USER env),
name-only, email-only. 661 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a third provisioning subcase to
`backend/docker/provision/git.py`:
_provision_git_user(plan, target)
Runs `docker exec -u node <container> git config --global
user.{name,email} <value>` for each field the bottle's
`git_user` declares. No-op when `git_user.is_empty()`.
`-u node` so `--global` lands in /home/node/.gitconfig (matching
the existing `_provision_git_gate_config` write location, so
agent-side `git` reads both configs from the same dotfile).
Name and email apply independently — a bottle declaring only
name runs just the user.name line, etc.
4 unit tests in `test_docker_provision_git_user.py`: no-op,
both-set, name-only, email-only. 657 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per-bottle `git config --global user.name` / `user.email` pair
so the agent's commits inside the bottle land with a known
identity rather than the agent image's default (no user, or
whatever the image dropped in).
Schema:
git_user:
name: "Eric Bauerfeld"
email: "eric+claude@dideric.is"
Either field can be set independently — name-only / email-only
configs are valid and apply just the field that's set. An
explicit `git_user:` block with both fields empty dies at parse
time rather than silently no-op'ing; an omitted block is the
no-op path (default GitUser is empty, provisioner skips).
Parse-time validation:
- Unknown sub-keys die (e.g., typo of `username`).
- Non-string name/email dies.
- Both-empty dies (half-finished edit hint).
11 unit tests in `test_manifest_git_user.py`; 653 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pipelock was 403-blocking legitimate egress cred-injected
traffic with 'blocked: request header contains secret'. The
chain is `agent → egress → pipelock → internet`: egress injects
`Authorization: Bearer <token>` for routes with an `auth_scheme`,
then forwards upstream to pipelock. Pipelock has `scan_env:
true` + `scan_headers: true` + `header_mode: all`, and the
bundle supervisor spawned every daemon (egress, pipelock,
git-gate, supervise) inheriting the bundle container's full env
— including the `EGRESS_TOKEN_<n>` slots set via
`docker run -e`. So pipelock had the token value egress
injected sitting in its own env, matched it in the request
headers, and blocked.
The agent itself runs in a different machine and never sees
`EGRESS_TOKEN_*`, so stripping these from non-egress daemons'
env loses no DLP coverage — pipelock can't catch the exfil of
a value the agent doesn't have in the first place.
New helper `_env_for_daemon(name, base_env)` returns the
unchanged base for `egress` and a copy with `EGRESS_TOKEN_*`
filtered for everyone else. `_spawn` now passes the scoped env
to `subprocess.Popen`. Prefix-based filter (not exact-match) so
future egress-only env slots don't have to update this code.
Tests:
- `TestEnvForDaemon`: egress gets full env, pipelock /
git-gate / supervise lose `EGRESS_TOKEN_0` + `EGRESS_TOKEN_1`
but keep `PATH`, `EGRESS_UPSTREAM_PROXY`, `SUPERVISE_PORT`.
- Independent-dict invariant locked so callers can't
accidentally mutate the supervisor's env.
642 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- bottle.py:_PTY_RESIZE_SCRIPT docstring: strip the speculative
cwd-dependence explanation. The real reason to use absolute
path is just that the wrapper is self-contained; the original
rationale (tmux pane cwd) was a hypothesis we never confirmed
and wasn't load-bearing once we found the libkrun race.
- pty_resize.py:main: drop the long comment duplicating
`_STARTUP_SYNC_DELAY_SEC`'s docstring. Keep a one-liner
pointing at the constant + the operational note about
daemon=True.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The b9853ae stdin=DEVNULL fix wasn't sufficient. End-to-end
testing against a live VM in tmux revealed a second crash path:
libkrun spits "load \`config.json\`: parse error: trailing
garbage { \"ociVersion\": \"1.0.2\", ... }" and the main exec
dies (rc=1 or SIGKILL/rc=137, depending on race scheduling).
Root cause: each `smolvm machine exec` writes a per-invocation
OCI config.json to the same smolvm state dir during its bringup.
The wrapper's startup sync() fires within 1ms of Popen-ing the
main exec — both invocations write config.json concurrently,
libkrun loads one mid-write, and gets garbage. Trivial inner
commands (`sh -c "echo hi"`) finished before the overlap
mattered, masking the race in earlier tests. claude's slower
startup hits the race every time, and only inside tmux because
the outside-tmux foreground-handoff path takes a different
bringup sequence that happens to dodge the window.
Fix: schedule the initial sync on a 2-second `threading.Timer`
instead of calling it synchronously. By 2s the main exec is
past its bringup window, so the side-channel's config.json
write doesn't collide. Daemon thread so the timer doesn't
block exit when the child finishes quickly.
Trade-off: the in-VM PTY uses smolvm's default size for the
first ~2s, then snaps to the host pane size when the timer
fires. Verified end-to-end against a live VM in tmux: claude
renders at the default size during bringup, then redraws at
full pane width once the deferred sync lands. Operator-driven
resizes (SIGWINCH) still bridge in real time via the
already-installed signal handler.
Also drop the diagnostic log added in 9c83ea6 — we have the
fix.
Regression test:
`TestStartupSyncDeferred.test_main_schedules_timer_does_not_
call_sync_synchronously` mocks Popen + Timer + _push_size and
asserts `main()` schedules the timer with the documented
delay constant and never invokes _push_size synchronously.
Catches a "let's just inline the sync() call" regression
immediately.
638 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User reports the launch still crashes in tmux after b9853ae's
stdin=DEVNULL fix. Re-instrument to capture the next failure mode
(argv, ppid, sync size, child exit, Popen tracebacks).
Removable once the inside-tmux launch is confirmed stable.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Inside tmux the dashboard's smolmachines launch crashed within
~100ms of the wrapper Popen-ing the main smolvm exec child —
sometimes with rc=137 (SIGKILL), sometimes with smolvm
spitting a runc-style "load `config.json`: cannot parse the
data: parse error: trailing garbage" and exiting 1. The same
wrapper ran fine outside tmux. Diagnostic logs showed the
SIGKILL landed ~100ms after the wrapper kicked off its
initial `sync()` (which fires the side-channel smolvm exec).
Root cause: the side-channel `subprocess.run([smolvm, machine,
exec, --, sh, -c, ...])` did not specify `stdin=`, so it
inherited the wrapper's stdin — the tmux pane PTY. The main
smolvm child (the agent session) also had that PTY as stdin.
Two concurrent smolvm processes sharing the PTY's
foreground-process-group / input plumbing caused smolvm to
abort one of them. iTerm's PTY plumbing apparently tolerated
this; tmux's didn't.
Fix is one line in `_push_size`: `stdin=subprocess.DEVNULL`.
The side-channel never needs stdin — it runs a fire-and-forget
`stty` and exits. Verified end-to-end: pre-fix the wrapper
crashed under `tmux respawn-pane` against a live VM; post-fix
the same invocation completes cleanly.
Also drop the diagnostic log added in 37bd11b — we have the
fix.
Regression test:
`test_side_channel_uses_devnull_stdin` locks the
`stdin=DEVNULL` invariant so a future "let's simplify the
subprocess.run kwargs" refactor surfaces this immediately.
637 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User reports launch crashing only inside tmux (works outside).
The wrapper itself runs fine in standalone tmux repros, so the
break is in some interaction we can't see — curses eats stderr,
default tmux remain-on-exit is off, and the pane closes before
the operator can read anything.
Add an always-on per-pid log at ~/.claude-bottle/pty_resize.log:
- start record: argv, cwd, PATH, TMUX status
- sync record: window size observed
- child pid + exit rc
- any KeyboardInterrupt forwarding
- Popen failure traceback if it dies
Append-mode, small overhead, easy to grep + share.
Removable (along with the wrapper itself) once smolvm forwards
SIGWINCH natively.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The dashboard's launch path crashed inside tmux but worked
outside it. Root cause: `python -m
claude_bottle.backend.smolmachines.pty_resize` needs the
`claude_bottle` package on `sys.path`, which by default comes
from cwd. The outside-tmux path is `subprocess.run(...)` —
inherits the dashboard process's cwd (the repo root, where
`claude_bottle/` lives), so the import resolves. The
inside-tmux path is `tmux split-window / respawn-pane <argv>`,
and tmux opens the new pane with the pane's OWN cwd, not the
cwd of the process invoking split-window. If the operator
started their tmux pane anywhere outside the repo (typical:
`$HOME`), the wrapper hit `ModuleNotFoundError: No module
named 'claude_bottle'` and tmux closed the pane immediately.
Sidestep the cwd dependence by invoking the wrapper as
`python <absolute-path-to-pty_resize.py>` instead of
`python -m <dotted-path>`. The wrapper has no
`claude_bottle.*` imports — it's stdlib-only — so it runs as
a standalone script anywhere on the filesystem. The absolute
path comes from `pty_resize.__file__` at module-load time.
Tests:
- `test_pty_resize_wrapper_prefix`: updated to assert the
absolute-script-path shape rather than the `-m <dotted>`
shape.
- `test_no_wrapper_when_tty_false`: the substring check now
uses `any("pty_resize" in a for a in argv)` instead of
string-joining (so the absolute path's "pty_resize.py"
filename match still catches a regression).
636 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`smolvm 0.8.0 machine exec -t` allocates an in-VM PTY but never
forwards the host terminal's window size — the PTY starts at
`0 0` and host resizes (tmux pane resize, terminal window
resize) go unnoticed, so the claude TUI inside a smolmachines
bottle renders for whatever tiny box it last saw and ignores
operator resizes. `docker exec -it` propagates window-size
changes automatically; smolvm doesn't.
Workaround: a small Python wrapper
(`backend/smolmachines/pty_resize.py`) that interposes between
the operator's terminal and `smolvm machine exec`. It spawns
smolvm as a child, traps host SIGWINCH, and on every resize
(plus once at startup) runs a side-channel
`smolvm machine exec --name <M> -- sh -c 'for f in /dev/pts/*;
do stty -F $f cols X rows Y; done'`. The kernel delivers
SIGWINCH to the in-VM foreground process group when the slave
PTY's size changes, so claude picks up the new dimensions
without extra signalling.
`SmolmachinesBottle.claude_argv` prepends
`[sys.executable, -m, claude_bottle.backend.smolmachines.
pty_resize, <machine>, --, ...]` to the existing smolvm argv
in TTY mode. Non-TTY mode (provisioning shell-outs) skips the
wrapper — no PTY to resize.
The wrapper survives the dashboard's
`_build_resume_argv_with_fallback` shell-wrap because the
split-at-`claude` token still finds the right position — the
wrapper's prefix wraps the entire smolvm-exec framing.
Tests:
- `test_smolmachines_pty_resize.py` (new): argv parsing, the
side-channel command shape (cols/rows / for-loop over
/dev/pts/*), and `_read_winsize`'s fallback across
stdin/stdout/stderr including the smolvm-allocated-PTY-
reports-`0 0` ironic case.
- `test_smolmachines_bottle.py`: updated TTY-mode assertions
to unwrap the pty_resize prefix; added `TestClaudeArgvNoTTY`
to lock the non-TTY skip.
636 unit tests pass.
Removable when smolvm grows native SIGWINCH forwarding.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`./cli.py cleanup` previously called only the env-var-selected
backend's `prepare_cleanup` / `cleanup` — so a leftover smolvm
machine + bundle container + bundle network from a crashed
smolmachines bottle would survive a default `docker`-mode cleanup
indefinitely.
Smolmachines now has a real `cleanup` module (alongside
`enumerate.py` from issue #77) that walks:
- smolvm machines named `claude-bottle-*` (via
`smolvm machine ls --json`)
- bundle containers `claude-bottle-sidecars-*`
- bundle networks `claude-bottle-bundle-*`
Cleanup runs stop+delete on the machines, force-rm on the
containers, network rm on the networks. Each step is best-effort
so a failed rm doesn't block the rest.
`cli.py cleanup` walks every backend in `known_backend_names()`
and runs each backend's `cleanup` after a single y/N prompt that
shows a combined plan.
State dirs (`~/.claude-bottle/state/<slug>/`) are shared layout
with the docker backend, which still owns the orphan-state-dir
bucket. It now consults `enumerate_active_bottles()` for the
cross-backend live identity set so a running smolmachines
bottle's state dir isn't reaped during a cleanup.
Tests: smolmachines cleanup (prepare + cleanup ordering + failure
handling); cross-backend orphan protection on the docker
state-dir check; CLI cmd_cleanup walks both backends, short-
circuits on all-empty, aborts on N. 617 unit tests pass.
End-to-end verified on this host:
$ smolvm machine ls --json | jq '.[].name'
"claude-bottle-researcher-m3hxd"
$ ./cli.py cleanup
--- smolmachines backend ---
smolvm machine: claude-bottle-researcher-m3hxd
remove all of the above? [y/N]
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Launching a smolmachines agent from the dashboard inside tmux
crashed with
AttributeError: 'SmolmachinesBottle' object has no attribute
'claude_docker_argv'
because the tmux pane-respawn path called
`bottle.claude_docker_argv(...)` directly — a method that only
existed on DockerBottle. The foreground-handoff path (curses
endwin → subprocess.run → restore) doesn't hit it; it goes
through `bottle.exec_claude` which is on the ABC.
- Move the argv builder onto the `Bottle` ABC as
`claude_argv(argv, *, tty=True) -> list[str]`. Both backends
implement it; both `exec_claude` impls collapse to
`subprocess.run(self.claude_argv(argv, tty=tty), check=False)`.
- DockerBottle: rename `claude_docker_argv` → `claude_argv`,
body unchanged.
- SmolmachinesBottle: extract the argv-building from
`exec_claude` into `claude_argv`; the new method returns the
full `smolvm machine exec --name … -- runuser -u node --
claude …` argv. The `runuser` switch lives on the
exec-framing prefix so the dashboard's
`_build_resume_argv_with_fallback` split-at-"claude" trick
keeps the UID switch when wrapping the claude tail in
`sh -c "… --continue || …"`.
- Dashboard: drop the docker-specific wording — local + helper
arg names `docker_argv` → `claude_argv`; docstrings on
`_build_resume_argv_with_fallback`, `_build_split_pane_argv`,
`_build_respawn_pane_argv` now say "backend-exec argv". The
shell-fallback wrap is unchanged; the existing logic works
for smolmachines because `claude` is still the marker token.
Tests:
- `tests/unit/test_smolmachines_bottle.py` (new): locks down
the smolmachines argv shape — prompt-file flag injection,
guest-env `-e K=V` forwarding, TTY toggle, runuser-precedes-
claude invariant.
- `test_docker_bottle.py`: TestClaudeDockerArgv →
TestClaudeArgv; method renames follow.
- `test_dashboard_active_agents.py`: docstring follow.
615 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When starting a smolmachines agent from the dashboard the
docker-build output rendered on top of the curses preflight
modal — the build was kicked off before the operator had
confirmed launch. The docker backend's `prepare` is pure
resolution (no docker calls); smolmachines was inconsistent
because `prepare` called `_ensure_smolmachine` which ran
`docker build` → `docker save` → `crane push` → `smolvm pack
create`, several seconds of stderr noise rendered before the
y/N prompt.
Move the pipeline:
- `_ensure_smolmachine` (+ `_SMOLMACHINE_CACHE_DIR` + `_REPO_DIR`
+ the local-registry / smolvm imports) moves from
`backend/smolmachines/prepare.py` to
`backend/smolmachines/launch.py`. Called right before
`_smolvm.machine_create` so the resulting `.smolmachine`
sidecar path lands as a local in `launch`, not on the plan.
- `SmolmachinesBottlePlan.agent_from_path: Path` becomes
`agent_image_ref: str`. `prepare` stashes only the docker tag
(`$CLAUDE_BOTTLE_IMAGE` || `claude-bottle:latest`); `launch`
resolves it into the artifact at bringup.
This puts smolmachines on the same prepare-vs-launch boundary
the docker backend uses: the preflight summary in the dashboard
prints, the operator confirms, then `launch` runs — and its
stderr is routed via `_route_op_to_right_pane` (in tmux) or via
`curses.endwin` (foreground handoff) so the build output lands
cleanly.
Tests:
- `tests/unit/test_smolmachines_prepare_image.py` →
`tests/unit/test_smolmachines_launch_image.py`, updated to
import `_ensure_smolmachine` from `launch` rather than
`prepare`.
- `test_smolmachines_provision.py`: plan fixture switches
`agent_from_path` → `agent_image_ref`.
593 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>