Replace the two thin "docs live in …" lines with the conventions the
docs/ READMEs establish: the three document types (PRD / research note
/ decision record) with their numbering and the PRD Status lifecycle,
plus the cross-cutting rule that decision rationale stays self-contained
in the repo rather than in Gitea issue threads. Points at the per-folder
READMEs as the source of truth instead of duplicating them.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
We use Gitea, not an abstract forge. Reword the docs added in this
branch: "forge thread" -> "Gitea thread", and the research note's
generic "forge" -> "Gitea" / "hosting provider" as context demands,
keeping its portability argument coherent.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document what research notes are (opinionated investigations of a
question/design space), their unnumbered kebab-case naming, and their
loose verdict-first shape — explicitly freeform, not a template. Point
the AGENTS.md research line at it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Remove the one-line docs/INDEX.md (its directory pointers are covered
by docs/README.md's "when to write which document" table). Add
docs/prds/README.md documenting the PRD naming, Status lifecycle, and
section format. Repoint the AGENTS.md repository-layout list at the
new READMEs and add the decisions/ dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the document-type comparison out of docs/decisions/README.md
(where it only surfaced if you were already in the decisions dir) up
to a new docs/README.md, renamed "When to write which document".
Leave a pointer from the decisions README.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per review on PR #97: an index that lists every ADR is a sync
burden. The files in docs/decisions/ are the index.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an "Alternatives considered" section enumerating the design
options from issue #88 (duplicate bottles / agent-side bottle_config
/ bottle-side extends) and why extends won, so the PRD stands without
the forge thread. Repoint the two phrases that depended on the #88
comment thread at the new section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add docs/decisions/ with a convention README and back-fill two
decisions that previously had no in-repo home: merging PRs with
rebase (ADR 0001) and the agent-identity claimed-not-vouched trust
posture from PRD 0027 (ADR 0002). Point docs/INDEX.md at it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Analyze tracking feature requests in Gitea against the project's
in-repo PRDs/research notes, given the goal of keeping decision
history portable and not provider-locked. Recommends demoting issues
to an ephemeral inbox and reifying durable rationale into the repo.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
We use Gitea, not an abstract forge. Reword the pre-existing research
and PRD docs: the generic "Forge-API gate"/"forge tokens" become
"Git-host-API gate"/"Git-host tokens" (the gate still spans Gitea /
GitHub / GitLab), "Git/forge history" -> "Git/Gitea history", and the
KNOWN_FORGE_HOSTS / forge: manifest-field examples -> KNOWN_GIT_HOSTS
/ git_host:. Meaning preserved; only the word "forge" is dropped.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Flip Status: Draft -> Active for the 23 PRDs whose work has shipped to
main (including 0027, now that PR #95 has merged). Leaves the
terminal-status PRDs unchanged: 0007 and 0010 (Superseded) and 0014
(Retargeted) were replaced, not shipped as-is.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
README manifest section documents the agent git.user overlay, the
bottle-only git.remotes boundary, and the claimed-not-vouched trust
note. Collapses the example: implementer carries its own identity
against the shared dev bottle instead of an identity-only bottle.
Refs #94
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Agents may declare git.user (name/email); it overlays the referenced
bottle's git.user per-field at Manifest.bottle_for (agent wins on
non-empty), mirroring the extends: merge. git.remotes is rejected on
agents — it carries credentials and host trust and stays bottle-only.
The overlay lives at bottle_for, the single chokepoint both backends
use, so the docker/smolmachines git provisioners are unchanged. Adds
Manifest.git_identity_summary with per-field (agent)/(bottle)
provenance, printed in both preflights and `info`.
Refs #94
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lift git.user (name/email) to the agent layer with a per-field
overlay onto the referenced bottle, mirroring the extends: merge.
git.remotes stays bottle-only. Includes identity provenance in
preflight/info and an example collapse.
Refs #94
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete CLAUDE.md in favor of AGENTS.md as the orientation doc, rebrand
the project from Codex-bottle to provider-agnostic bot-bottle, and
repoint every CLAUDE.md reference across PRDs, research notes, the
implementer agent example, and the yaml_subset comment.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The two Debian-family CA-layout constants lived in
docker/provision/ca.py, which forced the smolmachines backend to
import them cross-backend (smolmachines -> docker). Move them into
the shared backend/util.py next to select_ca_cert; docker, compose,
and smolmachines now all import from there. No behavior change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both backends' provision_ca duplicated _select_ca_cert and the
SHA-256 fingerprint computation verbatim. Lift them into the shared
backend/util.py as select_ca_cert + log_ca_fingerprint; docker and
smolmachines now call the shared helpers. No behavior change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an optional `extends: <bottle-name>` field to bottle
frontmatter. Two-pass load:
1. Collect raw frontmatter for every bottle file.
2. Recursively resolve each name into a merged Bottle via
`_resolve_one_bottle` + `_merge_bottles`.
Merge rules (per PRD 0025):
- env: dict merge, child wins on key collision
- git: full replace if child declares `git:`
- git_user: per-field overlay (child's non-empty fields win)
- egress: full replace if child declares `egress:`
- supervise: full replace if child declares `supervise:`
List-valued fields full-replace because partial merge is
ambiguous (ordering matters, name collisions ambiguous); env is
dict-merge because dict-keyed override is the natural shape.
git_user overlays per-field so a parent can declare just the
name and a child can add just the email.
Cycles / self-extends / missing-parent / non-string `extends:`
all die at parse with a pointer that includes the chain (cycles)
or the available names (missing parent). Resolution is cached
per-name so a diamond reference graph doesn't reparse the same
parent N times.
Both load paths threaded:
- `_load_bottles_from_dir` (md files) — collect raws, then
resolve.
- `Manifest.from_json_obj` (JSON / test fixtures) — same.
Tests (24, in `test_manifest_extends.py`):
- Leaf without extends parses unchanged
- Child inherits parent unchanged when child only declares
`extends:`
- env: disjoint union, collision (child wins), child-omits
- git: replace, omit, explicit-empty-clears-parent
- egress: same shape (replace, inherit)
- git_user: parent-only, child-overrides-both, partial fields
- 3-step chain (grandparent → parent → child)
- Errors: missing parent, self-extends, 2-node cycle, 3-node
cycle, non-string extends
685 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a `git_user:` block to the example bottle frontmatter with a
one-paragraph note on what it does + that either field can be
set independently. Other doc surfaces (manifest module docstring,
provisioner module docstrings) were updated alongside the
implementation commits.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror the docker backend's third provisioning subcase in
`backend/smolmachines/provision/git.py`:
_provision_git_user(plan, target)
Runs `smolvm machine exec --name <M> -e HOME=/home/node -e
USER=node -- runuser -u node -- git config --global user.<X>
<value>` for each git_user field. No-op when
`git_user.is_empty()`.
`runuser -u node --` switches the UID without invoking a login
shell (matching the existing `Bottle.exec_claude` pattern).
HOME / USER are forced via `smolvm -e` because bare runuser
inherits root's HOME=/root, which would put --global in
/root/.gitconfig instead of /home/node/.gitconfig (where the
existing `_provision_git_gate_config` writes).
4 unit tests in test_smolmachines_provision.TestProvisionGitUser:
no-op, both-set (asserts runuser prefix + HOME/USER env),
name-only, email-only. 661 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a third provisioning subcase to
`backend/docker/provision/git.py`:
_provision_git_user(plan, target)
Runs `docker exec -u node <container> git config --global
user.{name,email} <value>` for each field the bottle's
`git_user` declares. No-op when `git_user.is_empty()`.
`-u node` so `--global` lands in /home/node/.gitconfig (matching
the existing `_provision_git_gate_config` write location, so
agent-side `git` reads both configs from the same dotfile).
Name and email apply independently — a bottle declaring only
name runs just the user.name line, etc.
4 unit tests in `test_docker_provision_git_user.py`: no-op,
both-set, name-only, email-only. 657 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per-bottle `git config --global user.name` / `user.email` pair
so the agent's commits inside the bottle land with a known
identity rather than the agent image's default (no user, or
whatever the image dropped in).
Schema:
git_user:
name: "Eric Bauerfeld"
email: "eric+claude@dideric.is"
Either field can be set independently — name-only / email-only
configs are valid and apply just the field that's set. An
explicit `git_user:` block with both fields empty dies at parse
time rather than silently no-op'ing; an omitted block is the
no-op path (default GitUser is empty, provisioner skips).
Parse-time validation:
- Unknown sub-keys die (e.g., typo of `username`).
- Non-string name/email dies.
- Both-empty dies (half-finished edit hint).
11 unit tests in `test_manifest_git_user.py`; 653 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pipelock was 403-blocking legitimate egress cred-injected
traffic with 'blocked: request header contains secret'. The
chain is `agent → egress → pipelock → internet`: egress injects
`Authorization: Bearer <token>` for routes with an `auth_scheme`,
then forwards upstream to pipelock. Pipelock has `scan_env:
true` + `scan_headers: true` + `header_mode: all`, and the
bundle supervisor spawned every daemon (egress, pipelock,
git-gate, supervise) inheriting the bundle container's full env
— including the `EGRESS_TOKEN_<n>` slots set via
`docker run -e`. So pipelock had the token value egress
injected sitting in its own env, matched it in the request
headers, and blocked.
The agent itself runs in a different machine and never sees
`EGRESS_TOKEN_*`, so stripping these from non-egress daemons'
env loses no DLP coverage — pipelock can't catch the exfil of
a value the agent doesn't have in the first place.
New helper `_env_for_daemon(name, base_env)` returns the
unchanged base for `egress` and a copy with `EGRESS_TOKEN_*`
filtered for everyone else. `_spawn` now passes the scoped env
to `subprocess.Popen`. Prefix-based filter (not exact-match) so
future egress-only env slots don't have to update this code.
Tests:
- `TestEnvForDaemon`: egress gets full env, pipelock /
git-gate / supervise lose `EGRESS_TOKEN_0` + `EGRESS_TOKEN_1`
but keep `PATH`, `EGRESS_UPSTREAM_PROXY`, `SUPERVISE_PORT`.
- Independent-dict invariant locked so callers can't
accidentally mutate the supervisor's env.
642 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- bottle.py:_PTY_RESIZE_SCRIPT docstring: strip the speculative
cwd-dependence explanation. The real reason to use absolute
path is just that the wrapper is self-contained; the original
rationale (tmux pane cwd) was a hypothesis we never confirmed
and wasn't load-bearing once we found the libkrun race.
- pty_resize.py:main: drop the long comment duplicating
`_STARTUP_SYNC_DELAY_SEC`'s docstring. Keep a one-liner
pointing at the constant + the operational note about
daemon=True.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The b9853ae stdin=DEVNULL fix wasn't sufficient. End-to-end
testing against a live VM in tmux revealed a second crash path:
libkrun spits "load \`config.json\`: parse error: trailing
garbage { \"ociVersion\": \"1.0.2\", ... }" and the main exec
dies (rc=1 or SIGKILL/rc=137, depending on race scheduling).
Root cause: each `smolvm machine exec` writes a per-invocation
OCI config.json to the same smolvm state dir during its bringup.
The wrapper's startup sync() fires within 1ms of Popen-ing the
main exec — both invocations write config.json concurrently,
libkrun loads one mid-write, and gets garbage. Trivial inner
commands (`sh -c "echo hi"`) finished before the overlap
mattered, masking the race in earlier tests. claude's slower
startup hits the race every time, and only inside tmux because
the outside-tmux foreground-handoff path takes a different
bringup sequence that happens to dodge the window.
Fix: schedule the initial sync on a 2-second `threading.Timer`
instead of calling it synchronously. By 2s the main exec is
past its bringup window, so the side-channel's config.json
write doesn't collide. Daemon thread so the timer doesn't
block exit when the child finishes quickly.
Trade-off: the in-VM PTY uses smolvm's default size for the
first ~2s, then snaps to the host pane size when the timer
fires. Verified end-to-end against a live VM in tmux: claude
renders at the default size during bringup, then redraws at
full pane width once the deferred sync lands. Operator-driven
resizes (SIGWINCH) still bridge in real time via the
already-installed signal handler.
Also drop the diagnostic log added in 9c83ea6 — we have the
fix.
Regression test:
`TestStartupSyncDeferred.test_main_schedules_timer_does_not_
call_sync_synchronously` mocks Popen + Timer + _push_size and
asserts `main()` schedules the timer with the documented
delay constant and never invokes _push_size synchronously.
Catches a "let's just inline the sync() call" regression
immediately.
638 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>