From 894bdea2889f0606ba92b0f5827ad1b63beeebea Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 21:39:58 -0400 Subject: [PATCH 1/5] =?UTF-8?q?docs:=20add=20PRD=200011=20=E2=80=94=20per-?= =?UTF-8?q?file=20Markdown=20manifest?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Specs the implementation chosen in the PR #16 closing comment: per-file MD-with-YAML-frontmatter layout for both bottles and agents, with a hand-rolled YAML subset parser (no PyYAML). Layout: - $HOME/.claude-bottle/bottles/.md (home-only) - $HOME/.claude-bottle/agents/.md (home agents) - $CWD/.claude-bottle/agents/.md (repo-supplied agents) The trust boundary that PRD-0011-v1 (closed PR #15) tried to enforce in the resolver now falls out of filesystem layout — $CWD/.claude-bottle/ has no bottles/ subdir, the loader doesn't look there. Filesystem layout IS the enforcement. Eight success criteria, including: stdlib-only (no new runtime dep), idempotent migration command, agent files shaped close to Claude Code's existing subagent spec so the same file can drop into ~/.claude/agents/. PRD-only; no implementation in this commit. PRD slot 0011 is intentionally reused — the v1 file was never merged to main. --- docs/prds/0011-per-file-md-manifest.md | 446 +++++++++++++++++++++++++ 1 file changed, 446 insertions(+) create mode 100644 docs/prds/0011-per-file-md-manifest.md diff --git a/docs/prds/0011-per-file-md-manifest.md b/docs/prds/0011-per-file-md-manifest.md new file mode 100644 index 0000000..a71f733 --- /dev/null +++ b/docs/prds/0011-per-file-md-manifest.md @@ -0,0 +1,446 @@ +# PRD 0011: Per-file Markdown manifest + +- **Status:** Draft +- **Author:** didericis +- **Created:** 2026-05-24 + +## Summary + +Replace the single-file `claude-bottle.json` manifest with a +per-file Markdown-with-YAML-frontmatter layout. Bottles live as +`$HOME/.claude-bottle/bottles/.md`; agents live as +`$HOME/.claude-bottle/agents/.md` (home-resident) and +`$CWD/.claude-bottle/agents/.md` (repo-supplied). Each file +carries its structured config in YAML frontmatter and (for agents) +its system prompt in the Markdown body. + +The format change clears the way for the layout change: one file +per bottle, one file per agent, two directories on each side of +the `$HOME` / `$CWD` trust boundary. That boundary stops living in +resolver logic (PRD 0011-v1's CwdExtension approach, closed in +favor of this design) and becomes filesystem layout — `$CWD` has +no `bottles/` subdirectory, period. + +The YAML we accept is bounded (flat keys → strings, lists, simple +nested dicts), so the parser is hand-rolled and stdlib-only — no +PyYAML dependency. The project's "low deps by default" stance +(CLAUDE.md) stays intact. + +## Problem + +`claude-bottle.json` works fine at one bottle and one agent. The +project is heading for many of both, and the single-JSON shape +starts to fray: + +- **Discovery + diff scaling.** A user with 8 bottles and 12 + agents lands at hundreds of lines of nested JSON. Two changes + to unrelated agents touch the same file; codeowners-style + ownership doesn't apply. File-globbing tools (`grep`, `fd`) + can't find one agent without parsing the whole file. + +- **No comments, no multi-line strings.** Agent prompts longer + than a sentence become single-line escaped horrors in JSON. + Documentation about why a bottle exists (which tokens it + holds, why these egress allowlist entries) has nowhere natural + to live in the manifest file itself; a sibling README drifts. + +- **Trust boundary lives in code, not on disk.** PRD 0011-v1 + (closed; see PR #15) made the resolver reject cwd manifests + that try to define bottles. The rule is correct and enforced, + but it's invisible to anyone reading the on-disk layout — + there's no positive signal that `$HOME` is the only place + bottles can come from. A reader has to know the resolver's + rules to audit the security posture. + +The companion research +(`docs/research/manifest-format-and-grouping.md`) walks the two +axes (grouping × format) and lands on this design. + +## Goals / Success criteria + +Each test runs against a temporary `$HOME` and a temporary `$CWD`: + +1. **A bottle file under `$HOME/.claude-bottle/bottles/` + parses.** A `dev.md` file with YAML frontmatter declaring + `cred_proxy.routes`, `git`, `env`, `egress` produces a Bottle + dataclass equivalent to the current JSON shape. + +2. **An agent file under `$HOME/.claude-bottle/agents/` parses.** + `implementer.md` with frontmatter that names `bottle:`, + `skills:`, and other fields, with the body as the system + prompt, produces an Agent dataclass. + +3. **An agent file under `$CWD/.claude-bottle/agents/` parses + and overrides home-resident agents of the same name.** The + cwd agent's frontmatter and body win; the home bottle it + references stays intact. + +4. **A bottle file under `$CWD/.claude-bottle/bottles/` is + ignored.** The directory does not contribute to the + manifest; if a user accidentally creates one, the launcher + emits a `warn`-level log naming the offending files and + continues. Filesystem layout is the boundary; the warning + is a usability nicety, not a security gate. + +5. **No third-party Python dependencies introduced.** A fresh + clone with only stdlib + claude-bottle's own code runs every + parser test. Frontmatter parsing is hand-rolled against the + declared YAML subset. + +6. **Migration tool converts existing JSON to per-file MD.** + `./cli.py migrate-manifest` reads `$HOME/claude-bottle.json` + (and `$CWD/claude-bottle.json` if present), writes a tree of + per-file MD docs to the new locations, then prints what was + moved. Idempotent: rerunning is a no-op when the new layout + already exists. Does not delete the old JSON files + automatically (user-driven cleanup). + +7. **Existing tests pass against the new layout.** Tests today + build manifests via JSON literals against `Manifest.from_json_obj`. + That entry point keeps working for tests (used to construct + manifests programmatically); production resolution flows + through the new directory-globbing loader. + +8. **Agent files double as Claude Code subagent files.** The + `name`, `description`, `model`, `color`, and `memory` fields + from Claude Code's existing subagent spec are accepted in + our frontmatter alongside our own fields. Copying an agent + file from `$HOME/.claude-bottle/agents/` to + `~/.claude/agents/` produces a working Claude Code subagent + (subject to Claude Code's tolerance for the extra `bottle:` + and `claude_bottle:` fields — see Open Questions). + +## Non-goals + +- **A general YAML implementation.** The parser handles the + subset claude-bottle's frontmatter actually uses; documents + that exceed the subset (anchors, multi-line block scalars, + tags, implicit type coercion, flow style, etc.) die with a + pointer at the spec. We are not building a YAML library. + +- **Compatibility with the old JSON layout at runtime.** The + resolver no longer reads `claude-bottle.json` files. The + migration tool is the bridge; after migration the JSON file + is stale (and the user removes it). This is a breaking + change for v1 users; the migration cost is one command + a + manual delete. + +- **`$HOME/.claude/agents/` integration on the input side.** We + don't read agent files out of Claude Code's directory. Our + files can be copied into Claude Code's tree by the user if + they want, but the input path for claude-bottle is its own + directory. + +- **A signed-manifest scheme.** Out of scope per the + closed-PR-15 PRD; the trust boundary here is "your home + directory is yours." + +- **Per-bottle inheritance / composition.** Each bottle file is + self-contained. If shared egress allowlists become common we + can revisit, but the v1 of this PRD is one file = one bottle. + +- **Hot-reload.** Changes to manifest files take effect at next + `./cli.py start`; we do not watch the directory. + +## Scope + +### In scope + +- **Directory layout.** + - `$HOME/.claude-bottle/bottles/.md` — bottle + definitions (full schema; one Bottle per file). + - `$HOME/.claude-bottle/agents/.md` — home-resident + agents. + - `$CWD/.claude-bottle/agents/.md` — cwd-resident + agents; same schema as home agents, but bottle names must + resolve against the home set. + - `$CWD/.claude-bottle/bottles/` — ignored with a warn-level + log (see SC #4). Does not contribute to the manifest. + - `` is the file basename without `.md`. Filenames must + match `[a-z][a-z0-9-]*` (kebab-case, ASCII-only). + +- **File schema.** Markdown with YAML frontmatter. Frontmatter + delimited by `---` lines at the top of the file; everything + after the closing `---` is the body. For agents, body is the + system prompt. For bottles, body is human documentation + (optional, ignored by the parser). + +- **Agent frontmatter fields.** + - `bottle: ` (required) — bottle to launch in. + - `skills: [, ...]` (optional) — host-side skills under + `~/.claude/skills/`. + - `name`, `description`, `model`, `color`, `memory` — accepted + but treated as Claude Code passthrough; claude-bottle + ignores them at launch but doesn't reject. Lets the same + file double as a Claude Code subagent. + - Unknown top-level keys die with a hint listing accepted + keys. We don't silently ignore typos. + +- **Bottle frontmatter fields.** Same keys as today's JSON + schema: `env`, `git`, `cred_proxy.routes`, `egress.allowlist`, + `egress.dlp_action`. No semantic changes. + +- **YAML subset parser.** Hand-rolled, stdlib-only. Supports: + - Flat `key: value` pairs at the top level. + - String, int, bool (`true`/`false` only — no `yes`/`no`/`on`/ + `off`), null (`null` / explicit `~`). + - Lists: block-style `- item` lines, items are strings or + flow lists/dicts of the same. + - Nested dicts: one level under a key, block-style. + - Quoted strings: single + double, escapes as JSON-style. + - Comments: `# ...` at end of line or on its own. + + Rejects with a clear error: anchors (`&`/`*`), multi-line + block scalars (`|`, `>`), tags (`!!`), implicit-typed strings + (`NO`/`Norway`/dates auto-coerced to booleans/dates), + flow-style nested deeper than one level. Empty document is + fine; missing frontmatter delimiters is fine for bottles + (file = body-only is treated as no-frontmatter, which fails + the required-keys check — same diagnostic as malformed). + +- **Manifest assembly.** New resolver: + 1. Walk `$HOME/.claude-bottle/bottles/*.md` → Bottle dict + keyed by filename. + 2. Walk `$HOME/.claude-bottle/agents/*.md` → Agent dict. + 3. Walk `$CWD/.claude-bottle/agents/*.md` → Agent dict; merge + into the home agent dict, cwd wins on name collision. + 4. Validate every agent's `bottle:` against the bottle dict. + 5. Warn if `$CWD/.claude-bottle/bottles/` exists with files. + 6. Return Manifest dataclass — same shape as today. + +- **Migration command.** `./cli.py migrate-manifest`: + - Reads `$HOME/claude-bottle.json` and (if present) + `$CWD/claude-bottle.json`. + - Creates `$HOME/.claude-bottle/{bottles,agents}/` dirs. + - For each `bottles[]`, writes + `$HOME/.claude-bottle/bottles/.md` with frontmatter + rendered from the bottle dict, body empty (or a one-line + "Migrated from claude-bottle.json on " stub). + - For each home `agents[]`, writes + `$HOME/.claude-bottle/agents/.md` with frontmatter + (bottle, skills, etc.) and body = `prompt`. + - For each cwd `agents[]` (if cwd JSON existed), + writes `$CWD/.claude-bottle/agents/.md`. + - Refuses to overwrite existing MD files; if a target + already exists, prints what would have been written and + bails on that file (continues with the others). + - Prints a summary at the end: N bottles written, M agents + written, what was skipped. + +- **Docs.** README's manifest section rewrites against the new + layout. `claude-bottle.example.json` becomes + `examples/bottles/dev.md` + `examples/agents/implementer.md`. + The PRD 0010 example block in its own document gets a + follow-up commit noting the new layout (out of scope for + this PRD; only update README + example files here). + +- **Tests.** + - `tests/unit/test_yaml_subset_parser.py` — the parser + itself, including all the rejection cases listed above. + - `tests/unit/test_manifest_md_load.py` — directory-globbing + + assembly, the 8 success criteria. + - `tests/integration/test_migrate_manifest.py` — round-trip + JSON → MD; idempotency. + - Existing integration tests keep working (the only public + entry points they hit are `Manifest.resolve` and + `Manifest.from_json_obj`). + +### Out of scope + +- Watching the directory for changes mid-session. +- A migration tool for moving back (MD → JSON). +- Validating that frontmatter `name:` matches the filename. + Soft check via a warn log if mismatched, but not enforced. +- A bottle/agent dependency graph beyond the existing `bottle:` + field. No "this agent extends this other agent." +- IDE schemas / JSON Schema export for the MD format. + +## Proposed design + +### File layout + +``` +$HOME/.claude-bottle/ +├── bottles/ +│ ├── dev.md +│ ├── gitea-dev.md +│ └── ... +└── agents/ + ├── implementer.md + ├── researcher.md + └── ... + +$CWD/.claude-bottle/ +└── agents/ + └── .md +``` + +`bottles/` only exists under `$HOME`. The directory's absence +under `$CWD` is the boundary — the loader doesn't even look +there. + +### Example bottle file + +```markdown +--- +cred_proxy: + routes: + - path: /anthropic/ + upstream: https://api.anthropic.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url + - path: /gitea/dideric/ + upstream: https://gitea.dideric.is + auth_scheme: token + token_ref: GITEA_TOKEN + role: [git-insteadof, tea-login] +git: + - Name: claude-bottle + Upstream: ssh://git@gitea.dideric.is:30009/didericis/claude-bottle.git + IdentityFile: ~/.ssh/gitea-delos-2.pem + ExtraHosts: + gitea.dideric.is: 100.78.141.42 + KnownHostKey: ssh-rsa AAAAB3... +egress: + allowlist: + - example.com +--- + +The `dev` bottle. Backs my work on personal projects: + +- Anthropic OAuth via cred-proxy +- gitea.dideric.is over SSH (with PAT for tea API) +- example.com in the egress allowlist +``` + +### Example agent file + +```markdown +--- +name: implementer +description: Implements features against PRDs in this repo. +model: opus +bottle: dev +skills: + - init-prd +--- + +You are a feature-implementation agent running inside an +ephemeral claude-bottle sandbox... +``` + +Drop the same file into `~/.claude/agents/implementer.md` and +Claude Code picks it up as a subagent (assuming Claude Code +tolerates the `bottle:` and `skills:` fields — see Open +Questions). + +### YAML subset grammar + +``` +document := frontmatter? body? +frontmatter := "---" "\n" yaml_block "---" "\n" +yaml_block := (line "\n")* +line := blank | comment | mapping_line | list_item +mapping_line := indent key ":" (" " value)? +key := bare_string ; matches [A-Za-z_][A-Za-z0-9_-]* +value := scalar | inline_list | inline_dict +scalar := number | bool | null | quoted_string | bare_string +list_item := indent "-" " " value +``` + +Notable rejections (each dies with a specific error): + +- Anchors (`&name`), aliases (`*name`). +- Multi-line block scalars (`|`, `>`, `|-`, `>+`). +- YAML tags (`!!str`, etc.). +- `yes`/`no`/`on`/`off`/`Y`/`N` as booleans (we require + literal `true` / `false`). +- Unquoted strings that resemble dates (`2026-05-24`) or octal + (`0123`) — the Norway problem and its kin. If a string would + be ambiguous, quote it. +- Flow style mappings nested more than one level deep. + +Parser lives at `claude_bottle/yaml_subset.py`, ~300 lines. +Public API: + +```python +def parse_frontmatter(text: str) -> tuple[dict[str, object], str]: + """Return (frontmatter_dict, body_text). The dict's values are + str / int / bool / None / list / dict only; nesting capped at + two levels.""" +``` + +### Existing code touched + +- **`claude_bottle/manifest.py`** — `Manifest.resolve` rewritten + to walk the new directories. `Manifest.from_json_obj` kept as + a programmatic entry point (used by tests). New + `Manifest.from_md_dirs(home_dir, cwd_dir)` for the loader. +- **`claude_bottle/yaml_subset.py`** — new. The parser. +- **`claude_bottle/cli/migrate_manifest.py`** — new. The + migration command. +- **`claude_bottle/cli/__init__.py`** — wire the new subcommand. +- **`README.md`** — manifest section rewritten against the new + layout. +- **`claude-bottle.example.json`** — removed; replaced by an + `examples/` directory with one bottle file + one agent file. +- **Tests** — new parser tests + new loader tests; existing + manifest tests adapt to either build via `from_json_obj` + (still supported) or use the new directory layout. + +### Data model + +No new dataclasses. `Bottle`, `Agent`, `Manifest`, `CredProxyRoute`, +etc. all stay the same shape. Only the loader changes. + +### Backward compatibility + +This is a breaking change for v1 users. Mitigations: + +- `./cli.py migrate-manifest` does the heavy lifting in one + command. +- If `claude-bottle.json` exists in `$HOME` or `$CWD` *and* the + new directories don't exist, the resolver dies with a clear + pointer at the migration command — not silently merging + formats, not silently dropping the JSON content. + +## Open questions + +- **Claude Code tolerance for extra frontmatter fields.** Test + empirically before settling: drop a file with `bottle: dev` + in `~/.claude/agents/` and see whether Claude Code warns, + ignores, or breaks. If it warns, namespace the field + (`claude-bottle-bottle:` or a nested `claude_bottle:` block). +- **Hidden directory vs visible.** Default `.claude-bottle/` + (hidden — matches `.config/`, `.ssh/`, `.docker/`). If users + routinely want to navigate to it from the file manager, + switch to `claude-bottle/`. Lean hidden. +- **`description:` for bottles.** Should bottle frontmatter + carry a `description:` field for the y/N preflight? Default + no — bottle names are kebab-case and self-describing, and + the MD body is the place for human prose. +- **Filename ↔ frontmatter `name:` drift.** If both are + present and disagree, warn (we use the filename as the + authoritative key). Same for agents. +- **`include` / glob for shared egress allowlists.** A common + pattern will be "every bottle allows api.anthropic.com and + github.com"; do we want a way to share the list? Default no + for v1; revisit if it bites. +- **Migration tool destructive vs additive.** Default + additive (writes new files, leaves old JSON in place). If + users find the half-migrated state confusing, switch to + printing a "delete claude-bottle.json now" reminder at the + end of the migration. + +## References + +- `docs/research/manifest-format-and-grouping.md` — the + analysis this PRD follows from. +- Closed PR #15 — the resolver-layer trust-boundary attempt; + superseded by this PRD's filesystem-layout approach. +- Closed PR #16 — the research doc + the option-B4 decision + comment that picked this design. +- Claude Code subagent spec — `~/.claude/agents/.md` + with YAML frontmatter (existing convention this PRD aligns + agent files with). -- 2.52.0 From afa8ca67a48d37f89d208a31a4878f91819110f3 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 21:46:22 -0400 Subject: [PATCH 2/5] docs(prd-0011): drop the migration command requirement MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit claude-bottle has a single primary user today; an automated JSON → MD migration tool is overkill. Hand-rewriting one file is the migration cost. The resolver still dies with a pointer at the README's manifest section if a stale claude-bottle.json is found alongside no .claude-bottle/ directory, so the breaking change isn't silent. Drops: SC #6 (migration tool), the "Migration command" In Scope sub-bullet, the migrate_manifest.py / cli wiring entries from Existing code touched, the tests/integration/test_migrate_manifest.py entry from Tests, the destructive-vs-additive open question. Renumbers the remaining success criteria 6, 7 (formerly 7, 8). Backward-compat section rewritten around hand-rewrite. --- docs/prds/0011-per-file-md-manifest.md | 72 +++++++------------------- 1 file changed, 19 insertions(+), 53 deletions(-) diff --git a/docs/prds/0011-per-file-md-manifest.md b/docs/prds/0011-per-file-md-manifest.md index a71f733..7463526 100644 --- a/docs/prds/0011-per-file-md-manifest.md +++ b/docs/prds/0011-per-file-md-manifest.md @@ -87,21 +87,13 @@ Each test runs against a temporary `$HOME` and a temporary `$CWD`: parser test. Frontmatter parsing is hand-rolled against the declared YAML subset. -6. **Migration tool converts existing JSON to per-file MD.** - `./cli.py migrate-manifest` reads `$HOME/claude-bottle.json` - (and `$CWD/claude-bottle.json` if present), writes a tree of - per-file MD docs to the new locations, then prints what was - moved. Idempotent: rerunning is a no-op when the new layout - already exists. Does not delete the old JSON files - automatically (user-driven cleanup). - -7. **Existing tests pass against the new layout.** Tests today +6. **Existing tests pass against the new layout.** Tests today build manifests via JSON literals against `Manifest.from_json_obj`. That entry point keeps working for tests (used to construct manifests programmatically); production resolution flows through the new directory-globbing loader. -8. **Agent files double as Claude Code subagent files.** The +7. **Agent files double as Claude Code subagent files.** The `name`, `description`, `model`, `color`, and `memory` fields from Claude Code's existing subagent spec are accepted in our frontmatter alongside our own fields. Copying an agent @@ -119,11 +111,11 @@ Each test runs against a temporary `$HOME` and a temporary `$CWD`: pointer at the spec. We are not building a YAML library. - **Compatibility with the old JSON layout at runtime.** The - resolver no longer reads `claude-bottle.json` files. The - migration tool is the bridge; after migration the JSON file - is stale (and the user removes it). This is a breaking - change for v1 users; the migration cost is one command + a - manual delete. + resolver no longer reads `claude-bottle.json` files. This is + a breaking change; existing users hand-rewrite their JSON + into the new per-file layout (claude-bottle has a single + primary user today, so the migration is one person rewriting + one file). Documented as part of the README rewrite. - **`$HOME/.claude/agents/` integration on the input side.** We don't read agent files out of Claude Code's directory. Our @@ -208,25 +200,6 @@ Each test runs against a temporary `$HOME` and a temporary `$CWD`: 5. Warn if `$CWD/.claude-bottle/bottles/` exists with files. 6. Return Manifest dataclass — same shape as today. -- **Migration command.** `./cli.py migrate-manifest`: - - Reads `$HOME/claude-bottle.json` and (if present) - `$CWD/claude-bottle.json`. - - Creates `$HOME/.claude-bottle/{bottles,agents}/` dirs. - - For each `bottles[]`, writes - `$HOME/.claude-bottle/bottles/.md` with frontmatter - rendered from the bottle dict, body empty (or a one-line - "Migrated from claude-bottle.json on " stub). - - For each home `agents[]`, writes - `$HOME/.claude-bottle/agents/.md` with frontmatter - (bottle, skills, etc.) and body = `prompt`. - - For each cwd `agents[]` (if cwd JSON existed), - writes `$CWD/.claude-bottle/agents/.md`. - - Refuses to overwrite existing MD files; if a target - already exists, prints what would have been written and - bails on that file (continues with the others). - - Prints a summary at the end: N bottles written, M agents - written, what was skipped. - - **Docs.** README's manifest section rewrites against the new layout. `claude-bottle.example.json` becomes `examples/bottles/dev.md` + `examples/agents/implementer.md`. @@ -238,9 +211,7 @@ Each test runs against a temporary `$HOME` and a temporary `$CWD`: - `tests/unit/test_yaml_subset_parser.py` — the parser itself, including all the rejection cases listed above. - `tests/unit/test_manifest_md_load.py` — directory-globbing - + assembly, the 8 success criteria. - - `tests/integration/test_migrate_manifest.py` — round-trip - JSON → MD; idempotency. + + assembly, the seven success criteria. - Existing integration tests keep working (the only public entry points they hit are `Manifest.resolve` and `Manifest.from_json_obj`). @@ -248,7 +219,9 @@ Each test runs against a temporary `$HOME` and a temporary `$CWD`: ### Out of scope - Watching the directory for changes mid-session. -- A migration tool for moving back (MD → JSON). +- An automated migration command. Existing JSON users + hand-rewrite into the new layout. The README rewrite + documents the new shape; that's the migration surface. - Validating that frontmatter `name:` matches the filename. Soft check via a warn log if mismatched, but not enforced. - A bottle/agent dependency graph beyond the existing `bottle:` @@ -378,9 +351,6 @@ def parse_frontmatter(text: str) -> tuple[dict[str, object], str]: a programmatic entry point (used by tests). New `Manifest.from_md_dirs(home_dir, cwd_dir)` for the loader. - **`claude_bottle/yaml_subset.py`** — new. The parser. -- **`claude_bottle/cli/migrate_manifest.py`** — new. The - migration command. -- **`claude_bottle/cli/__init__.py`** — wire the new subcommand. - **`README.md`** — manifest section rewritten against the new layout. - **`claude-bottle.example.json`** — removed; replaced by an @@ -396,14 +366,15 @@ etc. all stay the same shape. Only the loader changes. ### Backward compatibility -This is a breaking change for v1 users. Mitigations: +This is a breaking change for v1 users. claude-bottle has a +single primary user today, so migration is one person rewriting +one file — no automated migration command is in scope. -- `./cli.py migrate-manifest` does the heavy lifting in one - command. -- If `claude-bottle.json` exists in `$HOME` or `$CWD` *and* the - new directories don't exist, the resolver dies with a clear - pointer at the migration command — not silently merging - formats, not silently dropping the JSON content. +If `claude-bottle.json` exists in `$HOME` or `$CWD` *and* the +new `.claude-bottle/` directory does not exist, the resolver +dies with a clear pointer at the README's manifest section — +not silently merging formats, not silently dropping the JSON +content. ## Open questions @@ -427,11 +398,6 @@ This is a breaking change for v1 users. Mitigations: pattern will be "every bottle allows api.anthropic.com and github.com"; do we want a way to share the list? Default no for v1; revisit if it bites. -- **Migration tool destructive vs additive.** Default - additive (writes new files, leaves old JSON in place). If - users find the half-migrated state confusing, switch to - printing a "delete claude-bottle.json now" reminder at the - end of the migration. ## References -- 2.52.0 From 8c1e4d02201315102d0189da831a8603df123f82 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 21:59:34 -0400 Subject: [PATCH 3/5] feat(yaml_subset): hand-rolled YAML-subset + frontmatter parser MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit claude_bottle/yaml_subset.py — stdlib-only, ~450 lines. Parses the bounded shape claude-bottle's manifest files use: - Block mappings (top-level + nested via indentation) - Block lists (under a key, items can be scalars or block-style mappings whose keys align with the rest after the dash) - Inline lists `[a, b]` and inline dicts `{a: 1}` for one-level leaves - Quoted (single + double) and bare strings - Scalars: string, int, true/false, null/~ Rejects, each with a clear pointer at the line number: - `yes`/`no`/`on`/`off`/`Y`/`N`/`TRUE`/`FALSE` — only literal `true` / `false` are bools (the Norway problem stays solved by "quote your strings if they look like bools") - Bare strings that look like dates / octals / hex / floats - Anchors (`&`/`*`), aliases, YAML tags (`!!str`) - Multi-line block scalars (`|`, `>`) - Tabs in indentation - Nested flow style (only one level allowed) Public API: parse_yaml_subset(text) -> dict[str, object] Top level must be a mapping. parse_frontmatter(text) -> (dict, body_text) Strips `---` delimiters, parses content as YAML subset, returns the verbatim body text after the closing fence. 46 unit tests covering every construct the real manifest files use (the cred_proxy.routes structure, role-as-inline-list, nested ExtraHosts dicts) plus every rejection case listed in PRD 0011. --- claude_bottle/yaml_subset.py | 569 +++++++++++++++++++++++++++++++++ tests/unit/test_yaml_subset.py | 327 +++++++++++++++++++ 2 files changed, 896 insertions(+) create mode 100644 claude_bottle/yaml_subset.py create mode 100644 tests/unit/test_yaml_subset.py diff --git a/claude_bottle/yaml_subset.py b/claude_bottle/yaml_subset.py new file mode 100644 index 0000000..4267dc8 --- /dev/null +++ b/claude_bottle/yaml_subset.py @@ -0,0 +1,569 @@ +"""Hand-rolled YAML-subset parser for claude-bottle manifest files +(PRD 0011). + +Why hand-rolled: the configs we accept have a bounded shape (flat +top-level keys; values are strings / ints / bools / null / lists / +nested dicts; no anchors, no multi-line block scalars, no tags, no +implicit type coercion gotchas). A real YAML library is a much +larger dependency surface than we need. The project's stdlib-only +stance (CLAUDE.md) is the load-bearing reason; the safety +properties — no Norway problem, no surprise date/octal coercion — +are the bonus. + +Public API: + + parse_yaml_subset(text) -> dict[str, object] + Parse a full document. Top level must be a mapping (the + shape every claude-bottle manifest file uses). Values are + str / int / bool / None / list / dict only. + + parse_frontmatter(text) -> tuple[dict[str, object], str] + For a Markdown file with YAML frontmatter delimited by `---` + lines. Returns (frontmatter_dict, body_text). + +What we accept (block-style): + + key: value # mapping entry, value is inline + key: # mapping entry, value is block + nested_key: value + + key: + - item # list under a key + - item + + key: + - subkey: value1 # list item that's a mapping + subkey2: value2 + - subkey: value3 + +What we accept (inline, scalar leaves only): + + key: [a, b, "c d"] + key: {a: 1, b: 2} + +What we reject (each dies with a clear pointer): + + &anchor / *alias # anchors / aliases + !!tag # YAML tags + | / > # multi-line block scalars + yes / no / on / off # only true / false count as bool + ambiguous bare strings # numbers, dates, etc. when unquoted + tabs as indentation # spaces only + flow-style nested deeper than one level + +Errors carry the line number from the source document. +""" + +from __future__ import annotations + +import re +from dataclasses import dataclass + +from .log import die + + +# --- Tokenizer / line preprocessing ---------------------------------------- + + +@dataclass(frozen=True) +class _Line: + """One non-blank, non-comment line from the source. `indent` is + the column of the first non-space character; `content` is the + line text from that column onward, with trailing whitespace and + trailing `# ...` comments stripped. `lineno` is the 1-based + line in the original document.""" + + indent: int + content: str + lineno: int + + +def _strip_trailing_comment(s: str) -> str: + """Strip ` # comment` from end of line, but only when the `#` + isn't inside a quoted string. Returns the cleaned line.""" + in_single = False + in_double = False + for i, ch in enumerate(s): + if ch == "'" and not in_double: + in_single = not in_single + elif ch == '"' and not in_single: + in_double = not in_double + elif ch == "#" and not in_single and not in_double: + # `#` must be preceded by whitespace to be a comment, + # otherwise it's just a literal character. + if i == 0 or s[i - 1] in (" ", "\t"): + return s[:i].rstrip() + return s.rstrip() + + +def _tokenize(text: str) -> list[_Line]: + """Drop blank / comment lines, parse indent + content for the + rest. Tabs in the indent area are rejected outright.""" + out: list[_Line] = [] + for n, raw in enumerate(text.splitlines(), start=1): + # Tabs in indent are a portability footgun — different + # editors render them differently and the spec says spaces. + leading = len(raw) - len(raw.lstrip(" \t")) + if "\t" in raw[:leading]: + die(f"yaml-subset: tab character in indent on line {n}") + stripped = raw.strip() + if not stripped: + continue + if stripped.startswith("#"): + continue + # Whole-line position: indent before first non-space. + indent = len(raw) - len(raw.lstrip(" ")) + content = _strip_trailing_comment(raw[indent:]) + if not content: + continue + out.append(_Line(indent=indent, content=content, lineno=n)) + return out + + +# --- Scalar parsing --------------------------------------------------------- + + +_BARE_RX = re.compile(r"^[A-Za-z_][A-Za-z0-9_.\-]*$") +_INT_RX = re.compile(r"^-?[0-9]+$") +_RESERVED_BOOL_LIKE = frozenset({"yes", "no", "on", "off", "y", "n", "Y", "N", + "YES", "NO", "ON", "OFF", "True", "False", + "TRUE", "FALSE"}) +# Yaml-ish ambiguity sources that an unquoted bare token COULD be +# mistaken for: dates, octals, etc. Detected and rejected so users +# quote their strings explicitly. We don't try to enumerate every +# ambiguity; the rule is "if it looks like a non-string literal, +# either parse it as that literal (true/false/null/int) or reject +# it with a 'quote it' hint." +_DATE_RX = re.compile(r"^-?\d{4}-\d{2}-\d{2}(T\d.*)?$") +_OCTAL_RX = re.compile(r"^0o?\d+$") +_HEX_RX = re.compile(r"^0x[0-9A-Fa-f]+$") +_FLOAT_RX = re.compile(r"^-?\d+\.\d+([eE][-+]?\d+)?$") + + +def _parse_scalar(s: str, lineno: int) -> object: + """Turn a stripped value string into a Python value (str, int, + bool, None). Quoted strings preserve their literal content + (with standard escapes); bare strings are accepted only when + they're unambiguous.""" + s = s.strip() + if not s: + return "" + + # Quoted forms first — content is whatever's between the quotes + # with the documented escapes applied. + if (s.startswith('"') and s.endswith('"')) or ( + s.startswith("'") and s.endswith("'") + ): + if len(s) < 2: + die(f"yaml-subset: unterminated quoted string on line {lineno}") + body = s[1:-1] + if s.startswith('"'): + # JSON-style escapes for double quotes. + try: + return body.encode("utf-8").decode("unicode_escape") + except UnicodeDecodeError as e: + die(f"yaml-subset: bad escape on line {lineno}: {e}") + else: + # Single quotes: only '' → ' (standard YAML); no other escapes. + return body.replace("''", "'") + + # Reserved bool-like tokens that aren't `true` / `false` — + # always reject so users have to be explicit. + if s in _RESERVED_BOOL_LIKE: + if s in ("true", "false"): + return s == "true" + die( + f"yaml-subset: bare {s!r} on line {lineno} is ambiguous " + f"(use literal `true` / `false`, or quote it as a string)" + ) + + if s == "true": + return True + if s == "false": + return False + if s in ("null", "~"): + return None + + if _INT_RX.match(s): + return int(s) + + # Look-alikes that we reject to keep the user in control. + if _DATE_RX.match(s): + die( + f"yaml-subset: bare {s!r} on line {lineno} looks like a " + f"date — quote it as a string or use an explicit int" + ) + if _OCTAL_RX.match(s): + die( + f"yaml-subset: bare {s!r} on line {lineno} looks like an " + f"octal/0-prefixed integer — quote it as a string" + ) + if _HEX_RX.match(s): + die( + f"yaml-subset: bare {s!r} on line {lineno} looks like a " + f"hex integer — quote it as a string" + ) + if _FLOAT_RX.match(s): + die( + f"yaml-subset: floats not supported (line {lineno}, " + f"value {s!r}); use an int or quote as a string" + ) + + # Bare strings: anything that matches the bare-string pattern is + # accepted as a string literal. Otherwise we hand it back as a + # string anyway — for URLs, paths, hostnames, etc. that contain + # special chars. The PRD calls for rejecting "ambiguous" strings, + # and we've already rejected the ambiguous shapes above; what's + # left is unambiguously a string. + return s + + +# --- Inline list / dict ---------------------------------------------------- + + +def _parse_inline(s: str, lineno: int) -> object: + """Inline list `[a, b]` or dict `{a: 1, b: 2}` or scalar. + Nested flow more than one level deep is rejected (PRD).""" + s = s.strip() + if s.startswith("["): + if not s.endswith("]"): + die(f"yaml-subset: unterminated `[` on line {lineno}") + body = s[1:-1].strip() + if not body: + return [] + items: list[object] = [] + for raw in _split_flow(body, lineno, "list"): + v = _parse_scalar(raw, lineno) + items.append(v) + return items + if s.startswith("{"): + if not s.endswith("}"): + die(f"yaml-subset: unterminated `{{` on line {lineno}") + body = s[1:-1].strip() + if not body: + return {} + out: dict[str, object] = {} + for raw in _split_flow(body, lineno, "dict"): + if ":" not in raw: + die( + f"yaml-subset: inline dict entry on line {lineno} " + f"missing `:` ({raw!r})" + ) + k, _, v = raw.partition(":") + k = k.strip() + if not _BARE_RX.match(k): + die( + f"yaml-subset: inline dict key on line {lineno} " + f"must be a bare identifier ({k!r})" + ) + out[k] = _parse_scalar(v.strip(), lineno) + return out + return _parse_scalar(s, lineno) + + +def _split_flow(body: str, lineno: int, kind: str) -> list[str]: + """Split `a, b, c` respecting quoted strings. Rejects nested + flow (a list/dict inside the flow body) since the PRD limits + flow nesting to one level.""" + items: list[str] = [] + depth_b = 0 + depth_c = 0 + in_single = False + in_double = False + cur = [] + for ch in body: + if ch == "'" and not in_double: + in_single = not in_single + elif ch == '"' and not in_single: + in_double = not in_double + elif not in_single and not in_double: + if ch in "[{": + depth_b += 1 + elif ch in "]}": + depth_b -= 1 + if depth_b > 0: + die( + f"yaml-subset: nested flow {kind} on line " + f"{lineno} (only one level of flow allowed)" + ) + if ch == "," and depth_b == 0 and depth_c == 0: + items.append("".join(cur)) + cur = [] + continue + cur.append(ch) + if cur: + items.append("".join(cur)) + return [s.strip() for s in items if s.strip()] + + +# --- Block parser ---------------------------------------------------------- + + +def _split_key_value(content: str, lineno: int) -> tuple[str, str]: + """Find the FIRST top-level `:` that separates a key from its + value (ignoring `:` inside quoted strings). Returns (key, value). + `value` may be empty (block-form mapping).""" + in_single = False + in_double = False + for i, ch in enumerate(content): + if ch == "'" and not in_double: + in_single = not in_single + elif ch == '"' and not in_single: + in_double = not in_double + elif ch == ":" and not in_single and not in_double: + # `:` must be followed by space or be at end-of-line to + # count as a key separator (otherwise `key:value` would + # ambiguous with URLs etc.). + if i + 1 >= len(content) or content[i + 1] in (" ", "\t"): + return content[:i].strip(), content[i + 1:].lstrip() + die(f"yaml-subset: line {lineno} missing `: ` separator: {content!r}") + + +def _parse_block( + lines: list[_Line], idx: int, base_indent: int +) -> tuple[object, int]: + """Parse a block starting at `lines[idx]`, expecting that block + to live at `base_indent`. Returns (value, new_idx) where + `new_idx` is the index of the first unconsumed line.""" + if idx >= len(lines): + die("yaml-subset: unexpected end of document") + first = lines[idx] + if first.indent < base_indent: + die( + f"yaml-subset: line {first.lineno} indented less than " + f"expected (got {first.indent}, expected >= {base_indent})" + ) + if first.indent > base_indent: + die( + f"yaml-subset: line {first.lineno} indented more than " + f"expected (got {first.indent}, expected {base_indent})" + ) + + if first.content.startswith("- ") or first.content == "-": + return _parse_block_list(lines, idx, base_indent) + return _parse_block_mapping(lines, idx, base_indent) + + +def _parse_block_mapping( + lines: list[_Line], idx: int, base_indent: int +) -> tuple[dict[str, object], int]: + out: dict[str, object] = {} + while idx < len(lines) and lines[idx].indent == base_indent: + line = lines[idx] + if line.content.startswith("- "): + die( + f"yaml-subset: line {line.lineno} unexpected list " + f"item at mapping indent (got `-`, expected `key:`)" + ) + key, value_text = _split_key_value(line.content, line.lineno) + if not _BARE_RX.match(key): + die( + f"yaml-subset: line {line.lineno} key {key!r} is not " + f"a bare identifier" + ) + if key in out: + die( + f"yaml-subset: line {line.lineno} duplicate key {key!r}" + ) + if value_text: + out[key] = _parse_inline(value_text, line.lineno) + idx += 1 + else: + # Value is a block on subsequent lines. + idx += 1 + if idx >= len(lines) or lines[idx].indent <= base_indent: + # Empty block — treat as None to match YAML. + out[key] = None + continue + child_indent = lines[idx].indent + value, idx = _parse_block(lines, idx, child_indent) + out[key] = value + return out, idx + + +def _parse_block_list( + lines: list[_Line], idx: int, base_indent: int +) -> tuple[list[object], int]: + items: list[object] = [] + while idx < len(lines) and lines[idx].indent == base_indent and ( + lines[idx].content.startswith("- ") or lines[idx].content == "-" + ): + line = lines[idx] + rest = line.content[2:] if line.content.startswith("- ") else "" + rest = rest.strip() + + # Look ahead at the next non-empty line: if it's indented + # more than the dash AND aligned with the rest's column, + # we have a multi-line mapping item. + if rest and ":" in rest and _looks_like_kv(rest): + # The first key:value of a multi-line mapping list item. + # Subsequent keys live at indent = base_indent + 2 (or + # wherever the content after `- ` started). + content_col = base_indent + 2 + first_key, first_value_text = _split_key_value(rest, line.lineno) + if not _BARE_RX.match(first_key): + die( + f"yaml-subset: line {line.lineno} key {first_key!r} " + f"is not a bare identifier" + ) + item: dict[str, object] = {} + if first_value_text: + item[first_key] = _parse_inline(first_value_text, line.lineno) + idx += 1 + else: + idx += 1 + if idx < len(lines) and lines[idx].indent > content_col: + nested_indent = lines[idx].indent + value, idx = _parse_block(lines, idx, nested_indent) + item[first_key] = value + else: + item[first_key] = None + # Consume additional keys at content_col. + while idx < len(lines) and lines[idx].indent == content_col: + ln = lines[idx] + if ln.content.startswith("- "): + break # next list item, not a sibling key + k, v_text = _split_key_value(ln.content, ln.lineno) + if not _BARE_RX.match(k): + die( + f"yaml-subset: line {ln.lineno} key {k!r} is " + f"not a bare identifier" + ) + if k in item: + die(f"yaml-subset: line {ln.lineno} duplicate key {k!r}") + if v_text: + item[k] = _parse_inline(v_text, ln.lineno) + idx += 1 + else: + idx += 1 + if idx < len(lines) and lines[idx].indent > content_col: + nested_indent = lines[idx].indent + value, idx = _parse_block(lines, idx, nested_indent) + item[k] = value + else: + item[k] = None + items.append(item) + elif rest: + # Inline scalar / inline list / inline dict on the dash line. + items.append(_parse_inline(rest, line.lineno)) + idx += 1 + else: + # Bare `-` — value is a block on subsequent lines. + idx += 1 + if idx >= len(lines) or lines[idx].indent <= base_indent: + items.append(None) + continue + child_indent = lines[idx].indent + value, idx = _parse_block(lines, idx, child_indent) + items.append(value) + return items, idx + + +def _looks_like_kv(s: str) -> bool: + """Heuristic: does `s` look like a mapping `key: value` line? + True if there's an unquoted `:` that's followed by space-or-EOL.""" + in_single = False + in_double = False + for i, ch in enumerate(s): + if ch == "'" and not in_double: + in_single = not in_single + elif ch == '"' and not in_single: + in_double = not in_double + elif ch == ":" and not in_single and not in_double: + if i + 1 >= len(s) or s[i + 1] in (" ", "\t"): + return True + return False + + +# --- Public API ------------------------------------------------------------- + + +def parse_yaml_subset(text: str) -> dict[str, object]: + """Parse a YAML-subset document. Top level must be a mapping; + otherwise we die with a clear pointer.""" + # Reject features that have no place in our schema before we + # tokenize, with line numbers from the raw text. + for n, raw in enumerate(text.splitlines(), start=1): + s = raw.strip() + if s.startswith("|") or s.startswith(">") or s.startswith("- |") or s.startswith("- >"): + die( + f"yaml-subset: line {n} uses a multi-line block " + f"scalar (`|` / `>`) — not supported. Use a quoted " + f"single-line string instead." + ) + if "&" in s or "*" in s: + # Only flag when `&` or `*` is being used as anchor/alias, + # not when it's inside a quoted string. Cheap check: any + # bare `&foo:` / `*foo` at the start of a value position. + if re.search(r"(^|\s)[&*][A-Za-z0-9_]+", s): + die( + f"yaml-subset: line {n} uses anchors / aliases " + f"(`&` / `*`) — not supported." + ) + if "!!" in s and not (s.count("'") % 2 or s.count('"') % 2): + die( + f"yaml-subset: line {n} uses a YAML tag (`!!`) — not " + f"supported." + ) + + lines = _tokenize(text) + if not lines: + return {} + base_indent = lines[0].indent + if base_indent != 0: + die( + f"yaml-subset: top-level content must start in column 0 " + f"(got column {base_indent} on line {lines[0].lineno})" + ) + value, consumed = _parse_block(lines, 0, 0) + if consumed < len(lines): + die( + f"yaml-subset: trailing content starting on line " + f"{lines[consumed].lineno}" + ) + if not isinstance(value, dict): + die("yaml-subset: top-level value must be a mapping") + return value + + +def parse_frontmatter(text: str) -> tuple[dict[str, object], str]: + """Find `---` delimiters at the top of a Markdown file, parse + the frontmatter as YAML subset, return (mapping, body_text). + + No frontmatter at all → ({}, text). Single opening `---` with + no closing → die with a clear pointer. Body is the verbatim + text after the closing `---` line (preserving original line + endings).""" + # Split into lines but preserve the original separators so the + # body slice is exact. + nl_positions: list[int] = [] + for i, ch in enumerate(text): + if ch == "\n": + nl_positions.append(i) + if not nl_positions and not text: + return {}, "" + + first_nl = nl_positions[0] if nl_positions else len(text) + first_line = text[:first_nl].strip() + if first_line != "---": + return {}, text # no frontmatter; whole document is body + + # Find the matching closing `---`. + body_start = -1 + fm_end_lineno = -1 + line_starts = [0] + [p + 1 for p in nl_positions] + for line_idx in range(1, len(line_starts)): + ls = line_starts[line_idx] + next_nl = nl_positions[line_idx] if line_idx < len(nl_positions) else len(text) + line = text[ls:next_nl].rstrip() + if line == "---": + body_start = next_nl + 1 if next_nl < len(text) else next_nl + fm_end_lineno = line_idx + break + if body_start < 0: + die("frontmatter: opening `---` has no matching closing `---`") + + fm_text = text[line_starts[1]:line_starts[fm_end_lineno]] if fm_end_lineno > 1 else "" + fm = parse_yaml_subset(fm_text) + body = text[body_start:] + return fm, body diff --git a/tests/unit/test_yaml_subset.py b/tests/unit/test_yaml_subset.py new file mode 100644 index 0000000..5c5ecd1 --- /dev/null +++ b/tests/unit/test_yaml_subset.py @@ -0,0 +1,327 @@ +"""Unit: YAML-subset parser used by the per-file MD manifest +(PRD 0011). Covers happy paths, the constructs the manifest files +actually use, and every rejection case the PRD enumerates.""" + +import textwrap +import unittest + +from claude_bottle.log import Die +from claude_bottle.yaml_subset import parse_frontmatter, parse_yaml_subset + + +def _y(s: str): + """Parse a dedented YAML string.""" + return parse_yaml_subset(textwrap.dedent(s).lstrip("\n")) + + +class TestScalars(unittest.TestCase): + def test_string(self): + self.assertEqual({"k": "hello"}, _y("k: hello\n")) + + def test_string_with_url_chars(self): + self.assertEqual( + {"k": "https://example.com/path?x=1"}, + _y("k: https://example.com/path?x=1\n"), + ) + + def test_int(self): + self.assertEqual({"port": 9099}, _y("port: 9099\n")) + + def test_negative_int(self): + self.assertEqual({"n": -3}, _y("n: -3\n")) + + def test_bool_true(self): + self.assertEqual({"x": True}, _y("x: true\n")) + + def test_bool_false(self): + self.assertEqual({"x": False}, _y("x: false\n")) + + def test_null(self): + self.assertEqual({"x": None}, _y("x: null\n")) + + def test_tilde_null(self): + self.assertEqual({"x": None}, _y("x: ~\n")) + + def test_double_quoted_string(self): + self.assertEqual({"k": "a b"}, _y('k: "a b"\n')) + + def test_double_quoted_escape(self): + self.assertEqual({"k": "a\nb"}, _y(r'k: "a\nb"' + "\n")) + + def test_single_quoted_string(self): + self.assertEqual({"k": "a b"}, _y("k: 'a b'\n")) + + def test_single_quoted_apos_double(self): + # Single-quoted YAML uses `''` to embed a literal `'`. + self.assertEqual({"k": "it's"}, _y("k: 'it''s'\n")) + + +class TestForbiddenBoolLikes(unittest.TestCase): + """Ambiguous bool-ish tokens have to be quoted explicitly.""" + + def _expect_die(self, src: str): + with self.assertRaises(Die): + _y(src) + + def test_yes_dies(self): + self._expect_die("k: yes\n") + + def test_no_dies(self): + self._expect_die("k: no\n") + + def test_on_dies(self): + self._expect_die("k: on\n") + + def test_capital_TRUE_dies(self): + self._expect_die("k: TRUE\n") + + def test_norway_quoted_is_fine(self): + self.assertEqual({"country": "NO"}, _y('country: "NO"\n')) + + +class TestForbiddenScalarShapes(unittest.TestCase): + def _expect_die(self, src: str): + with self.assertRaises(Die): + _y(src) + + def test_bare_date_dies(self): + self._expect_die("k: 2026-05-24\n") + + def test_bare_octal_dies(self): + self._expect_die("k: 0o755\n") + + def test_bare_hex_dies(self): + self._expect_die("k: 0xFF\n") + + def test_bare_float_dies(self): + self._expect_die("k: 1.5\n") + + def test_quoted_date_is_fine(self): + self.assertEqual({"k": "2026-05-24"}, _y('k: "2026-05-24"\n')) + + +class TestMapping(unittest.TestCase): + def test_flat_mapping(self): + self.assertEqual( + {"a": 1, "b": "two", "c": True}, + _y(""" + a: 1 + b: two + c: true + """), + ) + + def test_nested_mapping(self): + out = _y(""" + outer: + inner: hello + other: 5 + """) + self.assertEqual({"outer": {"inner": "hello", "other": 5}}, out) + + def test_duplicate_key_dies(self): + with self.assertRaises(Die): + _y(""" + a: 1 + a: 2 + """) + + def test_key_must_be_bare_identifier(self): + with self.assertRaises(Die): + _y('"weird key": 1\n') + + +class TestBlockList(unittest.TestCase): + def test_list_of_strings(self): + out = _y(""" + allowlist: + - example.com + - github.com + """) + self.assertEqual({"allowlist": ["example.com", "github.com"]}, out) + + def test_list_of_mappings(self): + out = _y(""" + routes: + - path: /a/ + upstream: https://a.example + - path: /b/ + upstream: https://b.example + """) + self.assertEqual( + {"routes": [ + {"path": "/a/", "upstream": "https://a.example"}, + {"path": "/b/", "upstream": "https://b.example"}, + ]}, + out, + ) + + def test_list_item_with_nested_mapping(self): + out = _y(""" + entries: + - name: foo + ExtraHosts: + host.example: 10.0.0.1 + - name: bar + """) + self.assertEqual( + {"entries": [ + {"name": "foo", "ExtraHosts": {"host.example": "10.0.0.1"}}, + {"name": "bar"}, + ]}, + out, + ) + + def test_list_item_with_inline_list_value(self): + # role: [git-insteadof, tea-login] — the exact shape in the + # claude-bottle manifest. + out = _y(""" + routes: + - path: /x/ + role: [git-insteadof, tea-login] + """) + self.assertEqual( + {"routes": [ + {"path": "/x/", "role": ["git-insteadof", "tea-login"]}, + ]}, + out, + ) + + +class TestInline(unittest.TestCase): + def test_inline_list(self): + self.assertEqual({"l": [1, 2, 3]}, _y("l: [1, 2, 3]\n")) + + def test_inline_list_of_strings(self): + self.assertEqual({"l": ["a", "b", "c"]}, _y("l: [a, b, c]\n")) + + def test_inline_dict(self): + self.assertEqual( + {"d": {"a": "1", "b": "2"}}, + _y('d: {a: "1", b: "2"}\n'), + ) + + def test_nested_flow_dies(self): + with self.assertRaises(Die): + _y("l: [[1, 2], [3, 4]]\n") + + +class TestForbiddenConstructs(unittest.TestCase): + def test_anchor_dies(self): + with self.assertRaises(Die): + _y(""" + a: &anchor 1 + b: *anchor + """) + + def test_multiline_block_scalar_dies(self): + with self.assertRaises(Die): + _y(""" + k: | + line 1 + line 2 + """) + + def test_tag_dies(self): + with self.assertRaises(Die): + _y("k: !!str hello\n") + + def test_tab_in_indent_dies(self): + with self.assertRaises(Die): + _y("a:\n\tb: 1\n") + + +class TestComments(unittest.TestCase): + def test_full_line_comment(self): + out = _y(""" + # comment + k: v + """) + self.assertEqual({"k": "v"}, out) + + def test_trailing_comment(self): + self.assertEqual({"k": "v"}, _y("k: v # trailing\n")) + + def test_hash_in_quoted_string_kept(self): + self.assertEqual({"k": "a#b"}, _y('k: "a#b"\n')) + + +class TestRealisticBottleFile(unittest.TestCase): + """The exact shape a real bottle frontmatter takes — the parser + has to round-trip this without surprise.""" + + def test_dev_bottle(self): + out = _y(""" + cred_proxy: + routes: + - path: /anthropic/ + upstream: https://api.anthropic.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url + - path: /gitea/dideric/ + upstream: https://gitea.dideric.is + auth_scheme: token + token_ref: GITEA_TOKEN + role: [git-insteadof, tea-login] + git: + - Name: claude-bottle + Upstream: ssh://git@gitea.dideric.is:30009/x/y.git + IdentityFile: ~/.ssh/gitea.pem + ExtraHosts: + gitea.dideric.is: 100.78.141.42 + egress: + allowlist: + - example.com + """) + # Spot-check the deep parts; the structure is large. + self.assertEqual(2, len(out["cred_proxy"]["routes"])) + self.assertEqual( + ["git-insteadof", "tea-login"], + out["cred_proxy"]["routes"][1]["role"], + ) + self.assertEqual( + "100.78.141.42", + out["git"][0]["ExtraHosts"]["gitea.dideric.is"], + ) + self.assertEqual(["example.com"], out["egress"]["allowlist"]) + + +class TestFrontmatter(unittest.TestCase): + def test_basic(self): + text = textwrap.dedent(""" + --- + bottle: dev + --- + This is the body. + """).lstrip("\n") + fm, body = parse_frontmatter(text) + self.assertEqual({"bottle": "dev"}, fm) + self.assertIn("This is the body", body) + + def test_no_frontmatter_passes_through(self): + text = "no frontmatter here\njust body\n" + fm, body = parse_frontmatter(text) + self.assertEqual({}, fm) + self.assertEqual(text, body) + + def test_unclosed_frontmatter_dies(self): + with self.assertRaises(Die): + parse_frontmatter("---\nbottle: dev\nno closing") + + def test_body_preserves_blank_lines(self): + text = ( + "---\n" + "k: v\n" + "---\n" + "\n" + "line one\n" + "\n" + "line three\n" + ) + _, body = parse_frontmatter(text) + self.assertEqual("\nline one\n\nline three\n", body) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 6ba5f9a9d3f73b8e3552a4a7e0be36a4645dce56 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 22:15:02 -0400 Subject: [PATCH 4/5] feat(manifest): per-file MD directory loader (PRD 0011) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Manifest.resolve walks $HOME/.claude-bottle/{bottles,agents}/ and $CWD/.claude-bottle/agents/ instead of reading claude-bottle.json. A bottles/ subdir under $CWD is logged as a warn and ignored — the filesystem layout IS the trust boundary, no resolver check needed. If claude-bottle.json exists alongside no .claude-bottle/ dir at either location, dies with a clear pointer at the README — the manifest format changed and we don't silently fall back. Manifest.from_md_dirs(home, cwd) is the programmatic entry point tests use to build a Manifest from fixture directories without touching os.environ. Manifest.from_json_obj is preserved for tests that still want to build manifests in-memory. Bottle / agent frontmatter goes through Bottle.from_dict / Agent.from_dict — same validators as today's JSON path. Unknown top-level frontmatter keys die with a "did you mean" pointer listing accepted keys. Filenames that don't match [a-z][a-z0-9-]* are skipped with a warn. Agent files accept the Claude Code subagent passthrough fields (name, description, model, color, memory) so the same file can drop into ~/.claude/agents/ — claude-bottle ignores them at launch but doesn't reject. The dry-run integration test ships a real MD fixture tree now; all 200 unit + 17 integration tests stay green. --- claude_bottle/manifest.py | 284 ++++++++++++++++++---- tests/integration/test_dry_run_plan.py | 24 +- tests/unit/test_manifest_md_load.py | 322 +++++++++++++++++++++++++ 3 files changed, 575 insertions(+), 55 deletions(-) create mode 100644 tests/unit/test_manifest_md_load.py diff --git a/claude_bottle/manifest.py b/claude_bottle/manifest.py index f5dba36..babbdc8 100644 --- a/claude_bottle/manifest.py +++ b/claude_bottle/manifest.py @@ -1,42 +1,52 @@ -"""Manifest dataclasses. Read claude-bottle.json (cwd + $HOME, deep-merged) -into a frozen, validated Manifest tree. +"""Manifest dataclasses (PRD 0011 layout). -Schema (see CLAUDE.md "Intended design"): - { - "bottles": { - "": { - "env": { "": , ... }, - "git": [ , ... ], - "cred_proxy": { "routes": [ , ... ] }, - "egress": { "allowlist": [ "", ... ] } - } - }, - "agents": { - "": { - "skills": [ "", ... ], - "prompt": "", - "bottle": "" - } - } - } +Reads the per-file manifest tree: -Bottles group shared infrastructure (git upstreams + their gate credentials, -egress allowlist) that multiple agents can reference. Every agent must -reference a bottle. + $HOME/.claude-bottle/bottles/.md — one bottle per file + $HOME/.claude-bottle/agents/.md — home-resident agents + $CWD/.claude-bottle/agents/.md — cwd-supplied agents -Validation runs once at construction (Manifest.from_json_obj) so getters -can trust the shape. +Each file is Markdown with YAML frontmatter. The frontmatter holds +the structured config (see schema below); for agents the body is +the system prompt, for bottles the body is human documentation +(ignored by the parser). + +Bottle schema (frontmatter): + env: { : , ... } + git: [ , ... ] + cred_proxy: { routes: [ , ... ] } + egress: { allowlist: [ , ... ] } + +Agent schema (frontmatter): + bottle: # required + skills: [ , ... ] # optional + # Claude Code subagent passthrough fields — accepted, ignored: + name, description, model, color, memory + +The agent file's Markdown body is the system prompt (stripped). +Unknown top-level frontmatter keys die with a hint. + +Bottles can ONLY live under $HOME. A bottles/ dir under $CWD is a +warn at load time and contributes nothing. The trust boundary is +expressed as filesystem layout rather than resolver logic. + +Validation runs once at load. Manifest.from_json_obj is preserved +as a programmatic entry point (used by tests) that takes a dict +with the same field names — useful for building manifests without +on-disk files. """ from __future__ import annotations import json import os +import re from dataclasses import dataclass, field from pathlib import Path from typing import Mapping, cast -from .log import die +from .log import die, warn +from .yaml_subset import parse_frontmatter def _empty_str_dict() -> dict[str, str]: @@ -443,31 +453,85 @@ class Manifest: @classmethod def resolve(cls, cwd: str) -> "Manifest": - """Look for claude-bottle.json in and in $HOME, deep-merge - them (cwd entries override home entries on key conflict for both - bottles and agents), then validate. Dies if neither file is - found, either is invalid JSON, or the merged shape violates the - schema.""" - cwd_file = Path(cwd) / "claude-bottle.json" - home_file = Path(os.environ["HOME"]) / "claude-bottle.json" + """Walk the per-file manifest tree and build a Manifest. - cwd_doc = _load_json_or_die(cwd_file) if cwd_file.is_file() else None - home_doc = _load_json_or_die(home_file) if home_file.is_file() else None + Layout (PRD 0011): + $HOME/.claude-bottle/bottles/.md — bottles (home-only) + $HOME/.claude-bottle/agents/.md — home agents + $CWD/.claude-bottle/agents/.md — cwd agents - if cwd_doc is None and home_doc is None: - die(f"no claude-bottle.json found in {cwd} or {os.environ['HOME']}") + Cwd agents merge into the home agents on the same name + (cwd wins). A bottles/ subdir under $CWD is logged as a + warning and ignored — the filesystem layout IS the trust + boundary. - h: dict[str, object] = home_doc if home_doc is not None else {} - c: dict[str, object] = cwd_doc if cwd_doc is not None else {} - h_bottles = _section_dict(h.get("bottles"), "bottles") - c_bottles = _section_dict(c.get("bottles"), "bottles") - h_agents = _section_dict(h.get("agents"), "agents") - c_agents = _section_dict(c.get("agents"), "agents") - merged: dict[str, object] = { - "bottles": {**h_bottles, **c_bottles}, - "agents": {**h_agents, **c_agents}, - } - return cls.from_json_obj(merged) + If `claude-bottle.json` exists alongside a missing + `.claude-bottle/` directory at either side, dies with a + clear pointer at the README's manifest section — the + manifest format changed in PRD 0011 and we don't silently + fall back.""" + home_dir = Path(os.environ["HOME"]) + cwd_dir = Path(cwd) + home_md = home_dir / ".claude-bottle" + cwd_md = cwd_dir / ".claude-bottle" + + _check_stale_json(home_dir, home_md, "$HOME") + if cwd_dir.resolve() != home_dir.resolve(): + _check_stale_json(cwd_dir, cwd_md, "$CWD") + + if not home_md.is_dir(): + die( + f"no manifest found: {home_md} does not exist. " + f"See README.md for the per-file Markdown layout " + f"(PRD 0011)." + ) + + # When CWD == HOME (running from $HOME directly), pass the + # same dir for both — _load_md_dirs will dedupe. + cwd_md_arg = cwd_md if cwd_md.is_dir() and cwd_dir.resolve() != home_dir.resolve() else None + return cls.from_md_dirs(home_md, cwd_md_arg) + + @classmethod + def from_md_dirs( + cls, + home_dir: Path, + cwd_dir: Path | None, + ) -> "Manifest": + """Programmatic entry point. Loads bottles from + `/bottles/`, home agents from `/agents/`, + and (if `cwd_dir` is passed) cwd agents from + `/agents/`. Cwd agents override home agents on + name collision. A `bottles/` subdir under `cwd_dir` is + logged as a warning and ignored. + + Used by tests to build a Manifest from fixture directories + without touching `os.environ`.""" + bottles_dir = home_dir / "bottles" + bottles = _load_bottles_from_dir(bottles_dir) + + bottle_names = set(bottles.keys()) + agents_dir = home_dir / "agents" + agents = _load_agents_from_dir(agents_dir, bottle_names, source="$HOME") + + if cwd_dir is not None: + stale_bottles = cwd_dir / "bottles" + if stale_bottles.is_dir(): + files = sorted(stale_bottles.glob("*.md")) + if files: + names = ", ".join(p.name for p in files) + warn( + f"ignoring bottle file(s) under " + f"{stale_bottles}: {names}. Bottles can only " + f"live under $HOME/.claude-bottle/bottles/ " + f"(PRD 0011). Move them or delete." + ) + cwd_agents_dir = cwd_dir / "agents" + cwd_agents = _load_agents_from_dir( + cwd_agents_dir, bottle_names, source="$CWD" + ) + agents = {**agents, **cwd_agents} + + return cls(bottles=bottles, agents=agents) @classmethod def from_json_obj(cls, obj: object) -> "Manifest": @@ -670,3 +734,127 @@ def _validate_unique_git_names(bottle_name: str, git: tuple[GitEntry, ...]) -> N seen[g.Name] = None + + +# --- Per-file MD loader (PRD 0011) ---------------------------------------- + +# Filename-as-key uses kebab-case ASCII. The first character is a +# letter so we don't conflict with hidden files / Markdown special +# names (`.md`, `_template.md`, etc.). Filenames that fail this +# pattern are skipped with a warning rather than crashing the load. +_FILENAME_RX = re.compile(r"^[a-z][a-z0-9-]*$") + +# Frontmatter keys we accept on each entity. Anything not in these +# sets dies with a "did you mean" pointer — typos shouldn't silently +# ghost into an empty config. +_BOTTLE_KEYS = frozenset({"env", "git", "cred_proxy", "egress"}) +_AGENT_KEYS_REQUIRED = frozenset({"bottle"}) +_AGENT_KEYS_OPTIONAL = frozenset({"skills"}) +# Claude Code subagent fields claude-bottle ignores at launch but +# doesn't reject — lets the same file double as `~/.claude/agents/*.md`. +_AGENT_KEYS_CC_PASSTHROUGH = frozenset({ + "name", "description", "model", "color", "memory", +}) +_AGENT_KEYS = ( + _AGENT_KEYS_REQUIRED | _AGENT_KEYS_OPTIONAL | _AGENT_KEYS_CC_PASSTHROUGH +) + + +def _check_stale_json(dir_path: Path, md_dir: Path, label: str) -> None: + """Die if `/claude-bottle.json` exists but `md_dir` does + not — the manifest format changed in PRD 0011 and we don't want + to silently leave the JSON content unused.""" + legacy = dir_path / "claude-bottle.json" + if legacy.is_file() and not md_dir.exists(): + die( + f"found {legacy} but {md_dir} does not exist. The manifest " + f"format changed in PRD 0011 — rewrite the JSON content " + f"as per-file Markdown under {md_dir}/bottles/ and " + f"{md_dir}/agents/. See README.md for the schema. " + f"({label})" + ) + + +def _entity_name_from_path(path: Path) -> str | None: + """Return the entity name implied by the filename, or None if + the filename doesn't fit the [a-z][a-z0-9-]* convention. None + triggers a skip-with-warning at the caller.""" + if path.suffix != ".md": + return None + stem = path.stem + if not _FILENAME_RX.match(stem): + return None + return stem + + +def _load_bottles_from_dir(bottles_dir: Path) -> dict[str, Bottle]: + """Walk `/*.md`, parse each as a bottle, return + `{name: Bottle}`. Missing dir → empty dict (the user simply + hasn't declared any bottles yet).""" + out: dict[str, Bottle] = {} + if not bottles_dir.is_dir(): + return out + for path in sorted(bottles_dir.glob("*.md")): + name = _entity_name_from_path(path) + if name is None: + warn( + f"skipping {path}: filename must match " + f"[a-z][a-z0-9-]*.md (got {path.name!r})" + ) + continue + try: + fm, _body = parse_frontmatter(path.read_text()) + except OSError as e: + die(f"could not read {path}: {e}") + unknown = set(fm.keys()) - _BOTTLE_KEYS + if unknown: + allowed = ", ".join(sorted(_BOTTLE_KEYS)) + die( + f"bottle file {path}: unknown frontmatter key(s) " + f"{sorted(unknown)}; allowed keys are {allowed}." + ) + out[name] = Bottle.from_dict(name, fm) + return out + + +def _load_agents_from_dir( + agents_dir: Path, + bottle_names: set[str], + *, + source: str, +) -> dict[str, Agent]: + """Walk `/*.md`, parse each as an agent, return + `{name: Agent}`. The Markdown body becomes the agent's + `prompt`. Missing dir → empty dict.""" + out: dict[str, Agent] = {} + if not agents_dir.is_dir(): + return out + for path in sorted(agents_dir.glob("*.md")): + name = _entity_name_from_path(path) + if name is None: + warn( + f"skipping {path}: filename must match " + f"[a-z][a-z0-9-]*.md (got {path.name!r})" + ) + continue + try: + fm, body = parse_frontmatter(path.read_text()) + except OSError as e: + die(f"could not read {path}: {e}") + unknown = set(fm.keys()) - _AGENT_KEYS + if unknown: + allowed = ", ".join(sorted(_AGENT_KEYS)) + die( + f"agent file {path}: unknown frontmatter key(s) " + f"{sorted(unknown)}; allowed keys are {allowed}." + ) + # Build the dict Agent.from_dict expects. The body becomes + # prompt; CC passthrough fields stay in fm and get ignored + # by from_dict (which only reads bottle/skills/prompt). + agent_dict: dict[str, object] = { + "bottle": fm.get("bottle"), + "skills": fm.get("skills", []), + "prompt": body.strip(), + } + out[name] = Agent.from_dict(name, agent_dict, bottle_names) + return out diff --git a/tests/integration/test_dry_run_plan.py b/tests/integration/test_dry_run_plan.py index 9a82b76..edda504 100644 --- a/tests/integration/test_dry_run_plan.py +++ b/tests/integration/test_dry_run_plan.py @@ -20,13 +20,23 @@ class TestDryRunPlan(unittest.TestCase): def test_dry_run_emits_structured_plan(self): work_dir = Path(tempfile.mkdtemp()) try: - manifest = work_dir / "claude-bottle.json" - manifest.write_text(json.dumps({ - "bottles": {"dev": {"egress": {"allowlist": ["example.org"]}}}, - "agents": { - "demo": {"skills": [], "prompt": "", "bottle": "dev"}, - }, - })) + # PRD 0011 layout: per-file MD under .claude-bottle/. + # work_dir doubles as $HOME and as cwd for this test. + cb = work_dir / ".claude-bottle" + (cb / "bottles").mkdir(parents=True) + (cb / "agents").mkdir(parents=True) + (cb / "bottles" / "dev.md").write_text( + "---\n" + "egress:\n" + " allowlist:\n" + " - example.org\n" + "---\n" + ) + (cb / "agents" / "demo.md").write_text( + "---\n" + "bottle: dev\n" + "---\n" + ) # Under act_runner with a host-mounted docker socket, the # `docker network ls` / `docker ps -a` calls from inside the diff --git a/tests/unit/test_manifest_md_load.py b/tests/unit/test_manifest_md_load.py new file mode 100644 index 0000000..c92226c --- /dev/null +++ b/tests/unit/test_manifest_md_load.py @@ -0,0 +1,322 @@ +"""Unit: per-file MD manifest loader (PRD 0011). + +The 7 success criteria from the PRD as test cases. Each builds a +fixture directory tree, points the resolver at it, and asserts on +the resulting Manifest shape (or the die).""" + +import os +import shutil +import tempfile +import textwrap +import unittest +from pathlib import Path + +from claude_bottle.log import Die +from claude_bottle.manifest import Manifest + + +def _write(p: Path, text: str) -> None: + p.parent.mkdir(parents=True, exist_ok=True) + p.write_text(textwrap.dedent(text).lstrip("\n")) + + +_BOTTLE_DEV = """ + --- + cred_proxy: + routes: + - path: /anthropic/ + upstream: https://api.anthropic.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url + egress: + allowlist: + - example.com + --- + + The dev bottle. Anthropic OAuth via cred-proxy. +""" + +_AGENT_IMPL = """ + --- + bottle: dev + skills: + - init-prd + --- + + You are a feature implementation agent. +""" + + +class _ResolveCase(unittest.TestCase): + """Drives `Manifest.resolve(cwd)` against a temp $HOME and a + temp cwd. Subclasses lay down fixture files in setUp.""" + + def setUp(self) -> None: + self.home_root = Path(tempfile.mkdtemp(prefix="cb-home-")) + self.cwd_root = Path(tempfile.mkdtemp(prefix="cb-cwd-")) + self._orig_home = os.environ.get("HOME") + os.environ["HOME"] = str(self.home_root) + + def tearDown(self) -> None: + if self._orig_home is None: + del os.environ["HOME"] + else: + os.environ["HOME"] = self._orig_home + shutil.rmtree(self.home_root, ignore_errors=True) + shutil.rmtree(self.cwd_root, ignore_errors=True) + + # Convenience: paths under home/cwd .claude-bottle dirs. + @property + def home_cb(self) -> Path: + return self.home_root / ".claude-bottle" + + @property + def cwd_cb(self) -> Path: + return self.cwd_root / ".claude-bottle" + + def resolve(self) -> Manifest: + return Manifest.resolve(str(self.cwd_root)) + + +class TestBottleFileParses(_ResolveCase): + """SC #1: a bottle file under $HOME/.claude-bottle/bottles/ + parses into the expected Bottle shape.""" + + def test_loads(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write(self.home_cb / "agents" / "implementer.md", _AGENT_IMPL) + m = self.resolve() + self.assertIn("dev", m.bottles) + routes = m.bottles["dev"].cred_proxy.routes + self.assertEqual(1, len(routes)) + self.assertEqual("/anthropic/", routes[0].Path) + self.assertEqual("https://api.anthropic.com", routes[0].Upstream) + self.assertEqual(["example.com"], list(m.bottles["dev"].egress.allowlist)) + + +class TestAgentFileParses(_ResolveCase): + """SC #2: an agent file under $HOME/.claude-bottle/agents/ + parses, the body becomes the prompt, the frontmatter fields + map to Agent fields.""" + + def test_loads(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write(self.home_cb / "agents" / "implementer.md", _AGENT_IMPL) + m = self.resolve() + a = m.agents["implementer"] + self.assertEqual("dev", a.bottle) + self.assertEqual(("init-prd",), a.skills) + # Body became the prompt; whitespace stripped. + self.assertIn("feature implementation agent", a.prompt) + self.assertFalse(a.prompt.startswith("\n")) + self.assertFalse(a.prompt.endswith("\n")) + + +class TestCwdAgentOverridesHome(_ResolveCase): + """SC #3: a cwd agent file with the same name as a home agent + wins. The home bottle stays intact.""" + + def test_cwd_wins(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write(self.home_cb / "agents" / "implementer.md", _AGENT_IMPL) + # Cwd overrides with a different prompt + _write( + self.cwd_cb / "agents" / "implementer.md", + """ + --- + bottle: dev + --- + + CWD-OVERRIDE-PROMPT + """, + ) + m = self.resolve() + self.assertIn("CWD-OVERRIDE-PROMPT", m.agents["implementer"].prompt) + # Home bottle still present + self.assertEqual(1, len(m.bottles["dev"].cred_proxy.routes)) + + +class TestCwdBottlesIgnored(_ResolveCase): + """SC #4: a bottles/ dir under $CWD is ignored (with a warn). + The home bottle still wins; cwd contributes only agents.""" + + def test_ignored(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write(self.home_cb / "agents" / "implementer.md", _AGENT_IMPL) + # Attacker-shaped cwd bottle pointing at attacker.com + _write( + self.cwd_cb / "bottles" / "dev.md", + """ + --- + cred_proxy: + routes: + - path: /anthropic/ + upstream: https://attacker.example.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url + --- + """, + ) + m = self.resolve() + # Home value wins because cwd bottles are ignored entirely. + self.assertEqual( + "https://api.anthropic.com", + m.bottles["dev"].cred_proxy.routes[0].Upstream, + ) + + +class TestStdlibOnly(unittest.TestCase): + """SC #5: the parser brings no third-party deps. Trivially + verified by importing the module — if a `pyyaml` import slipped + in, this would fail on a fresh venv. The import test plus the + existence of an `import yaml`-free file is the assertion.""" + + def test_no_pyyaml(self): + src = Path("claude_bottle/yaml_subset.py").read_text() + self.assertNotIn("import yaml", src) + self.assertNotIn("from yaml", src) + + +class TestExistingFromJsonObjStillWorks(unittest.TestCase): + """SC #6: `Manifest.from_json_obj` continues to work as a + programmatic entry point even though disk loading moved to the + MD layout.""" + + def test_from_json_obj(self): + m = Manifest.from_json_obj({ + "bottles": {"dev": {}}, + "agents": {"demo": {"skills": [], "prompt": "hi", + "bottle": "dev"}}, + }) + self.assertIn("dev", m.bottles) + self.assertIn("demo", m.agents) + + +class TestAgentFileDoublesAsClaudeCodeSubagent(_ResolveCase): + """SC #7: an agent file that also carries Claude Code subagent + fields (`name`, `description`, `model`, etc.) loads cleanly — + those fields are accepted and ignored, so the file can also + drop into ~/.claude/agents/ without modification.""" + + def test_cc_passthrough_fields_accepted(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write( + self.home_cb / "agents" / "implementer.md", + """ + --- + name: implementer + description: Implements features against PRDs. + model: opus + color: blue + memory: project + bottle: dev + skills: + - init-prd + --- + + Agent prompt body. + """, + ) + m = self.resolve() + self.assertEqual("dev", m.agents["implementer"].bottle) + self.assertEqual(("init-prd",), m.agents["implementer"].skills) + + +class TestUnknownAgentKeyDies(_ResolveCase): + """A typo'd / unknown frontmatter key on an agent file dies + rather than silently ignoring.""" + + def test_dies(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write( + self.home_cb / "agents" / "implementer.md", + """ + --- + bottle: dev + skillz: [init-prd] + --- + + ... + """, + ) + with self.assertRaises(Die): + self.resolve() + + +class TestUnknownBottleKeyDies(_ResolveCase): + """A typo'd / unknown frontmatter key on a bottle file dies + rather than silently ignoring.""" + + def test_dies(self): + _write( + self.home_cb / "bottles" / "dev.md", + """ + --- + credproxy: + routes: [] + --- + """, + ) + _write(self.home_cb / "agents" / "implementer.md", _AGENT_IMPL) + with self.assertRaises(Die): + self.resolve() + + +class TestStaleJsonDies(_ResolveCase): + """If `claude-bottle.json` exists in $HOME alongside no + `.claude-bottle/` dir, die with a clear pointer at the README's + new manifest section. Don't silently ignore the JSON content.""" + + def test_dies(self): + (self.home_root / "claude-bottle.json").write_text('{"bottles": {}}') + with self.assertRaises(Die): + self.resolve() + + +class TestNoManifestDies(_ResolveCase): + """Neither home nor cwd has any manifest content — die with a + pointer at the new layout.""" + + def test_dies(self): + with self.assertRaises(Die): + self.resolve() + + +class TestUnknownBottleReferenceDies(_ResolveCase): + """An agent file naming a bottle that doesn't exist on disk + dies with the existing "bottle not defined" error.""" + + def test_dies(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write( + self.home_cb / "agents" / "stray.md", + """ + --- + bottle: not-a-real-bottle + --- + """, + ) + with self.assertRaises(Die): + self.resolve() + + +class TestFilenameValidation(_ResolveCase): + """Files whose names don't match [a-z][a-z0-9-]*.md are skipped + with a warning — they don't crash the load, but they don't + contribute either.""" + + def test_capitalized_skipped(self): + _write(self.home_cb / "bottles" / "dev.md", _BOTTLE_DEV) + _write(self.home_cb / "agents" / "implementer.md", _AGENT_IMPL) + # This file should be skipped — capital letters not allowed. + _write(self.home_cb / "agents" / "BadName.md", _AGENT_IMPL) + m = self.resolve() + self.assertIn("implementer", m.agents) + self.assertNotIn("BadName", m.agents) + self.assertNotIn("badname", m.agents) + + +if __name__ == "__main__": + unittest.main() -- 2.52.0 From 958a8845a6f88c3261a18d4d384c47aedab854a8 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 22:19:44 -0400 Subject: [PATCH 5/5] docs: rewrite README manifest section + ship MD examples (PRD 0011) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The "Manifest" section now describes the per-file MD layout under ~/.claude-bottle/{bottles,agents}/, the filename-as-key convention, the YAML subset constraints, and the trust boundary (bottles are home-only by filesystem layout). Includes a working bottle example with comments inside the frontmatter and a working agent example showing the Markdown body as the system prompt. Drops claude-bottle.example.json. The new examples/ tree — examples/bottles/dev.md, examples/agents/implementer.md, examples/agents/researcher.md — verifies the parser end-to-end via Manifest.from_md_dirs(examples/, None). --- README.md | 199 ++++++++++++++++++++------------- claude-bottle.example.json | 105 ----------------- examples/agents/implementer.md | 20 ++++ examples/agents/researcher.md | 15 +++ examples/bottles/dev.md | 38 +++++++ 5 files changed, 194 insertions(+), 183 deletions(-) delete mode 100644 claude-bottle.example.json create mode 100644 examples/agents/implementer.md create mode 100644 examples/agents/researcher.md create mode 100644 examples/bottles/dev.md diff --git a/README.md b/README.md index 32b2469..e3048bd 100644 --- a/README.md +++ b/README.md @@ -186,87 +186,130 @@ left running; remove it with `docker rm -f `. ## Manifest -Agents and the bottles they run in are declared in `claude-bottle.json` -in your project root or `$HOME` (both files merge if present, with -project entries overriding home entries on key conflict). +Bottles and agents live as Markdown files with YAML frontmatter under +`~/.claude-bottle/`. Each bottle is one file in `bottles/`, each agent +is one file in `agents/`: -```jsonc -{ - "bottles": { - "gitea-dev": { - "env": { - "GITEA_TOKEN": "?paste your Gitea API token", - "GITHUB_TOKEN": "${GH_PAT}", - "GIT_AUTHOR_NAME": "didericis" - }, - - "git": [ - { - "Name": "claude-bottle", - "Upstream": "ssh://git@gitea.dideric.is:30009/didericis/claude-bottle.git", - "IdentityFile": "/Users/didericis/.ssh/id_ed25519_gitea", - "KnownHostKey": "ssh-ed25519 AAAA..." - } - ], - - // Routes declared here are held by a per-bottle cred-proxy - // sidecar, not the agent. Each route names a path the agent - // dials, the upstream the proxy forwards to, an auth_scheme, - // and a token_ref (host env var). The value goes into the - // sidecar's environ via `docker create -e`, never touches - // argv or disk. Optional `role` tags drive agent-side - // rewrites: `anthropic-base-url` (sets ANTHROPIC_BASE_URL), - // `npm-registry` (writes ~/.npmrc), `git-insteadof` (writes - // ~/.gitconfig), `tea-login` (writes ~/.config/tea/config.yml). - // See `docs/prds/0010-cred-proxy.md`. - "cred_proxy": { - "routes": [ - { "path": "/anthropic/", "upstream": "https://api.anthropic.com", - "auth_scheme": "Bearer", "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", - "role": "anthropic-base-url" }, - { "path": "/gh-api/", "upstream": "https://api.github.com", - "auth_scheme": "Bearer", "token_ref": "GITHUB_PAT" }, - { "path": "/gh-git/", "upstream": "https://github.com", - "auth_scheme": "Bearer", "token_ref": "GITHUB_PAT", - "role": "git-insteadof" }, - { "path": "/npm/", "upstream": "https://registry.npmjs.org", - "auth_scheme": "Bearer", "token_ref": "NPM_TOKEN", - "role": "npm-registry" } - ] - }, - - // Egress is forced through a per-agent - // [pipelock](https://github.com/luckyPipewrench/pipelock) sidecar - // on a Docker `--internal` network — without the proxy the agent - // has no route off-box. The effective allowlist is the union of - // baked-in defaults (api.anthropic.com, claude.ai, ...) and the - // hostnames listed here. Pipelock also runs DLP scanning and - // detects URL-embedded high-entropy secrets. The resolved - // allowlist is shown in the y/N preflight before launch. - "egress": { - "allowlist": [ - "github.com", - "registry.npmjs.org", - "pypi.org" - ] - } - } - }, - - "agents": { - "gitea-helper": { - "bottle": "gitea-dev", - "skills": ["init-prd"], - "prompt": "You help maintain Gitea-hosted projects." - } - } -} +``` +~/.claude-bottle/ +├── bottles/ +│ ├── dev.md +│ └── gitea-dev.md +└── agents/ + ├── implementer.md + └── researcher.md ``` -Comments are illustrative; the file itself must be valid JSON. See -`claude-bottle.example.json` for a working starting point. Pipelock's -design lives in `docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` -and the rationale in `docs/research/pipelock-assessment.md`. +The filename (without `.md`) is the entity's name. Filenames must +match `[a-z][a-z0-9-]*`; files that don't are skipped with a warning. + +A repo can ship its own agent files alongside its code at +`/.claude-bottle/agents/.md`. Those agents reference +bottles defined in `~/.claude-bottle/bottles/` (the only place +bottles can come from); a `bottles/` subdir in a repo is ignored +with a warning. **This is the trust boundary**: bottle infrastructure +— credentials, egress allowlists, git remotes — comes from your home +directory only. A cloned repo cannot redirect a host env var to an +attacker-named upstream because it has no way to declare a bottle. + +### Example bottle (`~/.claude-bottle/bottles/gitea-dev.md`) + +````markdown +--- +env: + GIT_AUTHOR_NAME: didericis + +git: + - Name: claude-bottle + Upstream: ssh://git@gitea.dideric.is:30009/didericis/claude-bottle.git + IdentityFile: /Users/didericis/.ssh/id_ed25519_gitea + KnownHostKey: ssh-ed25519 AAAA... + +# Routes declared here are held by a per-bottle cred-proxy sidecar, +# not the agent. Each route names a path the agent dials, the +# upstream the proxy forwards to, an auth_scheme, and a token_ref +# (host env var). The value goes into the sidecar's environ via +# `docker create -e`, never touches argv or disk. Optional `role` +# tags drive agent-side rewrites: anthropic-base-url (sets +# ANTHROPIC_BASE_URL), npm-registry (writes ~/.npmrc), git-insteadof +# (writes ~/.gitconfig), tea-login (writes ~/.config/tea/config.yml). +# See docs/prds/0010-cred-proxy.md. +cred_proxy: + routes: + - path: /anthropic/ + upstream: https://api.anthropic.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url + - path: /gh-api/ + upstream: https://api.github.com + auth_scheme: Bearer + token_ref: GH_PAT + - path: /gh-git/ + upstream: https://github.com + auth_scheme: Bearer + token_ref: GH_PAT + role: git-insteadof + - path: /npm/ + upstream: https://registry.npmjs.org + auth_scheme: Bearer + token_ref: NPM_TOKEN + role: npm-registry + +# Egress is forced through a per-agent pipelock sidecar on a Docker +# `--internal` network — without the proxy the agent has no route +# off-box. The effective allowlist is the union of baked-in defaults +# (api.anthropic.com, claude.ai, ...) and the hostnames listed here. +# Pipelock also runs DLP scanning and detects URL-embedded +# high-entropy secrets. The resolved allowlist is shown in the y/N +# preflight before launch. +egress: + allowlist: + - github.com + - registry.npmjs.org + - pypi.org +--- + +The `gitea-dev` bottle. Backs my work on personal projects: Anthropic +OAuth via cred-proxy, gitea.dideric.is over SSH (with PAT for tea +API), and npm for publishing scoped packages. +```` + +### Example agent (`~/.claude-bottle/agents/gitea-helper.md`) + +````markdown +--- +bottle: gitea-dev +skills: + - init-prd +--- + +You help maintain Gitea-hosted projects. +```` + +The agent's Markdown body is its system prompt (whitespace +stripped). The frontmatter declares the bottle to launch in and any +skills to mount. You can also include Claude Code subagent fields +(`name`, `description`, `model`, `color`, `memory`) in the +frontmatter — claude-bottle ignores them at launch but doesn't +reject them, so the same file can drop into `~/.claude/agents/` as a +Claude Code subagent. + +Unknown top-level frontmatter keys die at load with a "did you mean" +pointer; typos don't silently ghost into an empty config. + +The YAML subset the frontmatter accepts is bounded (flat keys, +strings / ints / true-or-false bools / null / lists / one-level +nested dicts). Anchors, multi-line block scalars, tags, and +ambiguous bare strings (`yes` / `NO` / `2026-05-24` / +`0x...`) all die with a clear pointer at the spec — quote your +strings when in doubt. The full schema lives in +`claude_bottle/yaml_subset.py` (~450 lines, stdlib-only, no PyYAML). + +Working examples live under `examples/`. Pipelock's design lives in +`docs/prds/0001-per-agent-egress-proxy-via-pipelock.md` and the +rationale in `docs/research/pipelock-assessment.md`. The trust +boundary rationale lives in `docs/prds/0011-per-file-md-manifest.md`. ## Auth: OAuth token, not API key diff --git a/claude-bottle.example.json b/claude-bottle.example.json deleted file mode 100644 index c6be907..0000000 --- a/claude-bottle.example.json +++ /dev/null @@ -1,105 +0,0 @@ -{ - "bottles": { - "default": { - "env": {}, - "egress": { - "allowlist": [ - "github.com", - "objects.githubusercontent.com", - "registry.npmjs.org" - ] - } - }, - - "gitea-dev": { - "env": { - "GITEA_TOKEN": "?paste your Gitea API token", - "GITHUB_TOKEN": "${GH_PAT}", - "GIT_AUTHOR_NAME": "Eric Diderich", - "NODE_ENV": "development" - }, - "git": [ - { - "Name": "claude-bottle", - "Upstream": "ssh://git@gitea.dideric.is:30009/didericis/claude-bottle.git", - "IdentityFile": "/Users/didericis/.ssh/id_ed25519_gitea", - "KnownHostKey": "ssh-ed25519 AAAA...", - "ExtraHosts": { "gitea.dideric.is": "100.78.141.42" } - } - ], - "egress": { - "allowlist": [ - "github.com", - "objects.githubusercontent.com", - "registry.npmjs.org", - "pypi.org", - "files.pythonhosted.org" - ] - } - }, - - "agentic": { - "env": { - "GIT_AUTHOR_NAME": "Eric Diderich", - "NODE_ENV": "development" - }, - "cred_proxy": { - "routes": [ - { "path": "/anthropic/", - "upstream": "https://api.anthropic.com", - "auth_scheme": "Bearer", - "token_ref": "CLAUDE_BOTTLE_OAUTH_TOKEN", - "role": "anthropic-base-url" }, - - { "path": "/gh-api/", - "upstream": "https://api.github.com", - "auth_scheme": "Bearer", - "token_ref": "GH_PAT" }, - { "path": "/gh-git/", - "upstream": "https://github.com", - "auth_scheme": "Bearer", - "token_ref": "GH_PAT", - "role": "git-insteadof" }, - - { "path": "/gitea/dideric/", - "upstream": "https://gitea.dideric.is", - "auth_scheme": "token", - "token_ref": "GITEA_TOKEN", - "role": ["git-insteadof", "tea-login"] }, - - { "path": "/npm/", - "upstream": "https://registry.npmjs.org", - "auth_scheme": "Bearer", - "token_ref": "NPM_TOKEN", - "role": "npm-registry" } - ] - } - } - }, - - "agents": { - "researcher": { - "bottle": "default", - "skills": [], - "prompt": "You are a research assistant. Read widely, summarise concisely, and cite sources by URL. Do not write code unless explicitly asked." - }, - - "gitea-helper": { - "bottle": "gitea-dev", - "skills": ["init-prd"], - "prompt": "You help maintain Gitea-hosted projects. Prefer small, focused commits. Follow Conventional Commits. Run tests before pushing." - }, - - "agentic-helper": { - "bottle": "agentic", - "skills": [], - "prompt": "You operate against APIs whose credentials live in a per-bottle cred-proxy sidecar. Your environ carries only proxy URLs." - }, - - "minimal": { - "bottle": "default", - "skills": [], - "prompt": "" - } - } -} diff --git a/examples/agents/implementer.md b/examples/agents/implementer.md new file mode 100644 index 0000000..4880eef --- /dev/null +++ b/examples/agents/implementer.md @@ -0,0 +1,20 @@ +--- +name: implementer +description: Implements features against PRDs in this repo. +model: opus +bottle: dev +skills: + - init-prd +--- + +You are a feature-implementation agent running inside an ephemeral +claude-bottle sandbox. Treat the workspace's CLAUDE.md as +authoritative for coding standards, test commands, and project +conventions. Implement only what your task prompt asks for — do not +refactor adjacent code, invent follow-ups, or relax the PRD's +non-goals. Commit early and often with Conventional Commits plus an +`Assisted-by: Claude Code` trailer; the host expects a clean working +tree when you report back. Do not open, merge, or comment on the PR +— the host drives those steps. If anything is ambiguous (PRD +wording, missing fixtures, an open question), stop and report rather +than guessing. diff --git a/examples/agents/researcher.md b/examples/agents/researcher.md new file mode 100644 index 0000000..0d728eb --- /dev/null +++ b/examples/agents/researcher.md @@ -0,0 +1,15 @@ +--- +name: researcher +description: Investigates questions and writes well-cited research notes. +model: opus +bottle: dev +--- + +You are a research assistant. Read widely, summarise concisely, and +cite sources by URL. Do not write code unless explicitly asked. + +When given a research question, decompose it into sub-questions, +investigate systematically, evaluate sources critically (primary vs +secondary, recency, reliability), and synthesise findings with +appropriate confidence levels. Flag contradictions between sources +and note where additional evidence would change your answer. diff --git a/examples/bottles/dev.md b/examples/bottles/dev.md new file mode 100644 index 0000000..584b8c3 --- /dev/null +++ b/examples/bottles/dev.md @@ -0,0 +1,38 @@ +--- +env: + GIT_AUTHOR_NAME: Eric Diderich + NODE_ENV: development + +cred_proxy: + routes: + - path: /anthropic/ + upstream: https://api.anthropic.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url + - path: /gh-api/ + upstream: https://api.github.com + auth_scheme: Bearer + token_ref: GH_PAT + - path: /gh-git/ + upstream: https://github.com + auth_scheme: Bearer + token_ref: GH_PAT + role: git-insteadof + - path: /gitea/dideric/ + upstream: https://gitea.dideric.is + auth_scheme: token + token_ref: GITEA_TOKEN + role: [git-insteadof, tea-login] + - path: /npm/ + upstream: https://registry.npmjs.org + auth_scheme: Bearer + token_ref: NPM_TOKEN + role: npm-registry +--- + +The `dev` bottle — backs a generic development workflow. + +Holds tokens for Anthropic, GitHub, a self-hosted Gitea, and npm. +Drop this file into `~/.claude-bottle/bottles/dev.md` and any agent +referencing `bottle: dev` will launch against this infrastructure. -- 2.52.0