From da969a503dad30ddd434022a33fad6fe45110fd8 Mon Sep 17 00:00:00 2001 From: didericis Date: Sun, 24 May 2026 21:12:43 -0400 Subject: [PATCH] docs(research): manifest format + grouping options Captures the two open questions surfaced by PRD 0011: should bottles and agents stay grouped in one file or split per file, and should the format stay JSON or move to YAML / MD-with-frontmatter. Recommends per-file MD-with-frontmatter (with agents shaped close to Claude Code's subagent spec so they can drop into ~/.claude/agents/ as a side effect), explicitly flags the PyYAML runtime dependency as a user-decision crossing the project's "low deps by default" line, and leaves several other choices (hidden dotdir vs visible, migration tooling) as open questions. Companion to docs/prds/0011-cwd-manifest-trust-boundary.md (which solves the trust problem at the resolver layer); this doc explores a structural alternative that would make the boundary self-documenting on disk. --- docs/research/manifest-format-and-grouping.md | 378 ++++++++++++++++++ 1 file changed, 378 insertions(+) create mode 100644 docs/research/manifest-format-and-grouping.md diff --git a/docs/research/manifest-format-and-grouping.md b/docs/research/manifest-format-and-grouping.md new file mode 100644 index 0000000..52f28cc --- /dev/null +++ b/docs/research/manifest-format-and-grouping.md @@ -0,0 +1,378 @@ +# Manifest format and grouping + +Two open questions for claude-bottle's manifest layer after PRD 0011: + +1. **Grouping.** Keep bottles and agents in the same manifest file + (today's shape), or split them — one file per bottle and one + file per agent. +2. **Format.** Stay on JSON, switch to YAML, or move to a Markdown + spec with YAML frontmatter. The Markdown option splits into two + sub-flavors: reuse Claude Code's existing subagent format with + bottle-specific extensions, or invent a claude-bottle-owned + Markdown spec used for both agents and bottles. + +The trust boundary from PRD 0011 — bottle infrastructure lives in +`$HOME`, agents may live in `$CWD` — is largely orthogonal to both +axes. But the choice of grouping and format changes how naturally +that boundary expresses on disk, and how comfortable the manifest +will be once a user has 5+ bottles and 10+ agents. + +## Why this matters + +Current shape: one JSON file at `$HOME/claude-bottle.json` (and +optionally `$CWD/claude-bottle.json` for cwd-defined agents). After +PRD 0011, the home file owns bottles + home agents; the cwd file is +agents-only. + +The single-file shape works fine for the project's first 1-2 +bottles. Real friction starts when: + +- A user has 5-10 bottles for different projects, each carrying + several `cred_proxy.routes` and a few `bottle.git` entries — the + home file becomes hundreds of lines of nested JSON. +- Multiple humans share a `$HOME` manifest pattern (dotfiles repo, + shared workstation, CI machine baseline) and want to compose + pieces — JSON doesn't merge cleanly outside of the resolver. +- Per-agent prompts grow long. JSON forces them onto a single + escaped line; multi-paragraph prompts become unreadable. +- Documentation (why does this bottle exist? what's the threat + model for these credentials?) has nowhere natural to live in a + JSON file; you end up with a sibling README that drifts from the + config. + +JSON's strengths (stable parser, machine-readable, stdlib-only) are +real and shouldn't be thrown away lightly. The question is whether +the inflection point has been reached. + +## Axis 1 — grouping + +### Option A: one file for both (current) + +`$HOME/claude-bottle.json` contains `bottles:` and `agents:`. Cwd +file (optional) contains `agents:` only. + +**Pros** + +- Zero new lifecycle. One file to discover, edit, version, diff. +- Trust boundary lives entirely in the resolver — the on-disk + shape doesn't enforce or surface it. +- Atomic edits: changing a bottle and the agents that reference it + is one commit, one save. + +**Cons** + +- Scales linearly with bottles + agents. A user with 8 bottles and + 12 agents hits a ~600-line file even with terse formatting. +- Diff conflicts: two changes to unrelated agents touch the same + file. Codeowners-style ownership doesn't apply cleanly. +- Discovery harder beyond a point: searching for one agent + requires reading the whole file in a JSON parser, not + filename-globbing. +- The trust boundary is invisible on disk — a reader can't tell at + a glance which entries are home-trusted vs cwd-supplied; they + have to know the resolver's rules. + +### Option B: file per thing + +Bottles live as `$HOME/.claude-bottle/bottles/.`. Agents +live as `$HOME/.claude-bottle/agents/.` (home agents) +and `$CWD/.claude-bottle/agents/.` (cwd agents). The +resolver globs each directory. + +**Pros** + +- Scales to N bottles + N agents without any single file growing. +- Trust boundary expresses on disk: `$HOME/.claude-bottle/bottles/` + is the only place bottles can come from. `$CWD/.claude-bottle/` + can only contribute agents. No resolver logic needed to enforce + it — the file paths are the enforcement. +- Aligns with Claude Code's existing model: each subagent already + lives as `~/.claude/agents/.md`. Claude Code users will + recognize the directory shape. +- Per-file ownership / codeowners / diff workflows just work. +- Per-agent prompts grow without affecting other files. +- Documentation per bottle/agent can live in the file itself + (e.g. comments, Markdown body). + +**Cons** + +- More lifecycle: creating, renaming, deleting agents/bottles + becomes file ops (mkdir, mv, rm) instead of editing one file. + Power users prefer that; new users may not. +- Discovery requires `ls`, not "grep one file." Tooling helps + (e.g. `./cli.py list`) but the manifest is no longer a single + artifact to email or ship. +- Atomicity: swapping a bottle name across agents touches + multiple files. Git handles this fine; a one-shot text editor + flow loses something. +- Backwards compatibility: existing users have one JSON file. + Migration tool needed. + +### Interaction with trust boundary + +Option A keeps the resolver-enforced boundary from PRD 0011 as the +only enforcement. + +Option B can express the boundary purely as filesystem layout: +`/bottles/` is privileged; `/` directory only has an +`agents/` subdirectory. The resolver becomes "glob the dirs, parse +each file, validate the cross-references." Strictly cleaner than +the current parse-and-reject logic, and more obvious to a reader +auditing the security posture. + +## Axis 2 — format + +### Option 1: stay on JSON + +What we have today. The trust-boundary change in PRD 0011 +preserves this format. + +**Pros** + +- Zero migration cost. +- Stdlib parser; no new dependency. The project's CLAUDE.md sets + "low dependencies by default" as a guideline. +- Stable, predictable parse semantics. No type-coercion gotchas. +- Tooling everywhere — IDE support, linters, jq. + +**Cons** + +- No native comments. (JSONC, JSON5, `_comment` fields are all + workarounds.) +- Multi-line strings become escaped one-liners. Agent prompts + longer than a sentence become unreadable. +- Trailing commas are an error. Hand-editing punishes small typos. +- Verbose: every key + value gets quotes; nested structures grow + indent. + +### Option 2: full YAML + +`$HOME/claude-bottle.yaml` (or `.yml`). Parser pulls in PyYAML (or +ruamel.yaml). + +**Pros** + +- Comments, multi-line strings (block scalars), anchors for repeated + blocks (e.g. shared egress allowlists across bottles). +- Common config language for ops tooling (Kubernetes, + GitHub/Gitea Actions, Docker Compose, pipelock's own config). +- Less syntactic noise than JSON for nested data. + +**Cons** + +- **New runtime dependency.** The project today uses zero + third-party Python packages for production code; YAML parsing + pulls in PyYAML. (CLAUDE.md: "bash-first, low-deps by default.") +- YAML's footguns: indentation sensitivity, the Norway problem + (`country: NO` → boolean False), implicit type coercion that's + surprised non-trivial production projects. +- Specifying schemas in YAML is harder to validate strictly — + parsers are forgiving where JSON is strict. +- No native escape hatch for executable content / templating, but + users will reach for one (Jinja, Helm-style) and then we're in + yaml-as-template-language territory. + +### Option 3: reuse Claude Code's subagent spec (Markdown + YAML frontmatter), with claude-bottle extensions + +Claude Code already stores subagents at `~/.claude/agents/.md` +with YAML frontmatter and a Markdown body. Frontmatter today +carries fields like `name`, `description`, `model`, `color`, +`memory`; the body is the system prompt. Adding fields like +`bottle: dev` and a `claude_bottle:` sub-block to the same +frontmatter would make each claude-bottle agent a drop-in addition +to Claude Code's agent directory. + +```markdown +--- +name: implementer +description: Implements features against PRDs in this repo +model: opus +bottle: dev +claude_bottle: + skills: [init-prd] +--- + +You are a feature-implementation agent running inside an +ephemeral claude-bottle sandbox. The host has copied the user's +project into /home/node/workspace... +``` + +Bottles don't fit Claude Code's agent schema — they're +infrastructure, not behavior. Either: + +- (3a) Bottles stay JSON / YAML; only agents adopt the + MD+frontmatter format. Mixed-format manifest. +- (3b) Bottles adopt MD+frontmatter too, using a claude-bottle-only + schema. Then we're really doing option 4 for bottles + option 3 + for agents. Two formats but one parser. + +**Pros** + +- Existing Claude Code users already know this format and have a + directory full of these files. The mental model is "an agent is + a Markdown file." +- Each agent's prompt lives naturally as Markdown body — long + prompts read well, can use headings/lists/code blocks. +- File-per-thing falls out automatically (one MD per agent). +- Claude Code may eventually consume claude-bottle's agent files + directly, doubling their utility. + +**Cons** + +- **Coupling to Claude Code's spec.** Anthropic owns that schema; + field names and semantics can change. Today's `model` / + `description` / `memory` are stable, but tomorrow's may not be. + Our `bottle:` / `claude_bottle:` extensions could collide with + future official fields. +- The agent file's frontmatter starts to carry two unrelated + schemas: Claude Code's (model, description) and ours (bottle, + skills, ...). One file, two owners. +- Bottles still need a format choice (3a vs 3b above) — we don't + escape that decision. +- Parsing MD+frontmatter is more work than JSON. Either pull a + frontmatter library (python-frontmatter) or hand-parse the + `---` block and feed it to PyYAML. Either way, a new dep. + +### Option 4: invent a claude-bottle MD spec, used for both agents and bottles + +```markdown +--- +# $HOME/.claude-bottle/agents/implementer.md +bottle: dev +skills: [init-prd] +--- + +You are a feature-implementation agent running inside an +ephemeral claude-bottle sandbox... +``` + +```markdown +--- +# $HOME/.claude-bottle/bottles/dev.md +cred_proxy: + routes: + - path: /anthropic/ + upstream: https://api.anthropic.com + auth_scheme: Bearer + token_ref: CLAUDE_BOTTLE_OAUTH_TOKEN + role: anthropic-base-url +egress: + allowlist: [example.com] +--- + +The dev bottle. Backs my work on personal projects: Anthropic +OAuth, the gitea instance at gitea.dideric.is, and an npm token +for publishing scoped packages. +``` + +**Pros** + +- Single format, two directory layouts. One parser, one mental + model. +- Bottle files get an MD body that's a natural home for + documentation (why does this bottle exist? what tokens does it + hold? who owns the keys?). +- Not coupled to Claude Code's schema; we own the spec. +- Trust boundary on disk: `$HOME/.claude-bottle/bottles/` is the + only place bottles can come from; `$CWD/.claude-bottle/agents/` + is the only thing cwd contributes. +- Agent files in this spec are *almost* compatible with Claude + Code's subagent format. If we keep the `name` / `description` + conventions, the same files can drop into `~/.claude/agents/` + with no friction — best of both worlds without the formal + coupling. + +**Cons** + +- Invents a format. Users learn one more thing (small thing — MD + with frontmatter is widely understood). +- Bottle file bodies have no built-in use case beyond + documentation; users may leave them empty, which looks weird + ("why is this file partly Markdown?"). +- Still requires a YAML parser for the frontmatter, so the + dependency cost is the same as option 3. + +## Synthesis + +Combining axes: + +| | JSON | YAML | MD reuse (3) | MD new (4) | +|--------------|--------------------------|--------------------------|--------------------------|--------------------------| +| **Grouped (A)** | today | yaml monolith | not natural — MD wants per-file | not natural | +| **Per-file (B)** | dir of JSON files | dir of YAML files | best fit | best fit | + +Per-file × MD-with-frontmatter is the natural shape on both axes — +the format wants to live one-per-file (the "MD doc with metadata" +pattern doesn't lend itself to monoliths), and the file-per-thing +grouping fits how users iterate on agents (write a prompt, save, +launch). + +Between option 3 (reuse CC spec) and option 4 (new spec): the +appealing middle ground is "claude-bottle agents follow the CC +subagent shape closely (name / description / model + bottle and +skills extensions) so they drop into `~/.claude/agents/` as a +side effect, while bottles use the same MD+frontmatter shape but +with claude-bottle's own schema and live in a dedicated directory." +This: + +- gives agents both a claude-bottle launch story AND a Claude Code + invocation story from the same file; +- keeps bottles entirely under our schema (no Anthropic dependency + for the security-load-bearing config); +- uses one parser, one body-format, two directories. + +Cost of moving: + +- New runtime dep: a YAML parser (PyYAML or a hand-parse-the- + frontmatter shim). PyYAML is the safest choice if we accept the + dependency. +- Resolver rewrite: glob two directories, parse each file, validate + cross-references. Roughly the same complexity as today's JSON + merge once the boundary check is in place. +- Migration tool: a one-shot script that splits today's JSON into + per-file MD docs. Five minutes of work for the tool, five + minutes of work for the user. +- Docs: README's manifest section gets rewritten. Worth doing + alongside the move. + +## Recommendation + +Per-file MD with frontmatter (option B × option 4 with the option-3 +agent compatibility). The format change clears the way for the +per-file grouping (which is the bigger UX win), and the per-file +shape is what makes the trust boundary self-documenting on disk. + +The dependency cost (PyYAML) is the main thing that needs an +explicit yes from the user — claude-bottle today has zero +third-party Python deps for production code, and adopting one +crosses a clean architectural line. If "low deps" stays a hard +constraint, the alternative is to hand-parse the frontmatter block +and feed it to a minimal YAML subset parser (the keys +claude-bottle uses are all flat string/list/dict — no anchors, no +multi-line block scalars, no implicit type coercion). + +If we don't want to commit to the move yet, the next-cheapest +option is keeping JSON but splitting into per-file (option B × +option 1): `$HOME/.claude-bottle/bottles/.json` + +`$HOME/.claude-bottle/agents/.json`. Most of the scaling +wins; none of the body-prose or dependency story. + +## Open questions + +- **Does Claude Code object to extra frontmatter fields?** Test: + drop a file with `bottle:` in `~/.claude/agents/` and see if CC + warns / ignores / breaks. If it warns, we'd want a different + field name (e.g. `claude-bottle-bottle`) or a namespaced block. +- **Migration story.** Is the project willing to ship a one-shot + `./cli.py migrate-manifest` command that does the JSON → MD + conversion? Or do users just rewrite by hand from the new docs? +- **Bottle file body content.** If most bottle .md files have an + empty body, is the MD-with-frontmatter format still warranted? + An alternative is YAML for bottles only (no body, but with + comments) and MD+frontmatter for agents. +- **Dotfiles vs not.** `$HOME/.claude-bottle/` or + `$HOME/claude-bottle/`? The hidden dotfile shape matches dev + conventions (`.config/`, `.ssh/`); the visible shape signals + "this is a real thing you own." +- **PyYAML hard dep, or minimal subset parser?** Trade-off between + "honest about the dependency" and "stay stdlib-only."