Files
bot-bottle/docs/prds/0011-per-file-md-manifest.md

16 KiB
Raw Permalink Blame History

PRD 0011: Per-file Markdown manifest

  • Status: Active
  • Author: didericis
  • Created: 2026-05-24

Summary

Replace the single-file bot-bottle.json manifest with a per-file Markdown-with-YAML-frontmatter layout. Bottles live as $HOME/.bot-bottle/bottles/<name>.md; agents live as $HOME/.bot-bottle/agents/<name>.md (home-resident) and $CWD/.bot-bottle/agents/<name>.md (repo-supplied). Each file carries its structured config in YAML frontmatter and (for agents) its system prompt in the Markdown body.

The format change clears the way for the layout change: one file per bottle, one file per agent, two directories on each side of the $HOME / $CWD trust boundary. That boundary stops living in resolver logic (PRD 0011-v1's CwdExtension approach, closed in favor of this design) and becomes filesystem layout — $CWD has no bottles/ subdirectory, period.

The YAML we accept is bounded (flat keys → strings, lists, simple nested dicts), so the parser is hand-rolled and stdlib-only — no PyYAML dependency. The project's "low deps by default" stance (AGENTS.md) stays intact.

Problem

bot-bottle.json works fine at one bottle and one agent. The project is heading for many of both, and the single-JSON shape starts to fray:

  • Discovery + diff scaling. A user with 8 bottles and 12 agents lands at hundreds of lines of nested JSON. Two changes to unrelated agents touch the same file; codeowners-style ownership doesn't apply. File-globbing tools (grep, fd) can't find one agent without parsing the whole file.

  • No comments, no multi-line strings. Agent prompts longer than a sentence become single-line escaped horrors in JSON. Documentation about why a bottle exists (which tokens it holds, why these egress allowlist entries) has nowhere natural to live in the manifest file itself; a sibling README drifts.

  • Trust boundary lives in code, not on disk. PRD 0011-v1 (closed; see PR #15) made the resolver reject cwd manifests that try to define bottles. The rule is correct and enforced, but it's invisible to anyone reading the on-disk layout — there's no positive signal that $HOME is the only place bottles can come from. A reader has to know the resolver's rules to audit the security posture.

The companion research (docs/research/manifest-format-and-grouping.md) walks the two axes (grouping × format) and lands on this design.

Goals / Success criteria

Each test runs against a temporary $HOME and a temporary $CWD:

  1. A bottle file under $HOME/.bot-bottle/bottles/ parses. A dev.md file with YAML frontmatter declaring cred_proxy.routes, git, env, egress produces a Bottle dataclass equivalent to the current JSON shape.

  2. An agent file under $HOME/.bot-bottle/agents/ parses. implementer.md with frontmatter that names bottle:, skills:, and other fields, with the body as the system prompt, produces an Agent dataclass.

  3. An agent file under $CWD/.bot-bottle/agents/ parses and overrides home-resident agents of the same name. The cwd agent's frontmatter and body win; the home bottle it references stays intact.

  4. A bottle file under $CWD/.bot-bottle/bottles/ is ignored. The directory does not contribute to the manifest; if a user accidentally creates one, the launcher emits a warn-level log naming the offending files and continues. Filesystem layout is the boundary; the warning is a usability nicety, not a security gate.

  5. No third-party Python dependencies introduced. A fresh clone with only stdlib + bot-bottle's own code runs every parser test. Frontmatter parsing is hand-rolled against the declared YAML subset.

  6. Existing tests pass against the new layout. Tests today build manifests via JSON literals against Manifest.from_json_obj. That entry point keeps working for tests (used to construct manifests programmatically); production resolution flows through the new directory-globbing loader.

  7. Agent files double as Claude Code subagent files. The name, description, model, color, and memory fields from Claude Code's existing subagent spec are accepted in our frontmatter alongside our own fields. Copying an agent file from $HOME/.bot-bottle/agents/ to ~/.claude/agents/ produces a working Claude Code subagent (subject to Claude Code's tolerance for the extra bottle: and bot_bottle: fields — see Open Questions).

Non-goals

  • A general YAML implementation. The parser handles the subset bot-bottle's frontmatter actually uses; documents that exceed the subset (anchors, multi-line block scalars, tags, implicit type coercion, flow style, etc.) die with a pointer at the spec. We are not building a YAML library.

  • Compatibility with the old JSON layout at runtime. The resolver no longer reads bot-bottle.json files. This is a breaking change; existing users hand-rewrite their JSON into the new per-file layout (bot-bottle has a single primary user today, so the migration is one person rewriting one file). Documented as part of the README rewrite.

  • $HOME/.claude/agents/ integration on the input side. We don't read agent files out of Claude Code's directory. Our files can be copied into Claude Code's tree by the user if they want, but the input path for bot-bottle is its own directory.

  • A signed-manifest scheme. Out of scope per the closed-PR-15 PRD; the trust boundary here is "your home directory is yours."

  • Per-bottle inheritance / composition. Each bottle file is self-contained. If shared egress allowlists become common we can revisit, but the v1 of this PRD is one file = one bottle.

  • Hot-reload. Changes to manifest files take effect at next ./cli.py start; we do not watch the directory.

Scope

In scope

  • Directory layout.

    • $HOME/.bot-bottle/bottles/<name>.md — bottle definitions (full schema; one Bottle per file).
    • $HOME/.bot-bottle/agents/<name>.md — home-resident agents.
    • $CWD/.bot-bottle/agents/<name>.md — cwd-resident agents; same schema as home agents, but bottle names must resolve against the home set.
    • $CWD/.bot-bottle/bottles/ — ignored with a warn-level log (see SC #4). Does not contribute to the manifest.
    • <name> is the file basename without .md. Filenames must match [a-z][a-z0-9-]* (kebab-case, ASCII-only).
  • File schema. Markdown with YAML frontmatter. Frontmatter delimited by --- lines at the top of the file; everything after the closing --- is the body. For agents, body is the system prompt. For bottles, body is human documentation (optional, ignored by the parser).

  • Agent frontmatter fields.

    • bottle: <name> (required) — bottle to launch in.
    • skills: [<name>, ...] (optional) — host-side skills under ~/.claude/skills/.
    • name, description, model, color, memory — accepted but treated as Claude Code passthrough; bot-bottle ignores them at launch but doesn't reject. Lets the same file double as a Claude Code subagent.
    • Unknown top-level keys die with a hint listing accepted keys. We don't silently ignore typos.
  • Bottle frontmatter fields. Same keys as today's JSON schema: env, git, cred_proxy.routes, egress.allowlist, egress.dlp_action. No semantic changes.

  • YAML subset parser. Hand-rolled, stdlib-only. Supports:

    • Flat key: value pairs at the top level.
    • String, int, bool (true/false only — no yes/no/on/ off), null (null / explicit ~).
    • Lists: block-style - item lines, items are strings or flow lists/dicts of the same.
    • Nested dicts: one level under a key, block-style.
    • Quoted strings: single + double, escapes as JSON-style.
    • Comments: # ... at end of line or on its own.

    Rejects with a clear error: anchors (&/*), multi-line block scalars (|, >), tags (!!), implicit-typed strings (NO/Norway/dates auto-coerced to booleans/dates), flow-style nested deeper than one level. Empty document is fine; missing frontmatter delimiters is fine for bottles (file = body-only is treated as no-frontmatter, which fails the required-keys check — same diagnostic as malformed).

  • Manifest assembly. New resolver:

    1. Walk $HOME/.bot-bottle/bottles/*.md → Bottle dict keyed by filename.
    2. Walk $HOME/.bot-bottle/agents/*.md → Agent dict.
    3. Walk $CWD/.bot-bottle/agents/*.md → Agent dict; merge into the home agent dict, cwd wins on name collision.
    4. Validate every agent's bottle: against the bottle dict.
    5. Warn if $CWD/.bot-bottle/bottles/ exists with files.
    6. Return Manifest dataclass — same shape as today.
  • Docs. README's manifest section rewrites against the new layout. bot-bottle.example.json becomes examples/bottles/dev.md + examples/agents/implementer.md. The PRD 0010 example block in its own document gets a follow-up commit noting the new layout (out of scope for this PRD; only update README + example files here).

  • Tests.

    • tests/unit/test_yaml_subset_parser.py — the parser itself, including all the rejection cases listed above.
    • tests/unit/test_manifest_md_load.py — directory-globbing
      • assembly, the seven success criteria.
    • Existing integration tests keep working (the only public entry points they hit are Manifest.resolve and Manifest.from_json_obj).

Out of scope

  • Watching the directory for changes mid-session.
  • An automated migration command. Existing JSON users hand-rewrite into the new layout. The README rewrite documents the new shape; that's the migration surface.
  • Validating that frontmatter name: matches the filename. Soft check via a warn log if mismatched, but not enforced.
  • A bottle/agent dependency graph beyond the existing bottle: field. No "this agent extends this other agent."
  • IDE schemas / JSON Schema export for the MD format.

Proposed design

File layout

$HOME/.bot-bottle/
├── bottles/
│   ├── dev.md
│   ├── gitea-dev.md
│   └── ...
└── agents/
    ├── implementer.md
    ├── researcher.md
    └── ...

$CWD/.bot-bottle/
└── agents/
    └── <repo-specific>.md

bottles/ only exists under $HOME. The directory's absence under $CWD is the boundary — the loader doesn't even look there.

Example bottle file

---
cred_proxy:
  routes:
    - path: /anthropic/
      upstream: https://api.anthropic.com
      auth_scheme: Bearer
      token_ref: BOT_BOTTLE_CLAUDE_OAUTH_TOKEN
      role: anthropic-base-url
    - path: /gitea/dideric/
      upstream: https://gitea.dideric.is
      auth_scheme: token
      token_ref: GITEA_TOKEN
      role: [git-insteadof, tea-login]
git:
  remotes:
    gitea.dideric.is:
      Name: bot-bottle
      Upstream: ssh://git@gitea.dideric.is:30009/didericis/bot-bottle.git
      IdentityFile: ~/.ssh/gitea-delos-2.pem
      KnownHostKey: ssh-rsa AAAAB3...
egress:
  allowlist:
    - example.com
---

The `dev` bottle. Backs my work on personal projects:

- Anthropic OAuth via cred-proxy
- gitea.dideric.is over SSH (with PAT for tea API)
- example.com in the egress allowlist

Example agent file

---
name: implementer
description: Implements features against PRDs in this repo.
model: opus
bottle: dev
skills:
  - init-prd
---

You are a feature-implementation agent running inside an
ephemeral bot-bottle sandbox...

Drop the same file into ~/.claude/agents/implementer.md and Claude Code picks it up as a subagent (assuming Claude Code tolerates the bottle: and skills: fields — see Open Questions).

YAML subset grammar

document   := frontmatter? body?
frontmatter := "---" "\n" yaml_block "---" "\n"
yaml_block := (line "\n")*
line       := blank | comment | mapping_line | list_item
mapping_line := indent key ":" (" " value)?
key        := bare_string  ; matches [A-Za-z_][A-Za-z0-9_-]*
value      := scalar | inline_list | inline_dict
scalar     := number | bool | null | quoted_string | bare_string
list_item  := indent "-" " " value

Notable rejections (each dies with a specific error):

  • Anchors (&name), aliases (*name).
  • Multi-line block scalars (|, >, |-, >+).
  • YAML tags (!!str, etc.).
  • yes/no/on/off/Y/N as booleans (we require literal true / false).
  • Unquoted strings that resemble dates (2026-05-24) or octal (0123) — the Norway problem and its kin. If a string would be ambiguous, quote it.
  • Flow style mappings nested more than one level deep.

Parser lives at bot_bottle/yaml_subset.py, ~300 lines. Public API:

def parse_frontmatter(text: str) -> tuple[dict[str, object], str]:
    """Return (frontmatter_dict, body_text). The dict's values are
    str / int / bool / None / list / dict only; nesting capped at
    two levels."""

Existing code touched

  • bot_bottle/manifest.pyManifest.resolve rewritten to walk the new directories. Manifest.from_json_obj kept as a programmatic entry point (used by tests). New Manifest.from_md_dirs(home_dir, cwd_dir) for the loader.
  • bot_bottle/yaml_subset.py — new. The parser.
  • README.md — manifest section rewritten against the new layout.
  • bot-bottle.example.json — removed; replaced by an examples/ directory with one bottle file + one agent file.
  • Tests — new parser tests + new loader tests; existing manifest tests adapt to either build via from_json_obj (still supported) or use the new directory layout.

Data model

No new dataclasses. Bottle, Agent, Manifest, CredProxyRoute, etc. all stay the same shape. Only the loader changes.

Backward compatibility

This is a breaking change for v1 users. bot-bottle has a single primary user today, so migration is one person rewriting one file — no automated migration command is in scope.

If bot-bottle.json exists in $HOME or $CWD and the new .bot-bottle/ directory does not exist, the resolver dies with a clear pointer at the README's manifest section — not silently merging formats, not silently dropping the JSON content.

Open questions

  • Claude Code tolerance for extra frontmatter fields. Test empirically before settling: drop a file with bottle: dev in ~/.claude/agents/ and see whether Claude Code warns, ignores, or breaks. If it warns, namespace the field (bot-bottle-bottle: or a nested bot_bottle: block).
  • Hidden directory vs visible. Default .bot-bottle/ (hidden — matches .config/, .ssh/, .docker/). If users routinely want to navigate to it from the file manager, switch to bot-bottle/. Lean hidden.
  • description: for bottles. Should bottle frontmatter carry a description: field for the y/N preflight? Default no — bottle names are kebab-case and self-describing, and the MD body is the place for human prose.
  • Filename ↔ frontmatter name: drift. If both are present and disagree, warn (we use the filename as the authoritative key). Same for agents.
  • include / glob for shared egress allowlists. A common pattern will be "every bottle allows api.anthropic.com and github.com"; do we want a way to share the list? Default no for v1; revisit if it bites.

References

  • docs/research/manifest-format-and-grouping.md — the analysis this PRD follows from.
  • Closed PR #15 — the resolver-layer trust-boundary attempt; superseded by this PRD's filesystem-layout approach.
  • Closed PR #16 — the research doc + the option-B4 decision comment that picked this design.
  • Claude Code subagent spec — ~/.claude/agents/<name>.md with YAML frontmatter (existing convention this PRD aligns agent files with).