Files
bot-bottle/docs/prds/0011-per-file-md-manifest.md
T
2026-05-28 17:56:14 -04:00

415 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PRD 0011: Per-file Markdown manifest
- **Status:** Draft
- **Author:** didericis
- **Created:** 2026-05-24
## Summary
Replace the single-file `bot-bottle.json` manifest with a
per-file Markdown-with-YAML-frontmatter layout. Bottles live as
`$HOME/.bot-bottle/bottles/<name>.md`; agents live as
`$HOME/.bot-bottle/agents/<name>.md` (home-resident) and
`$CWD/.bot-bottle/agents/<name>.md` (repo-supplied). Each file
carries its structured config in YAML frontmatter and (for agents)
its system prompt in the Markdown body.
The format change clears the way for the layout change: one file
per bottle, one file per agent, two directories on each side of
the `$HOME` / `$CWD` trust boundary. That boundary stops living in
resolver logic (PRD 0011-v1's CwdExtension approach, closed in
favor of this design) and becomes filesystem layout — `$CWD` has
no `bottles/` subdirectory, period.
The YAML we accept is bounded (flat keys → strings, lists, simple
nested dicts), so the parser is hand-rolled and stdlib-only — no
PyYAML dependency. The project's "low deps by default" stance
(CLAUDE.md) stays intact.
## Problem
`bot-bottle.json` works fine at one bottle and one agent. The
project is heading for many of both, and the single-JSON shape
starts to fray:
- **Discovery + diff scaling.** A user with 8 bottles and 12
agents lands at hundreds of lines of nested JSON. Two changes
to unrelated agents touch the same file; codeowners-style
ownership doesn't apply. File-globbing tools (`grep`, `fd`)
can't find one agent without parsing the whole file.
- **No comments, no multi-line strings.** Agent prompts longer
than a sentence become single-line escaped horrors in JSON.
Documentation about why a bottle exists (which tokens it
holds, why these egress allowlist entries) has nowhere natural
to live in the manifest file itself; a sibling README drifts.
- **Trust boundary lives in code, not on disk.** PRD 0011-v1
(closed; see PR #15) made the resolver reject cwd manifests
that try to define bottles. The rule is correct and enforced,
but it's invisible to anyone reading the on-disk layout —
there's no positive signal that `$HOME` is the only place
bottles can come from. A reader has to know the resolver's
rules to audit the security posture.
The companion research
(`docs/research/manifest-format-and-grouping.md`) walks the two
axes (grouping × format) and lands on this design.
## Goals / Success criteria
Each test runs against a temporary `$HOME` and a temporary `$CWD`:
1. **A bottle file under `$HOME/.bot-bottle/bottles/`
parses.** A `dev.md` file with YAML frontmatter declaring
`cred_proxy.routes`, `git`, `env`, `egress` produces a Bottle
dataclass equivalent to the current JSON shape.
2. **An agent file under `$HOME/.bot-bottle/agents/` parses.**
`implementer.md` with frontmatter that names `bottle:`,
`skills:`, and other fields, with the body as the system
prompt, produces an Agent dataclass.
3. **An agent file under `$CWD/.bot-bottle/agents/` parses
and overrides home-resident agents of the same name.** The
cwd agent's frontmatter and body win; the home bottle it
references stays intact.
4. **A bottle file under `$CWD/.bot-bottle/bottles/` is
ignored.** The directory does not contribute to the
manifest; if a user accidentally creates one, the launcher
emits a `warn`-level log naming the offending files and
continues. Filesystem layout is the boundary; the warning
is a usability nicety, not a security gate.
5. **No third-party Python dependencies introduced.** A fresh
clone with only stdlib + bot-bottle's own code runs every
parser test. Frontmatter parsing is hand-rolled against the
declared YAML subset.
6. **Existing tests pass against the new layout.** Tests today
build manifests via JSON literals against `Manifest.from_json_obj`.
That entry point keeps working for tests (used to construct
manifests programmatically); production resolution flows
through the new directory-globbing loader.
7. **Agent files double as Claude Code subagent files.** The
`name`, `description`, `model`, `color`, and `memory` fields
from Claude Code's existing subagent spec are accepted in
our frontmatter alongside our own fields. Copying an agent
file from `$HOME/.bot-bottle/agents/` to
`~/.claude/agents/` produces a working Claude Code subagent
(subject to Claude Code's tolerance for the extra `bottle:`
and `bot_bottle:` fields — see Open Questions).
## Non-goals
- **A general YAML implementation.** The parser handles the
subset bot-bottle's frontmatter actually uses; documents
that exceed the subset (anchors, multi-line block scalars,
tags, implicit type coercion, flow style, etc.) die with a
pointer at the spec. We are not building a YAML library.
- **Compatibility with the old JSON layout at runtime.** The
resolver no longer reads `bot-bottle.json` files. This is
a breaking change; existing users hand-rewrite their JSON
into the new per-file layout (bot-bottle has a single
primary user today, so the migration is one person rewriting
one file). Documented as part of the README rewrite.
- **`$HOME/.claude/agents/` integration on the input side.** We
don't read agent files out of Claude Code's directory. Our
files can be copied into Claude Code's tree by the user if
they want, but the input path for bot-bottle is its own
directory.
- **A signed-manifest scheme.** Out of scope per the
closed-PR-15 PRD; the trust boundary here is "your home
directory is yours."
- **Per-bottle inheritance / composition.** Each bottle file is
self-contained. If shared egress allowlists become common we
can revisit, but the v1 of this PRD is one file = one bottle.
- **Hot-reload.** Changes to manifest files take effect at next
`./cli.py start`; we do not watch the directory.
## Scope
### In scope
- **Directory layout.**
- `$HOME/.bot-bottle/bottles/<name>.md` — bottle
definitions (full schema; one Bottle per file).
- `$HOME/.bot-bottle/agents/<name>.md` — home-resident
agents.
- `$CWD/.bot-bottle/agents/<name>.md` — cwd-resident
agents; same schema as home agents, but bottle names must
resolve against the home set.
- `$CWD/.bot-bottle/bottles/` — ignored with a warn-level
log (see SC #4). Does not contribute to the manifest.
- `<name>` is the file basename without `.md`. Filenames must
match `[a-z][a-z0-9-]*` (kebab-case, ASCII-only).
- **File schema.** Markdown with YAML frontmatter. Frontmatter
delimited by `---` lines at the top of the file; everything
after the closing `---` is the body. For agents, body is the
system prompt. For bottles, body is human documentation
(optional, ignored by the parser).
- **Agent frontmatter fields.**
- `bottle: <name>` (required) — bottle to launch in.
- `skills: [<name>, ...]` (optional) — host-side skills under
`~/.claude/skills/`.
- `name`, `description`, `model`, `color`, `memory` — accepted
but treated as Claude Code passthrough; bot-bottle
ignores them at launch but doesn't reject. Lets the same
file double as a Claude Code subagent.
- Unknown top-level keys die with a hint listing accepted
keys. We don't silently ignore typos.
- **Bottle frontmatter fields.** Same keys as today's JSON
schema: `env`, `git`, `cred_proxy.routes`, `egress.allowlist`,
`egress.dlp_action`. No semantic changes.
- **YAML subset parser.** Hand-rolled, stdlib-only. Supports:
- Flat `key: value` pairs at the top level.
- String, int, bool (`true`/`false` only — no `yes`/`no`/`on`/
`off`), null (`null` / explicit `~`).
- Lists: block-style `- item` lines, items are strings or
flow lists/dicts of the same.
- Nested dicts: one level under a key, block-style.
- Quoted strings: single + double, escapes as JSON-style.
- Comments: `# ...` at end of line or on its own.
Rejects with a clear error: anchors (`&`/`*`), multi-line
block scalars (`|`, `>`), tags (`!!`), implicit-typed strings
(`NO`/`Norway`/dates auto-coerced to booleans/dates),
flow-style nested deeper than one level. Empty document is
fine; missing frontmatter delimiters is fine for bottles
(file = body-only is treated as no-frontmatter, which fails
the required-keys check — same diagnostic as malformed).
- **Manifest assembly.** New resolver:
1. Walk `$HOME/.bot-bottle/bottles/*.md` → Bottle dict
keyed by filename.
2. Walk `$HOME/.bot-bottle/agents/*.md` → Agent dict.
3. Walk `$CWD/.bot-bottle/agents/*.md` → Agent dict; merge
into the home agent dict, cwd wins on name collision.
4. Validate every agent's `bottle:` against the bottle dict.
5. Warn if `$CWD/.bot-bottle/bottles/` exists with files.
6. Return Manifest dataclass — same shape as today.
- **Docs.** README's manifest section rewrites against the new
layout. `bot-bottle.example.json` becomes
`examples/bottles/dev.md` + `examples/agents/implementer.md`.
The PRD 0010 example block in its own document gets a
follow-up commit noting the new layout (out of scope for
this PRD; only update README + example files here).
- **Tests.**
- `tests/unit/test_yaml_subset_parser.py` — the parser
itself, including all the rejection cases listed above.
- `tests/unit/test_manifest_md_load.py` — directory-globbing
+ assembly, the seven success criteria.
- Existing integration tests keep working (the only public
entry points they hit are `Manifest.resolve` and
`Manifest.from_json_obj`).
### Out of scope
- Watching the directory for changes mid-session.
- An automated migration command. Existing JSON users
hand-rewrite into the new layout. The README rewrite
documents the new shape; that's the migration surface.
- Validating that frontmatter `name:` matches the filename.
Soft check via a warn log if mismatched, but not enforced.
- A bottle/agent dependency graph beyond the existing `bottle:`
field. No "this agent extends this other agent."
- IDE schemas / JSON Schema export for the MD format.
## Proposed design
### File layout
```
$HOME/.bot-bottle/
├── bottles/
│ ├── dev.md
│ ├── gitea-dev.md
│ └── ...
└── agents/
├── implementer.md
├── researcher.md
└── ...
$CWD/.bot-bottle/
└── agents/
└── <repo-specific>.md
```
`bottles/` only exists under `$HOME`. The directory's absence
under `$CWD` is the boundary — the loader doesn't even look
there.
### Example bottle file
```markdown
---
cred_proxy:
routes:
- path: /anthropic/
upstream: https://api.anthropic.com
auth_scheme: Bearer
token_ref: BOT_BOTTLE_OAUTH_TOKEN
role: anthropic-base-url
- path: /gitea/dideric/
upstream: https://gitea.dideric.is
auth_scheme: token
token_ref: GITEA_TOKEN
role: [git-insteadof, tea-login]
git:
remotes:
gitea.dideric.is:
Name: bot-bottle
Upstream: ssh://git@gitea.dideric.is:30009/didericis/bot-bottle.git
IdentityFile: ~/.ssh/gitea-delos-2.pem
ExtraHosts:
gitea.dideric.is: 100.78.141.42
KnownHostKey: ssh-rsa AAAAB3...
egress:
allowlist:
- example.com
---
The `dev` bottle. Backs my work on personal projects:
- Anthropic OAuth via cred-proxy
- gitea.dideric.is over SSH (with PAT for tea API)
- example.com in the egress allowlist
```
### Example agent file
```markdown
---
name: implementer
description: Implements features against PRDs in this repo.
model: opus
bottle: dev
skills:
- init-prd
---
You are a feature-implementation agent running inside an
ephemeral bot-bottle sandbox...
```
Drop the same file into `~/.claude/agents/implementer.md` and
Claude Code picks it up as a subagent (assuming Claude Code
tolerates the `bottle:` and `skills:` fields — see Open
Questions).
### YAML subset grammar
```
document := frontmatter? body?
frontmatter := "---" "\n" yaml_block "---" "\n"
yaml_block := (line "\n")*
line := blank | comment | mapping_line | list_item
mapping_line := indent key ":" (" " value)?
key := bare_string ; matches [A-Za-z_][A-Za-z0-9_-]*
value := scalar | inline_list | inline_dict
scalar := number | bool | null | quoted_string | bare_string
list_item := indent "-" " " value
```
Notable rejections (each dies with a specific error):
- Anchors (`&name`), aliases (`*name`).
- Multi-line block scalars (`|`, `>`, `|-`, `>+`).
- YAML tags (`!!str`, etc.).
- `yes`/`no`/`on`/`off`/`Y`/`N` as booleans (we require
literal `true` / `false`).
- Unquoted strings that resemble dates (`2026-05-24`) or octal
(`0123`) — the Norway problem and its kin. If a string would
be ambiguous, quote it.
- Flow style mappings nested more than one level deep.
Parser lives at `bot_bottle/yaml_subset.py`, ~300 lines.
Public API:
```python
def parse_frontmatter(text: str) -> tuple[dict[str, object], str]:
"""Return (frontmatter_dict, body_text). The dict's values are
str / int / bool / None / list / dict only; nesting capped at
two levels."""
```
### Existing code touched
- **`bot_bottle/manifest.py`** — `Manifest.resolve` rewritten
to walk the new directories. `Manifest.from_json_obj` kept as
a programmatic entry point (used by tests). New
`Manifest.from_md_dirs(home_dir, cwd_dir)` for the loader.
- **`bot_bottle/yaml_subset.py`** — new. The parser.
- **`README.md`** — manifest section rewritten against the new
layout.
- **`bot-bottle.example.json`** — removed; replaced by an
`examples/` directory with one bottle file + one agent file.
- **Tests** — new parser tests + new loader tests; existing
manifest tests adapt to either build via `from_json_obj`
(still supported) or use the new directory layout.
### Data model
No new dataclasses. `Bottle`, `Agent`, `Manifest`, `CredProxyRoute`,
etc. all stay the same shape. Only the loader changes.
### Backward compatibility
This is a breaking change for v1 users. bot-bottle has a
single primary user today, so migration is one person rewriting
one file — no automated migration command is in scope.
If `bot-bottle.json` exists in `$HOME` or `$CWD` *and* the
new `.bot-bottle/` directory does not exist, the resolver
dies with a clear pointer at the README's manifest section —
not silently merging formats, not silently dropping the JSON
content.
## Open questions
- **Claude Code tolerance for extra frontmatter fields.** Test
empirically before settling: drop a file with `bottle: dev`
in `~/.claude/agents/` and see whether Claude Code warns,
ignores, or breaks. If it warns, namespace the field
(`bot-bottle-bottle:` or a nested `bot_bottle:` block).
- **Hidden directory vs visible.** Default `.bot-bottle/`
(hidden — matches `.config/`, `.ssh/`, `.docker/`). If users
routinely want to navigate to it from the file manager,
switch to `bot-bottle/`. Lean hidden.
- **`description:` for bottles.** Should bottle frontmatter
carry a `description:` field for the y/N preflight? Default
no — bottle names are kebab-case and self-describing, and
the MD body is the place for human prose.
- **Filename ↔ frontmatter `name:` drift.** If both are
present and disagree, warn (we use the filename as the
authoritative key). Same for agents.
- **`include` / glob for shared egress allowlists.** A common
pattern will be "every bottle allows api.anthropic.com and
github.com"; do we want a way to share the list? Default no
for v1; revisit if it bites.
## References
- `docs/research/manifest-format-and-grouping.md` — the
analysis this PRD follows from.
- Closed PR #15 — the resolver-layer trust-boundary attempt;
superseded by this PRD's filesystem-layout approach.
- Closed PR #16 — the research doc + the option-B4 decision
comment that picked this design.
- Claude Code subagent spec — `~/.claude/agents/<name>.md`
with YAML frontmatter (existing convention this PRD aligns
agent files with).