# PRD 0033: Manifest Schema Boundaries - **Status:** Active - **Author:** didericis-codex - **Created:** 2026-06-02 - **Issue:** #125 ## Summary Split the manifest loader's schema validation, filesystem loading, `extends:` resolution, and compatibility passthrough policy into named internal boundaries without changing the public manifest format. The goal is to make `bot_bottle/manifest.py` cheaper to extend and review while preserving the strict validation behavior that keeps manifest mistakes visible. ## Problem `bot_bottle/manifest.py` has become a broad schema surface. It owns dataclass models, per-field validators, per-section unknown-key policy, Markdown frontmatter loading, two-pass bottle inheritance, merge semantics, and effective agent-to-bottle overlays in one file. The logic is deterministic and well covered, but the number of concerns makes schema changes expensive: reviewers have to re-derive loader behavior, parse-time validation, and post-parse composition rules together. One specific coupling is especially easy to miss: agent Markdown files are allowed to double as Claude Code subagent files, so the manifest parser accepts and ignores Claude Code frontmatter fields such as `name`, `description`, `model`, `color`, and `memory`. That compatibility rule is encoded as a passthrough allowlist alongside bot-bottle's own agent schema. If Claude Code adds a frontmatter field and users start sharing files between `~/.claude/agents/` and `.bot-bottle/agents/`, bot-bottle raises `ManifestError` until the local passthrough policy is updated. The current shape is workable, but it creates unnecessary risk for future manifest features. A new field can accidentally mix parsing, inheritance, and compatibility concerns in the same edit, or update one entry path (`from_json_obj`) without matching the Markdown path (`from_md_dirs`). ## Goals / Success Criteria - Preserve the existing public manifest schema and runtime behavior. - Keep `Manifest`, `Bottle`, `Agent`, `GitEntry`, `GitUser`, `AgentProvider`, `EgressRoute`, `EgressConfig`, and `PipelockRoutePolicy` import-compatible from `bot_bottle.manifest`. - Move Markdown file discovery and frontmatter loading behind a small internal loader boundary with tests that show `$HOME` bottles, `$HOME` agents, `$CWD` agent overrides, and ignored `$CWD` bottles still behave as before. - Move bottle `extends:` resolution and merge rules behind a named internal resolver boundary with tests for inheritance, replacement, cycle detection, missing parents, and per-field `git.user` overlays. - Centralize top-level allowed-key policy for bottle and agent schemas so unknown-key errors remain strict and the allowed set is visible in one place per schema. - Make Claude Code passthrough fields a named compatibility policy with focused tests that distinguish accepted passthrough keys from bot-bottle schema keys and true typos. - Keep both entry points, `Manifest.from_json_obj` and `Manifest.from_md_dirs`, covered by tests for shared validation and shared inheritance behavior. ## Non-goals - No manifest format changes. - No migration away from Markdown frontmatter or the stdlib-only YAML subset parser. - No dependency on Pydantic, PyYAML, JSON Schema, or another schema framework. - No relaxation of strict unknown-key validation for bot-bottle fields. - No provider-specific workspace, auth, launch, or egress changes. - No user-facing CLI behavior changes. ## Scope In scope: - Internal module organization for manifest loading and composition. - Validator helpers or schema-policy helpers that reduce duplicated unknown-key and type-checking logic. - Focused regression tests around the two existing load paths. - Documentation comments that clarify compatibility policy where it is encoded. Out of scope: - Renaming public dataclass fields or changing their capitalization. - Reworking callers outside the manifest boundary except for import updates that are mechanically required by an internal split. - Adding new manifest fields. - Changing how `bot-bottle.json` legacy-file errors are reported. ## Design Keep `bot_bottle.manifest` as the public facade. Existing imports should continue to work from that module, even if implementation moves into internal modules such as: - `bot_bottle/manifest_model.py` for dataclasses and field-level parsing. - `bot_bottle/manifest_loader.py` for filesystem layout, Markdown frontmatter loading, stale legacy-file checks, and `$CWD` override rules. - `bot_bottle/manifest_extends.py` for raw-bottle inheritance, cycle checks, and merge semantics. - `bot_bottle/manifest_schema.py` for allowed-key sets, passthrough policy, and small validation helpers. The exact filenames are not required. The required boundary is conceptual: raw input loading, schema validation, bottle inheritance, and effective agent-to-bottle overlays should be separable when reading and testing the code. `Manifest.from_json_obj` should continue to accept a raw JSON-like dict and feed the same raw bottle resolver used by Markdown loading. `Manifest.from_md_dirs` should perform only filesystem discovery and Markdown parsing before passing the same raw sections into the same validator/composer path. That shared path prevents a future schema field from working in one entry point but not the other. Claude Code passthrough fields should be represented as an explicit compatibility allowlist, named as such, and documented near the agent schema policy. The parser should still ignore those fields after validation. Tests should cover every passthrough field currently accepted and at least one unknown field that remains an error. The `extends:` resolver should remain raw-dict based until after inheritance is resolved. Merge rules stay unchanged: - scalar fields use child value when present. - `env` merges by key with child values winning. - `git.remotes` merges by upstream host, with child entries replacing duplicate hosts and explicit empty maps clearing inherited remotes. - `git.user` overlays per field. - `egress` remains full-replace when declared by the child. - cycles, missing parents, and self-reference remain `ManifestError`s. ## Implementation Chunks 1. Add focused characterization tests for agent allowed keys, Claude Code passthrough fields, and parity between `from_json_obj` and Markdown loading. 2. Extract allowed-key and compatibility policy helpers while keeping `bot_bottle.manifest` as the import surface. 3. Extract raw Markdown loading into a loader boundary and rerun existing PRD 0011 tests unchanged. 4. Extract bottle inheritance and merge rules into a resolver boundary and rerun existing PRD 0025 tests unchanged. 5. Trim `bot_bottle.manifest` to the public facade and model composition, leaving compatibility imports for existing callers. Each chunk should be mergeable on its own and should keep the test suite green. ## Testing Strategy Run the existing manifest-focused unit tests after each chunk: - `tests/unit/test_manifest_md_load.py` - `tests/unit/test_manifest_extends.py` - `tests/unit/test_manifest_git.py` - `tests/unit/test_manifest_git_user.py` - `tests/unit/test_manifest_agent_git_user.py` - `tests/unit/test_manifest_egress.py` - `tests/unit/test_manifest_runtime.py` Add new tests only where they lock down boundary behavior not already covered, especially compatibility passthrough and entry-point parity. ## Open Questions - Should the Claude Code passthrough allowlist intentionally track a documented upstream schema, or should bot-bottle keep a narrow local allowlist and update it only when users need a new shared-file field? - Should the public facade continue exposing every helper that tests currently import from `bot_bottle.manifest`, or should tests move to public behavior only during this cleanup?