Files
bot-bottle/docs/prds/0025-bottle-extends.md

256 lines
11 KiB
Markdown

# PRD 0025: Bottle composition via `extends:`
- **Status:** Active
- **Author:** didericis
- **Created:** 2026-05-27
- **Issue:** #88
## Summary
Let a bottle inherit from another bottle by name. A bottle frontmatter
gains an optional `extends: <bottle-name>` field; at manifest-load
time the parent's resolved config is the base and the child's
declared fields overlay it. Bottles remain home-only — the trust
boundary the README documents stays intact.
Solves the "I don't want to duplicate a 50-line bottle just to add
one egress route or env var" pain that motivated issue #88's
`bottle_config` proposal, without weakening the agent-vs-bottle
trust separation that proposal would have eroded.
## Problem
Today the only way to vary a bottle's config is to write a whole
new `bottles/<name>.md`. If `staging` is the same as `dev` but with
one extra egress route, the operator copy-pastes the entire `dev`
file. Drift between copies follows: an egress addition to `dev` is
silently absent from `staging` until someone notices.
Issue #88 proposed inlining a `bottle_config:` block in agent files
that would merge with (and override) the referenced bottle. That
design lets a `$CWD/.bot-bottle/agents/<name>.md` file from a
cloned repo redeclare egress routes, env mappings, and git remotes
— breaking the existing security model where bottles are
`$HOME`-only specifically so cloned repos can't influence them
(see README, "Manifest" section).
`extends:` solves the same composition pain *within the existing
trust boundary*: only `$HOME` bottles can declare it, only `$HOME`
bottles can be its target. Cloned repos still cannot author
bottle-equivalent config.
## Alternatives considered
The question raised in issue #88 was *where composition should live*.
Three points in that design space, recorded here so the decision
stands on its own without the issue thread:
1. **Duplicate bottles (status quo).** Copy `dev.md` to `staging.md`
and edit. Zero new mechanism, but every shared field drifts: a
route added to `dev` is silently missing from `staging`. This is
the pain that prompted #88.
2. **Agent-side `bottle_config:` override (the original #88
proposal).** Let an agent file carry an inline block that merges
over its referenced bottle. Ergonomically attractive — one file,
no second bottle — but it **breaks the trust boundary**: agent
files can come from `$CWD/.bot-bottle/agents/` in a cloned repo, so
a clone could redeclare egress routes, env mappings, and git
remotes — i.e. grant itself bottle-equivalent authority over
credentials and network egress. The home-only-bottle invariant
exists precisely to stop this.
3. **Bottle-side `extends:` (chosen).** Move composition to the
bottle layer, where it inherits the home-only property for free:
only `$HOME` bottles can declare `extends:`, and only `$HOME`
bottles can be its target. Identical duplication relief to option
2, none of its trust erosion. The cost is that an override requires
a (home-owned) child bottle rather than an inline agent block —
which is the *point*: the override authority stays in `$HOME`.
`extends:` wins because it solves the duplication pain entirely on the
trusted side of the agent-vs-bottle boundary. (PRD 0027 later lifts a
deliberately narrow, non-credential field — `git.user` — to the agent
layer, on the separate reasoning that commit identity is not a
capability; egress, credentials, and remotes stay bottle-only.)
## Goals / Success Criteria
- Add `extends: <bottle-name>` to the bottle frontmatter schema.
- At manifest load, resolve `extends:` chains into a fully-merged
effective config before the rest of the pipeline sees the
`Bottle` object. Downstream code (provisioners, compose
renderer, etc.) is unchanged.
- Defined, simple merge semantics (see "Merge rules" below).
- Cycle detection: `A extends B extends A` dies at parse with a
clear pointer.
- Missing parent dies at parse with a clear pointer + the list of
available bottle names.
- Existing bottles continue to parse identically — `extends:` is
opt-in.
- Backend-agnostic: docker + smolmachines behave the same because
they both consume `Bottle` after the merge.
## Non-goals
- **No agent-side `bottle_config:`.** Option 2 under "Alternatives
considered" — weighed and rejected on trust grounds. Don't
reintroduce it.
- **No additive list merges** (e.g., `routes: append` keyword).
The `extends:` design uses full-replace for list-valued fields
(see "Merge rules"); if a use case shows up that genuinely
needs `parent_routes + child_routes`, design that separately
rather than baking it in now.
- **No multi-parent inheritance.** A bottle has at most one
parent. Diamond resolution is out of scope and rarely worth
the complexity for a manifest of this size.
- **No agent-level extends.** Agents stay simple (bottle ref +
skills + prompt). Inheritance lives only on the bottle side.
## Design
### Schema
A new optional top-level frontmatter key on bottle files:
```yaml
---
extends: dev
egress:
routes:
- host: staging.example.com
auth:
scheme: Bearer
token_ref: STAGING_TOKEN
---
```
`extends:` is a string — the name of another bottle (without the
`.md`). Required to be one of the bottles loaded from
`$HOME/.bot-bottle/bottles/`. Self-reference (`extends: self`
in `self.md`) and longer cycles die at parse.
### Merge rules
Resolution walks `extends:` chains bottom-up: parent's
already-resolved config is the base, child's declared fields
overlay it. For each field on `Bottle`:
| Field | Type | Merge |
|--------------|-----------------------|---------------------------------------------|
| `env` | `Mapping[str, str]` | dict merge, child wins on key collision |
| `git.user` | `GitUser` | child overlay: child's non-empty fields win |
| `git.remotes`| `tuple[GitEntry,…]` | dict merge by host, child wins |
| `egress` | `EgressConfig` | full replace if child declares `egress:` |
| `supervise` | `bool` | full replace if child declares `supervise:` |
Why full-replace for `egress.routes[]`:
- **Ordering matters.** Egress route ordering is part of the
match semantics (first matching host wins). Merging two
ordered lists by name introduces "where does the child's route
go?" ambiguity.
- **Simpler precedence.** "Child declares X → X wins, full
stop" is one sentence; partial merges need a table per list.
The `env` dict is the one exception because dict-merge has no
ordering concern and dict-keyed overrides are the obvious user
expectation. (Same model as shell `export` precedence.)
`git.remotes` is also keyed, so it follows dict-style inheritance:
children can override one host without restating every remote. The
remote entry is replaced as a whole on host collision because
`Upstream`, `IdentityFile`, and `KnownHostKey` are tightly coupled.
The `git.user` dataclass-overlay (each non-empty field wins
individually) is so a parent can declare `git.user.name` and a
child can add just `git.user.email`. The default `GitUser()`
fields are empty strings, which are treated as "not set" for
overlay purposes — same `is_empty()` predicate the provisioner
uses.
### Resolution algorithm
```
bottles_raw: dict[name, raw_frontmatter_dict] # before parsing
def resolve(name, seen=()) -> Bottle:
if name in seen:
die(f"bottle '{name}' extends-cycle: {' -> '.join(seen + (name,))}")
raw = bottles_raw[name]
parent_name = raw.get("extends")
if parent_name is None:
return Bottle.from_dict(name, raw) # leaf
if parent_name not in bottles_raw:
die(f"bottle '{name}' extends '{parent_name}' which is not defined; "
f"available: {sorted(bottles_raw)}")
parent = resolve(parent_name, seen + (name,))
return _merge(parent, raw, name)
```
Resolution is cached per-name within a single `Manifest.from_*`
call so a diamond-like reference graph (multiple children
extending the same parent) doesn't reparse the parent N times.
Cycles are caught by the `seen` set; the error message includes
the full chain so operators can find the offending file.
### Trust boundary preservation
Bottles continue to be loaded from `$HOME/.bot-bottle/bottles/`
only (`Manifest.from_md_dirs` is unchanged). The `extends:` field
references another file in that same directory. No cwd-readable
file gains the ability to declare or modify bottle config — the
attack surface from option 2 ("Alternatives considered") stays closed.
If a future change ever introduces cwd-loaded bottles, the
`extends:` resolver should be gated to forbid a `$CWD` bottle
from extending a `$HOME` bottle (lest cwd-loaded config inherit
home-resident credentials via the merge step). Today this is
moot because there's only one bottle source.
## Implementation chunks
1. **PRD (this commit).** Sets the design.
2. **Resolver + schema + tests.**
- Add `"extends"` to `_BOTTLE_KEYS`.
- Add `extends: str = ""` to a new pre-merge raw shape (or
keep raw dicts and resolve as a separate pass before
`Bottle.from_dict`).
- Implement `_resolve_bottle_with_extends` recursive walk
with cycle detection.
- Wire into `_load_bottles_from_dir` so the public
`Manifest.bottles` dict already contains resolved Bottle
instances (downstream code unchanged).
- Implement `_merge(parent: Bottle, child_raw: dict, name: str)
-> Bottle` with the rules table above.
- Unit tests: simple two-bottle extends, env merge with
collision, host-keyed git remote merge, egress list-replace, git.user overlay,
supervise override, missing parent dies, cycle dies, deeper
chains (A extends B extends C).
3. **Docs.** Add an `extends:` example to the README's manifest
section. Note that the field is optional + how merge precedes.
## Testing strategy
- **Unit (must):** all the merge semantics, the parse-time
errors (cycle, missing parent), and a multi-step inheritance
chain locking the resolver's recursion.
- **No integration changes needed:** downstream code consumes
the already-merged `Bottle`. Existing integration tests cover
the docker / smolmachines provisioning paths and would catch
any regression in how `Bottle.git`, `Bottle.egress`, etc., are
consumed.
## Open questions
- **Should the parent appear in the preflight summary?** Right
now the y/N preflight prints the resolved bottle config; with
`extends:` the operator doesn't see *which* fields came from
which level. A short `extends:` annotation in the preflight
output ("inherits from `dev`") would let the operator spot
surprises. Cheap follow-up; out of scope for this PRD.
- **Should `cli.py info <agent>` show the resolution chain?**
Same shape as the preflight question — useful diagnostic
surface, doesn't change the runtime. Out of scope.