Files
bot-bottle/docs/prds/0031-split-merge-provider-route.md
T
didericis-claude f15721b424
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 46s
complete(prd): mark PRD 0031 active
Provisioned-wins merge and _route_to_yaml_fields are implemented and all
tests pass.
2026-06-02 05:45:28 +00:00

242 lines
9.4 KiB
Markdown

# PRD 0031: Simplify egress route merge and consolidate Route types
- **Status:** Active
- **Author:** didericis-claude
- **Created:** 2026-06-02
- **Issue:** #120
## Summary
Replace `_merge_provider_route`'s five-case nested conditional with a
flat provisioned-wins merge, and make the mapping between the host-side
`EgressRoute` and the addon's `Route` explicit in one place. Covers the
two remaining open tasks from the #117 hotspot review.
## Problem
### 1. `_merge_provider_route` branching
`_merge_provider_route` in `bot_bottle/egress.py` handles five distinct
cases in a single function with interleaved conditions:
1. **append-new** — host not in manifest; append a fresh route.
2. **upgrade-bare** — host found, no existing auth; adopt provider auth.
3. **no-op** — host found, same auth; return unchanged.
4. **tls-passthrough upgrade** — same as no-op but provider sets
`tls_passthrough=True`; flip the flag on the existing route.
5. **conflict-die** — host found, different auth; hard error.
Cases 3 and 4 share a block with no-op as the invisible fall-through.
`_find_or_alloc_token_env` is duplicated between cases 2 and 1. In-place
replacements spell out every `EgressRoute` field explicitly, so a new
field added to the dataclass silently drops its value in any replacement
site that wasn't updated.
The root cause of the complexity is that the current merge tries to be
cooperative: it lets manifest routes coexist with provider routes and
attempts to upgrade bare manifest entries. This makes sense if the
manifest is authoritative, but the actual intended hierarchy is the
opposite — provider routes claim their hosts outright and the manifest
fills in what's left.
### 2. Three-way Route type fragmentation
`EgressRoute` (in `egress.py`) and `egress_addon_core.Route` are
separate dataclasses with overlapping but not identical field sets:
| Field | `EgressRoute` | addon `Route` |
|---|---|---|
| `host` | ✓ | ✓ |
| `path_allowlist` | ✓ | ✓ |
| `auth_scheme` | ✓ | ✓ |
| `token_env` | ✓ | ✓ |
| `token_ref` | ✓ (host-side) | — |
| `roles` | ✓ (host-side) | — |
| `tls_passthrough` | ✓ (pipelock concern) | — |
`egress_render_routes` serialises `EgressRoute` fields to YAML; the
addon's `load_routes` deserialises that YAML into `Route` objects. If a
field is added to `EgressRoute` that should appear in the YAML, both
`egress_render_routes` and `_parse_one` must be updated consistently.
The render function spells the field list out inline with no reference
to the addon's parser, so divergence is silent until runtime.
`egress_addon_core.Route` cannot be replaced by `EgressRoute` — the
addon file is copied flat into the sidecar container image (`/app/`) and
has no access to the `bot_bottle` package. The types must stay separate;
the risk is that they drift.
## Goals / Success Criteria
- `egress_routes_for_bottle` implements a flat provisioned-wins merge:
provider routes claim their hosts; manifest routes for unclaimed hosts
append. No upgrade logic, no conflict detection.
- Token slot assignment is a single pass over the merged list, not
interleaved with the merge.
- `egress_render_routes` uses a single `_route_to_yaml_fields` helper
that explicitly lists the addon-visible fields, creating one place
where the `EgressRoute``Route` mapping is spelled out.
- All existing `TestProviderRouteMerge` and `TestRenderRoutes` tests
pass (adjusting assertions for any semantics changes described below).
- No behaviour change for existing manifests that don't trigger the
conflict-die or upgrade-bare paths.
## Non-goals
- Merging `EgressRoute` and `egress_addon_core.Route` into one class
(impossible: addon runs in a stdlib-only container environment).
- Changing what the addon does with a route once it has one.
- Changing `decide()` or `is_git_push_request()` in `egress_addon_core`
— those are already pure functions with good separation.
## Design
### Merge: provisioned wins
The new hierarchy: **provisioned routes own their hosts; manifest routes
fill the gaps.**
```
provisioned_hosts = {pr.host.lower() for pr in provider_routes}
effective = list(provider_routes)
effective += [r for r in manifest_routes if r.host.lower() not in provisioned_hosts]
```
Token slot assignment runs as a final pass over `effective` in order:
```python
def _assign_token_slots(
routes: list[EgressRoute],
) -> tuple[EgressRoute, ...]:
slot_for_ref: dict[str, str] = {}
out: list[EgressRoute] = []
for r in routes:
if r.auth_scheme and r.token_ref and not r.token_env:
token_env = slot_for_ref.get(r.token_ref)
if token_env is None:
token_env = f"EGRESS_TOKEN_{len(slot_for_ref)}"
slot_for_ref[r.token_ref] = token_env
r = dataclasses.replace(r, token_env=token_env)
out.append(r)
return tuple(out)
```
This replaces `_merge_provider_route`, `_find_or_alloc_token_env`, and
the slot-assignment loop inside `egress_manifest_routes`.
#### Semantics change
Under the old design, a manifest route for a provisioned host with a
*different* `auth_scheme` or `token_ref` raised a hard error. Under
provisioned-wins, the manifest entry is silently dropped. Operators who
relied on the conflict error to catch misconfigurations should audit
their manifests, but in practice this path was only reachable when a
manifest declared auth for `api.openai.com` or `chatgpt.com` with a
token ref other than `CODEX_HOST_CREDENTIAL_TOKEN_REF` while also
enabling `forward_host_credentials` — an unlikely combination.
Similarly, the "upgrade-bare" path (provider adopts a bare manifest
route's `path_allowlist`) is dropped: a provisioned host takes the
provider route's fields wholesale, and the manifest's `path_allowlist`
for that host is ignored.
### Route type mapping: `_route_to_yaml_fields`
Add a pure function in `egress.py`:
```python
def _route_to_yaml_fields(r: EgressRoute) -> dict:
"""Return the addon-visible fields for one route.
This is the single authoritative mapping between `EgressRoute`
(host-side) and `egress_addon_core.Route` (sidecar-side). If a
field is added to `Route` that must appear in the YAML, add it
here and in `egress_addon_core._parse_one` together."""
fields: dict = {"host": r.host}
if r.auth_scheme and r.token_env:
fields["auth_scheme"] = r.auth_scheme
fields["token_env"] = r.token_env
if r.path_allowlist:
fields["path_allowlist"] = list(r.path_allowlist)
return fields
```
`egress_render_routes` delegates to it:
```python
for r in routes:
f = _route_to_yaml_fields(r)
lines.append(f' - host: "{f["host"]}"')
if "auth_scheme" in f:
lines.append(f' auth_scheme: "{f["auth_scheme"]}"')
lines.append(f' token_env: "{f["token_env"]}"')
if "path_allowlist" in f:
lines.append(" path_allowlist:")
for p in f["path_allowlist"]:
lines.append(f' - "{p}"')
```
The docstring on `_route_to_yaml_fields` is the explicit callout to
update both it and `_parse_one` together when the schema changes.
### `egress_manifest_routes` / `egress_routes_for_bottle`
`egress_manifest_routes` becomes a pure lifter with no slot assignment:
it reads each manifest route entry and returns an `EgressRoute` with
`token_env=""` (the slot to be filled later). The function's docstring
currently promises slot assignment; that promise moves to
`egress_routes_for_bottle`.
`egress_routes_for_bottle` becomes:
```python
def egress_routes_for_bottle(
bottle: Bottle,
provider_routes: tuple[EgressRoute, ...] = (),
) -> tuple[EgressRoute, ...]:
manifest = egress_manifest_routes(bottle)
provisioned_hosts = {pr.host.lower() for pr in provider_routes}
merged = list(provider_routes) + [
r for r in manifest if r.host.lower() not in provisioned_hosts
]
return _assign_token_slots(merged)
```
### Mitmproxy request logic
`egress_addon_core.decide()` and `is_git_push_request()` are already
pure functions; `egress_addon.EgressAddon.request()` is the minimal
mitmproxy glue (read host/path/headers from flow → call pure functions
→ apply result to flow). This split is already clean and requires no
structural change in this PRD.
## Test impact
- **`TestProviderRouteMerge`**: the `test_provider_route_upgrades_bare_manifest_route`
test asserts that a provider route preserves a bare manifest route's
`path_allowlist`. Under provisioned-wins that `path_allowlist` is
dropped. Update the test to reflect the new semantics.
- **`test_provider_route_conflicts_with_different_authed_manifest_route`**:
the conflict-die case no longer exists. Remove this test.
- All other merge and render tests should pass without modification.
## Implementation chunks
1. **PRD (this commit).** Sets the design.
2. **Merge refactor.** Replace `_merge_provider_route` and
`_find_or_alloc_token_env` with `_assign_token_slots` and the
flat provisioned-wins logic in `egress_routes_for_bottle`. Strip
slot assignment from `egress_manifest_routes`.
3. **Render consolidation.** Add `_route_to_yaml_fields`; update
`egress_render_routes` to use it.
4. **Test updates.** Adjust `TestProviderRouteMerge` for the semantics
changes above; confirm all render tests pass.
## References
- Issue #120: Refactor `_merge_provider_route` (expanded to include
Route type fragmentation).
- Issue #117: Complexity hotspots — source of both findings.
- PRD 0030: Deduplicate egress token resolution (prior egress cleanup).