242 lines
9.4 KiB
Markdown
242 lines
9.4 KiB
Markdown
# PRD 0031: Simplify egress route merge and consolidate Route types
|
|
|
|
- **Status:** Active
|
|
- **Author:** didericis-claude
|
|
- **Created:** 2026-06-02
|
|
- **Issue:** #120
|
|
|
|
## Summary
|
|
|
|
Replace `_merge_provider_route`'s five-case nested conditional with a
|
|
flat provisioned-wins merge, and make the mapping between the host-side
|
|
`EgressRoute` and the addon's `Route` explicit in one place. Covers the
|
|
two remaining open tasks from the #117 hotspot review.
|
|
|
|
## Problem
|
|
|
|
### 1. `_merge_provider_route` branching
|
|
|
|
`_merge_provider_route` in `bot_bottle/egress.py` handles five distinct
|
|
cases in a single function with interleaved conditions:
|
|
|
|
1. **append-new** — host not in manifest; append a fresh route.
|
|
2. **upgrade-bare** — host found, no existing auth; adopt provider auth.
|
|
3. **no-op** — host found, same auth; return unchanged.
|
|
4. **tls-passthrough upgrade** — same as no-op but provider sets
|
|
`tls_passthrough=True`; flip the flag on the existing route.
|
|
5. **conflict-die** — host found, different auth; hard error.
|
|
|
|
Cases 3 and 4 share a block with no-op as the invisible fall-through.
|
|
`_find_or_alloc_token_env` is duplicated between cases 2 and 1. In-place
|
|
replacements spell out every `EgressRoute` field explicitly, so a new
|
|
field added to the dataclass silently drops its value in any replacement
|
|
site that wasn't updated.
|
|
|
|
The root cause of the complexity is that the current merge tries to be
|
|
cooperative: it lets manifest routes coexist with provider routes and
|
|
attempts to upgrade bare manifest entries. This makes sense if the
|
|
manifest is authoritative, but the actual intended hierarchy is the
|
|
opposite — provider routes claim their hosts outright and the manifest
|
|
fills in what's left.
|
|
|
|
### 2. Three-way Route type fragmentation
|
|
|
|
`EgressRoute` (in `egress.py`) and `egress_addon_core.Route` are
|
|
separate dataclasses with overlapping but not identical field sets:
|
|
|
|
| Field | `EgressRoute` | addon `Route` |
|
|
|---|---|---|
|
|
| `host` | ✓ | ✓ |
|
|
| `path_allowlist` | ✓ | ✓ |
|
|
| `auth_scheme` | ✓ | ✓ |
|
|
| `token_env` | ✓ | ✓ |
|
|
| `token_ref` | ✓ (host-side) | — |
|
|
| `roles` | ✓ (host-side) | — |
|
|
| `tls_passthrough` | ✓ (pipelock concern) | — |
|
|
|
|
`egress_render_routes` serialises `EgressRoute` fields to YAML; the
|
|
addon's `load_routes` deserialises that YAML into `Route` objects. If a
|
|
field is added to `EgressRoute` that should appear in the YAML, both
|
|
`egress_render_routes` and `_parse_one` must be updated consistently.
|
|
The render function spells the field list out inline with no reference
|
|
to the addon's parser, so divergence is silent until runtime.
|
|
|
|
`egress_addon_core.Route` cannot be replaced by `EgressRoute` — the
|
|
addon file is copied flat into the sidecar container image (`/app/`) and
|
|
has no access to the `bot_bottle` package. The types must stay separate;
|
|
the risk is that they drift.
|
|
|
|
## Goals / Success Criteria
|
|
|
|
- `egress_routes_for_bottle` implements a flat provisioned-wins merge:
|
|
provider routes claim their hosts; manifest routes for unclaimed hosts
|
|
append. No upgrade logic, no conflict detection.
|
|
- Token slot assignment is a single pass over the merged list, not
|
|
interleaved with the merge.
|
|
- `egress_render_routes` uses a single `_route_to_yaml_fields` helper
|
|
that explicitly lists the addon-visible fields, creating one place
|
|
where the `EgressRoute`→`Route` mapping is spelled out.
|
|
- All existing `TestProviderRouteMerge` and `TestRenderRoutes` tests
|
|
pass (adjusting assertions for any semantics changes described below).
|
|
- No behaviour change for existing manifests that don't trigger the
|
|
conflict-die or upgrade-bare paths.
|
|
|
|
## Non-goals
|
|
|
|
- Merging `EgressRoute` and `egress_addon_core.Route` into one class
|
|
(impossible: addon runs in a stdlib-only container environment).
|
|
- Changing what the addon does with a route once it has one.
|
|
- Changing `decide()` or `is_git_push_request()` in `egress_addon_core`
|
|
— those are already pure functions with good separation.
|
|
|
|
## Design
|
|
|
|
### Merge: provisioned wins
|
|
|
|
The new hierarchy: **provisioned routes own their hosts; manifest routes
|
|
fill the gaps.**
|
|
|
|
```
|
|
provisioned_hosts = {pr.host.lower() for pr in provider_routes}
|
|
|
|
effective = list(provider_routes)
|
|
effective += [r for r in manifest_routes if r.host.lower() not in provisioned_hosts]
|
|
```
|
|
|
|
Token slot assignment runs as a final pass over `effective` in order:
|
|
|
|
```python
|
|
def _assign_token_slots(
|
|
routes: list[EgressRoute],
|
|
) -> tuple[EgressRoute, ...]:
|
|
slot_for_ref: dict[str, str] = {}
|
|
out: list[EgressRoute] = []
|
|
for r in routes:
|
|
if r.auth_scheme and r.token_ref and not r.token_env:
|
|
token_env = slot_for_ref.get(r.token_ref)
|
|
if token_env is None:
|
|
token_env = f"EGRESS_TOKEN_{len(slot_for_ref)}"
|
|
slot_for_ref[r.token_ref] = token_env
|
|
r = dataclasses.replace(r, token_env=token_env)
|
|
out.append(r)
|
|
return tuple(out)
|
|
```
|
|
|
|
This replaces `_merge_provider_route`, `_find_or_alloc_token_env`, and
|
|
the slot-assignment loop inside `egress_manifest_routes`.
|
|
|
|
#### Semantics change
|
|
|
|
Under the old design, a manifest route for a provisioned host with a
|
|
*different* `auth_scheme` or `token_ref` raised a hard error. Under
|
|
provisioned-wins, the manifest entry is silently dropped. Operators who
|
|
relied on the conflict error to catch misconfigurations should audit
|
|
their manifests, but in practice this path was only reachable when a
|
|
manifest declared auth for `api.openai.com` or `chatgpt.com` with a
|
|
token ref other than `CODEX_HOST_CREDENTIAL_TOKEN_REF` while also
|
|
enabling `forward_host_credentials` — an unlikely combination.
|
|
|
|
Similarly, the "upgrade-bare" path (provider adopts a bare manifest
|
|
route's `path_allowlist`) is dropped: a provisioned host takes the
|
|
provider route's fields wholesale, and the manifest's `path_allowlist`
|
|
for that host is ignored.
|
|
|
|
### Route type mapping: `_route_to_yaml_fields`
|
|
|
|
Add a pure function in `egress.py`:
|
|
|
|
```python
|
|
def _route_to_yaml_fields(r: EgressRoute) -> dict:
|
|
"""Return the addon-visible fields for one route.
|
|
|
|
This is the single authoritative mapping between `EgressRoute`
|
|
(host-side) and `egress_addon_core.Route` (sidecar-side). If a
|
|
field is added to `Route` that must appear in the YAML, add it
|
|
here and in `egress_addon_core._parse_one` together."""
|
|
fields: dict = {"host": r.host}
|
|
if r.auth_scheme and r.token_env:
|
|
fields["auth_scheme"] = r.auth_scheme
|
|
fields["token_env"] = r.token_env
|
|
if r.path_allowlist:
|
|
fields["path_allowlist"] = list(r.path_allowlist)
|
|
return fields
|
|
```
|
|
|
|
`egress_render_routes` delegates to it:
|
|
|
|
```python
|
|
for r in routes:
|
|
f = _route_to_yaml_fields(r)
|
|
lines.append(f' - host: "{f["host"]}"')
|
|
if "auth_scheme" in f:
|
|
lines.append(f' auth_scheme: "{f["auth_scheme"]}"')
|
|
lines.append(f' token_env: "{f["token_env"]}"')
|
|
if "path_allowlist" in f:
|
|
lines.append(" path_allowlist:")
|
|
for p in f["path_allowlist"]:
|
|
lines.append(f' - "{p}"')
|
|
```
|
|
|
|
The docstring on `_route_to_yaml_fields` is the explicit callout to
|
|
update both it and `_parse_one` together when the schema changes.
|
|
|
|
### `egress_manifest_routes` / `egress_routes_for_bottle`
|
|
|
|
`egress_manifest_routes` becomes a pure lifter with no slot assignment:
|
|
it reads each manifest route entry and returns an `EgressRoute` with
|
|
`token_env=""` (the slot to be filled later). The function's docstring
|
|
currently promises slot assignment; that promise moves to
|
|
`egress_routes_for_bottle`.
|
|
|
|
`egress_routes_for_bottle` becomes:
|
|
|
|
```python
|
|
def egress_routes_for_bottle(
|
|
bottle: Bottle,
|
|
provider_routes: tuple[EgressRoute, ...] = (),
|
|
) -> tuple[EgressRoute, ...]:
|
|
manifest = egress_manifest_routes(bottle)
|
|
provisioned_hosts = {pr.host.lower() for pr in provider_routes}
|
|
merged = list(provider_routes) + [
|
|
r for r in manifest if r.host.lower() not in provisioned_hosts
|
|
]
|
|
return _assign_token_slots(merged)
|
|
```
|
|
|
|
### Mitmproxy request logic
|
|
|
|
`egress_addon_core.decide()` and `is_git_push_request()` are already
|
|
pure functions; `egress_addon.EgressAddon.request()` is the minimal
|
|
mitmproxy glue (read host/path/headers from flow → call pure functions
|
|
→ apply result to flow). This split is already clean and requires no
|
|
structural change in this PRD.
|
|
|
|
## Test impact
|
|
|
|
- **`TestProviderRouteMerge`**: the `test_provider_route_upgrades_bare_manifest_route`
|
|
test asserts that a provider route preserves a bare manifest route's
|
|
`path_allowlist`. Under provisioned-wins that `path_allowlist` is
|
|
dropped. Update the test to reflect the new semantics.
|
|
- **`test_provider_route_conflicts_with_different_authed_manifest_route`**:
|
|
the conflict-die case no longer exists. Remove this test.
|
|
- All other merge and render tests should pass without modification.
|
|
|
|
## Implementation chunks
|
|
|
|
1. **PRD (this commit).** Sets the design.
|
|
2. **Merge refactor.** Replace `_merge_provider_route` and
|
|
`_find_or_alloc_token_env` with `_assign_token_slots` and the
|
|
flat provisioned-wins logic in `egress_routes_for_bottle`. Strip
|
|
slot assignment from `egress_manifest_routes`.
|
|
3. **Render consolidation.** Add `_route_to_yaml_fields`; update
|
|
`egress_render_routes` to use it.
|
|
4. **Test updates.** Adjust `TestProviderRouteMerge` for the semantics
|
|
changes above; confirm all render tests pass.
|
|
|
|
## References
|
|
|
|
- Issue #120: Refactor `_merge_provider_route` (expanded to include
|
|
Route type fragmentation).
|
|
- Issue #117: Complexity hotspots — source of both findings.
|
|
- PRD 0030: Deduplicate egress token resolution (prior egress cleanup).
|