Files
bot-bottle/docs/prds/0031-split-merge-provider-route.md
didericis-claude f15721b424
test / unit (pull_request) Successful in 39s
test / integration (pull_request) Successful in 46s
complete(prd): mark PRD 0031 active
Provisioned-wins merge and _route_to_yaml_fields are implemented and all
tests pass.
2026-06-02 05:45:28 +00:00

9.4 KiB

PRD 0031: Simplify egress route merge and consolidate Route types

  • Status: Active
  • Author: didericis-claude
  • Created: 2026-06-02
  • Issue: #120

Summary

Replace _merge_provider_route's five-case nested conditional with a flat provisioned-wins merge, and make the mapping between the host-side EgressRoute and the addon's Route explicit in one place. Covers the two remaining open tasks from the #117 hotspot review.

Problem

1. _merge_provider_route branching

_merge_provider_route in bot_bottle/egress.py handles five distinct cases in a single function with interleaved conditions:

  1. append-new — host not in manifest; append a fresh route.
  2. upgrade-bare — host found, no existing auth; adopt provider auth.
  3. no-op — host found, same auth; return unchanged.
  4. tls-passthrough upgrade — same as no-op but provider sets tls_passthrough=True; flip the flag on the existing route.
  5. conflict-die — host found, different auth; hard error.

Cases 3 and 4 share a block with no-op as the invisible fall-through. _find_or_alloc_token_env is duplicated between cases 2 and 1. In-place replacements spell out every EgressRoute field explicitly, so a new field added to the dataclass silently drops its value in any replacement site that wasn't updated.

The root cause of the complexity is that the current merge tries to be cooperative: it lets manifest routes coexist with provider routes and attempts to upgrade bare manifest entries. This makes sense if the manifest is authoritative, but the actual intended hierarchy is the opposite — provider routes claim their hosts outright and the manifest fills in what's left.

2. Three-way Route type fragmentation

EgressRoute (in egress.py) and egress_addon_core.Route are separate dataclasses with overlapping but not identical field sets:

Field EgressRoute addon Route
host
path_allowlist
auth_scheme
token_env
token_ref ✓ (host-side)
roles ✓ (host-side)
tls_passthrough ✓ (pipelock concern)

egress_render_routes serialises EgressRoute fields to YAML; the addon's load_routes deserialises that YAML into Route objects. If a field is added to EgressRoute that should appear in the YAML, both egress_render_routes and _parse_one must be updated consistently. The render function spells the field list out inline with no reference to the addon's parser, so divergence is silent until runtime.

egress_addon_core.Route cannot be replaced by EgressRoute — the addon file is copied flat into the sidecar container image (/app/) and has no access to the bot_bottle package. The types must stay separate; the risk is that they drift.

Goals / Success Criteria

  • egress_routes_for_bottle implements a flat provisioned-wins merge: provider routes claim their hosts; manifest routes for unclaimed hosts append. No upgrade logic, no conflict detection.
  • Token slot assignment is a single pass over the merged list, not interleaved with the merge.
  • egress_render_routes uses a single _route_to_yaml_fields helper that explicitly lists the addon-visible fields, creating one place where the EgressRouteRoute mapping is spelled out.
  • All existing TestProviderRouteMerge and TestRenderRoutes tests pass (adjusting assertions for any semantics changes described below).
  • No behaviour change for existing manifests that don't trigger the conflict-die or upgrade-bare paths.

Non-goals

  • Merging EgressRoute and egress_addon_core.Route into one class (impossible: addon runs in a stdlib-only container environment).
  • Changing what the addon does with a route once it has one.
  • Changing decide() or is_git_push_request() in egress_addon_core — those are already pure functions with good separation.

Design

Merge: provisioned wins

The new hierarchy: provisioned routes own their hosts; manifest routes fill the gaps.

provisioned_hosts = {pr.host.lower() for pr in provider_routes}

effective = list(provider_routes)
effective += [r for r in manifest_routes if r.host.lower() not in provisioned_hosts]

Token slot assignment runs as a final pass over effective in order:

def _assign_token_slots(
    routes: list[EgressRoute],
) -> tuple[EgressRoute, ...]:
    slot_for_ref: dict[str, str] = {}
    out: list[EgressRoute] = []
    for r in routes:
        if r.auth_scheme and r.token_ref and not r.token_env:
            token_env = slot_for_ref.get(r.token_ref)
            if token_env is None:
                token_env = f"EGRESS_TOKEN_{len(slot_for_ref)}"
                slot_for_ref[r.token_ref] = token_env
            r = dataclasses.replace(r, token_env=token_env)
        out.append(r)
    return tuple(out)

This replaces _merge_provider_route, _find_or_alloc_token_env, and the slot-assignment loop inside egress_manifest_routes.

Semantics change

Under the old design, a manifest route for a provisioned host with a different auth_scheme or token_ref raised a hard error. Under provisioned-wins, the manifest entry is silently dropped. Operators who relied on the conflict error to catch misconfigurations should audit their manifests, but in practice this path was only reachable when a manifest declared auth for api.openai.com or chatgpt.com with a token ref other than CODEX_HOST_CREDENTIAL_TOKEN_REF while also enabling forward_host_credentials — an unlikely combination.

Similarly, the "upgrade-bare" path (provider adopts a bare manifest route's path_allowlist) is dropped: a provisioned host takes the provider route's fields wholesale, and the manifest's path_allowlist for that host is ignored.

Route type mapping: _route_to_yaml_fields

Add a pure function in egress.py:

def _route_to_yaml_fields(r: EgressRoute) -> dict:
    """Return the addon-visible fields for one route.

    This is the single authoritative mapping between `EgressRoute`
    (host-side) and `egress_addon_core.Route` (sidecar-side). If a
    field is added to `Route` that must appear in the YAML, add it
    here and in `egress_addon_core._parse_one` together."""
    fields: dict = {"host": r.host}
    if r.auth_scheme and r.token_env:
        fields["auth_scheme"] = r.auth_scheme
        fields["token_env"] = r.token_env
    if r.path_allowlist:
        fields["path_allowlist"] = list(r.path_allowlist)
    return fields

egress_render_routes delegates to it:

for r in routes:
    f = _route_to_yaml_fields(r)
    lines.append(f'  - host: "{f["host"]}"')
    if "auth_scheme" in f:
        lines.append(f'    auth_scheme: "{f["auth_scheme"]}"')
        lines.append(f'    token_env: "{f["token_env"]}"')
    if "path_allowlist" in f:
        lines.append("    path_allowlist:")
        for p in f["path_allowlist"]:
            lines.append(f'      - "{p}"')

The docstring on _route_to_yaml_fields is the explicit callout to update both it and _parse_one together when the schema changes.

egress_manifest_routes / egress_routes_for_bottle

egress_manifest_routes becomes a pure lifter with no slot assignment: it reads each manifest route entry and returns an EgressRoute with token_env="" (the slot to be filled later). The function's docstring currently promises slot assignment; that promise moves to egress_routes_for_bottle.

egress_routes_for_bottle becomes:

def egress_routes_for_bottle(
    bottle: Bottle,
    provider_routes: tuple[EgressRoute, ...] = (),
) -> tuple[EgressRoute, ...]:
    manifest = egress_manifest_routes(bottle)
    provisioned_hosts = {pr.host.lower() for pr in provider_routes}
    merged = list(provider_routes) + [
        r for r in manifest if r.host.lower() not in provisioned_hosts
    ]
    return _assign_token_slots(merged)

Mitmproxy request logic

egress_addon_core.decide() and is_git_push_request() are already pure functions; egress_addon.EgressAddon.request() is the minimal mitmproxy glue (read host/path/headers from flow → call pure functions → apply result to flow). This split is already clean and requires no structural change in this PRD.

Test impact

  • TestProviderRouteMerge: the test_provider_route_upgrades_bare_manifest_route test asserts that a provider route preserves a bare manifest route's path_allowlist. Under provisioned-wins that path_allowlist is dropped. Update the test to reflect the new semantics.
  • test_provider_route_conflicts_with_different_authed_manifest_route: the conflict-die case no longer exists. Remove this test.
  • All other merge and render tests should pass without modification.

Implementation chunks

  1. PRD (this commit). Sets the design.
  2. Merge refactor. Replace _merge_provider_route and _find_or_alloc_token_env with _assign_token_slots and the flat provisioned-wins logic in egress_routes_for_bottle. Strip slot assignment from egress_manifest_routes.
  3. Render consolidation. Add _route_to_yaml_fields; update egress_render_routes to use it.
  4. Test updates. Adjust TestProviderRouteMerge for the semantics changes above; confirm all render tests pass.

References

  • Issue #120: Refactor _merge_provider_route (expanded to include Route type fragmentation).
  • Issue #117: Complexity hotspots — source of both findings.
  • PRD 0030: Deduplicate egress token resolution (prior egress cleanup).