Extends cred-proxy to filter (not just route) paths, including for unauthenticated upstreams via a new `auth_scheme: "none"` mode and `path_allowlist` field per route. Pipelock keeps its hostname allowlist + DLP role; cred-proxy adds path-level enforcement for routes that opt in. Motivated by PR #25's follow-up note in _apply_pipelock_url: pipelock 2.3.0's api_allowlist is hostname-only, so approving pipelock-block opens the entire host. For shared platforms (github.com, gitlab.com, public registries) operators usually want narrower-than-host granularity. Draft status; open questions on match semantics, allow-route-with- empty-allowlist edge case, and the eventual MCP tool shape for agent-proposed path additions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8.2 KiB
PRD 0017: Path-aware egress filtering via cred-proxy
- Status: Draft
- Author: didericis
- Created: 2026-05-25
Summary
Pipelock's api_allowlist is hostname-only — once a host is on the
list, every URL path at that host is reachable. For agents working
on shared platforms (github.com, gitlab.com, public registries),
this means approving access to one user's content also opens
access to every other user's content. Cred-proxy already
path-prefix-routes authenticated traffic; this PRD extends it to
filter (not just route) paths, including for unauthenticated hosts.
Per-bottle egress then has two complementary layers: pipelock for
hostname allow + DLP + body scanning, cred-proxy for path-level
allow on declared hosts.
Problem
PR #25's pipelock-block tool delivers an honest but coarse experience:
the agent reports "I tried hitting https://github.com/didericis,
pipelock 403'd it"; the operator approves and the agent now has
access to all of github.com. The path in the proposal is captured
as context but not enforced (PR #25 documents this in
_apply_pipelock_url's docstring).
The intended posture for many shared platforms is narrower than
hostname-level. "Allow the agent to read github.com/didericis but
not github.com/somebody-else" is a normal ask. Today the egress
stack can't express that, even though cred-proxy already has 80%
of the machinery: it path-routes authenticated traffic with
longest-prefix matching, and the manifest's cred_proxy.routes[]
shape is already a list of (path, upstream, ...) rules.
Goals / Success Criteria
A bottle manifest can declare a cred-proxy route with a
path_allowlist and auth_scheme: none. Agents dialing
http://cred-proxy:<port>/<route>/<suffix> hit a 403 from
cred-proxy when <suffix> doesn't match any allowlist entry, and
a normal forward (no auth header injected) when it does. For
existing authenticated routes the addition is opt-in: a route
without path_allowlist keeps its current permissive behaviour.
Demonstrable behavior: a bottle manifest declares
{path: "/github/", upstream: "https://github.com", auth_scheme: "none", path_allowlist: ["/didericis/"]}; the agent reaches
http://cred-proxy:9099/github/didericis/some-repo successfully,
gets a 403 on http://cred-proxy:9099/github/someone-else/whatever.
Non-goals
- Replacing pipelock. Pipelock still does the hostname allowlist, DLP body scanning, MCP / WebSocket inspection. Path filtering is additive, sitting in front of pipelock for routes that opt in.
- Auto-routing arbitrary outbound HTTP through cred-proxy. The
agent's
HTTP_PROXYstays pointed at pipelock; cred-proxy is reached by explicit URL (with agit-insteadof-style rewrite for the few protocol-level helpers that need it). - Reworking pipelock-block. The PR #25 tool stays hostname-only; whether a new path-aware proposal tool (or a richer pipelock-block) is wanted is an open question for a follow-on PRD.
- Live mutation of the running container or cred-proxy beyond what cred-proxy SIGHUP already supports (PRD 0014).
Scope
In scope
- A new optional
auth_scheme: "none"mode on cred-proxy routes that suppresses Authorization injection while keeping path routing + (new) path filtering. - A new optional
path_allowlist: [<prefix>, ...]field per cred-proxy route. When present, cred-proxy 403s requests whose in-route suffix doesn't match at least one prefix. - Manifest schema + validation for the two new fields.
- Cred-proxy server logic: enforcement on each request after the longest-prefix route match.
- SIGHUP reload picks up
path_allowlistchanges (no new sidecar primitives — the existing reload path already re-readsroutes.json).
Out of scope
- A new MCP tool for the agent to propose
path_allowlistadditions. Today the operator manages this via the manifest + the existingroutes edit <bottle>TUI verb. - Glob / regex matching. v1 ships prefix matching only; the open question lays out the trade-offs.
- Auto-migrating PR #25's pipelock-block proposals into cred-proxy routes. Manual operator decision per host.
- Provisioner-side dotfile changes for HTTPS-to-cred-proxy rewrites on bottles that opt unauth'd hosts onto cred-proxy. Out of scope for the engine work; the manifest can already encode it.
Proposed Design
Manifest schema additions
bottle.cred_proxy.routes[] gains two optional fields:
cred_proxy:
routes:
- path: "/github/"
upstream: "https://github.com"
auth_scheme: "none" # new — no Authorization header
token_ref: "" # ignored when auth_scheme is "none"
path_allowlist: # new — prefix list; empty / absent = permissive
- "/didericis/"
- "/didericis-org/"
auth_scheme: "none"joins the existingBearer/tokenvalues. Whennone,token_refmust be empty or absent and no Authorization header is injected. The route still routes by path prefix and forwards to upstream.path_allowlistis a list of suffix prefixes (matched after the route'spathis stripped). Empty / absent means permissive (current behaviour). When non-empty, the suffix must start with at least one of the allowlist entries.
cred-proxy server changes
Per request:
- Strip query string, longest-prefix-match against
routes. - Compute the suffix = request_path[len(route.path):].
- If
route.path_allowlistis non-empty: require that"/" + suffix(or justsuffix— pick a consistent normalization) starts with at least one allowlist entry. 403 if not. - If
auth_scheme == "none": skip theAuthorizationheader step entirely; otherwise inject as today. - Forward upstream, stream response (unchanged).
The 403 body should name the route + the disallowed suffix so the operator can diagnose. cred-proxy's existing log line at request time picks up the new outcome too.
Validation
At manifest load:
auth_schememust be one ofBearer,token, ornone.- When
auth_scheme == "none",token_refis forbidden (clearer error than silently ignoring). path_allowlistentries must start with/and end with/(matching the existing convention forroute.path).- Duplicate prefixes are deduplicated with a warning, not an error.
Migration / backward compatibility
- Routes without
path_allowlistbehave exactly as today. - Routes with
auth_scheme: Bearer | tokenbehave exactly as today. - No existing manifests need editing; the new fields are opt-in.
Open questions
- Match semantics: prefix vs glob vs regex. Prefix is simple
and matches the existing
route.pathconvention. Glob (/users/*/repos/) adds power but is easy to get wrong (does*match a/?). Regex is the most powerful and the most footguny. Recommend prefix-only for v1, glob in a follow-up if operators ask for it. - 403 body shape. Plain text vs JSON. Cred-proxy's existing
errors use plain text (
send_error(404, "no route for ...")). Match that. - Auth-less routes and TLS interception. A
none-auth route still routes outbound HTTPS through pipelock (cred-proxy'sHTTPS_PROXYenv), so pipelock's CA + body scanner still apply. Confirm that pipelock's allowlist needs the upstream host in this case — there's no token to make the cred-proxy → upstream leg special. Likely yes, same as today. - MCP tool / pipelock-block evolution. Once path filtering
exists, the operator may want a way for the agent to propose
path additions (e.g. "I need /didericis-org/ added to the
github route"). Today that requires manifest edit + cli.py
rebuild, or
routes editvia the dashboard. Whether a new MCP tool (or a richer pipelock-block) is wanted is a follow-on PRD open question. - Allowlist semantics for the entire route prefix. Should an
empty
path_allowlist: []be allowed? Equivalent to "block everything at this upstream" — possibly useful as a tombstone, more likely a typo. Recommend treating empty list the same as absent (permissive) and flagging in the validation note.
References
- PRD 0010 — cred-proxy (the engine being extended).
- PRD 0015 — pipelock block remediation (whose hostname-only ceiling motivates this PRD).
- PR #25 —
_apply_pipelock_url's docstring documents the follow-up that this PRD formalises.