docs(prd-0017): pivot to mitmproxy-based egress-proxy
Significant rewrite of PRD 0017 based on PR #25 design discussion. Original draft proposed adding `path_allowlist` to the existing cred-proxy. That bought opt-in path filtering for tools that voluntarily routed through cred-proxy (Claude Code, git, npm) — but raw `curl https://github.com/foo` from the agent goes to HTTPS_PROXY=pipelock and bypasses cred-proxy entirely, so any universal enforcement claim was a lie. New design: replace cred-proxy with a mitmproxy-based egress-proxy that becomes the agent's HTTP_PROXY/HTTPS_PROXY. Every agent HTTP/HTTPS request flows through it before reaching pipelock. Path-level allow/deny enforcement is universal because the proxy is on every leg. The proxy also absorbs cred-proxy's credential injection role (mitmproxy addon hooks request → strip + inject Authorization). Net sidecar count: unchanged. cred-proxy is replaced 1:1 by egress-proxy. Pipelock stays as hostname allow + DLP downstream of egress-proxy. Decisions baked in per PR-#25 discussion: - Tool: mitmproxy (designed for this; Python addons; well-maintained). - CA custody: egress-proxy holds the per-bottle MITM CA key (concentration accepted; documented in trust-domain section). - Migration: hard cutover. Existing `bottle.cred_proxy.routes[]` manifests fail-fast at load time with a pointer at this PRD. Open questions retained for the implementation PRs: addon distribution (bake vs mount), prefix-vs-glob match, double-strip of Authorization between egress-proxy and pipelock, whether pipelock keeps TLS interception or stays hostname-only post-cutover, performance under two-MITM-hops. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,309 @@
|
|||||||
|
# PRD 0017: Egress-proxy — universal MITM with path filtering + auth injection
|
||||||
|
|
||||||
|
- **Status:** Draft
|
||||||
|
- **Author:** didericis
|
||||||
|
- **Created:** 2026-05-25
|
||||||
|
- **Supersedes:** the cred-proxy sidecar (PRD 0010) — hard cutover.
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Replace the per-bottle cred-proxy sidecar with a new `egress-proxy`
|
||||||
|
sidecar built on mitmproxy. The egress-proxy is the agent's
|
||||||
|
`HTTP_PROXY` / `HTTPS_PROXY` — every agent HTTP/HTTPS request flows
|
||||||
|
through it before reaching pipelock. It owns three jobs that today
|
||||||
|
are split between cred-proxy and pipelock:
|
||||||
|
|
||||||
|
1. **MITM the agent's HTTPS.** Uses the per-bottle CA today held by
|
||||||
|
pipelock; that key moves to the egress-proxy.
|
||||||
|
2. **Path-level allow/deny.** Manifest-declared `path_allowlist`
|
||||||
|
per route. Universal coverage — any HTTPS path the agent reaches
|
||||||
|
for is inspected here, not just traffic that voluntarily dials
|
||||||
|
the cred-proxy URL.
|
||||||
|
3. **Credential injection.** Continues cred-proxy's existing role:
|
||||||
|
match by hostname (or hostname + path), strip inbound
|
||||||
|
Authorization, inject one based on `auth_scheme` + `token_ref`.
|
||||||
|
|
||||||
|
Pipelock's role narrows to hostname allowlist + DLP body scanning
|
||||||
|
on the egress-proxy → upstream leg. Pipelock no longer holds the
|
||||||
|
CA private key; no longer the agent's direct proxy.
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
PR #25's pipelock-block flow exposed an honest gap: pipelock's
|
||||||
|
`api_allowlist` is hostname-only (verified by probing the binary's
|
||||||
|
strict preset and the `pipelock check --url` output). Approving a
|
||||||
|
proposed `pipelock-block` opens the entire host, not the URL's
|
||||||
|
path. For shared platforms (github.com, gitlab.com, public
|
||||||
|
registries) operators routinely want narrower-than-host granularity
|
||||||
|
— allow github.com/didericis but block github.com/somebody-else.
|
||||||
|
|
||||||
|
Cred-proxy already does path-prefix routing for credentialed APIs,
|
||||||
|
but it only sees the requests the agent voluntarily routes to it
|
||||||
|
(via `ANTHROPIC_BASE_URL`, `~/.gitconfig` insteadOf, npmrc
|
||||||
|
`registry=`). A raw `curl https://github.com/anyone` from the agent
|
||||||
|
goes to `HTTPS_PROXY=pipelock` directly and bypasses cred-proxy
|
||||||
|
entirely. So extending cred-proxy with `path_allowlist` (the earlier
|
||||||
|
PRD 0017 draft) buys *opt-in* path filtering, not enforcement.
|
||||||
|
|
||||||
|
For enforcement we need a layer that sits on the agent's
|
||||||
|
`HTTPS_PROXY` path — universal coverage of agent egress.
|
||||||
|
|
||||||
|
## Goals / Success Criteria
|
||||||
|
|
||||||
|
A bottle manifest declares an egress-proxy route with a
|
||||||
|
`path_allowlist`. From inside the bottle, `curl
|
||||||
|
https://github.com/didericis/foo` succeeds; `curl
|
||||||
|
https://github.com/somebody-else/secret` gets a 403 from
|
||||||
|
egress-proxy, never reaches pipelock or the real github. The same
|
||||||
|
holds for any tool inside the bottle that respects
|
||||||
|
`HTTPS_PROXY` — claude-code, git over HTTPS, npm, raw curl, random
|
||||||
|
Python `requests`. No tool-specific rewrite is required for path
|
||||||
|
enforcement.
|
||||||
|
|
||||||
|
Existing cred-proxy responsibilities continue to work after the
|
||||||
|
cutover: Anthropic OAuth injection for claude-code (via the
|
||||||
|
proxy-side header injection rather than the dotfile rewrite),
|
||||||
|
git-insteadof routing into the proxy stays useful for hostname
|
||||||
|
canonicalisation but is no longer load-bearing for credential
|
||||||
|
delivery.
|
||||||
|
|
||||||
|
## Non-goals
|
||||||
|
|
||||||
|
- Replacing pipelock. Pipelock keeps doing hostname allowlist +
|
||||||
|
DLP body scanning on the egress-proxy → upstream leg.
|
||||||
|
- Building our own MITM stack. mitmproxy already does it; we ship
|
||||||
|
addons.
|
||||||
|
- Backward compatibility with `bottle.cred_proxy.routes[]`. Hard
|
||||||
|
cutover (see Migration).
|
||||||
|
- Path-level rules in pipelock. Upstream feature request is a
|
||||||
|
separate track (file independently); this PRD doesn't depend on
|
||||||
|
it.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
### In scope
|
||||||
|
|
||||||
|
- A new `egress-proxy` sidecar replacing the cred-proxy sidecar.
|
||||||
|
mitmproxy image, pinned by digest. Addons in Python.
|
||||||
|
- Per-bottle CA generation **moves from pipelock to egress-proxy**.
|
||||||
|
The agent's trust store is rebuilt against the egress-proxy CA
|
||||||
|
(was pipelock's CA).
|
||||||
|
- Manifest rename: `bottle.cred_proxy.routes[]` →
|
||||||
|
`bottle.egress_proxy.routes[]`. The route shape gains optional
|
||||||
|
`path_allowlist: [<prefix>, ...]` and supports `auth_scheme:
|
||||||
|
"none"`.
|
||||||
|
- Agent's `HTTP_PROXY` / `HTTPS_PROXY` env vars repointed at the
|
||||||
|
egress-proxy (was pipelock).
|
||||||
|
- Pipelock retains its sidecar slot and its own DLP + hostname
|
||||||
|
scanner. The agent never dials it directly anymore; egress-proxy
|
||||||
|
uses `HTTPS_PROXY=pipelock` for its outbound leg, matching the
|
||||||
|
current cred-proxy → pipelock pattern.
|
||||||
|
- Existing PRDs that depend on cred-proxy:
|
||||||
|
- PRD 0014 (cred-proxy-block remediation) → renames + retargets
|
||||||
|
apply path. SIGHUP reload semantics carry over to egress-proxy.
|
||||||
|
- PRD 0013 (supervise plane) `cred-proxy-block` MCP tool stays;
|
||||||
|
its proposed file format updates per the new route shape.
|
||||||
|
- Removal of the old cred-proxy code: `claude_bottle/cred_proxy.py`,
|
||||||
|
`cred_proxy_server.py`, `backend/docker/cred_proxy.py`,
|
||||||
|
`provision/cred_proxy.py`, the `Dockerfile.cred-proxy`. Tests
|
||||||
|
updated.
|
||||||
|
|
||||||
|
### Out of scope
|
||||||
|
|
||||||
|
- Pipelock CA path: pipelock keeps generating its *own* CA for
|
||||||
|
any internal TLS termination it still does (e.g., on the
|
||||||
|
egress-proxy → upstream leg if pipelock is the MITM there).
|
||||||
|
Whether pipelock needs that CA at all post-cutover is an open
|
||||||
|
question (probably no — egress-proxy already terminated; pipelock
|
||||||
|
is now downstream of a plain-HTTP forward from egress-proxy).
|
||||||
|
- Glob / regex matching in `path_allowlist`. v1 ships prefix
|
||||||
|
matching; expressive forms are a follow-up.
|
||||||
|
- An MCP tool for the agent to propose `path_allowlist`
|
||||||
|
additions. Today the operator manages this via the manifest +
|
||||||
|
the existing `routes edit <bottle>` TUI verb (renamed to
|
||||||
|
`egress-proxy edit <bottle>`).
|
||||||
|
|
||||||
|
## Proposed design
|
||||||
|
|
||||||
|
### Topology
|
||||||
|
|
||||||
|
```
|
||||||
|
[Agent] --HTTP_PROXY=egress-proxy-->
|
||||||
|
[egress-proxy (mitmproxy)]
|
||||||
|
MITM with per-bottle CA
|
||||||
|
path_allowlist enforcement
|
||||||
|
Authorization header injection
|
||||||
|
--HTTPS_PROXY=pipelock-->
|
||||||
|
[pipelock]
|
||||||
|
hostname allowlist
|
||||||
|
DLP body scan
|
||||||
|
--egress--> Internet
|
||||||
|
```
|
||||||
|
|
||||||
|
Universal coverage: every HTTP/HTTPS request the agent makes hits
|
||||||
|
egress-proxy first. cred-proxy's URL convention
|
||||||
|
(`http://cred-proxy:9099/...`) goes away — there's no need for the
|
||||||
|
agent to address the proxy by name because it's already on the
|
||||||
|
default proxy path.
|
||||||
|
|
||||||
|
### Manifest
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
egress_proxy:
|
||||||
|
routes:
|
||||||
|
# Authenticated route (today's cred-proxy shape, slightly
|
||||||
|
# renamed). path_allowlist optional.
|
||||||
|
- host: "api.github.com"
|
||||||
|
auth_scheme: "Bearer"
|
||||||
|
token_ref: "GH_PAT"
|
||||||
|
path_allowlist:
|
||||||
|
- "/repos/didericis/"
|
||||||
|
- "/users/didericis"
|
||||||
|
# Unauthenticated path-filtered route.
|
||||||
|
- host: "github.com"
|
||||||
|
auth_scheme: "none"
|
||||||
|
path_allowlist:
|
||||||
|
- "/didericis/"
|
||||||
|
# Bare-pass route: no auth injection, no path enforcement.
|
||||||
|
# Useful when you want a host to skip path filtering but
|
||||||
|
# still be DLP-scanned by pipelock.
|
||||||
|
- host: "api.anthropic.com"
|
||||||
|
auth_scheme: "none"
|
||||||
|
# no path_allowlist → all paths pass
|
||||||
|
```
|
||||||
|
|
||||||
|
Route matching is on `host` (was `path` prefix). The hostname
|
||||||
|
gates whether a route applies; `path_allowlist` (if present)
|
||||||
|
constrains the URL path under that host.
|
||||||
|
|
||||||
|
### mitmproxy addon shape
|
||||||
|
|
||||||
|
The egress-proxy ships a small Python addon that:
|
||||||
|
|
||||||
|
- Loads the per-bottle routes from `/etc/egress-proxy/routes.yaml`
|
||||||
|
(rendered by the prepare step, docker-cp'd in like cred-proxy's
|
||||||
|
current routes.json).
|
||||||
|
- On `request` hook: match `flow.request.host` → route. If no route
|
||||||
|
matches → forward unchanged (pipelock will hostname-gate it). If
|
||||||
|
route matches and has `path_allowlist`, check `flow.request.path`
|
||||||
|
against the prefix list; 403 with a clear reason if no match.
|
||||||
|
- On approved requests: strip inbound Authorization, inject
|
||||||
|
`Authorization: <auth_scheme> <token-from-env>` if `auth_scheme
|
||||||
|
!= "none"`.
|
||||||
|
- SIGHUP / file-mtime watch on `routes.yaml` for hot-reload (same
|
||||||
|
cadence as today's cred-proxy SIGHUP path).
|
||||||
|
|
||||||
|
mitmproxy's standard CA generation handles per-host leaf certs at
|
||||||
|
SNI time. The per-bottle CA is generated at bottle launch (was
|
||||||
|
pipelock's tls-init step; now egress-proxy's). Agent's trust store
|
||||||
|
gets the egress-proxy CA installed in place of pipelock's.
|
||||||
|
|
||||||
|
### Trust-domain concentration
|
||||||
|
|
||||||
|
The egress-proxy now holds:
|
||||||
|
|
||||||
|
- Every credential the bottle declared in `egress_proxy.routes[]`
|
||||||
|
(OAuth tokens, PATs, npm tokens).
|
||||||
|
- The per-bottle MITM CA private key.
|
||||||
|
|
||||||
|
This is a deliberate concentration. With the previous split:
|
||||||
|
|
||||||
|
- cred-proxy held tokens.
|
||||||
|
- pipelock held the CA.
|
||||||
|
|
||||||
|
A memory disclosure in cred-proxy exposed tokens; in pipelock,
|
||||||
|
the CA. Both were bad; neither exposed everything.
|
||||||
|
|
||||||
|
The new egress-proxy in one disclosure exposes both. Mitigations:
|
||||||
|
|
||||||
|
- mitmproxy runs as an unprivileged user inside the container.
|
||||||
|
- Tokens live in the container's environ (same as cred-proxy today).
|
||||||
|
The CA private key is mounted from the host's stage_dir (mode 600).
|
||||||
|
- Pipelock stays as a separate sidecar, so a compromise of
|
||||||
|
egress-proxy doesn't disable pipelock's hostname check + DLP on
|
||||||
|
the outbound leg — the attacker can forge certs to the agent but
|
||||||
|
can't easily exfil from inside the agent without pipelock
|
||||||
|
noticing.
|
||||||
|
|
||||||
|
The user (per PR #25 discussion) accepted this concentration in
|
||||||
|
exchange for the one-sidecar consolidation. The PRD records it
|
||||||
|
explicitly.
|
||||||
|
|
||||||
|
### Migration — hard cutover
|
||||||
|
|
||||||
|
No backward-compat alias for `bottle.cred_proxy.routes[]`. At
|
||||||
|
manifest load:
|
||||||
|
|
||||||
|
- `cred_proxy:` block → `die()` with a clear pointer at this PRD
|
||||||
|
and a migration recipe (rename to `egress_proxy:`, rename
|
||||||
|
`path` → `host`, drop the agent-side URL prefix).
|
||||||
|
- `cred_proxy_routes` field on existing dataclasses removed.
|
||||||
|
- `Dockerfile.cred-proxy` deleted.
|
||||||
|
- `claude_bottle/cred_proxy*.py` deleted.
|
||||||
|
- `claude_bottle/backend/docker/cred_proxy*.py` consolidated into
|
||||||
|
`egress_proxy*.py`.
|
||||||
|
- Provisioner files renamed.
|
||||||
|
- PRDs 0010 (cred-proxy), 0014 (cred-proxy-block remediation)
|
||||||
|
retroactively annotated as "superseded by 0017" — old text
|
||||||
|
preserved, header updated.
|
||||||
|
|
||||||
|
### Implementation chunks
|
||||||
|
|
||||||
|
Plausibly three implementation PRs after this PRD lands:
|
||||||
|
|
||||||
|
1. **egress-proxy sidecar core.** Dockerfile + mitmproxy addon +
|
||||||
|
`routes.yaml` schema + lifecycle (prepare / start / stop / SIGHUP).
|
||||||
|
2. **Manifest + provisioner migration.** Rename cred-proxy
|
||||||
|
throughout the codebase, hard-fail on legacy manifests, update
|
||||||
|
agent CA trust to point at egress-proxy.
|
||||||
|
3. **PRD 0014 retargeting.** cred-proxy-block remediation's apply
|
||||||
|
path repointed at egress-proxy (SIGHUP, audit log, etc.).
|
||||||
|
Supervise tool description updated.
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
- **mitmproxy addon distribution.** Mount the addon Python file
|
||||||
|
from stage_dir, or bake it into the image. Mount is more
|
||||||
|
hot-reloadable; bake-in is more reproducible. Recommend bake-in,
|
||||||
|
with routes.yaml as the only mounted state.
|
||||||
|
- **Path match semantics.** Prefix-only for v1 (matches PRD 0017
|
||||||
|
v1 spirit). Globs / regex are a follow-up if operators ask.
|
||||||
|
- **Mode for the `Authorization` strip on inbound.** Pipelock has a
|
||||||
|
similar strip in `sensitive_headers`. Confirm there's no
|
||||||
|
double-strip causing a real header the agent set to disappear
|
||||||
|
unexpectedly. Probably want egress-proxy to be the only stripper
|
||||||
|
for routes that match.
|
||||||
|
- **Pipelock's TLS interception post-cutover.** Today pipelock
|
||||||
|
MITMs the cred-proxy → upstream leg using its own CA. After the
|
||||||
|
cutover, that leg starts as a CONNECT tunnel from egress-proxy
|
||||||
|
(egress-proxy treats pipelock as a plain HTTPS forward proxy).
|
||||||
|
Does pipelock still need to MITM? Probably no — egress-proxy
|
||||||
|
already terminated, body content is already inspected upstream
|
||||||
|
by egress-proxy's addons (or could be). But that means moving
|
||||||
|
DLP from pipelock to egress-proxy, which expands egress-proxy's
|
||||||
|
trust-domain *further*. Punted to the implementation PR to
|
||||||
|
decide.
|
||||||
|
- **Performance.** Two MITM hops in the worst case (agent ↔
|
||||||
|
egress-proxy and pipelock ↔ upstream if pipelock keeps its
|
||||||
|
interception). Measure under realistic load; if it's a problem,
|
||||||
|
the answer is probably to disable pipelock's TLS interception
|
||||||
|
and let it operate at hostname-only.
|
||||||
|
- **Agent's existing dotfile rewrites.** Today cred-proxy
|
||||||
|
provisions ~/.npmrc with `registry=http://cred-proxy:9099/npm/`,
|
||||||
|
~/.gitconfig with `insteadOf` rules, etc. After the cutover
|
||||||
|
none of those rewrites are strictly necessary for routing
|
||||||
|
(HTTPS_PROXY catches everything), but they may still be useful
|
||||||
|
for canonicalisation (so the agent's `npm install` doesn't
|
||||||
|
surprise itself by talking to a different registry). Decide per
|
||||||
|
dotfile in the migration PR.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- PRD 0010 — cred-proxy (superseded by this PRD).
|
||||||
|
- PRD 0014 — cred-proxy-block remediation (retargeted).
|
||||||
|
- PRD 0013 — supervise plane (tool descriptions updated).
|
||||||
|
- PR #25 — the supervise loop, whose `_apply_pipelock_url`
|
||||||
|
docstring flagged the original "path filtering belongs
|
||||||
|
somewhere" follow-up.
|
||||||
|
- mitmproxy — https://mitmproxy.org/ — chosen as the egress-proxy
|
||||||
|
engine because it's the canonical scriptable MITM forward proxy.
|
||||||
@@ -1,195 +0,0 @@
|
|||||||
# PRD 0017: Path-aware egress filtering via cred-proxy
|
|
||||||
|
|
||||||
- **Status:** Draft
|
|
||||||
- **Author:** didericis
|
|
||||||
- **Created:** 2026-05-25
|
|
||||||
|
|
||||||
## Summary
|
|
||||||
|
|
||||||
Pipelock's `api_allowlist` is hostname-only — once a host is on the
|
|
||||||
list, every URL path at that host is reachable. For agents working
|
|
||||||
on shared platforms (github.com, gitlab.com, public registries),
|
|
||||||
this means approving access to one user's content also opens
|
|
||||||
access to every other user's content. Cred-proxy already
|
|
||||||
path-prefix-routes authenticated traffic; this PRD extends it to
|
|
||||||
filter (not just route) paths, including for unauthenticated hosts.
|
|
||||||
Per-bottle egress then has two complementary layers: pipelock for
|
|
||||||
hostname allow + DLP + body scanning, cred-proxy for path-level
|
|
||||||
allow on declared hosts.
|
|
||||||
|
|
||||||
## Problem
|
|
||||||
|
|
||||||
PR #25's pipelock-block tool delivers an honest but coarse experience:
|
|
||||||
the agent reports "I tried hitting `https://github.com/didericis`,
|
|
||||||
pipelock 403'd it"; the operator approves and the agent now has
|
|
||||||
access to all of github.com. The path in the proposal is captured
|
|
||||||
as context but not enforced (PR #25 documents this in
|
|
||||||
`_apply_pipelock_url`'s docstring).
|
|
||||||
|
|
||||||
The intended posture for many shared platforms is narrower than
|
|
||||||
hostname-level. "Allow the agent to read github.com/didericis but
|
|
||||||
not github.com/somebody-else" is a normal ask. Today the egress
|
|
||||||
stack can't express that, even though cred-proxy already has 80%
|
|
||||||
of the machinery: it path-routes authenticated traffic with
|
|
||||||
longest-prefix matching, and the manifest's `cred_proxy.routes[]`
|
|
||||||
shape is already a list of `(path, upstream, ...)` rules.
|
|
||||||
|
|
||||||
## Goals / Success Criteria
|
|
||||||
|
|
||||||
A bottle manifest can declare a cred-proxy route with a
|
|
||||||
`path_allowlist` and `auth_scheme: none`. Agents dialing
|
|
||||||
`http://cred-proxy:<port>/<route>/<suffix>` hit a 403 from
|
|
||||||
cred-proxy when `<suffix>` doesn't match any allowlist entry, and
|
|
||||||
a normal forward (no auth header injected) when it does. For
|
|
||||||
existing authenticated routes the addition is opt-in: a route
|
|
||||||
without `path_allowlist` keeps its current permissive behaviour.
|
|
||||||
|
|
||||||
Demonstrable behavior: a bottle manifest declares
|
|
||||||
`{path: "/github/", upstream: "https://github.com", auth_scheme: "none",
|
|
||||||
path_allowlist: ["/didericis/"]}`; the agent reaches
|
|
||||||
`http://cred-proxy:9099/github/didericis/some-repo` successfully,
|
|
||||||
gets a 403 on `http://cred-proxy:9099/github/someone-else/whatever`.
|
|
||||||
|
|
||||||
## Non-goals
|
|
||||||
|
|
||||||
- Replacing pipelock. Pipelock still does the hostname allowlist,
|
|
||||||
DLP body scanning, MCP / WebSocket inspection. Path filtering is
|
|
||||||
additive, sitting in front of pipelock for routes that opt in.
|
|
||||||
- Auto-routing arbitrary outbound HTTP through cred-proxy. The
|
|
||||||
agent's `HTTP_PROXY` stays pointed at pipelock; cred-proxy is
|
|
||||||
reached by explicit URL (with a `git-insteadof`-style rewrite
|
|
||||||
for the few protocol-level helpers that need it).
|
|
||||||
- Reworking pipelock-block. The PR #25 tool stays hostname-only;
|
|
||||||
whether a new path-aware proposal tool (or a richer
|
|
||||||
pipelock-block) is wanted is an open question for a follow-on
|
|
||||||
PRD.
|
|
||||||
- Live mutation of the running container or cred-proxy beyond
|
|
||||||
what cred-proxy SIGHUP already supports (PRD 0014).
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
### In scope
|
|
||||||
|
|
||||||
- A new optional `auth_scheme: "none"` mode on cred-proxy routes
|
|
||||||
that suppresses Authorization injection while keeping path
|
|
||||||
routing + (new) path filtering.
|
|
||||||
- A new optional `path_allowlist: [<prefix>, ...]` field per
|
|
||||||
cred-proxy route. When present, cred-proxy 403s requests whose
|
|
||||||
in-route suffix doesn't match at least one prefix.
|
|
||||||
- Manifest schema + validation for the two new fields.
|
|
||||||
- Cred-proxy server logic: enforcement on each request after the
|
|
||||||
longest-prefix route match.
|
|
||||||
- SIGHUP reload picks up `path_allowlist` changes (no new sidecar
|
|
||||||
primitives — the existing reload path already re-reads
|
|
||||||
`routes.json`).
|
|
||||||
|
|
||||||
### Out of scope
|
|
||||||
|
|
||||||
- A new MCP tool for the agent to propose `path_allowlist`
|
|
||||||
additions. Today the operator manages this via the manifest +
|
|
||||||
the existing `routes edit <bottle>` TUI verb.
|
|
||||||
- Glob / regex matching. v1 ships prefix matching only; the open
|
|
||||||
question lays out the trade-offs.
|
|
||||||
- Auto-migrating PR #25's pipelock-block proposals into cred-proxy
|
|
||||||
routes. Manual operator decision per host.
|
|
||||||
- Provisioner-side dotfile changes for HTTPS-to-cred-proxy rewrites
|
|
||||||
on bottles that opt unauth'd hosts onto cred-proxy. Out of scope
|
|
||||||
for the engine work; the manifest can already encode it.
|
|
||||||
|
|
||||||
## Proposed Design
|
|
||||||
|
|
||||||
### Manifest schema additions
|
|
||||||
|
|
||||||
`bottle.cred_proxy.routes[]` gains two optional fields:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
cred_proxy:
|
|
||||||
routes:
|
|
||||||
- path: "/github/"
|
|
||||||
upstream: "https://github.com"
|
|
||||||
auth_scheme: "none" # new — no Authorization header
|
|
||||||
token_ref: "" # ignored when auth_scheme is "none"
|
|
||||||
path_allowlist: # new — prefix list; empty / absent = permissive
|
|
||||||
- "/didericis/"
|
|
||||||
- "/didericis-org/"
|
|
||||||
```
|
|
||||||
|
|
||||||
- `auth_scheme: "none"` joins the existing `Bearer` / `token` values.
|
|
||||||
When `none`, `token_ref` must be empty or absent and no
|
|
||||||
Authorization header is injected. The route still routes by path
|
|
||||||
prefix and forwards to upstream.
|
|
||||||
- `path_allowlist` is a list of suffix prefixes (matched after the
|
|
||||||
route's `path` is stripped). Empty / absent means permissive
|
|
||||||
(current behaviour). When non-empty, the suffix must start with
|
|
||||||
at least one of the allowlist entries.
|
|
||||||
|
|
||||||
### cred-proxy server changes
|
|
||||||
|
|
||||||
Per request:
|
|
||||||
1. Strip query string, longest-prefix-match against `routes`.
|
|
||||||
2. Compute the suffix = request_path[len(route.path):].
|
|
||||||
3. If `route.path_allowlist` is non-empty: require that
|
|
||||||
`"/" + suffix` (or just `suffix` — pick a consistent
|
|
||||||
normalization) starts with at least one allowlist entry. 403 if
|
|
||||||
not.
|
|
||||||
4. If `auth_scheme == "none"`: skip the `Authorization` header
|
|
||||||
step entirely; otherwise inject as today.
|
|
||||||
5. Forward upstream, stream response (unchanged).
|
|
||||||
|
|
||||||
The 403 body should name the route + the disallowed suffix so the
|
|
||||||
operator can diagnose. cred-proxy's existing log line at request
|
|
||||||
time picks up the new outcome too.
|
|
||||||
|
|
||||||
### Validation
|
|
||||||
|
|
||||||
At manifest load:
|
|
||||||
- `auth_scheme` must be one of `Bearer`, `token`, or `none`.
|
|
||||||
- When `auth_scheme == "none"`, `token_ref` is forbidden (clearer
|
|
||||||
error than silently ignoring).
|
|
||||||
- `path_allowlist` entries must start with `/` and end with `/`
|
|
||||||
(matching the existing convention for `route.path`).
|
|
||||||
- Duplicate prefixes are deduplicated with a warning, not an
|
|
||||||
error.
|
|
||||||
|
|
||||||
### Migration / backward compatibility
|
|
||||||
|
|
||||||
- Routes without `path_allowlist` behave exactly as today.
|
|
||||||
- Routes with `auth_scheme: Bearer | token` behave exactly as today.
|
|
||||||
- No existing manifests need editing; the new fields are opt-in.
|
|
||||||
|
|
||||||
## Open questions
|
|
||||||
|
|
||||||
- **Match semantics: prefix vs glob vs regex.** Prefix is simple
|
|
||||||
and matches the existing `route.path` convention. Glob (`/users/*/repos/`)
|
|
||||||
adds power but is easy to get wrong (does `*` match a `/`?).
|
|
||||||
Regex is the most powerful and the most footguny. Recommend
|
|
||||||
prefix-only for v1, glob in a follow-up if operators ask for it.
|
|
||||||
- **403 body shape.** Plain text vs JSON. Cred-proxy's existing
|
|
||||||
errors use plain text (`send_error(404, "no route for ...")`).
|
|
||||||
Match that.
|
|
||||||
- **Auth-less routes and TLS interception.** A `none`-auth route
|
|
||||||
still routes outbound HTTPS through pipelock (cred-proxy's
|
|
||||||
`HTTPS_PROXY` env), so pipelock's CA + body scanner still apply.
|
|
||||||
Confirm that pipelock's allowlist needs the upstream host in
|
|
||||||
this case — there's no token to make the cred-proxy → upstream
|
|
||||||
leg special. Likely yes, same as today.
|
|
||||||
- **MCP tool / pipelock-block evolution.** Once path filtering
|
|
||||||
exists, the operator may want a way for the agent to propose
|
|
||||||
path additions (e.g. "I need /didericis-org/ added to the
|
|
||||||
github route"). Today that requires manifest edit + cli.py
|
|
||||||
rebuild, or `routes edit` via the dashboard. Whether a new MCP
|
|
||||||
tool (or a richer pipelock-block) is wanted is a follow-on PRD
|
|
||||||
open question.
|
|
||||||
- **Allowlist semantics for the entire route prefix.** Should an
|
|
||||||
empty `path_allowlist: []` be allowed? Equivalent to "block
|
|
||||||
everything at this upstream" — possibly useful as a tombstone,
|
|
||||||
more likely a typo. Recommend treating empty list the same as
|
|
||||||
absent (permissive) and flagging in the validation note.
|
|
||||||
|
|
||||||
## References
|
|
||||||
|
|
||||||
- PRD 0010 — cred-proxy (the engine being extended).
|
|
||||||
- PRD 0015 — pipelock block remediation (whose hostname-only
|
|
||||||
ceiling motivates this PRD).
|
|
||||||
- PR #25 — `_apply_pipelock_url`'s docstring documents the
|
|
||||||
follow-up that this PRD formalises.
|
|
||||||
Reference in New Issue
Block a user