From a74dd2b97f34a1a48e1f2b90954d753358c1ab97 Mon Sep 17 00:00:00 2001
From: didericis <eric@dideric.is>
Date: Sun, 24 May 2026 23:39:17 -0400
Subject: [PATCH] docs: research on git-gate commit approval; link from PRD
 0012

---
 docs/prds/0012-stuck-agent-recovery-flow.md |   2 +-
 docs/research/git-gate-commit-approval.md   | 229 ++++++++++++++++++++
 2 files changed, 230 insertions(+), 1 deletion(-)
 create mode 100644 docs/research/git-gate-commit-approval.md

diff --git a/docs/prds/0012-stuck-agent-recovery-flow.md b/docs/prds/0012-stuck-agent-recovery-flow.md
index 7d69ef4..41a142a 100644
--- a/docs/prds/0012-stuck-agent-recovery-flow.md
+++ b/docs/prds/0012-stuck-agent-recovery-flow.md
@@ -74,7 +74,7 @@ A real stuck agent recovers end-to-end through the flow: the agent hits a missin
 - How does the dashboard handle rejection? Does the agent get a comment back saying "denied, here's why," or does the bottle just stay torn down?
 - How does the orchestrator know which PR / branch a given bottle maps to — recorded at bottle-spawn time, derived from the working tree, or specified in the manifest?
 - Concurrency: if multiple bottles request changes simultaneously, what does the dashboard surface and in what order?
-- How does the flow handle one-off exceptions to gitlock / pipelock denials — e.g. a commit that includes docs with intentionally-bogus tokens that the secret scanner correctly flags? The shape (agent blocked → ask via PR comment → user approves → continue) is the same as a manifest-change request, but the *resolution* is different: a per-operation override or a scoped allowlist entry, not a new manifest. Does this fold into the same `/request-bottle-change` slash command with a different request type, or is it a separate slash command (e.g. `/request-gate-exception`)? And how is an "exception" expressed safely — by commit SHA, by content hash, by a narrow allowlist rule? Either way, the approval must be auditable so a future reader can see what was waived and why.
+- How does the flow handle one-off exceptions to gitlock / pipelock denials — e.g. a commit that includes docs with intentionally-bogus tokens that the secret scanner correctly flags? The shape (agent blocked → ask via PR comment → user approves → continue) is the same as a manifest-change request, but the *resolution* is different: a per-operation override or a scoped allowlist entry, not a new manifest. Does this fold into the same `/request-bottle-change` slash command with a different request type, or is it a separate slash command (e.g. `/request-gate-exception`)? And how is an "exception" expressed safely — by commit SHA, by content hash, by a narrow allowlist rule? Either way, the approval must be auditable so a future reader can see what was waived and why. See `docs/research/git-gate-commit-approval.md` for a survey of gitleaks's native allowlist primitives and a recommendation.
 
 ## References
 
diff --git a/docs/research/git-gate-commit-approval.md b/docs/research/git-gate-commit-approval.md
new file mode 100644
index 0000000..463d7bc
--- /dev/null
+++ b/docs/research/git-gate-commit-approval.md
@@ -0,0 +1,229 @@
+# Approving specific commits past git-gate
+
+Research into (1) whether a dashboard or operator surface for the
+git-gate (a.k.a. "gitlock", PRD 0008) already exists, and (2) what a
+narrowly-scoped approval flow for false-positive gitleaks rejections
+could look like without compromising the gate's "if it's bypassable it
+isn't a gate" property.
+
+Motivated by PRD 0012's open question: when an agent commits docs
+containing intentionally-bogus tokens that the secret scanner
+correctly flags, the rejection is correct in the literal sense and
+wrong in the user-intent sense, and there is no way to say so.
+
+## Summary
+
+There is no dashboard for the git-gate today. The CLI ships
+`init / list / info / start / edit / cleanup` for bottles; the gate is
+visible only as a sidecar in `bottle_plan.py`'s preflight rendering.
+No `gate` subcommand exists.
+
+There is also no exception mechanism. The pre-receive hook calls
+`gitleaks git --log-opts="$range" --no-banner --redact` with no
+config path and no allowlist surface. PRD 0008 explicitly rejects
+exceptions ("Bypass for trusted commits. No `[skip gitleaks]`
+trailer, no allowlist by commit hash. If the gate is bypassable it
+isn't a gate.").
+
+That non-goal is correct under its own framing — *any path the agent
+can take* invalidates the gate — but it conflates two distinct
+questions: can the *agent* bypass the gate (must be no), and can the
+*user* approve a narrowly-scoped exception (could be yes, under
+constraints). PRD 0012's recovery flow is exactly the seam where a
+user-side, out-of-band approval could live without giving the agent
+any in-band bypass.
+
+The design problem is therefore not "should there be exceptions" but
+"how narrow does an exception have to be before the gate is still a
+gate." This note surveys gitleaks's native allowlist primitives,
+sketches three approval-scope designs, and recommends a direction.
+
+## Question 1: Is there a dashboard / operator surface for git-gate?
+
+No, in three senses:
+
+- **No CLI subcommand.** `claude_bottle/cli/` has `_common, cleanup,
+  edit, info, init, list, start` and nothing gate-specific.
+  `claude-bottle list` shows bottles, not their gates' state or
+  recent rejections.
+- **No gate-side log surface.** Rejections are written to the
+  pre-receive hook's stderr (`echo "git-gate: gitleaks rejected push
+  to $ref" >&2`); the agent sees the rejection in its `git push`
+  output, but nothing persists outside the container's logs.
+- **No upstream UI for git-gate.** gitleaks itself is a CLI; it has
+  no built-in dashboard. The hosted secret-scanning UIs surveyed in
+  `git-secret-scanning-hardening.md` (ggshield, TruffleHog Enterprise)
+  are SaaS products that ship repo content to a vendor — explicitly
+  the wrong shape for a project whose premise is sandbox isolation.
+
+The PRD 0012 dashboard, when it exists, is the natural place for
+git-gate operator surface to live: list pending change requests,
+show recent rejections per bottle, render the diff of any
+exception-approval request. There is no reason to build a separate
+gate dashboard.
+
+## Question 2: How could specific commits be approved?
+
+### What gitleaks gives you natively
+
+Gitleaks's TOML config supports an `[allowlist]` block (or
+`[[rules.allowlists]]` per-rule) with four selectors that can be
+combined inside a single entry. The selectors observed in current
+gitleaks (v8) are:
+
+- `paths` — list of regex against file paths.
+- `regexes` — list of regex matched against the *finding's* matched
+  bytes; on match, suppress the finding. `regexTarget` chooses
+  whether the regex applies to the matched bytes, the surrounding
+  line, or the secret group only.
+- `stopwords` — substrings that, if present in the finding, suppress
+  it. Cheaper than `regexes` for literal matches.
+- `commits` — explicit commit SHAs to skip entirely.
+- `condition` — `AND` (default) or `OR` across the above selectors,
+  letting an entry require, e.g., both a path match *and* a content
+  match before suppressing.
+
+`commits` is the bluntest tool and the easiest to misuse: a single
+SHA can hide arbitrary content. `paths + regexes` (with AND) is the
+narrowest scope: a finding is only suppressed if it lives at a
+specific path *and* matches a specific byte pattern. That's the
+shape that makes a per-finding exception still defensible.
+
+The hook today does not pass `--config` to gitleaks. Adding it would
+mean baking a config file into the gate image *or* mounting one in
+at `start` time. The image is built per `DockerGitGate.start`, so
+either is mechanically straightforward.
+
+### The design tension
+
+PRD 0008's "no bypass for trusted commits" non-goal is load-bearing
+*against the agent*. It is not load-bearing against the user, who
+already has every privilege the gate is trying to deny the agent.
+The risk of letting the user approve exceptions is not direct (the
+user can already do whatever they want); it is indirect:
+
+- **Prompt-injection laundering.** An attacker who has captured the
+  agent's prompt-stream can ask the agent to *request* an exception
+  that looks plausible ("I just need to commit the test fixture for
+  the new auth flow"). If the user rubber-stamps the request, the
+  attacker has used the user as a bypass channel. This is the same
+  risk as any human-in-the-loop control: it degrades to "no control"
+  if the human always says yes.
+- **Scope creep of a granted exception.** A commit-SHA allowlist
+  approved for one commit could, in principle, be re-targeted at a
+  different commit if the allowlist isn't tied to the content. This
+  is why `commits` alone is unsafe; `paths + regexes` is the form
+  that survives content-substitution.
+- **Persistence past intent.** An exception granted "just for this
+  commit" that stays in the gate's config indefinitely is no longer
+  a per-commit exception; it's a permanent allowlist entry. Without
+  TTL or a clean teardown, exceptions accrete.
+
+These three risks shape the design constraints below.
+
+### Three design options
+
+**Option A — Reject and rotate.** Treat every gitleaks hit as
+"rewrite the commit to not contain the literal token, then re-push."
+For docs with fake tokens, use a sentinel string the repo's
+gitleaks config recognizes as obviously not a real secret (e.g.
+`AKIAIOSFODNN7EXAMPLE`, AWS's documented example key, or a project-
+specific placeholder like `<aws-access-key-id>`).
+
+- *Cost:* zero. No new code.
+- *Property:* gate stays unbypassable in both senses.
+- *Friction:* every author must know the placeholder convention. The
+  first time someone pastes a realistic-looking fake into a doc,
+  they get rejected and have to redo the commit. Probably fine for
+  the host repo; less fine for bottles authoring third-party content.
+- *Verdict:* this should be the *default*. The exception flow exists
+  only for cases where Option A genuinely fails (e.g. the example is
+  specifically about a real-looking token format, or the upstream
+  doc requires the literal pattern).
+
+**Option B — Per-finding narrow allowlist via PRD 0012 flow.** When
+the agent's push is rejected, the agent invokes
+`/request-gate-exception` (or `/request-bottle-change` with an
+exception variant). The slash command POSTs to the cred-proxy
+endpoint, carrying:
+
+- the file path that triggered the finding
+- the finding's matched-byte hash (not the bytes themselves, to keep
+  the request artifact non-secret on its own)
+- the gitleaks rule ID
+- a free-text justification ("docs example for AWS auth flow")
+
+The user reviews the request in the dashboard, sees the file and the
+diff, and approves an entry of shape `{ paths: [<exact path>],
+regexes: [<exact-match regex over matched bytes>], condition: AND }`.
+The gate restarts with that config entry merged into its
+`.gitleaks.toml`. A future commit on the same path with a *different*
+finding still hits the gate and rejects.
+
+- *Property:* approved exceptions are content-locked, not commit-
+  locked. Substituting bytes on the same path triggers a fresh
+  rejection.
+- *Auditability:* the approval is a manifest diff; it lives in git
+  history and in the PR conversation thread per PRD 0012.
+- *Open: TTL.* Should the entry expire? Plausible defaults: never
+  (it's content-locked anyway), or "until the next manifest version
+  bump." Lean "never" for v1; revisit if exception lists balloon.
+
+**Option C — Pre-flight scan with author signoff.** Run gitleaks
+client-side inside the bottle (as a non-gating advisory check) so
+the agent sees findings *before* attempting the push. The slash
+command then includes the pre-known findings; the dashboard shows
+the user the finding inline rather than having to go look at the
+rejection log. On approval, same Option-B-style allowlist entry
+gets added.
+
+- *Property:* identical end-state to Option B; better UX because
+  the agent stops before the rejected push, not after.
+- *Cost:* one more place that needs gitleaks installed (the bottle
+  image), and an in-bottle advisory check that the agent can in
+  principle ignore. That's fine because it's *advisory* — the gate
+  still rejects; the in-bottle check just avoids one round-trip.
+- *Verdict:* nice-to-have over Option B, not a substitute.
+
+### Recommendation
+
+Default to Option A as the canonical answer ("rewrite to use a
+placeholder"). Build Option B as the PRD 0012 exception path, scoped
+narrowly: `paths + regexes` with AND, no `commits` selector exposed
+to the approval flow. Defer Option C to a follow-up; it's an
+ergonomic win, not a security property.
+
+This puts the answer to PRD 0012's open question as:
+
+- Same recovery shape (`/request-bottle-change`), distinguishable
+  request type. The dashboard renders an exception request
+  differently from a manifest-change request because the *diff*
+  being approved is to the gate's allowlist, not to the manifest.
+- Exceptions are expressed as `(path, content-pattern)` pairs, not
+  commit SHAs. Re-pushing different bytes on the same path
+  re-triggers the gate.
+- The approval is recorded twice for audit: in the PR thread (free-
+  text), and as a versioned diff to the gate's allowlist config (or
+  the manifest field that materializes into it).
+
+## Cross-references
+
+- PRD 0008 — git-gate design and "no bypass" non-goal.
+- PRD 0010 — cred-proxy; the inbound endpoint PRD 0012 reuses for
+  exception requests.
+- PRD 0012 — stuck-agent recovery flow; the open question this note
+  informs.
+- `docs/research/git-secret-scanning-hardening.md` — prior research
+  on the secret-scanning tool landscape and why gitleaks is the fit.
+
+## Sources
+
+- [gitleaks configuration documentation](https://github.com/gitleaks/gitleaks#configuration)
+  — `[allowlist]` selectors (`paths`, `regexes`, `stopwords`,
+  `commits`, `regexTarget`, `condition`).
+- [AWS example access key (`AKIAIOSFODNN7EXAMPLE`)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html)
+  — documented placeholder safe to use in examples without
+  triggering most secret scanners.
+- `claude_bottle/git_gate.py` — pre-receive hook implementation
+  (`gitleaks git --log-opts="$log_opts" --no-banner --redact`, no
+  `--config` argument today).