bot-bottle/docs/research/git-gate-commit-approval.md

# Approving specific commits past git-gate

Research into (1) whether a dashboard or operator surface for the
git-gate (a.k.a. "gitlock", PRD 0008) already exists, and (2) what a
narrowly-scoped approval flow for false-positive gitleaks rejections
could look like without compromising the gate's "if it's bypassable it
isn't a gate" property.

Motivated by PRD 0012's open question: when an agent commits docs
containing intentionally-bogus tokens that the secret scanner
correctly flags, the rejection is correct in the literal sense and
wrong in the user-intent sense, and there is no way to say so.

## Summary

There is no dashboard for the git-gate today. The CLI ships
`init / list / info / start / edit / cleanup` for bottles; the gate is
visible only as a sidecar in `bottle_plan.py`'s preflight rendering.
No `gate` subcommand exists.

There is also no exception mechanism. The pre-receive hook calls
`gitleaks git --log-opts="$range" --no-banner --redact` with no
config path and no allowlist surface. PRD 0008 explicitly rejects
exceptions ("Bypass for trusted commits. No `[skip gitleaks]`
trailer, no allowlist by commit hash. If the gate is bypassable it
isn't a gate.").

That non-goal is correct under its own framing — *any path the agent
can take* invalidates the gate — but it conflates two distinct
questions: can the *agent* bypass the gate (must be no), and can the
*user* approve a narrowly-scoped exception (could be yes, under
constraints). PRD 0012's recovery flow is exactly the seam where a
user-side, out-of-band approval could live without giving the agent
any in-band bypass.

The design problem is therefore not "should there be exceptions" but
"how narrow does an exception have to be before the gate is still a
gate." This note surveys gitleaks's native allowlist primitives,
sketches three approval-scope designs, and recommends a direction.

## Question 1: Is there a dashboard / operator surface for git-gate?

No, in three senses:

- **No CLI subcommand.** `claude_bottle/cli/` has `_common, cleanup,
  edit, info, init, list, start` and nothing gate-specific.
  `claude-bottle list` shows bottles, not their gates' state or
  recent rejections.
- **No gate-side log surface.** Rejections are written to the
  pre-receive hook's stderr (`echo "git-gate: gitleaks rejected push
  to $ref" >&2`); the agent sees the rejection in its `git push`
  output, but nothing persists outside the container's logs.
- **No upstream UI for git-gate.** gitleaks itself is a CLI; it has
  no built-in dashboard. The hosted secret-scanning UIs surveyed in
  `git-secret-scanning-hardening.md` (ggshield, TruffleHog Enterprise)
  are SaaS products that ship repo content to a vendor — explicitly
  the wrong shape for a project whose premise is sandbox isolation.

The PRD 0012 dashboard, when it exists, is the natural place for
git-gate operator surface to live: list pending change requests,
show recent rejections per bottle, render the diff of any
exception-approval request. There is no reason to build a separate
gate dashboard.

## Question 2: How could specific commits be approved?

### What gitleaks gives you natively

Gitleaks's TOML config supports an `[allowlist]` block (or
`[[rules.allowlists]]` per-rule) with four selectors that can be
combined inside a single entry. The selectors observed in current
gitleaks (v8) are:

- `paths` — list of regex against file paths.
- `regexes` — list of regex matched against the *finding's* matched
  bytes; on match, suppress the finding. `regexTarget` chooses
  whether the regex applies to the matched bytes, the surrounding
  line, or the secret group only.
- `stopwords` — substrings that, if present in the finding, suppress
  it. Cheaper than `regexes` for literal matches.
- `commits` — explicit commit SHAs to skip entirely.
- `condition` — `AND` (default) or `OR` across the above selectors,
  letting an entry require, e.g., both a path match *and* a content
  match before suppressing.

`commits` is the bluntest tool and the easiest to misuse: a single
SHA can hide arbitrary content. `paths + regexes` (with AND) is the
narrowest scope: a finding is only suppressed if it lives at a
specific path *and* matches a specific byte pattern. That's the
shape that makes a per-finding exception still defensible.

The hook today does not pass `--config` to gitleaks. Adding it would
mean baking a config file into the gate image *or* mounting one in
at `start` time. The image is built per `DockerGitGate.start`, so
either is mechanically straightforward.

### The design tension

PRD 0008's "no bypass for trusted commits" non-goal is load-bearing
*against the agent*. It is not load-bearing against the user, who
already has every privilege the gate is trying to deny the agent.
The risk of letting the user approve exceptions is not direct (the
user can already do whatever they want); it is indirect:

- **Prompt-injection laundering.** An attacker who has captured the
  agent's prompt-stream can ask the agent to *request* an exception
  that looks plausible ("I just need to commit the test fixture for
  the new auth flow"). If the user rubber-stamps the request, the
  attacker has used the user as a bypass channel. This is the same
  risk as any human-in-the-loop control: it degrades to "no control"
  if the human always says yes.
- **Scope creep of a granted exception.** A commit-SHA allowlist
  approved for one commit could, in principle, be re-targeted at a
  different commit if the allowlist isn't tied to the content. This
  is why `commits` alone is unsafe; `paths + regexes` is the form
  that survives content-substitution.
- **Persistence past intent.** An exception granted "just for this
  commit" that stays in the gate's config indefinitely is no longer
  a per-commit exception; it's a permanent allowlist entry. Without
  TTL or a clean teardown, exceptions accrete.

These three risks shape the design constraints below.

### Three design options

**Option A — Reject and rotate.** Treat every gitleaks hit as
"rewrite the commit to not contain the literal token, then re-push."
For docs with fake tokens, use a sentinel string the repo's
gitleaks config recognizes as obviously not a real secret (e.g.
`AKIAIOSFODNN7EXAMPLE`, AWS's documented example key, or a project-
specific placeholder like `<aws-access-key-id>`).

- *Cost:* zero. No new code.
- *Property:* gate stays unbypassable in both senses.
- *Friction:* every author must know the placeholder convention. The
  first time someone pastes a realistic-looking fake into a doc,
  they get rejected and have to redo the commit. Probably fine for
  the host repo; less fine for bottles authoring third-party content.
- *Verdict:* this should be the *default*. The exception flow exists
  only for cases where Option A genuinely fails (e.g. the example is
  specifically about a real-looking token format, or the upstream
  doc requires the literal pattern).

**Option B — Per-finding narrow allowlist via PRD 0012 flow.** When
the agent's push is rejected, the agent invokes
`/request-gate-exception` (or `/request-bottle-change` with an
exception variant). The slash command POSTs to the cred-proxy
endpoint, carrying:

- the file path that triggered the finding
- the finding's matched-byte hash (not the bytes themselves, to keep
  the request artifact non-secret on its own)
- the gitleaks rule ID
- a free-text justification ("docs example for AWS auth flow")

The user reviews the request in the dashboard, sees the file and the
diff, and approves an entry of shape `{ paths: [<exact path>],
regexes: [<exact-match regex over matched bytes>], condition: AND }`.
The gate restarts with that config entry merged into its
`.gitleaks.toml`. A future commit on the same path with a *different*
finding still hits the gate and rejects.

- *Property:* approved exceptions are content-locked, not commit-
  locked. Substituting bytes on the same path triggers a fresh
  rejection.
- *Auditability:* the approval is a manifest diff; it lives in git
  history and in the PR conversation thread per PRD 0012.
- *Open: TTL.* Should the entry expire? Plausible defaults: never
  (it's content-locked anyway), or "until the next manifest version
  bump." Lean "never" for v1; revisit if exception lists balloon.

**Option C — Pre-flight scan with author signoff.** Run gitleaks
client-side inside the bottle (as a non-gating advisory check) so
the agent sees findings *before* attempting the push. The slash
command then includes the pre-known findings; the dashboard shows
the user the finding inline rather than having to go look at the
rejection log. On approval, same Option-B-style allowlist entry
gets added.

- *Property:* identical end-state to Option B; better UX because
  the agent stops before the rejected push, not after.
- *Cost:* one more place that needs gitleaks installed (the bottle
  image), and an in-bottle advisory check that the agent can in
  principle ignore. That's fine because it's *advisory* — the gate
  still rejects; the in-bottle check just avoids one round-trip.
- *Verdict:* nice-to-have over Option B, not a substitute.

### Recommendation

Default to Option A as the canonical answer ("rewrite to use a
placeholder"). Build Option B as the PRD 0012 exception path, scoped
narrowly: `paths + regexes` with AND, no `commits` selector exposed
to the approval flow. Defer Option C to a follow-up; it's an
ergonomic win, not a security property.

This puts the answer to PRD 0012's open question as:

- Same recovery shape (`/request-bottle-change`), distinguishable
  request type. The dashboard renders an exception request
  differently from a manifest-change request because the *diff*
  being approved is to the gate's allowlist, not to the manifest.
- Exceptions are expressed as `(path, content-pattern)` pairs, not
  commit SHAs. Re-pushing different bytes on the same path
  re-triggers the gate.
- The approval is recorded twice for audit: in the PR thread (free-
  text), and as a versioned diff to the gate's allowlist config (or
  the manifest field that materializes into it).

## Cross-references

- PRD 0008 — git-gate design and "no bypass" non-goal.
- PRD 0010 — cred-proxy; the inbound endpoint PRD 0012 reuses for
  exception requests.
- PRD 0012 — stuck-agent recovery flow; the open question this note
  informs.
- `docs/research/git-secret-scanning-hardening.md` — prior research
  on the secret-scanning tool landscape and why gitleaks is the fit.

## Sources

- [gitleaks configuration documentation](https://github.com/gitleaks/gitleaks#configuration)
  — `[allowlist]` selectors (`paths`, `regexes`, `stopwords`,
  `commits`, `regexTarget`, `condition`).
- [AWS example access key (`AKIAIOSFODNN7EXAMPLE`)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html)
  — documented placeholder safe to use in examples without
  triggering most secret scanners.
- `claude_bottle/git_gate.py` — pre-receive hook implementation
  (`gitleaks git --log-opts="$log_opts" --no-banner --redact`, no
  `--config` argument today).