docs: research on git-gate commit approval; link from PRD 0012
This commit is contained in:
@@ -74,7 +74,7 @@ A real stuck agent recovers end-to-end through the flow: the agent hits a missin
|
||||
- How does the dashboard handle rejection? Does the agent get a comment back saying "denied, here's why," or does the bottle just stay torn down?
|
||||
- How does the orchestrator know which PR / branch a given bottle maps to — recorded at bottle-spawn time, derived from the working tree, or specified in the manifest?
|
||||
- Concurrency: if multiple bottles request changes simultaneously, what does the dashboard surface and in what order?
|
||||
- How does the flow handle one-off exceptions to gitlock / pipelock denials — e.g. a commit that includes docs with intentionally-bogus tokens that the secret scanner correctly flags? The shape (agent blocked → ask via PR comment → user approves → continue) is the same as a manifest-change request, but the *resolution* is different: a per-operation override or a scoped allowlist entry, not a new manifest. Does this fold into the same `/request-bottle-change` slash command with a different request type, or is it a separate slash command (e.g. `/request-gate-exception`)? And how is an "exception" expressed safely — by commit SHA, by content hash, by a narrow allowlist rule? Either way, the approval must be auditable so a future reader can see what was waived and why.
|
||||
- How does the flow handle one-off exceptions to gitlock / pipelock denials — e.g. a commit that includes docs with intentionally-bogus tokens that the secret scanner correctly flags? The shape (agent blocked → ask via PR comment → user approves → continue) is the same as a manifest-change request, but the *resolution* is different: a per-operation override or a scoped allowlist entry, not a new manifest. Does this fold into the same `/request-bottle-change` slash command with a different request type, or is it a separate slash command (e.g. `/request-gate-exception`)? And how is an "exception" expressed safely — by commit SHA, by content hash, by a narrow allowlist rule? Either way, the approval must be auditable so a future reader can see what was waived and why. See `docs/research/git-gate-commit-approval.md` for a survey of gitleaks's native allowlist primitives and a recommendation.
|
||||
|
||||
## References
|
||||
|
||||
|
||||
@@ -0,0 +1,229 @@
|
||||
# Approving specific commits past git-gate
|
||||
|
||||
Research into (1) whether a dashboard or operator surface for the
|
||||
git-gate (a.k.a. "gitlock", PRD 0008) already exists, and (2) what a
|
||||
narrowly-scoped approval flow for false-positive gitleaks rejections
|
||||
could look like without compromising the gate's "if it's bypassable it
|
||||
isn't a gate" property.
|
||||
|
||||
Motivated by PRD 0012's open question: when an agent commits docs
|
||||
containing intentionally-bogus tokens that the secret scanner
|
||||
correctly flags, the rejection is correct in the literal sense and
|
||||
wrong in the user-intent sense, and there is no way to say so.
|
||||
|
||||
## Summary
|
||||
|
||||
There is no dashboard for the git-gate today. The CLI ships
|
||||
`init / list / info / start / edit / cleanup` for bottles; the gate is
|
||||
visible only as a sidecar in `bottle_plan.py`'s preflight rendering.
|
||||
No `gate` subcommand exists.
|
||||
|
||||
There is also no exception mechanism. The pre-receive hook calls
|
||||
`gitleaks git --log-opts="$range" --no-banner --redact` with no
|
||||
config path and no allowlist surface. PRD 0008 explicitly rejects
|
||||
exceptions ("Bypass for trusted commits. No `[skip gitleaks]`
|
||||
trailer, no allowlist by commit hash. If the gate is bypassable it
|
||||
isn't a gate.").
|
||||
|
||||
That non-goal is correct under its own framing — *any path the agent
|
||||
can take* invalidates the gate — but it conflates two distinct
|
||||
questions: can the *agent* bypass the gate (must be no), and can the
|
||||
*user* approve a narrowly-scoped exception (could be yes, under
|
||||
constraints). PRD 0012's recovery flow is exactly the seam where a
|
||||
user-side, out-of-band approval could live without giving the agent
|
||||
any in-band bypass.
|
||||
|
||||
The design problem is therefore not "should there be exceptions" but
|
||||
"how narrow does an exception have to be before the gate is still a
|
||||
gate." This note surveys gitleaks's native allowlist primitives,
|
||||
sketches three approval-scope designs, and recommends a direction.
|
||||
|
||||
## Question 1: Is there a dashboard / operator surface for git-gate?
|
||||
|
||||
No, in three senses:
|
||||
|
||||
- **No CLI subcommand.** `claude_bottle/cli/` has `_common, cleanup,
|
||||
edit, info, init, list, start` and nothing gate-specific.
|
||||
`claude-bottle list` shows bottles, not their gates' state or
|
||||
recent rejections.
|
||||
- **No gate-side log surface.** Rejections are written to the
|
||||
pre-receive hook's stderr (`echo "git-gate: gitleaks rejected push
|
||||
to $ref" >&2`); the agent sees the rejection in its `git push`
|
||||
output, but nothing persists outside the container's logs.
|
||||
- **No upstream UI for git-gate.** gitleaks itself is a CLI; it has
|
||||
no built-in dashboard. The hosted secret-scanning UIs surveyed in
|
||||
`git-secret-scanning-hardening.md` (ggshield, TruffleHog Enterprise)
|
||||
are SaaS products that ship repo content to a vendor — explicitly
|
||||
the wrong shape for a project whose premise is sandbox isolation.
|
||||
|
||||
The PRD 0012 dashboard, when it exists, is the natural place for
|
||||
git-gate operator surface to live: list pending change requests,
|
||||
show recent rejections per bottle, render the diff of any
|
||||
exception-approval request. There is no reason to build a separate
|
||||
gate dashboard.
|
||||
|
||||
## Question 2: How could specific commits be approved?
|
||||
|
||||
### What gitleaks gives you natively
|
||||
|
||||
Gitleaks's TOML config supports an `[allowlist]` block (or
|
||||
`[[rules.allowlists]]` per-rule) with four selectors that can be
|
||||
combined inside a single entry. The selectors observed in current
|
||||
gitleaks (v8) are:
|
||||
|
||||
- `paths` — list of regex against file paths.
|
||||
- `regexes` — list of regex matched against the *finding's* matched
|
||||
bytes; on match, suppress the finding. `regexTarget` chooses
|
||||
whether the regex applies to the matched bytes, the surrounding
|
||||
line, or the secret group only.
|
||||
- `stopwords` — substrings that, if present in the finding, suppress
|
||||
it. Cheaper than `regexes` for literal matches.
|
||||
- `commits` — explicit commit SHAs to skip entirely.
|
||||
- `condition` — `AND` (default) or `OR` across the above selectors,
|
||||
letting an entry require, e.g., both a path match *and* a content
|
||||
match before suppressing.
|
||||
|
||||
`commits` is the bluntest tool and the easiest to misuse: a single
|
||||
SHA can hide arbitrary content. `paths + regexes` (with AND) is the
|
||||
narrowest scope: a finding is only suppressed if it lives at a
|
||||
specific path *and* matches a specific byte pattern. That's the
|
||||
shape that makes a per-finding exception still defensible.
|
||||
|
||||
The hook today does not pass `--config` to gitleaks. Adding it would
|
||||
mean baking a config file into the gate image *or* mounting one in
|
||||
at `start` time. The image is built per `DockerGitGate.start`, so
|
||||
either is mechanically straightforward.
|
||||
|
||||
### The design tension
|
||||
|
||||
PRD 0008's "no bypass for trusted commits" non-goal is load-bearing
|
||||
*against the agent*. It is not load-bearing against the user, who
|
||||
already has every privilege the gate is trying to deny the agent.
|
||||
The risk of letting the user approve exceptions is not direct (the
|
||||
user can already do whatever they want); it is indirect:
|
||||
|
||||
- **Prompt-injection laundering.** An attacker who has captured the
|
||||
agent's prompt-stream can ask the agent to *request* an exception
|
||||
that looks plausible ("I just need to commit the test fixture for
|
||||
the new auth flow"). If the user rubber-stamps the request, the
|
||||
attacker has used the user as a bypass channel. This is the same
|
||||
risk as any human-in-the-loop control: it degrades to "no control"
|
||||
if the human always says yes.
|
||||
- **Scope creep of a granted exception.** A commit-SHA allowlist
|
||||
approved for one commit could, in principle, be re-targeted at a
|
||||
different commit if the allowlist isn't tied to the content. This
|
||||
is why `commits` alone is unsafe; `paths + regexes` is the form
|
||||
that survives content-substitution.
|
||||
- **Persistence past intent.** An exception granted "just for this
|
||||
commit" that stays in the gate's config indefinitely is no longer
|
||||
a per-commit exception; it's a permanent allowlist entry. Without
|
||||
TTL or a clean teardown, exceptions accrete.
|
||||
|
||||
These three risks shape the design constraints below.
|
||||
|
||||
### Three design options
|
||||
|
||||
**Option A — Reject and rotate.** Treat every gitleaks hit as
|
||||
"rewrite the commit to not contain the literal token, then re-push."
|
||||
For docs with fake tokens, use a sentinel string the repo's
|
||||
gitleaks config recognizes as obviously not a real secret (e.g.
|
||||
`AKIAIOSFODNN7EXAMPLE`, AWS's documented example key, or a project-
|
||||
specific placeholder like `<aws-access-key-id>`).
|
||||
|
||||
- *Cost:* zero. No new code.
|
||||
- *Property:* gate stays unbypassable in both senses.
|
||||
- *Friction:* every author must know the placeholder convention. The
|
||||
first time someone pastes a realistic-looking fake into a doc,
|
||||
they get rejected and have to redo the commit. Probably fine for
|
||||
the host repo; less fine for bottles authoring third-party content.
|
||||
- *Verdict:* this should be the *default*. The exception flow exists
|
||||
only for cases where Option A genuinely fails (e.g. the example is
|
||||
specifically about a real-looking token format, or the upstream
|
||||
doc requires the literal pattern).
|
||||
|
||||
**Option B — Per-finding narrow allowlist via PRD 0012 flow.** When
|
||||
the agent's push is rejected, the agent invokes
|
||||
`/request-gate-exception` (or `/request-bottle-change` with an
|
||||
exception variant). The slash command POSTs to the cred-proxy
|
||||
endpoint, carrying:
|
||||
|
||||
- the file path that triggered the finding
|
||||
- the finding's matched-byte hash (not the bytes themselves, to keep
|
||||
the request artifact non-secret on its own)
|
||||
- the gitleaks rule ID
|
||||
- a free-text justification ("docs example for AWS auth flow")
|
||||
|
||||
The user reviews the request in the dashboard, sees the file and the
|
||||
diff, and approves an entry of shape `{ paths: [<exact path>],
|
||||
regexes: [<exact-match regex over matched bytes>], condition: AND }`.
|
||||
The gate restarts with that config entry merged into its
|
||||
`.gitleaks.toml`. A future commit on the same path with a *different*
|
||||
finding still hits the gate and rejects.
|
||||
|
||||
- *Property:* approved exceptions are content-locked, not commit-
|
||||
locked. Substituting bytes on the same path triggers a fresh
|
||||
rejection.
|
||||
- *Auditability:* the approval is a manifest diff; it lives in git
|
||||
history and in the PR conversation thread per PRD 0012.
|
||||
- *Open: TTL.* Should the entry expire? Plausible defaults: never
|
||||
(it's content-locked anyway), or "until the next manifest version
|
||||
bump." Lean "never" for v1; revisit if exception lists balloon.
|
||||
|
||||
**Option C — Pre-flight scan with author signoff.** Run gitleaks
|
||||
client-side inside the bottle (as a non-gating advisory check) so
|
||||
the agent sees findings *before* attempting the push. The slash
|
||||
command then includes the pre-known findings; the dashboard shows
|
||||
the user the finding inline rather than having to go look at the
|
||||
rejection log. On approval, same Option-B-style allowlist entry
|
||||
gets added.
|
||||
|
||||
- *Property:* identical end-state to Option B; better UX because
|
||||
the agent stops before the rejected push, not after.
|
||||
- *Cost:* one more place that needs gitleaks installed (the bottle
|
||||
image), and an in-bottle advisory check that the agent can in
|
||||
principle ignore. That's fine because it's *advisory* — the gate
|
||||
still rejects; the in-bottle check just avoids one round-trip.
|
||||
- *Verdict:* nice-to-have over Option B, not a substitute.
|
||||
|
||||
### Recommendation
|
||||
|
||||
Default to Option A as the canonical answer ("rewrite to use a
|
||||
placeholder"). Build Option B as the PRD 0012 exception path, scoped
|
||||
narrowly: `paths + regexes` with AND, no `commits` selector exposed
|
||||
to the approval flow. Defer Option C to a follow-up; it's an
|
||||
ergonomic win, not a security property.
|
||||
|
||||
This puts the answer to PRD 0012's open question as:
|
||||
|
||||
- Same recovery shape (`/request-bottle-change`), distinguishable
|
||||
request type. The dashboard renders an exception request
|
||||
differently from a manifest-change request because the *diff*
|
||||
being approved is to the gate's allowlist, not to the manifest.
|
||||
- Exceptions are expressed as `(path, content-pattern)` pairs, not
|
||||
commit SHAs. Re-pushing different bytes on the same path
|
||||
re-triggers the gate.
|
||||
- The approval is recorded twice for audit: in the PR thread (free-
|
||||
text), and as a versioned diff to the gate's allowlist config (or
|
||||
the manifest field that materializes into it).
|
||||
|
||||
## Cross-references
|
||||
|
||||
- PRD 0008 — git-gate design and "no bypass" non-goal.
|
||||
- PRD 0010 — cred-proxy; the inbound endpoint PRD 0012 reuses for
|
||||
exception requests.
|
||||
- PRD 0012 — stuck-agent recovery flow; the open question this note
|
||||
informs.
|
||||
- `docs/research/git-secret-scanning-hardening.md` — prior research
|
||||
on the secret-scanning tool landscape and why gitleaks is the fit.
|
||||
|
||||
## Sources
|
||||
|
||||
- [gitleaks configuration documentation](https://github.com/gitleaks/gitleaks#configuration)
|
||||
— `[allowlist]` selectors (`paths`, `regexes`, `stopwords`,
|
||||
`commits`, `regexTarget`, `condition`).
|
||||
- [AWS example access key (`AKIAIOSFODNN7EXAMPLE`)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html)
|
||||
— documented placeholder safe to use in examples without
|
||||
triggering most secret scanners.
|
||||
- `claude_bottle/git_gate.py` — pre-receive hook implementation
|
||||
(`gitleaks git --log-opts="$log_opts" --no-banner --redact`, no
|
||||
`--config` argument today).
|
||||
Reference in New Issue
Block a user