docs: research on git-gate commit approval; link from PRD 0012
This commit is contained in:
@@ -74,7 +74,7 @@ A real stuck agent recovers end-to-end through the flow: the agent hits a missin
|
|||||||
- How does the dashboard handle rejection? Does the agent get a comment back saying "denied, here's why," or does the bottle just stay torn down?
|
- How does the dashboard handle rejection? Does the agent get a comment back saying "denied, here's why," or does the bottle just stay torn down?
|
||||||
- How does the orchestrator know which PR / branch a given bottle maps to — recorded at bottle-spawn time, derived from the working tree, or specified in the manifest?
|
- How does the orchestrator know which PR / branch a given bottle maps to — recorded at bottle-spawn time, derived from the working tree, or specified in the manifest?
|
||||||
- Concurrency: if multiple bottles request changes simultaneously, what does the dashboard surface and in what order?
|
- Concurrency: if multiple bottles request changes simultaneously, what does the dashboard surface and in what order?
|
||||||
- How does the flow handle one-off exceptions to gitlock / pipelock denials — e.g. a commit that includes docs with intentionally-bogus tokens that the secret scanner correctly flags? The shape (agent blocked → ask via PR comment → user approves → continue) is the same as a manifest-change request, but the *resolution* is different: a per-operation override or a scoped allowlist entry, not a new manifest. Does this fold into the same `/request-bottle-change` slash command with a different request type, or is it a separate slash command (e.g. `/request-gate-exception`)? And how is an "exception" expressed safely — by commit SHA, by content hash, by a narrow allowlist rule? Either way, the approval must be auditable so a future reader can see what was waived and why.
|
- How does the flow handle one-off exceptions to gitlock / pipelock denials — e.g. a commit that includes docs with intentionally-bogus tokens that the secret scanner correctly flags? The shape (agent blocked → ask via PR comment → user approves → continue) is the same as a manifest-change request, but the *resolution* is different: a per-operation override or a scoped allowlist entry, not a new manifest. Does this fold into the same `/request-bottle-change` slash command with a different request type, or is it a separate slash command (e.g. `/request-gate-exception`)? And how is an "exception" expressed safely — by commit SHA, by content hash, by a narrow allowlist rule? Either way, the approval must be auditable so a future reader can see what was waived and why. See `docs/research/git-gate-commit-approval.md` for a survey of gitleaks's native allowlist primitives and a recommendation.
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,229 @@
|
|||||||
|
# Approving specific commits past git-gate
|
||||||
|
|
||||||
|
Research into (1) whether a dashboard or operator surface for the
|
||||||
|
git-gate (a.k.a. "gitlock", PRD 0008) already exists, and (2) what a
|
||||||
|
narrowly-scoped approval flow for false-positive gitleaks rejections
|
||||||
|
could look like without compromising the gate's "if it's bypassable it
|
||||||
|
isn't a gate" property.
|
||||||
|
|
||||||
|
Motivated by PRD 0012's open question: when an agent commits docs
|
||||||
|
containing intentionally-bogus tokens that the secret scanner
|
||||||
|
correctly flags, the rejection is correct in the literal sense and
|
||||||
|
wrong in the user-intent sense, and there is no way to say so.
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
There is no dashboard for the git-gate today. The CLI ships
|
||||||
|
`init / list / info / start / edit / cleanup` for bottles; the gate is
|
||||||
|
visible only as a sidecar in `bottle_plan.py`'s preflight rendering.
|
||||||
|
No `gate` subcommand exists.
|
||||||
|
|
||||||
|
There is also no exception mechanism. The pre-receive hook calls
|
||||||
|
`gitleaks git --log-opts="$range" --no-banner --redact` with no
|
||||||
|
config path and no allowlist surface. PRD 0008 explicitly rejects
|
||||||
|
exceptions ("Bypass for trusted commits. No `[skip gitleaks]`
|
||||||
|
trailer, no allowlist by commit hash. If the gate is bypassable it
|
||||||
|
isn't a gate.").
|
||||||
|
|
||||||
|
That non-goal is correct under its own framing — *any path the agent
|
||||||
|
can take* invalidates the gate — but it conflates two distinct
|
||||||
|
questions: can the *agent* bypass the gate (must be no), and can the
|
||||||
|
*user* approve a narrowly-scoped exception (could be yes, under
|
||||||
|
constraints). PRD 0012's recovery flow is exactly the seam where a
|
||||||
|
user-side, out-of-band approval could live without giving the agent
|
||||||
|
any in-band bypass.
|
||||||
|
|
||||||
|
The design problem is therefore not "should there be exceptions" but
|
||||||
|
"how narrow does an exception have to be before the gate is still a
|
||||||
|
gate." This note surveys gitleaks's native allowlist primitives,
|
||||||
|
sketches three approval-scope designs, and recommends a direction.
|
||||||
|
|
||||||
|
## Question 1: Is there a dashboard / operator surface for git-gate?
|
||||||
|
|
||||||
|
No, in three senses:
|
||||||
|
|
||||||
|
- **No CLI subcommand.** `claude_bottle/cli/` has `_common, cleanup,
|
||||||
|
edit, info, init, list, start` and nothing gate-specific.
|
||||||
|
`claude-bottle list` shows bottles, not their gates' state or
|
||||||
|
recent rejections.
|
||||||
|
- **No gate-side log surface.** Rejections are written to the
|
||||||
|
pre-receive hook's stderr (`echo "git-gate: gitleaks rejected push
|
||||||
|
to $ref" >&2`); the agent sees the rejection in its `git push`
|
||||||
|
output, but nothing persists outside the container's logs.
|
||||||
|
- **No upstream UI for git-gate.** gitleaks itself is a CLI; it has
|
||||||
|
no built-in dashboard. The hosted secret-scanning UIs surveyed in
|
||||||
|
`git-secret-scanning-hardening.md` (ggshield, TruffleHog Enterprise)
|
||||||
|
are SaaS products that ship repo content to a vendor — explicitly
|
||||||
|
the wrong shape for a project whose premise is sandbox isolation.
|
||||||
|
|
||||||
|
The PRD 0012 dashboard, when it exists, is the natural place for
|
||||||
|
git-gate operator surface to live: list pending change requests,
|
||||||
|
show recent rejections per bottle, render the diff of any
|
||||||
|
exception-approval request. There is no reason to build a separate
|
||||||
|
gate dashboard.
|
||||||
|
|
||||||
|
## Question 2: How could specific commits be approved?
|
||||||
|
|
||||||
|
### What gitleaks gives you natively
|
||||||
|
|
||||||
|
Gitleaks's TOML config supports an `[allowlist]` block (or
|
||||||
|
`[[rules.allowlists]]` per-rule) with four selectors that can be
|
||||||
|
combined inside a single entry. The selectors observed in current
|
||||||
|
gitleaks (v8) are:
|
||||||
|
|
||||||
|
- `paths` — list of regex against file paths.
|
||||||
|
- `regexes` — list of regex matched against the *finding's* matched
|
||||||
|
bytes; on match, suppress the finding. `regexTarget` chooses
|
||||||
|
whether the regex applies to the matched bytes, the surrounding
|
||||||
|
line, or the secret group only.
|
||||||
|
- `stopwords` — substrings that, if present in the finding, suppress
|
||||||
|
it. Cheaper than `regexes` for literal matches.
|
||||||
|
- `commits` — explicit commit SHAs to skip entirely.
|
||||||
|
- `condition` — `AND` (default) or `OR` across the above selectors,
|
||||||
|
letting an entry require, e.g., both a path match *and* a content
|
||||||
|
match before suppressing.
|
||||||
|
|
||||||
|
`commits` is the bluntest tool and the easiest to misuse: a single
|
||||||
|
SHA can hide arbitrary content. `paths + regexes` (with AND) is the
|
||||||
|
narrowest scope: a finding is only suppressed if it lives at a
|
||||||
|
specific path *and* matches a specific byte pattern. That's the
|
||||||
|
shape that makes a per-finding exception still defensible.
|
||||||
|
|
||||||
|
The hook today does not pass `--config` to gitleaks. Adding it would
|
||||||
|
mean baking a config file into the gate image *or* mounting one in
|
||||||
|
at `start` time. The image is built per `DockerGitGate.start`, so
|
||||||
|
either is mechanically straightforward.
|
||||||
|
|
||||||
|
### The design tension
|
||||||
|
|
||||||
|
PRD 0008's "no bypass for trusted commits" non-goal is load-bearing
|
||||||
|
*against the agent*. It is not load-bearing against the user, who
|
||||||
|
already has every privilege the gate is trying to deny the agent.
|
||||||
|
The risk of letting the user approve exceptions is not direct (the
|
||||||
|
user can already do whatever they want); it is indirect:
|
||||||
|
|
||||||
|
- **Prompt-injection laundering.** An attacker who has captured the
|
||||||
|
agent's prompt-stream can ask the agent to *request* an exception
|
||||||
|
that looks plausible ("I just need to commit the test fixture for
|
||||||
|
the new auth flow"). If the user rubber-stamps the request, the
|
||||||
|
attacker has used the user as a bypass channel. This is the same
|
||||||
|
risk as any human-in-the-loop control: it degrades to "no control"
|
||||||
|
if the human always says yes.
|
||||||
|
- **Scope creep of a granted exception.** A commit-SHA allowlist
|
||||||
|
approved for one commit could, in principle, be re-targeted at a
|
||||||
|
different commit if the allowlist isn't tied to the content. This
|
||||||
|
is why `commits` alone is unsafe; `paths + regexes` is the form
|
||||||
|
that survives content-substitution.
|
||||||
|
- **Persistence past intent.** An exception granted "just for this
|
||||||
|
commit" that stays in the gate's config indefinitely is no longer
|
||||||
|
a per-commit exception; it's a permanent allowlist entry. Without
|
||||||
|
TTL or a clean teardown, exceptions accrete.
|
||||||
|
|
||||||
|
These three risks shape the design constraints below.
|
||||||
|
|
||||||
|
### Three design options
|
||||||
|
|
||||||
|
**Option A — Reject and rotate.** Treat every gitleaks hit as
|
||||||
|
"rewrite the commit to not contain the literal token, then re-push."
|
||||||
|
For docs with fake tokens, use a sentinel string the repo's
|
||||||
|
gitleaks config recognizes as obviously not a real secret (e.g.
|
||||||
|
`AKIAIOSFODNN7EXAMPLE`, AWS's documented example key, or a project-
|
||||||
|
specific placeholder like `<aws-access-key-id>`).
|
||||||
|
|
||||||
|
- *Cost:* zero. No new code.
|
||||||
|
- *Property:* gate stays unbypassable in both senses.
|
||||||
|
- *Friction:* every author must know the placeholder convention. The
|
||||||
|
first time someone pastes a realistic-looking fake into a doc,
|
||||||
|
they get rejected and have to redo the commit. Probably fine for
|
||||||
|
the host repo; less fine for bottles authoring third-party content.
|
||||||
|
- *Verdict:* this should be the *default*. The exception flow exists
|
||||||
|
only for cases where Option A genuinely fails (e.g. the example is
|
||||||
|
specifically about a real-looking token format, or the upstream
|
||||||
|
doc requires the literal pattern).
|
||||||
|
|
||||||
|
**Option B — Per-finding narrow allowlist via PRD 0012 flow.** When
|
||||||
|
the agent's push is rejected, the agent invokes
|
||||||
|
`/request-gate-exception` (or `/request-bottle-change` with an
|
||||||
|
exception variant). The slash command POSTs to the cred-proxy
|
||||||
|
endpoint, carrying:
|
||||||
|
|
||||||
|
- the file path that triggered the finding
|
||||||
|
- the finding's matched-byte hash (not the bytes themselves, to keep
|
||||||
|
the request artifact non-secret on its own)
|
||||||
|
- the gitleaks rule ID
|
||||||
|
- a free-text justification ("docs example for AWS auth flow")
|
||||||
|
|
||||||
|
The user reviews the request in the dashboard, sees the file and the
|
||||||
|
diff, and approves an entry of shape `{ paths: [<exact path>],
|
||||||
|
regexes: [<exact-match regex over matched bytes>], condition: AND }`.
|
||||||
|
The gate restarts with that config entry merged into its
|
||||||
|
`.gitleaks.toml`. A future commit on the same path with a *different*
|
||||||
|
finding still hits the gate and rejects.
|
||||||
|
|
||||||
|
- *Property:* approved exceptions are content-locked, not commit-
|
||||||
|
locked. Substituting bytes on the same path triggers a fresh
|
||||||
|
rejection.
|
||||||
|
- *Auditability:* the approval is a manifest diff; it lives in git
|
||||||
|
history and in the PR conversation thread per PRD 0012.
|
||||||
|
- *Open: TTL.* Should the entry expire? Plausible defaults: never
|
||||||
|
(it's content-locked anyway), or "until the next manifest version
|
||||||
|
bump." Lean "never" for v1; revisit if exception lists balloon.
|
||||||
|
|
||||||
|
**Option C — Pre-flight scan with author signoff.** Run gitleaks
|
||||||
|
client-side inside the bottle (as a non-gating advisory check) so
|
||||||
|
the agent sees findings *before* attempting the push. The slash
|
||||||
|
command then includes the pre-known findings; the dashboard shows
|
||||||
|
the user the finding inline rather than having to go look at the
|
||||||
|
rejection log. On approval, same Option-B-style allowlist entry
|
||||||
|
gets added.
|
||||||
|
|
||||||
|
- *Property:* identical end-state to Option B; better UX because
|
||||||
|
the agent stops before the rejected push, not after.
|
||||||
|
- *Cost:* one more place that needs gitleaks installed (the bottle
|
||||||
|
image), and an in-bottle advisory check that the agent can in
|
||||||
|
principle ignore. That's fine because it's *advisory* — the gate
|
||||||
|
still rejects; the in-bottle check just avoids one round-trip.
|
||||||
|
- *Verdict:* nice-to-have over Option B, not a substitute.
|
||||||
|
|
||||||
|
### Recommendation
|
||||||
|
|
||||||
|
Default to Option A as the canonical answer ("rewrite to use a
|
||||||
|
placeholder"). Build Option B as the PRD 0012 exception path, scoped
|
||||||
|
narrowly: `paths + regexes` with AND, no `commits` selector exposed
|
||||||
|
to the approval flow. Defer Option C to a follow-up; it's an
|
||||||
|
ergonomic win, not a security property.
|
||||||
|
|
||||||
|
This puts the answer to PRD 0012's open question as:
|
||||||
|
|
||||||
|
- Same recovery shape (`/request-bottle-change`), distinguishable
|
||||||
|
request type. The dashboard renders an exception request
|
||||||
|
differently from a manifest-change request because the *diff*
|
||||||
|
being approved is to the gate's allowlist, not to the manifest.
|
||||||
|
- Exceptions are expressed as `(path, content-pattern)` pairs, not
|
||||||
|
commit SHAs. Re-pushing different bytes on the same path
|
||||||
|
re-triggers the gate.
|
||||||
|
- The approval is recorded twice for audit: in the PR thread (free-
|
||||||
|
text), and as a versioned diff to the gate's allowlist config (or
|
||||||
|
the manifest field that materializes into it).
|
||||||
|
|
||||||
|
## Cross-references
|
||||||
|
|
||||||
|
- PRD 0008 — git-gate design and "no bypass" non-goal.
|
||||||
|
- PRD 0010 — cred-proxy; the inbound endpoint PRD 0012 reuses for
|
||||||
|
exception requests.
|
||||||
|
- PRD 0012 — stuck-agent recovery flow; the open question this note
|
||||||
|
informs.
|
||||||
|
- `docs/research/git-secret-scanning-hardening.md` — prior research
|
||||||
|
on the secret-scanning tool landscape and why gitleaks is the fit.
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
- [gitleaks configuration documentation](https://github.com/gitleaks/gitleaks#configuration)
|
||||||
|
— `[allowlist]` selectors (`paths`, `regexes`, `stopwords`,
|
||||||
|
`commits`, `regexTarget`, `condition`).
|
||||||
|
- [AWS example access key (`AKIAIOSFODNN7EXAMPLE`)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html)
|
||||||
|
— documented placeholder safe to use in examples without
|
||||||
|
triggering most secret scanners.
|
||||||
|
- `claude_bottle/git_gate.py` — pre-receive hook implementation
|
||||||
|
(`gitleaks git --log-opts="$log_opts" --no-banner --redact`, no
|
||||||
|
`--config` argument today).
|
||||||
Reference in New Issue
Block a user