docs(research): survey gitleaks dashboards + add baseline-file primitive
test / unit (pull_request) Successful in 13s
test / integration (pull_request) Successful in 24s

This commit is contained in:
2026-05-24 23:54:46 -04:00
parent a74dd2b97f
commit c33930290f
+244 -104
View File
@@ -13,86 +13,200 @@ wrong in the user-intent sense, and there is no way to say so.
## Summary
There is no dashboard for the git-gate today. The CLI ships
`init / list / info / start / edit / cleanup` for bottles; the gate is
visible only as a sidecar in `bottle_plan.py`'s preflight rendering.
No `gate` subcommand exists.
No off-the-shelf dashboard fits the shape claude-bottle needs
(per-bottle, host-local, integrated into a pre-receive rejection
with approval feeding back into the gate's own decision). Gitleaks
itself is a CLI with no UI and was declared **feature-complete** in
early 2026; the author's successor project **Betterleaks** is
explicitly "for the agentic era" but is also CLI-shaped and still
young. The closest open-source dashboard is **DefectDojo**, which
ingests gitleaks JSON but is post-hoc and org-scale — its "marked
as accepted" state does not feed back into the scanner. SaaS
dashboards (GitGuardian, TruffleHog Enterprise) ship repo content
to a vendor and were already disqualified by
`git-secret-scanning-hardening.md`.
There is also no exception mechanism. The pre-receive hook calls
`gitleaks git --log-opts="$range" --no-banner --redact` with no
config path and no allowlist surface. PRD 0008 explicitly rejects
exceptions ("Bypass for trusted commits. No `[skip gitleaks]`
trailer, no allowlist by commit hash. If the gate is bypassable it
isn't a gate.").
The git-gate ships no exception mechanism today: the pre-receive
hook calls `gitleaks git --log-opts="$range" --no-banner --redact`
with no `--config` and no `--baseline-path`, and PRD 0008
explicitly rejects exceptions ("Bypass for trusted commits. No
`[skip gitleaks]` trailer, no allowlist by commit hash. If the
gate is bypassable it isn't a gate.").
That non-goal is correct under its own framing — *any path the agent
can take* invalidates the gate — but it conflates two distinct
questions: can the *agent* bypass the gate (must be no), and can the
*user* approve a narrowly-scoped exception (could be yes, under
constraints). PRD 0012's recovery flow is exactly the seam where a
user-side, out-of-band approval could live without giving the agent
any in-band bypass.
That non-goal is correct against the *agent* but conflates two
questions: can the *agent* bypass the gate (must be no), and can
the *user* approve a narrowly-scoped exception out-of-band (could
be yes). PRD 0012's recovery flow is exactly the seam where the
user-side approval can live without giving the agent any in-band
bypass.
The design problem is therefore not "should there be exceptions" but
"how narrow does an exception have to be before the gate is still a
gate." This note surveys gitleaks's native allowlist primitives,
sketches three approval-scope designs, and recommends a direction.
Gitleaks does ship one native primitive that maps well to "approve
this specific finding" — the **baseline file** — which is
semantically a better fit for per-finding approval than the
allowlist config (a suppression *rule*). This note surveys the
dashboard landscape, the two native primitives (allowlist and
baseline), and recommends a direction.
## Question 1: Is there a dashboard / operator surface for git-gate?
## Question 1: Existing dashboards and control surfaces
No, in three senses:
### Inside claude-bottle today
- **No CLI subcommand.** `claude_bottle/cli/` has `_common, cleanup,
edit, info, init, list, start` and nothing gate-specific.
`claude-bottle list` shows bottles, not their gates' state or
recent rejections.
- **No gate-side log surface.** Rejections are written to the
pre-receive hook's stderr (`echo "git-gate: gitleaks rejected push
to $ref" >&2`); the agent sees the rejection in its `git push`
output, but nothing persists outside the container's logs.
- **No upstream UI for git-gate.** gitleaks itself is a CLI; it has
no built-in dashboard. The hosted secret-scanning UIs surveyed in
`git-secret-scanning-hardening.md` (ggshield, TruffleHog Enterprise)
are SaaS products that ship repo content to a vendor — explicitly
the wrong shape for a project whose premise is sandbox isolation.
`claude_bottle/cli/` has `_common, cleanup, edit, info, init, list,
start` — nothing gate-specific. The gate appears only as a sidecar
in `bottle_plan.py`'s preflight rendering. Rejections are written
to the pre-receive hook's stderr (`echo "git-gate: gitleaks
rejected push to $ref" >&2`) and surface only in the agent's
`git push` output — nothing persists outside the container's logs.
The PRD 0012 dashboard, when it exists, is the natural place for
git-gate operator surface to live: list pending change requests,
show recent rejections per bottle, render the diff of any
exception-approval request. There is no reason to build a separate
gate dashboard.
### Native gitleaks: CLI-only, and now feature-complete
Gitleaks has no built-in dashboard or web UI. As of early 2026 the
project has been declared **feature complete** — only security
patches will be merged going forward. The original maintainer
(Zachary Rice) has moved active work to Betterleaks (below), so
any dashboard built directly against gitleaks should treat the
gitleaks surface as frozen rather than evolving.
### Betterleaks: the same author's "agentic era" successor
Started February 2026 and explicitly framed for AI agents driving
the scanner: flag-based output for low-token-overhead consumption,
parallelized Git scanning, CEL-based filtering in place of the
TOML allowlist, and a roadmap that includes LLM-assisted
classification and automatic secret revocation via provider APIs.
Still CLI-shaped — no dashboard either.
Relevant to claude-bottle in two ways:
- The upstream direction of travel is *toward* agent-driven
scanners, which makes "the bottle invokes a scanner and reports
findings up" a supported pattern rather than a hack.
- CEL is a richer expression language for filter entries than
gitleaks's selector struct, which loosens the design space for
Option B (below). If claude-bottle ever swaps gitleaks for
Betterleaks, the approval-flow design should be expressible in
both.
### Output formats: SARIF + viewers
Both gitleaks and Betterleaks can emit SARIF. That plugs into
GitHub Advanced Security's Code Scanning tab (read-only viewer
with a dismiss-as-not-a-problem state) and assorted open-source
SARIF viewers (`sarif-web-component`, Microsoft's VS Code
extension). These render findings; they do not handle approval
state or feed back into the scanner. Useful for *seeing* findings;
not useful as the approval surface.
### Findings aggregators
[**DefectDojo**](https://defectdojo.com/integrations/gitleaks) is
the closest open-source thing to "a dashboard for gitleaks." It
ingests gitleaks JSON (and ~200 other scanners), aggregates and
deduplicates, lets you triage and mark findings as accepted or
false-positive in its UI, and tracks remediation state. Designed
for org-scale: one DefectDojo instance covers many repos and
scanners.
Shape mismatch for claude-bottle:
- DefectDojo's review state is *informational* — marking a finding
as accepted in DefectDojo does not write to gitleaks's allowlist
or baseline and does not change what the gate decides on the
next push.
- It expects findings as artifacts of CI runs, not as the
rejection-cause of an in-flight push.
- A single shared instance violates the one-sidecar-per-bottle
posture; per-bottle DefectDojo instances are absurd overhead.
Useful to know it exists, especially for long-term post-hoc
finding tracking. Not the v1 answer for the in-flight approval
flow PRD 0012 needs.
A separate [JupiterOne integration](https://github.com/gitleaks-findings/gitleaks)
exists but ships findings to JupiterOne's commercial platform and
has effectively zero public adoption (0 stars, 0 forks). Mentioned
only because its repo name suggests "the dashboard" and isn't.
### SaaS dashboards (disqualified by sandbox premise)
GitGuardian / ggshield and TruffleHog Enterprise both offer
incident-triage UIs with finding-level approval state. Both ship
repo content to a vendor; already disqualified in
`git-secret-scanning-hardening.md` for a project whose entire
premise is sandbox isolation.
### Bottom line
No off-the-shelf dashboard fits claude-bottle's shape: per-bottle,
host-local, integrated into a pre-receive rejection with the
approval feeding back into the gate's own decision-making. The
nearest open-source analogue (DefectDojo) is post-hoc and
org-scale; the nearest UX (GitGuardian) is SaaS. The PRD 0012
dashboard — sharing surface with the broader stuck-agent recovery
flow — remains the right place to build this.
## Question 2: How could specific commits be approved?
### What gitleaks gives you natively
Gitleaks's TOML config supports an `[allowlist]` block (or
`[[rules.allowlists]]` per-rule) with four selectors that can be
combined inside a single entry. The selectors observed in current
gitleaks (v8) are:
Two distinct primitives, and the distinction matters for designing
an approval flow.
**Allowlists** are *suppression rules* — config-level patterns that
say "ignore findings matching X." Gitleaks's TOML config supports
an `[allowlist]` block (or `[[rules.allowlists]]` per-rule) with
four selectors:
- `paths` — list of regex against file paths.
- `regexes` — list of regex matched against the *finding's* matched
bytes; on match, suppress the finding. `regexTarget` chooses
whether the regex applies to the matched bytes, the surrounding
line, or the secret group only.
- `stopwords` — substrings that, if present in the finding, suppress
it. Cheaper than `regexes` for literal matches.
- `regexes` — list of regex matched against the finding bytes;
`regexTarget` directs the regex at the extracted secret
(default), the entire regex match, or the whole line.
- `stopwords` — substrings that, if present, suppress the finding.
- `commits` — explicit commit SHAs to skip entirely.
- `condition` — `AND` (default) or `OR` across the above selectors,
letting an entry require, e.g., both a path match *and* a content
match before suppressing.
`commits` is the bluntest tool and the easiest to misuse: a single
SHA can hide arbitrary content. `paths + regexes` (with AND) is the
narrowest scope: a finding is only suppressed if it lives at a
specific path *and* matches a specific byte pattern. That's the
shape that makes a per-finding exception still defensible.
Selectors combine with `condition = "OR"` (default; suppress if any
selector matches) or `condition = "AND"` (suppress only if all
match). `commits` is the bluntest tool and the easiest to misuse:
a single SHA can hide arbitrary content. `paths + regexes` with
AND is the narrowest scope, and the form that makes a per-finding
exception still defensible.
The hook today does not pass `--config` to gitleaks. Adding it would
mean baking a config file into the gate image *or* mounting one in
at `start` time. The image is built per `DockerGitGate.start`, so
either is mechanically straightforward.
**Baselines** are a *known-findings list* — a JSON file of
previously detected findings that gitleaks's `IsNew` function
compares against on the next scan, so only new findings get
reported. The file is generated by saving a scan's JSON output and
fed back in via `--baseline-path`. The comparison checks RuleID,
description, file path, line numbers, secret content, commit, and
author/timestamp. When `--redact` is enabled, redacted Secret and
Match fields are ignored in the comparison so the baseline still
functions with redacted reports.
Detection flow is: global allowlist → rule-specific allowlist →
baseline → reported finding. Allowlist suppressions therefore win
over baseline; baseline is the last gate before report.
The hook today passes neither `--config` nor `--baseline-path`.
Wiring either in is mechanically straightforward: the gate image
is built per `DockerGitGate.start`, so the config / baseline can be
baked into the image *or* mounted in at start.
**Allowlist vs baseline for approval storage.** Both can express
"don't reject this finding," but they imply different things about
intent:
- An *allowlist* entry says "any future finding that matches this
pattern is fine." Generative: it covers findings that don't
exist yet on commits that haven't been made.
- A *baseline* entry says "this exact finding I've already seen is
fine." Specific: it pins to the bytes / location / rule of one
observed finding; a different finding on the same path on a
later commit re-triggers.
For a per-commit user approval, baseline is the better semantic
match: each approval is an attestation about one observed finding,
not a rule that pre-approves a pattern. Baseline entries can also
be diffed in PRs trivially (it's a JSON list) — they double as the
audit record.
### The design tension
@@ -141,40 +255,41 @@ specific placeholder like `<aws-access-key-id>`).
specifically about a real-looking token format, or the upstream
doc requires the literal pattern).
**Option B — Per-finding narrow allowlist via PRD 0012 flow.** When
the agent's push is rejected, the agent invokes
**Option B — Per-finding approval via PRD 0012 flow.** When the
agent's push is rejected, the agent invokes
`/request-gate-exception` (or `/request-bottle-change` with an
exception variant). The slash command POSTs to the cred-proxy
endpoint, carrying:
endpoint, carrying the gitleaks finding record (rule ID, file path,
line, redacted match) and a free-text justification ("docs example
for AWS auth flow").
- the file path that triggered the finding
- the finding's matched-byte hash (not the bytes themselves, to keep
the request artifact non-secret on its own)
- the gitleaks rule ID
- a free-text justification ("docs example for AWS auth flow")
The user reviews the request in the dashboard, sees the file and
the diff, and approves. The approval gets written into the gate's
**baseline file** — the JSON list of known-OK findings the gate
passes as `--baseline-path` to gitleaks. The gate restarts with
the new baseline.
The user reviews the request in the dashboard, sees the file and the
diff, and approves an entry of shape `{ paths: [<exact path>],
regexes: [<exact-match regex over matched bytes>], condition: AND }`.
The gate restarts with that config entry merged into its
`.gitleaks.toml`. A future commit on the same path with a *different*
finding still hits the gate and rejects.
- *Property:* approved exceptions are content-locked, not commit-
locked. Substituting bytes on the same path triggers a fresh
rejection.
- *Auditability:* the approval is a manifest diff; it lives in git
history and in the PR conversation thread per PRD 0012.
- *Open: TTL.* Should the entry expire? Plausible defaults: never
(it's content-locked anyway), or "until the next manifest version
bump." Lean "never" for v1; revisit if exception lists balloon.
- *Property:* approved findings are pinned to the specific
observed bytes / path / rule. A different secret on the same
path on a later commit re-triggers the gate.
- *Auditability:* baseline file is JSON in git history; each PR
approval becomes a diff to that file. The free-text
justification lives in the PR thread per PRD 0012.
- *Fallback to allowlist for canonical cases.* If a particular
fixture file should be permanently understood as "examples only,"
the user can promote a baseline entry to an `[allowlist]` rule
with `paths + regexes` AND — explicit generalization, opt-in by
the user, never by the agent.
- *Open: TTL.* Should baseline entries expire? Baseline is specific
by construction, so the case for expiration is weaker than for
allowlist. Lean "never" for v1; revisit if baselines balloon.
**Option C — Pre-flight scan with author signoff.** Run gitleaks
client-side inside the bottle (as a non-gating advisory check) so
the agent sees findings *before* attempting the push. The slash
command then includes the pre-known findings; the dashboard shows
the user the finding inline rather than having to go look at the
rejection log. On approval, same Option-B-style allowlist entry
rejection log. On approval, same Option-B-style baseline entry
gets added.
- *Property:* identical end-state to Option B; better UX because
@@ -188,23 +303,28 @@ gets added.
### Recommendation
Default to Option A as the canonical answer ("rewrite to use a
placeholder"). Build Option B as the PRD 0012 exception path, scoped
narrowly: `paths + regexes` with AND, no `commits` selector exposed
to the approval flow. Defer Option C to a follow-up; it's an
ergonomic win, not a security property.
placeholder"). Build Option B as the PRD 0012 exception path,
storing approvals in the gate's **baseline file** (not in an
allowlist rule). Baseline is the right primitive because each
approval is an attestation about one observed finding, not a
generative pattern. Allowlist promotion is a separate, user-
initiated escalation for cases that genuinely deserve patterning.
The `commits` selector is never exposed to the approval flow under
either path — it hides arbitrary content. Defer Option C to a
follow-up; it's an ergonomic win, not a security property.
This puts the answer to PRD 0012's open question as:
- Same recovery shape (`/request-bottle-change`), distinguishable
request type. The dashboard renders an exception request
differently from a manifest-change request because the *diff*
being approved is to the gate's allowlist, not to the manifest.
- Exceptions are expressed as `(path, content-pattern)` pairs, not
commit SHAs. Re-pushing different bytes on the same path
re-triggers the gate.
- The approval is recorded twice for audit: in the PR thread (free-
text), and as a versioned diff to the gate's allowlist config (or
the manifest field that materializes into it).
being approved is to the gate's baseline file, not to the
manifest.
- Exceptions are expressed as baseline-file entries — finding-
specific JSON records — not commit SHAs or regex patterns.
- The approval is recorded twice for audit: in the PR thread
(free-text justification), and as a versioned diff to the
baseline file (which is committed alongside the manifest).
## Cross-references
@@ -218,12 +338,32 @@ This puts the answer to PRD 0012's open question as:
## Sources
- [gitleaks configuration documentation](https://github.com/gitleaks/gitleaks#configuration)
— `[allowlist]` selectors (`paths`, `regexes`, `stopwords`,
`commits`, `regexTarget`, `condition`).
- [gitleaks repository](https://github.com/gitleaks/gitleaks) —
`[allowlist]` selectors (`paths`, `regexes`, `stopwords`,
`commits`, `regexTarget`, `condition`); also home of the
feature-complete notice.
- [Gitleaks allowlists & baselines (DeepWiki)](https://deepwiki.com/gitleaks/gitleaks/4.4-allowlists-and-baselines)
— detailed walk-through of the allowlist selector struct, the
baseline file format, the `IsNew` comparison logic, and the
global→rule→baseline detection order. Primary source for the
allowlist-vs-baseline distinction this note rests on.
- [Betterleaks (GitHub)](https://github.com/betterleaks/betterleaks)
— Zachary Rice's successor project; CEL filtering, agent-driven
output design, roadmap for LLM-assisted classification.
- [Help Net Security on Betterleaks](https://www.helpnetsecurity.com/2026/03/19/betterleaks-open-source-secrets-scanner/)
and [The New Stack](https://thenewstack.io/betterleaks-open-source-secret-scanner/)
— context on the "agentic era" framing and why gitleaks froze.
- [DefectDojo gitleaks parser](https://defectdojo.com/integrations/gitleaks)
— JSON ingest, finding triage UI, accept/false-positive state.
Open-source, generic, post-hoc; informational state only —
marking a finding as accepted does not feed back into the
scanner. Shape mismatch for in-flight per-bottle approval.
- [gitleaks-findings/gitleaks](https://github.com/gitleaks-findings/gitleaks)
— JupiterOne integration, not a dashboard. Listed because the
repo name is misleading.
- [AWS example access key (`AKIAIOSFODNN7EXAMPLE`)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html)
— documented placeholder safe to use in examples without
triggering most secret scanners.
- `claude_bottle/git_gate.py` — pre-receive hook implementation
(`gitleaks git --log-opts="$log_opts" --no-banner --redact`, no
`--config` argument today).
- `claude_bottle/git_gate.py` — pre-receive hook implementation.
Today: `gitleaks git --log-opts="$log_opts" --no-banner
--redact`; no `--config`, no `--baseline-path`.