didericis/bot-bottle

Fork 0

Files

T

didericis c33930290f

test / unit (pull_request) Successful in 13s

Details

test / integration (pull_request) Successful in 24s

Details

docs(research): survey gitleaks dashboards + add baseline-file primitive

2026-05-24 23:54:46 -04:00

18 KiB

Raw Blame History

Approving specific commits past git-gate

Research into (1) whether a dashboard or operator surface for the git-gate (a.k.a. "gitlock", PRD 0008) already exists, and (2) what a narrowly-scoped approval flow for false-positive gitleaks rejections could look like without compromising the gate's "if it's bypassable it isn't a gate" property.

Motivated by PRD 0012's open question: when an agent commits docs containing intentionally-bogus tokens that the secret scanner correctly flags, the rejection is correct in the literal sense and wrong in the user-intent sense, and there is no way to say so.

Summary

No off-the-shelf dashboard fits the shape claude-bottle needs (per-bottle, host-local, integrated into a pre-receive rejection with approval feeding back into the gate's own decision). Gitleaks itself is a CLI with no UI and was declared feature-complete in early 2026; the author's successor project Betterleaks is explicitly "for the agentic era" but is also CLI-shaped and still young. The closest open-source dashboard is DefectDojo, which ingests gitleaks JSON but is post-hoc and org-scale — its "marked as accepted" state does not feed back into the scanner. SaaS dashboards (GitGuardian, TruffleHog Enterprise) ship repo content to a vendor and were already disqualified by git-secret-scanning-hardening.md.

The git-gate ships no exception mechanism today: the pre-receive hook calls gitleaks git --log-opts="$range" --no-banner --redact with no --config and no --baseline-path, and PRD 0008 explicitly rejects exceptions ("Bypass for trusted commits. No [skip gitleaks] trailer, no allowlist by commit hash. If the gate is bypassable it isn't a gate.").

That non-goal is correct against the agent but conflates two questions: can the agent bypass the gate (must be no), and can the user approve a narrowly-scoped exception out-of-band (could be yes). PRD 0012's recovery flow is exactly the seam where the user-side approval can live without giving the agent any in-band bypass.

Gitleaks does ship one native primitive that maps well to "approve this specific finding" — the baseline file — which is semantically a better fit for per-finding approval than the allowlist config (a suppression rule). This note surveys the dashboard landscape, the two native primitives (allowlist and baseline), and recommends a direction.

Question 1: Existing dashboards and control surfaces

Inside claude-bottle today

claude_bottle/cli/ has _common, cleanup, edit, info, init, list, start — nothing gate-specific. The gate appears only as a sidecar in bottle_plan.py's preflight rendering. Rejections are written to the pre-receive hook's stderr (echo "git-gate: gitleaks rejected push to $ref" >&2) and surface only in the agent's git push output — nothing persists outside the container's logs.

Native gitleaks: CLI-only, and now feature-complete

Gitleaks has no built-in dashboard or web UI. As of early 2026 the project has been declared feature complete — only security patches will be merged going forward. The original maintainer (Zachary Rice) has moved active work to Betterleaks (below), so any dashboard built directly against gitleaks should treat the gitleaks surface as frozen rather than evolving.

Betterleaks: the same author's "agentic era" successor

Started February 2026 and explicitly framed for AI agents driving the scanner: flag-based output for low-token-overhead consumption, parallelized Git scanning, CEL-based filtering in place of the TOML allowlist, and a roadmap that includes LLM-assisted classification and automatic secret revocation via provider APIs. Still CLI-shaped — no dashboard either.

Relevant to claude-bottle in two ways:

The upstream direction of travel is toward agent-driven scanners, which makes "the bottle invokes a scanner and reports findings up" a supported pattern rather than a hack.
CEL is a richer expression language for filter entries than gitleaks's selector struct, which loosens the design space for Option B (below). If claude-bottle ever swaps gitleaks for Betterleaks, the approval-flow design should be expressible in both.

Output formats: SARIF + viewers

Both gitleaks and Betterleaks can emit SARIF. That plugs into GitHub Advanced Security's Code Scanning tab (read-only viewer with a dismiss-as-not-a-problem state) and assorted open-source SARIF viewers (sarif-web-component, Microsoft's VS Code extension). These render findings; they do not handle approval state or feed back into the scanner. Useful for seeing findings; not useful as the approval surface.

Findings aggregators

DefectDojo is the closest open-source thing to "a dashboard for gitleaks." It ingests gitleaks JSON (and ~200 other scanners), aggregates and deduplicates, lets you triage and mark findings as accepted or false-positive in its UI, and tracks remediation state. Designed for org-scale: one DefectDojo instance covers many repos and scanners.

Shape mismatch for claude-bottle:

DefectDojo's review state is informational — marking a finding as accepted in DefectDojo does not write to gitleaks's allowlist or baseline and does not change what the gate decides on the next push.
It expects findings as artifacts of CI runs, not as the rejection-cause of an in-flight push.
A single shared instance violates the one-sidecar-per-bottle posture; per-bottle DefectDojo instances are absurd overhead.

Useful to know it exists, especially for long-term post-hoc finding tracking. Not the v1 answer for the in-flight approval flow PRD 0012 needs.

A separate JupiterOne integration exists but ships findings to JupiterOne's commercial platform and has effectively zero public adoption (0 stars, 0 forks). Mentioned only because its repo name suggests "the dashboard" and isn't.

SaaS dashboards (disqualified by sandbox premise)

GitGuardian / ggshield and TruffleHog Enterprise both offer incident-triage UIs with finding-level approval state. Both ship repo content to a vendor; already disqualified in git-secret-scanning-hardening.md for a project whose entire premise is sandbox isolation.

Bottom line

No off-the-shelf dashboard fits claude-bottle's shape: per-bottle, host-local, integrated into a pre-receive rejection with the approval feeding back into the gate's own decision-making. The nearest open-source analogue (DefectDojo) is post-hoc and org-scale; the nearest UX (GitGuardian) is SaaS. The PRD 0012 dashboard — sharing surface with the broader stuck-agent recovery flow — remains the right place to build this.

Question 2: How could specific commits be approved?

What gitleaks gives you natively

Two distinct primitives, and the distinction matters for designing an approval flow.

Allowlists are suppression rules — config-level patterns that say "ignore findings matching X." Gitleaks's TOML config supports an [allowlist] block (or [[rules.allowlists]] per-rule) with four selectors:

paths — list of regex against file paths.
regexes — list of regex matched against the finding bytes; regexTarget directs the regex at the extracted secret (default), the entire regex match, or the whole line.
stopwords — substrings that, if present, suppress the finding.
commits — explicit commit SHAs to skip entirely.

Selectors combine with condition = "OR" (default; suppress if any selector matches) or condition = "AND" (suppress only if all match). commits is the bluntest tool and the easiest to misuse: a single SHA can hide arbitrary content. paths + regexes with AND is the narrowest scope, and the form that makes a per-finding exception still defensible.

Baselines are a known-findings list — a JSON file of previously detected findings that gitleaks's IsNew function compares against on the next scan, so only new findings get reported. The file is generated by saving a scan's JSON output and fed back in via --baseline-path. The comparison checks RuleID, description, file path, line numbers, secret content, commit, and author/timestamp. When --redact is enabled, redacted Secret and Match fields are ignored in the comparison so the baseline still functions with redacted reports.

Detection flow is: global allowlist → rule-specific allowlist → baseline → reported finding. Allowlist suppressions therefore win over baseline; baseline is the last gate before report.

The hook today passes neither --config nor --baseline-path. Wiring either in is mechanically straightforward: the gate image is built per DockerGitGate.start, so the config / baseline can be baked into the image or mounted in at start.

Allowlist vs baseline for approval storage. Both can express "don't reject this finding," but they imply different things about intent:

An allowlist entry says "any future finding that matches this pattern is fine." Generative: it covers findings that don't exist yet on commits that haven't been made.
A baseline entry says "this exact finding I've already seen is fine." Specific: it pins to the bytes / location / rule of one observed finding; a different finding on the same path on a later commit re-triggers.

For a per-commit user approval, baseline is the better semantic match: each approval is an attestation about one observed finding, not a rule that pre-approves a pattern. Baseline entries can also be diffed in PRs trivially (it's a JSON list) — they double as the audit record.

The design tension

PRD 0008's "no bypass for trusted commits" non-goal is load-bearing against the agent. It is not load-bearing against the user, who already has every privilege the gate is trying to deny the agent. The risk of letting the user approve exceptions is not direct (the user can already do whatever they want); it is indirect:

Prompt-injection laundering. An attacker who has captured the agent's prompt-stream can ask the agent to request an exception that looks plausible ("I just need to commit the test fixture for the new auth flow"). If the user rubber-stamps the request, the attacker has used the user as a bypass channel. This is the same risk as any human-in-the-loop control: it degrades to "no control" if the human always says yes.
Scope creep of a granted exception. A commit-SHA allowlist approved for one commit could, in principle, be re-targeted at a different commit if the allowlist isn't tied to the content. This is why commits alone is unsafe; paths + regexes is the form that survives content-substitution.
Persistence past intent. An exception granted "just for this commit" that stays in the gate's config indefinitely is no longer a per-commit exception; it's a permanent allowlist entry. Without TTL or a clean teardown, exceptions accrete.

These three risks shape the design constraints below.

Three design options

Option A — Reject and rotate. Treat every gitleaks hit as "rewrite the commit to not contain the literal token, then re-push." For docs with fake tokens, use a sentinel string the repo's gitleaks config recognizes as obviously not a real secret (e.g. AKIAIOSFODNN7EXAMPLE, AWS's documented example key, or a project- specific placeholder like <aws-access-key-id>).

Cost: zero. No new code.
Property: gate stays unbypassable in both senses.
Friction: every author must know the placeholder convention. The first time someone pastes a realistic-looking fake into a doc, they get rejected and have to redo the commit. Probably fine for the host repo; less fine for bottles authoring third-party content.
Verdict: this should be the default. The exception flow exists only for cases where Option A genuinely fails (e.g. the example is specifically about a real-looking token format, or the upstream doc requires the literal pattern).

Option B — Per-finding approval via PRD 0012 flow. When the agent's push is rejected, the agent invokes /request-gate-exception (or /request-bottle-change with an exception variant). The slash command POSTs to the cred-proxy endpoint, carrying the gitleaks finding record (rule ID, file path, line, redacted match) and a free-text justification ("docs example for AWS auth flow").

The user reviews the request in the dashboard, sees the file and the diff, and approves. The approval gets written into the gate's baseline file — the JSON list of known-OK findings the gate passes as --baseline-path to gitleaks. The gate restarts with the new baseline.

Property: approved findings are pinned to the specific observed bytes / path / rule. A different secret on the same path on a later commit re-triggers the gate.
Auditability: baseline file is JSON in git history; each PR approval becomes a diff to that file. The free-text justification lives in the PR thread per PRD 0012.
Fallback to allowlist for canonical cases. If a particular fixture file should be permanently understood as "examples only," the user can promote a baseline entry to an [allowlist] rule with paths + regexes AND — explicit generalization, opt-in by the user, never by the agent.
Open: TTL. Should baseline entries expire? Baseline is specific by construction, so the case for expiration is weaker than for allowlist. Lean "never" for v1; revisit if baselines balloon.

Option C — Pre-flight scan with author signoff. Run gitleaks client-side inside the bottle (as a non-gating advisory check) so the agent sees findings before attempting the push. The slash command then includes the pre-known findings; the dashboard shows the user the finding inline rather than having to go look at the rejection log. On approval, same Option-B-style baseline entry gets added.

Property: identical end-state to Option B; better UX because the agent stops before the rejected push, not after.
Cost: one more place that needs gitleaks installed (the bottle image), and an in-bottle advisory check that the agent can in principle ignore. That's fine because it's advisory — the gate still rejects; the in-bottle check just avoids one round-trip.
Verdict: nice-to-have over Option B, not a substitute.

Recommendation

Default to Option A as the canonical answer ("rewrite to use a placeholder"). Build Option B as the PRD 0012 exception path, storing approvals in the gate's baseline file (not in an allowlist rule). Baseline is the right primitive because each approval is an attestation about one observed finding, not a generative pattern. Allowlist promotion is a separate, user- initiated escalation for cases that genuinely deserve patterning. The commits selector is never exposed to the approval flow under either path — it hides arbitrary content. Defer Option C to a follow-up; it's an ergonomic win, not a security property.

This puts the answer to PRD 0012's open question as:

Same recovery shape (/request-bottle-change), distinguishable request type. The dashboard renders an exception request differently from a manifest-change request because the diff being approved is to the gate's baseline file, not to the manifest.
Exceptions are expressed as baseline-file entries — finding- specific JSON records — not commit SHAs or regex patterns.
The approval is recorded twice for audit: in the PR thread (free-text justification), and as a versioned diff to the baseline file (which is committed alongside the manifest).

Cross-references

PRD 0008 — git-gate design and "no bypass" non-goal.
PRD 0010 — cred-proxy; the inbound endpoint PRD 0012 reuses for exception requests.
PRD 0012 — stuck-agent recovery flow; the open question this note informs.
docs/research/git-secret-scanning-hardening.md — prior research on the secret-scanning tool landscape and why gitleaks is the fit.

Sources

gitleaks repository — [allowlist] selectors (paths, regexes, stopwords, commits, regexTarget, condition); also home of the feature-complete notice.
Gitleaks allowlists & baselines (DeepWiki) — detailed walk-through of the allowlist selector struct, the baseline file format, the IsNew comparison logic, and the global→rule→baseline detection order. Primary source for the allowlist-vs-baseline distinction this note rests on.
Betterleaks (GitHub) — Zachary Rice's successor project; CEL filtering, agent-driven output design, roadmap for LLM-assisted classification.
Help Net Security on Betterleaks and The New Stack — context on the "agentic era" framing and why gitleaks froze.
DefectDojo gitleaks parser — JSON ingest, finding triage UI, accept/false-positive state. Open-source, generic, post-hoc; informational state only — marking a finding as accepted does not feed back into the scanner. Shape mismatch for in-flight per-bottle approval.
gitleaks-findings/gitleaks — JupiterOne integration, not a dashboard. Listed because the repo name is misleading.
AWS example access key (AKIAIOSFODNN7EXAMPLE) — documented placeholder safe to use in examples without triggering most secret scanners.
claude_bottle/git_gate.py — pre-receive hook implementation. Today: gitleaks git --log-opts="$log_opts" --no-banner --redact; no --config, no --baseline-path.

18 KiB Raw Blame History