From 9dc0dfd5eed665c249fcd1886ff3ca275c78458e Mon Sep 17 00:00:00 2001 From: didericis-claude Date: Fri, 29 May 2026 01:52:07 -0400 Subject: [PATCH 1/3] =?UTF-8?q?docs(prd):=20PRD=200028=20=E2=80=94=20git-g?= =?UTF-8?q?ate=20new-branch=20push=20scan=20scope?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit git-gate's pre-receive scans the full ancestry of a new branch, so the repo's historical test-fixture findings block every new-branch push (issue #106). Scope the new-ref scan to incoming commits (`$new --not --all`) with no loss of coverage, and harden the forward ssh against hangs. Refs #106 Co-Authored-By: Claude Opus 4.8 --- .../0028-git-gate-new-branch-scan-scope.md | 170 ++++++++++++++++++ 1 file changed, 170 insertions(+) create mode 100644 docs/prds/0028-git-gate-new-branch-scan-scope.md diff --git a/docs/prds/0028-git-gate-new-branch-scan-scope.md b/docs/prds/0028-git-gate-new-branch-scan-scope.md new file mode 100644 index 0000000..12426a9 --- /dev/null +++ b/docs/prds/0028-git-gate-new-branch-scan-scope.md @@ -0,0 +1,170 @@ +# PRD 0028: git-gate new-branch push scan scope + +- **Status:** Draft +- **Author:** didericis-claude +- **Created:** 2026-05-29 +- **Issue:** #106 + +## Summary + +git-gate's pre-receive hook scans the **entire ancestry** of a *new* +branch for secrets, so any pre-existing finding in repo history blocks +every new-branch push. Scope the scan to the commits a push actually +introduces (`$new --not --all`) so a push is gated on what *it* adds, +not on what's already on the gate/upstream. Also harden the forward +`ssh` against hangs. Net: new branches can be pushed through the gate +again, with no loss of leak-detection coverage. + +## Problem + +In `git_gate_render_hook()` (`bot_bottle/git_gate.py`) the pre-receive +hook chooses the gitleaks revision range per ref: + +```sh +if [ "$old" = "$zero" ]; then + log_opts="$new" # new branch: `git log ` = the FULL ancestry +else + log_opts="$old..$new" # existing branch: just the pushed delta +fi +gitleaks git --log-opts="$log_opts" ... # exit 1 if ANY finding in range +``` + +For a **new** ref there is no `old` to diff against, so the hook passes +`$new`, which `git log` expands to every commit reachable from the new +tip. This repo's history contains 11 deliberately secret-shaped strings +(demo manifests, `docs/demo.tape`, and the pipelock/sandbox-escape +integration tests that exist *to exercise* the DLP). gitleaks reports +`438 commits scanned … leaks found: 11`, the hook `exit 1`s, and the +push is rejected. Confirmed live against issue #106's bottle: the +branch never lands in the bare repo and is never forwarded. + +Consequence: **no new branch can ever be pushed through git-gate** as +long as a single historical finding exists — which is permanent. + +Two adjacent problems surfaced while diagnosing #106: + +- The rejection is **invisible to the client** — over the `git://` + + smolmachines forward it presented as a ~75s silent hang, not a + `remote: git-gate: gitleaks rejected …` message. +- The forward `ssh` lacks `BatchMode`/`ConnectTimeout`, so an + unreachable upstream or a prompt would hang the hook indefinitely. + (Not the cause of #106 — the forward itself works — but a latent + hang risk.) + +## Goals / Success Criteria + +- A new-branch push is scanned **only for the commits it introduces** + (reachable from `$new`, not from any ref the gate already has). +- A new branch that adds no new findings **pushes successfully**, even + though historical fixtures still trip a full-history scan. +- A new branch that *does* introduce a finding is **still rejected**. +- No reduction in leak coverage for the commits a push actually brings + to the upstream (see "Security analysis"). +- Forward `ssh` fails fast (`BatchMode=yes` + `ConnectTimeout`) instead + of hanging on a prompt/unreachable upstream. +- Existing git-gate unit + integration tests stay green; new tests lock + the scoped-scan behaviour. + +## Non-goals + +- **Scrubbing the historical fixture findings.** They're intentional + test/demo inputs; scoping the scan resolves the practical problem + without rewriting history. +- **Relaxing the existing-branch path.** `$old..$new` already scans the + delta; this PRD only fixes the new-ref branch (and optionally unifies + on `--not --all`, see Open questions). +- **The client-visibility fix is investigation-gated.** Surfacing the + rejection over the `git://` + smolmachines path may need separate work + (sideband relay); tracked here but may land as a follow-up rather than + block the scan-scope fix. + +## Design + +### Scoped scan + +Replace the new-ref range with one that excludes everything the gate +already knows: + +```sh +if [ "$new" = "$zero" ]; then + continue # deletion: nothing to scan (unchanged) +elif [ "$old" = "$zero" ]; then + log_opts="$new --not --all" # new branch: only commits new to the gate +else + log_opts="$old..$new" # existing branch: the pushed delta +fi +``` + +`git log $new --not --all` = commits reachable from the pushed tip but +**not** reachable from any ref already in the gate's bare repo. + +### Security analysis (why excluding "already-on-the-gate" is safe) + +Commits enter the gate's bare repo by exactly two paths: + +1. **mirror-fetch from the upstream** — the bare repo is + `remote add --mirror=fetch origin`, and the access-hook fetches the + upstream before every upload-pack; and +2. **a push through the gate** — which is gitleaks-scanned before it is + forwarded. + +Therefore every commit reachable from a gate ref is *already on the +upstream* or *was already scanned when pushed*. A commit excluded by +`--not --all` cannot be a new secret arriving at the upstream via this +push: + +- if it's already upstream, re-scanning changes nothing — the content is + already there and blocking this branch wouldn't remove it; and +- if it arrived via an earlier push, it was already scanned. + +The only commits that can carry a *new* secret upstream are the ones the +push introduces — exactly the set `$new --not --all` scans. An agent +cannot pre-seed a secret commit as "already known" to dodge the scan: it +can't write refs into the bare repo except by pushing (which scans), and +the mirror refs come only from the trusted upstream. + +**Invariant this relies on:** the bare repo's refs are populated *only* +by upstream mirror-fetch and gitleaks-gated pushes. That holds in the +current design (nothing writes refs out-of-band); revisit if that +changes. + +### Forward ssh hardening + +Add `-o BatchMode=yes -o ConnectTimeout=` to the hook's `ssh_cmd` so +a prompt or unreachable upstream fails fast with a clear error instead +of hanging the receive-pack. + +## Implementation chunks + +1. **PRD (this commit).** +2. **Hook scan scope + ssh hardening.** Edit `git_gate_render_hook()`: + the new-ref range → `$new --not --all`; add `BatchMode`/`ConnectTimeout` + to `ssh_cmd`. Unit tests in `test_git_gate.py` asserting the rendered + hook uses the scoped range for new refs and the hardened ssh flags. +3. **Integration coverage.** A new-branch push carrying no new finding + succeeds through a gate whose history contains a fixture finding; a + new-branch push that introduces a finding is still rejected. +4. **(Optional / follow-up) client visibility.** Make a gitleaks/forward + rejection reach the client as a `remote:` error over the git:// + + smolmachines path. + +## Testing strategy + +- **Unit (must):** rendered-hook assertions — new-ref uses + `$new --not --all`, existing-ref still `$old..$new`, deletion still + skipped; ssh_cmd carries `BatchMode=yes` + a `ConnectTimeout`. +- **Integration (should):** against a real gate seeded with a + fixture-bearing history, a clean new branch forwards to the upstream; + a new branch with a planted secret is rejected. Skips cleanly on hosts + that can't run the bundle (same shape as the existing git-gate + integration test). + +## Open questions + +- **Unify both branches on `--not --all`?** It's also more robust than + `$old..$new` for non-fast-forward/force pushes (which can skip commits + off the direct path). Tempting to use it for the existing-ref case + too; deferred to keep this change tight, but worth a follow-up. +- **Client visibility mechanism.** Whether the silent-hang is a git + daemon sideband-relay issue or specific to the smolmachines forward + needs a focused repro before committing to a fix. -- 2.52.0 From 6c673bece6815c247eaed926d36338a80328925e Mon Sep 17 00:00:00 2001 From: didericis-claude Date: Fri, 29 May 2026 01:59:20 -0400 Subject: [PATCH 2/3] fix(git-gate): scope new-branch scan to incoming commits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A new ref made the pre-receive hook scan the full ancestry (`log_opts="$new"`), so historical test-fixture findings rejected every new-branch push (#106). Scope it to `$new --not --all` — only commits new to the gate, which (since the bare repo is populated solely by upstream mirror-fetch and gitleaks-gated pushes) loses no coverage on what a push actually brings to the upstream. Also add BatchMode=yes + ConnectTimeout=10 to both the forward and access-hook ssh so an unreachable upstream fails fast instead of hanging. Refs #106 Co-Authored-By: Claude Opus 4.8 --- bot_bottle/git_gate.py | 13 ++++++++++--- tests/unit/test_git_gate.py | 25 +++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/bot_bottle/git_gate.py b/bot_bottle/git_gate.py index e3cad28..5b010f9 100644 --- a/bot_bottle/git_gate.py +++ b/bot_bottle/git_gate.py @@ -280,7 +280,14 @@ while IFS=' ' read -r old new ref; do [ -z "$ref" ] && continue [ "$new" = "$zero" ] && continue if [ "$old" = "$zero" ]; then - log_opts="$new" + # New ref: scan only the commits this push introduces — those + # reachable from $new but not from any ref the gate already has. + # Everything already on the gate arrived via upstream mirror-fetch + # or a previously gitleaks-scanned push, so it's already-upstream + # or already-scanned; re-scanning it (the old `$new` full-ancestry + # range) only resurfaces historical findings and blocks every new + # branch. See PRD 0028 / issue #106. + log_opts="$new --not --all" else log_opts="$old..$new" fi @@ -300,7 +307,7 @@ if [ ! -f "$hostsfile" ]; then echo "git-gate: add KnownHostKey to the bottle.git entry and restart the bottle" >&2 exit 1 fi -ssh_cmd="ssh -i $keyfile -o UserKnownHostsFile=$hostsfile -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes" +ssh_cmd="ssh -i $keyfile -o UserKnownHostsFile=$hostsfile -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o BatchMode=yes -o ConnectTimeout=10" while IFS=' ' read -r old new ref; do [ -z "$ref" ] && continue @@ -355,7 +362,7 @@ if [ -z "$keyfile" ] || [ ! -f "$hostsfile" ]; then echo "git-gate: missing credentials for $repo_dir; refusing fetch" >&2 exit 1 fi -ssh_cmd="ssh -i $keyfile -o UserKnownHostsFile=$hostsfile -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes" +ssh_cmd="ssh -i $keyfile -o UserKnownHostsFile=$hostsfile -o StrictHostKeyChecking=yes -o IdentitiesOnly=yes -o BatchMode=yes -o ConnectTimeout=10" echo "git-gate: refreshing $repo_dir from upstream" >&2 if ! GIT_SSH_COMMAND="$ssh_cmd" git -C "$repo_dir" fetch origin --prune >&2; then diff --git a/tests/unit/test_git_gate.py b/tests/unit/test_git_gate.py index d83bec8..7e7884e 100644 --- a/tests/unit/test_git_gate.py +++ b/tests/unit/test_git_gate.py @@ -197,6 +197,24 @@ class TestHookRender(unittest.TestCase): # Stdin is buffered to a tempfile so both phases can re-read. self.assertIn("refs_file=$(mktemp)", hook) + def test_new_ref_scan_scoped_to_incoming_commits(self): + # A new branch (old=all-zeros) must scan only commits new to the + # gate, not the full ancestry — otherwise historical findings + # block every new-branch push (PRD 0028 / issue #106). + hook = git_gate_render_hook() + self.assertIn('log_opts="$new --not --all"', hook) + # The old over-broad full-ancestry range must be gone. + self.assertNotIn('log_opts="$new"', hook) + # Existing-branch delta scan is unchanged. + self.assertIn('log_opts="$old..$new"', hook) + + def test_forward_ssh_is_non_interactive_and_bounded(self): + # No prompt (BatchMode) and a connect timeout, so an unreachable + # upstream fails fast instead of hanging the receive-pack. + hook = git_gate_render_hook() + self.assertIn("BatchMode=yes", hook) + self.assertIn("ConnectTimeout=", hook) + class TestAccessHookRender(unittest.TestCase): def test_access_hook_refreshes_origin_on_upload_pack(self): @@ -216,6 +234,13 @@ class TestAccessHookRender(unittest.TestCase): self.assertIn("refusing to serve stale data", hook) self.assertIn("exit 1", hook) + def test_access_hook_ssh_is_non_interactive_and_bounded(self): + # Same hardening as the forward path: the fetch ssh must not + # prompt and must time out rather than hang upload-pack. + hook = git_gate_render_access_hook() + self.assertIn("BatchMode=yes", hook) + self.assertIn("ConnectTimeout=", hook) + class TestPrepare(unittest.TestCase): def setUp(self): -- 2.52.0 From 50baf63669bed96bcb0de58dbb184c22a7841916 Mon Sep 17 00:00:00 2001 From: didericis-codex Date: Fri, 29 May 2026 02:27:42 -0400 Subject: [PATCH 3/3] docs(prd): mark PRD 0028 active --- docs/prds/0028-git-gate-new-branch-scan-scope.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/prds/0028-git-gate-new-branch-scan-scope.md b/docs/prds/0028-git-gate-new-branch-scan-scope.md index 12426a9..d003169 100644 --- a/docs/prds/0028-git-gate-new-branch-scan-scope.md +++ b/docs/prds/0028-git-gate-new-branch-scan-scope.md @@ -1,6 +1,6 @@ # PRD 0028: git-gate new-branch push scan scope -- **Status:** Draft +- **Status:** Active - **Author:** didericis-claude - **Created:** 2026-05-29 - **Issue:** #106 -- 2.52.0