diff --git a/docs/prds/0008-git-gate.md b/docs/prds/0008-git-gate.md index 0c6fd3d..47033b5 100644 --- a/docs/prds/0008-git-gate.md +++ b/docs/prds/0008-git-gate.md @@ -6,11 +6,18 @@ ## Summary -Per-bottle sidecar that fronts the agent's git remotes, runs -gitleaks against incoming refs via a `pre-receive` hook, and only -forwards to the real upstream on a clean scan. Upstream push -credentials live in the gate, not the agent — so a misbehaving -agent cannot push a secret-bearing commit past it. +Per-bottle sidecar that fronts the agent's git remotes as a +transparent mirror. Push is gated: gitleaks scans incoming refs +via a `pre-receive` hook, and only clean refs get forwarded to +the real upstream. Fetch is mirrored: every `upload-pack` first +runs `git fetch origin --prune` against the upstream via the +daemon's `--access-hook`, so an agent fetch returns whatever the +upstream has *now* (fail-closed if the upstream is unreachable). + +Upstream credentials live in the gate, not the agent — so a +misbehaving agent cannot push a secret-bearing commit past it +and cannot acquire push access by inspecting the agent's own +filesystem. ## Problem @@ -30,12 +37,19 @@ the agent can disable is not a gate. ## Goals / Success Criteria -Integration test: spin up a bottle whose only push path for a -declared upstream is the gate. Drop a synthetic high-entropy -secret into a commit, run `git push` from inside the agent, -observe a non-zero exit and a gitleaks finding in the gate's -stderr. Repeat with a clean commit, observe exit 0 and the commit -landing on the real upstream. +Two integration tests, both with the gate as the only git path +for a declared upstream: + +1. **Push:** drop a synthetic high-entropy secret into a commit, + run `git push` from inside the agent, observe a non-zero exit + and a gitleaks finding in the response. Repeat with a clean + commit and observe exit 0 + the commit landing on the real + upstream. +2. **Fetch:** clone the upstream through the gate (`git clone` + against the gate URL), observe the upstream's content. Push + a new commit to the upstream out-of-band, refetch through the + gate, observe the new commit. The gate must never serve stale + data — every fetch refreshes from upstream first. ## Non-goals @@ -71,12 +85,19 @@ landing on the real upstream. uses to push upstream. The agent gets no parallel `bottle.ssh` entry for those upstreams. - **Agent-side URL rewrite.** Provisioner emits `~/.gitconfig` - with `[url ""] insteadOf = ` so `git push - origin` from inside the agent transparently hits the gate. + with `[url ""] insteadOf = ` so every git + operation against the declared upstream (push, fetch, clone, + pull, ls-remote) transparently hits the gate. - **Pre-receive gitleaks hook.** Baked into the gate image. On a hit the hook exits non-zero and the push fails; on clean it - shells out `git push :` using the - gate-resident credential. + shells out `git push origin :` using the gate-resident + credential. +- **Access-hook upstream refresh.** `git daemon --access-hook` runs + `git fetch origin --prune` against the upstream before every + `upload-pack` request, so a fetch through the gate is observably + equivalent to a fetch against the real upstream. Failure to reach + the upstream is fail-closed: the access hook exits non-zero and + the agent's fetch fails. - **Plan rendering / dry-run.** `bottle_plan.py` and the y/N preflight surface the gate sidecar (name, listed upstreams, which credential it holds per upstream). @@ -86,10 +107,10 @@ landing on the real upstream. - Push policy beyond gitleaks. No commit-author allowlist, no branch-name policy, no signed-commit enforcement. gitleaks is the single rule for v1. -- Fetch routing. Fetch can continue going through ssh-gate as - today, with the agent holding a read-scoped deploy key. Routing - fetch through the git-gate is a follow-up; this PRD is - push-side only. (Open question: revisit.) +- Fetch caching / stale-while-revalidate. Every `upload-pack` + refresh is a synchronous round-trip to the upstream; there is + no TTL cache, no background refresh. If the upstream is slow, + the agent's fetch is slow. - Quarantine / replay. A rejected push is discarded; we do not stash it for the user to inspect. - Non-Docker backends. Implementation lands for Docker only; the @@ -116,20 +137,34 @@ Mirror the existing sidecar layout: `stop` is idempotent `docker rm -f`. Container name: `claude-bottle-git-gate-`. -Gate image: a minimal `git` + `gitleaks` + `openssh-server` -image, pinned by digest (declared next to `PIPELOCK_IMAGE` and -the socat image constant). For each declared upstream the gate -hosts a bare repo at a stable local path (`/git/.git`) -with `hooks/pre-receive` wired to gitleaks. On a clean scan the -hook (or a `post-receive` companion) does `git push -:` using the credential the gate holds for that -upstream. +Gate image: `git-daemon` + `openssh-client` over a +`zricethezav/gitleaks` base (alpine + gitleaks), pinned by digest. +For each declared upstream the gate hosts a bare repo at +`/git/.git` with `remote.origin.url` set to the real +upstream (via `git remote add --mirror=fetch`), `hooks/pre-receive` +wired to gitleaks-then-`git push origin`, and the bare repo's +config carrying per-upstream credential paths. Inside the bottle, the agent's `.gitconfig` rewrites the real -upstream URL to the gate's local URL via `insteadOf`. A `git -push origin main` therefore pushes to the gate; the gate scans; -on success the gate pushes to the real upstream. The agent never -sees the upstream push credential. +upstream URL to the gate's `git://` URL via `insteadOf`. Every +git operation against the declared upstream therefore hits the +gate. + +For pushes, the pre-receive hook gitleaks-scans the incoming +refs and, on clean, pushes each accepted ref to the real +upstream using the credential the gate holds. + +For fetches (clone, pull, fetch, ls-remote), `git daemon`'s +`--access-hook=` runs `git fetch origin --prune` against +the real upstream before the upload-pack service serves the +client. The bare repo therefore reflects the upstream's current +state at the moment the agent's fetch begins; if the upstream +is unreachable, the access hook exits non-zero and the agent's +fetch fails — same observable behavior as if the agent were +talking to the upstream directly. + +The agent never sees the upstream credential under either +operation. ### Existing code touched @@ -167,17 +202,12 @@ exposes it as, and the credential the gate uses to push upstream ### External dependencies -- A minimal `git` + `gitleaks` + `openssh-server` image, pinned - by digest. -- `gitleaks` binary, version pinned in the image build. +- `zricethezav/gitleaks` base image, pinned by digest. The base + ships gitleaks + git; the gate Dockerfile adds `git-daemon` and + `openssh-client` on top. - No new Python packages. ## Future work - -- **Fetch through the gate.** A v2 could route fetch through the - gate too, so the agent holds no upstream credentials at all. - Today fetch falls back to ssh-gate; pushing through git-gate - alone is the v1 win. - **Smolmachines colocation.** The eventual smolmachines backend may pack pipelock + ssh-gate + git-gate into a single microVM, or split git-gate off because it holds push creds and the