diff --git a/docs/prds/0008-git-gate.md b/docs/prds/0008-git-gate.md new file mode 100644 index 0000000..0c6fd3d --- /dev/null +++ b/docs/prds/0008-git-gate.md @@ -0,0 +1,219 @@ +# PRD 0008: Git gate + +- **Status:** Draft +- **Author:** didericis +- **Created:** 2026-05-12 + +## Summary + +Per-bottle sidecar that fronts the agent's git remotes, runs +gitleaks against incoming refs via a `pre-receive` hook, and only +forwards to the real upstream on a clean scan. Upstream push +credentials live in the gate, not the agent — so a misbehaving +agent cannot push a secret-bearing commit past it. + +## Problem + +Today the agent holds its own SSH identity for each `bottle.ssh` +entry and pushes straight at gitea/github with ssh-gate doing dumb +L4 forwarding. There is no boundary between "the agent thinks this +commit is fine" and "the secret hits an external remote." If a +compromised or careless agent stages a `.env`, slips a token into +a fixture, or commits the `CLAUDE_BOTTLE_OAUTH_TOKEN` itself, `git +push` ships it. + +Host-side pre-commit / pre-push hooks are the usual defense, but +they live on the agent's side of the trust boundary: an agent with +shell access can `git push --no-verify` past them, edit +`.githooks/`, or `git config core.hooksPath /dev/null`. Anything +the agent can disable is not a gate. + +## Goals / Success Criteria + +Integration test: spin up a bottle whose only push path for a +declared upstream is the gate. Drop a synthetic high-entropy +secret into a commit, run `git push` from inside the agent, +observe a non-zero exit and a gitleaks finding in the gate's +stderr. Repeat with a clean commit, observe exit 0 and the commit +landing on the real upstream. + +## Non-goals + +- Pre-commit scanning. The gate is a `pre-receive` checkpoint + only; it does not run on `git commit`, does not block local + commits, and does not edit the agent's working tree. +- Git-protocol awareness beyond what `pre-receive` already gives + you. No bespoke pack inspection; gitleaks runs against the + incoming ref(s) in a bare repo, full stop. +- Per-user authentication on the agent → gate hop. The hop sits + inside a single bottle on an `--internal` Docker network; only + the bottle's agent can reach the gate. No additional ACLs. +- Subsuming ssh-gate or pipelock. Non-git SSH (if any) keeps + flowing through ssh-gate; HTTPS through pipelock. The git-gate + is git-only. +- Multi-tenant gate. One gate is provisioned per bottle, not + shared across bottles (same one-sidecar-per-agent posture as + pipelock / ssh-gate). +- Smolmachines / microVM colocation policy. Whether the future + smolmachines backend packs gates into one VM or runs them as + separate VMs is a backend decision, not a manifest or design + decision in this PRD. See "Future work." + +## Scope + +### In scope + +- **Gate sidecar lifecycle.** New `GitGate` + `DockerGitGate`, + mirroring `DockerSSHGate` and `DockerPipelockProxy` in shape and + network-attachment story. +- **Manifest field.** `bottle.git` — a list of git remotes the + bottle is allowed to talk to, each with the credential the gate + uses to push upstream. The agent gets no parallel `bottle.ssh` + entry for those upstreams. +- **Agent-side URL rewrite.** Provisioner emits `~/.gitconfig` + with `[url ""] insteadOf = ` so `git push + origin` from inside the agent transparently hits the gate. +- **Pre-receive gitleaks hook.** Baked into the gate image. On a + hit the hook exits non-zero and the push fails; on clean it + shells out `git push :` using the + gate-resident credential. +- **Plan rendering / dry-run.** `bottle_plan.py` and the y/N + preflight surface the gate sidecar (name, listed upstreams, + which credential it holds per upstream). + +### Out of scope + +- Push policy beyond gitleaks. No commit-author allowlist, no + branch-name policy, no signed-commit enforcement. gitleaks is + the single rule for v1. +- Fetch routing. Fetch can continue going through ssh-gate as + today, with the agent holding a read-scoped deploy key. Routing + fetch through the git-gate is a follow-up; this PRD is + push-side only. (Open question: revisit.) +- Quarantine / replay. A rejected push is discarded; we do not + stash it for the user to inspect. +- Non-Docker backends. Implementation lands for Docker only; the + `BottleBackend` abstraction gains the hook but other backends + are deferred. +- Bypass for trusted commits. No `[skip gitleaks]` trailer, no + allowlist by commit hash. If the gate is bypassable it isn't a + gate. + +## Proposed Design + +### New services / components + +Mirror the existing sidecar layout: + +- **`claude_bottle/git_gate.py`** (new): abstract `GitGate` + + `GitGatePlan` dataclass. `prepare` is host-side / side-effect- + free on docker; renders the per-upstream config and stages the + push credentials under `stage_dir`. +- **`claude_bottle/backend/docker/git_gate.py`** (new): + `DockerGitGate` concrete subclass. `start` does `docker create` + on the internal network, copies in the bare-repo skeleton, the + hook script, and per-upstream credentials, then `docker start`. + `stop` is idempotent `docker rm -f`. Container name: + `claude-bottle-git-gate-`. + +Gate image: a minimal `git` + `gitleaks` + `openssh-server` +image, pinned by digest (declared next to `PIPELOCK_IMAGE` and +the socat image constant). For each declared upstream the gate +hosts a bare repo at a stable local path (`/git/.git`) +with `hooks/pre-receive` wired to gitleaks. On a clean scan the +hook (or a `post-receive` companion) does `git push +:` using the credential the gate holds for that +upstream. + +Inside the bottle, the agent's `.gitconfig` rewrites the real +upstream URL to the gate's local URL via `insteadOf`. A `git +push origin main` therefore pushes to the gate; the gate scans; +on success the gate pushes to the real upstream. The agent never +sees the upstream push credential. + +### Existing code touched + +- **`claude_bottle/manifest.py`**: parse and validate the new + `bottle.git` block; reject `bottle.ssh` entries whose upstream + is also claimed by a `bottle.git` upstream (one path per + remote, no shadow route). +- **`claude_bottle/backend/docker/provision/git.py`** (new) or an + extension of the ssh provisioner: render the `insteadOf` config + and any extra `~/.gitconfig` plumbing. +- **`claude_bottle/backend/docker/backend.py`**: instantiate + `DockerGitGate` alongside `DockerPipelockProxy` and + `DockerSSHGate`; thread its `prepare` / `start` / `stop` + through `resolve_plan` / `launch`. +- **`claude_bottle/backend/docker/launch.py`**: add gate start / + stop to the `ExitStack` so the gate is up before any + provisioner that writes the agent's `~/.gitconfig`. +- **`claude_bottle/backend/docker/bottle_plan.py`**: new + `GitGatePlan` field on `DockerBottlePlan`; preflight rendering + surfaces the gate sidecar (name, per-upstream local paths, + upstream real URLs, which credential is in use). +- **Tests**: unit tests for `GitGate.prepare` and render shape; + manifest validator tests for the new field and the + no-shadow-route rule; an integration test in + `tests/integration/` for the push-with-secret (rejected) and + push-without-secret (forwarded) cases. + +### Data model changes + +`Bottle` grows an optional `git: list[GitEntry]` field. A +`GitEntry` carries the upstream URL, the local name the gate +exposes it as, and the credential the gate uses to push upstream +(initial shape: `identity_file` + `known_host_key`, matching +`bottle.ssh`). + +### External dependencies + +- A minimal `git` + `gitleaks` + `openssh-server` image, pinned + by digest. +- `gitleaks` binary, version pinned in the image build. +- No new Python packages. + +## Future work + +- **Fetch through the gate.** A v2 could route fetch through the + gate too, so the agent holds no upstream credentials at all. + Today fetch falls back to ssh-gate; pushing through git-gate + alone is the v1 win. +- **Smolmachines colocation.** The eventual smolmachines backend + may pack pipelock + ssh-gate + git-gate into a single microVM, + or split git-gate off because it holds push creds and the + others don't. That decision belongs to the backend; the shared + `BottleBackend` interface keeps sidecars independent so either + packing is possible without touching this PRD's design. + +## Open questions + +- Protocol on the agent → gate hop: SSH (`sshd` + `git-shell` + inside the gate) or HTTP smart protocol (`git-http-backend` + behind a tiny webserver)? SSH matches the existing ssh-gate + patterns and the user's existing `~/.ssh` muscle memory; HTTP + is lighter on image size and avoids an `authorized_keys` + story. Default: SSH unless image size becomes a problem. +- Where gitleaks runs: pre-receive hook against a checkout of the + incoming ref vs. a wrapper around `git-receive-pack` that + inspects the pack file directly. Hook is canonical; defer the + wrapper variant. +- Rejection signalling: gitleaks failures surface as a normal + pre-receive reject (the user sees gitleaks's report on + stderr). Worth a "redacted" mode that hides the matched bytes + from the rejection message? Default: show file + line, hide + the matched bytes. +- Credential reuse vs. duplication from `bottle.ssh`. If a user + lists the same identity for ssh-gate (read) and git-gate + (write), we can either reference by name or require two + copies. Default: inline copies; revisit when it gets annoying. + +## References + +- PRD 0001: per-agent egress proxy via pipelock — sidecar + pattern this PRD reuses. +- PRD 0007: SSH egress gate — the L4 SSH forwarder this PRD + sits alongside; explicitly *not* the place to add + git-protocol awareness. +- `claude_bottle/ssh_gate.py` / `claude_bottle/pipelock.py` — + existing sidecar abstractions to mirror. +- gitleaks: