Files
bot-bottle/docs/prds/0008-git-gate.md
T
didericis c91395425c
test / unit (pull_request) Successful in 12s
test / integration (pull_request) Successful in 13s
docs(prds): add PRD 0008 git gate
Per-bottle sidecar that fronts the agent's git remotes, runs
gitleaks via a pre-receive hook, and only forwards to the real
upstream on a clean scan. Upstream push credentials live in the
gate, not the agent — so a misbehaving agent cannot push a
secret-bearing commit past it.
2026-05-12 18:24:33 -04:00

9.4 KiB

PRD 0008: Git gate

  • Status: Draft
  • Author: didericis
  • Created: 2026-05-12

Summary

Per-bottle sidecar that fronts the agent's git remotes, runs gitleaks against incoming refs via a pre-receive hook, and only forwards to the real upstream on a clean scan. Upstream push credentials live in the gate, not the agent — so a misbehaving agent cannot push a secret-bearing commit past it.

Problem

Today the agent holds its own SSH identity for each bottle.ssh entry and pushes straight at gitea/github with ssh-gate doing dumb L4 forwarding. There is no boundary between "the agent thinks this commit is fine" and "the secret hits an external remote." If a compromised or careless agent stages a .env, slips a token into a fixture, or commits the CLAUDE_BOTTLE_OAUTH_TOKEN itself, git push ships it.

Host-side pre-commit / pre-push hooks are the usual defense, but they live on the agent's side of the trust boundary: an agent with shell access can git push --no-verify past them, edit .githooks/, or git config core.hooksPath /dev/null. Anything the agent can disable is not a gate.

Goals / Success Criteria

Integration test: spin up a bottle whose only push path for a declared upstream is the gate. Drop a synthetic high-entropy secret into a commit, run git push from inside the agent, observe a non-zero exit and a gitleaks finding in the gate's stderr. Repeat with a clean commit, observe exit 0 and the commit landing on the real upstream.

Non-goals

  • Pre-commit scanning. The gate is a pre-receive checkpoint only; it does not run on git commit, does not block local commits, and does not edit the agent's working tree.
  • Git-protocol awareness beyond what pre-receive already gives you. No bespoke pack inspection; gitleaks runs against the incoming ref(s) in a bare repo, full stop.
  • Per-user authentication on the agent → gate hop. The hop sits inside a single bottle on an --internal Docker network; only the bottle's agent can reach the gate. No additional ACLs.
  • Subsuming ssh-gate or pipelock. Non-git SSH (if any) keeps flowing through ssh-gate; HTTPS through pipelock. The git-gate is git-only.
  • Multi-tenant gate. One gate is provisioned per bottle, not shared across bottles (same one-sidecar-per-agent posture as pipelock / ssh-gate).
  • Smolmachines / microVM colocation policy. Whether the future smolmachines backend packs gates into one VM or runs them as separate VMs is a backend decision, not a manifest or design decision in this PRD. See "Future work."

Scope

In scope

  • Gate sidecar lifecycle. New GitGate + DockerGitGate, mirroring DockerSSHGate and DockerPipelockProxy in shape and network-attachment story.
  • Manifest field. bottle.git — a list of git remotes the bottle is allowed to talk to, each with the credential the gate uses to push upstream. The agent gets no parallel bottle.ssh entry for those upstreams.
  • Agent-side URL rewrite. Provisioner emits ~/.gitconfig with [url "<gate-url>"] insteadOf = <real-url> so git push origin from inside the agent transparently hits the gate.
  • Pre-receive gitleaks hook. Baked into the gate image. On a hit the hook exits non-zero and the push fails; on clean it shells out git push <upstream> <ref>:<ref> using the gate-resident credential.
  • Plan rendering / dry-run. bottle_plan.py and the y/N preflight surface the gate sidecar (name, listed upstreams, which credential it holds per upstream).

Out of scope

  • Push policy beyond gitleaks. No commit-author allowlist, no branch-name policy, no signed-commit enforcement. gitleaks is the single rule for v1.
  • Fetch routing. Fetch can continue going through ssh-gate as today, with the agent holding a read-scoped deploy key. Routing fetch through the git-gate is a follow-up; this PRD is push-side only. (Open question: revisit.)
  • Quarantine / replay. A rejected push is discarded; we do not stash it for the user to inspect.
  • Non-Docker backends. Implementation lands for Docker only; the BottleBackend abstraction gains the hook but other backends are deferred.
  • Bypass for trusted commits. No [skip gitleaks] trailer, no allowlist by commit hash. If the gate is bypassable it isn't a gate.

Proposed Design

New services / components

Mirror the existing sidecar layout:

  • claude_bottle/git_gate.py (new): abstract GitGate + GitGatePlan dataclass. prepare is host-side / side-effect- free on docker; renders the per-upstream config and stages the push credentials under stage_dir.
  • claude_bottle/backend/docker/git_gate.py (new): DockerGitGate concrete subclass. start does docker create on the internal network, copies in the bare-repo skeleton, the hook script, and per-upstream credentials, then docker start. stop is idempotent docker rm -f. Container name: claude-bottle-git-gate-<slug>.

Gate image: a minimal git + gitleaks + openssh-server image, pinned by digest (declared next to PIPELOCK_IMAGE and the socat image constant). For each declared upstream the gate hosts a bare repo at a stable local path (/git/<name>.git) with hooks/pre-receive wired to gitleaks. On a clean scan the hook (or a post-receive companion) does git push <upstream> <ref>:<ref> using the credential the gate holds for that upstream.

Inside the bottle, the agent's .gitconfig rewrites the real upstream URL to the gate's local URL via insteadOf. A git push origin main therefore pushes to the gate; the gate scans; on success the gate pushes to the real upstream. The agent never sees the upstream push credential.

Existing code touched

  • claude_bottle/manifest.py: parse and validate the new bottle.git block; reject bottle.ssh entries whose upstream is also claimed by a bottle.git upstream (one path per remote, no shadow route).
  • claude_bottle/backend/docker/provision/git.py (new) or an extension of the ssh provisioner: render the insteadOf config and any extra ~/.gitconfig plumbing.
  • claude_bottle/backend/docker/backend.py: instantiate DockerGitGate alongside DockerPipelockProxy and DockerSSHGate; thread its prepare / start / stop through resolve_plan / launch.
  • claude_bottle/backend/docker/launch.py: add gate start / stop to the ExitStack so the gate is up before any provisioner that writes the agent's ~/.gitconfig.
  • claude_bottle/backend/docker/bottle_plan.py: new GitGatePlan field on DockerBottlePlan; preflight rendering surfaces the gate sidecar (name, per-upstream local paths, upstream real URLs, which credential is in use).
  • Tests: unit tests for GitGate.prepare and render shape; manifest validator tests for the new field and the no-shadow-route rule; an integration test in tests/integration/ for the push-with-secret (rejected) and push-without-secret (forwarded) cases.

Data model changes

Bottle grows an optional git: list[GitEntry] field. A GitEntry carries the upstream URL, the local name the gate exposes it as, and the credential the gate uses to push upstream (initial shape: identity_file + known_host_key, matching bottle.ssh).

External dependencies

  • A minimal git + gitleaks + openssh-server image, pinned by digest.
  • gitleaks binary, version pinned in the image build.
  • No new Python packages.

Future work

  • Fetch through the gate. A v2 could route fetch through the gate too, so the agent holds no upstream credentials at all. Today fetch falls back to ssh-gate; pushing through git-gate alone is the v1 win.
  • Smolmachines colocation. The eventual smolmachines backend may pack pipelock + ssh-gate + git-gate into a single microVM, or split git-gate off because it holds push creds and the others don't. That decision belongs to the backend; the shared BottleBackend interface keeps sidecars independent so either packing is possible without touching this PRD's design.

Open questions

  • Protocol on the agent → gate hop: SSH (sshd + git-shell inside the gate) or HTTP smart protocol (git-http-backend behind a tiny webserver)? SSH matches the existing ssh-gate patterns and the user's existing ~/.ssh muscle memory; HTTP is lighter on image size and avoids an authorized_keys story. Default: SSH unless image size becomes a problem.
  • Where gitleaks runs: pre-receive hook against a checkout of the incoming ref vs. a wrapper around git-receive-pack that inspects the pack file directly. Hook is canonical; defer the wrapper variant.
  • Rejection signalling: gitleaks failures surface as a normal pre-receive reject (the user sees gitleaks's report on stderr). Worth a "redacted" mode that hides the matched bytes from the rejection message? Default: show file + line, hide the matched bytes.
  • Credential reuse vs. duplication from bottle.ssh. If a user lists the same identity for ssh-gate (read) and git-gate (write), we can either reference by name or require two copies. Default: inline copies; revisit when it gets annoying.

References

  • PRD 0001: per-agent egress proxy via pipelock — sidecar pattern this PRD reuses.
  • PRD 0007: SSH egress gate — the L4 SSH forwarder this PRD sits alongside; explicitly not the place to add git-protocol awareness.
  • claude_bottle/ssh_gate.py / claude_bottle/pipelock.py — existing sidecar abstractions to mirror.
  • gitleaks: https://github.com/gitleaks/gitleaks