Flip Status: Draft -> Active for the 22 PRDs whose work has shipped to main. Leaves the terminal-status PRDs unchanged: 0007 and 0010 (Superseded) and 0014 (Retargeted) were replaced, not shipped as-is. 0027 stays Draft — its PR (#95) is not merged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
11 KiB
PRD 0008: Git gate
- Status: Active
- Author: didericis
- Created: 2026-05-12
Summary
Per-bottle sidecar that fronts the agent's git remotes as a
transparent mirror. Push is gated: gitleaks scans incoming refs
via a pre-receive hook, and only clean refs get forwarded to
the real upstream. Fetch is mirrored: every upload-pack first
runs git fetch origin --prune against the upstream via the
daemon's --access-hook, so an agent fetch returns whatever the
upstream has now (fail-closed if the upstream is unreachable).
Upstream credentials live in the gate, not the agent — so a misbehaving agent cannot push a secret-bearing commit past it and cannot acquire push access by inspecting the agent's own filesystem.
Problem
Today the agent holds its own SSH identity for each bottle.ssh
entry and pushes straight at gitea/github with ssh-gate doing dumb
L4 forwarding. There is no boundary between "the agent thinks this
commit is fine" and "the secret hits an external remote." If a
compromised or careless agent stages a .env, slips a token into
a fixture, or commits the BOT_BOTTLE_CLAUDE_OAUTH_TOKEN itself, git push ships it.
Host-side pre-commit / pre-push hooks are the usual defense, but
they live on the agent's side of the trust boundary: an agent with
shell access can git push --no-verify past them, edit
.githooks/, or git config core.hooksPath /dev/null. Anything
the agent can disable is not a gate.
Goals / Success Criteria
Two integration tests, both with the gate as the only git path for a declared upstream:
- Push: drop a synthetic high-entropy secret into a commit,
run
git pushfrom inside the agent, observe a non-zero exit and a gitleaks finding in the response. Repeat with a clean commit and observe exit 0 + the commit landing on the real upstream. - Fetch: clone the upstream through the gate (
git cloneagainst the gate URL), observe the upstream's content. Push a new commit to the upstream out-of-band, refetch through the gate, observe the new commit. The gate must never serve stale data — every fetch refreshes from upstream first.
Non-goals
- Pre-commit scanning. The gate is a
pre-receivecheckpoint only; it does not run ongit commit, does not block local commits, and does not edit the agent's working tree. - Git-protocol awareness beyond what
pre-receivealready gives you. No bespoke pack inspection; gitleaks runs against the incoming ref(s) in a bare repo, full stop. - Per-user authentication on the agent → gate hop. The hop sits
inside a single bottle on an
--internalDocker network; only the bottle's agent can reach the gate. No additional ACLs. - Subsuming ssh-gate or pipelock. Non-git SSH (if any) keeps flowing through ssh-gate; HTTPS through pipelock. The git-gate is git-only.
- Multi-tenant gate. One gate is provisioned per bottle, not shared across bottles (same one-sidecar-per-agent posture as pipelock / ssh-gate).
- Smolmachines / microVM colocation policy. Whether the future smolmachines backend packs gates into one VM or runs them as separate VMs is a backend decision, not a manifest or design decision in this PRD. See "Future work."
Scope
In scope
- Gate sidecar lifecycle. New
GitGate+DockerGitGate, mirroringDockerSSHGateandDockerPipelockProxyin shape and network-attachment story. - Manifest field.
bottle.git— a list of git remotes the bottle is allowed to talk to, each with the credential the gate uses to push upstream. The agent gets no parallelbottle.sshentry for those upstreams. Each entry may also carry anExtraHosts: { hostname: ip }map, surfaced to the gate as--add-hostso the gate can resolve upstreams whose public DNS doesn't point at the reachable IP (e.g. Tailscale-only hosts). The agent-sideinsteadOfrewrite keys off the original hostname, so the manifest'sUpstreamURL stays human-readable. - Agent-side URL rewrite. Provisioner emits
~/.gitconfigwith[url "<gate-url>"] insteadOf = <real-url>so every git operation against the declared upstream (push, fetch, clone, pull, ls-remote) transparently hits the gate. - Pre-receive gitleaks hook. Baked into the gate image. On a
hit the hook exits non-zero and the push fails; on clean it
shells out
git push origin <ref>:<ref>using the gate-resident credential. - Access-hook upstream refresh.
git daemon --access-hookrunsgit fetch origin --pruneagainst the upstream before everyupload-packrequest, so a fetch through the gate is observably equivalent to a fetch against the real upstream. Failure to reach the upstream is fail-closed: the access hook exits non-zero and the agent's fetch fails. - Plan rendering / dry-run.
bottle_plan.pyand the y/N preflight surface the gate sidecar (name, listed upstreams, which credential it holds per upstream).
Out of scope
- Push policy beyond gitleaks. No commit-author allowlist, no branch-name policy, no signed-commit enforcement. gitleaks is the single rule for v1.
- Fetch caching / stale-while-revalidate. Every
upload-packrefresh is a synchronous round-trip to the upstream; there is no TTL cache, no background refresh. If the upstream is slow, the agent's fetch is slow. - Quarantine / replay. A rejected push is discarded; we do not stash it for the user to inspect.
- Non-Docker backends. Implementation lands for Docker only; the
BottleBackendabstraction gains the hook but other backends are deferred. - Bypass for trusted commits. No
[skip gitleaks]trailer, no allowlist by commit hash. If the gate is bypassable it isn't a gate.
Proposed Design
New services / components
Mirror the existing sidecar layout:
bot_bottle/git_gate.py(new): abstractGitGate+GitGatePlandataclass.prepareis host-side / side-effect- free on docker; renders the per-upstream config and stages the push credentials understage_dir.bot_bottle/backend/docker/git_gate.py(new):DockerGitGateconcrete subclass.startdoesdocker createon the internal network, copies in the bare-repo skeleton, the hook script, and per-upstream credentials, thendocker start.stopis idempotentdocker rm -f. Container name:bot-bottle-git-gate-<slug>.
Gate image: git-daemon + openssh-client over a
zricethezav/gitleaks base (alpine + gitleaks), pinned by digest.
For each declared upstream the gate hosts a bare repo at
/git/<name>.git with remote.origin.url set to the real
upstream (via git remote add --mirror=fetch), hooks/pre-receive
wired to gitleaks-then-git push origin, and the bare repo's
config carrying per-upstream credential paths.
Inside the bottle, the agent's .gitconfig rewrites the real
upstream URL to the gate's git:// URL via insteadOf. Every
git operation against the declared upstream therefore hits the
gate.
For pushes, the pre-receive hook gitleaks-scans the incoming refs and, on clean, pushes each accepted ref to the real upstream using the credential the gate holds.
For fetches (clone, pull, fetch, ls-remote), git daemon's
--access-hook=<path> runs git fetch origin --prune against
the real upstream before the upload-pack service serves the
client. The bare repo therefore reflects the upstream's current
state at the moment the agent's fetch begins; if the upstream
is unreachable, the access hook exits non-zero and the agent's
fetch fails — same observable behavior as if the agent were
talking to the upstream directly.
The agent never sees the upstream credential under either operation.
Existing code touched
bot_bottle/manifest.py: parse and validate the newbottle.gitblock; rejectbottle.sshentries whose upstream is also claimed by abottle.gitupstream (one path per remote, no shadow route).bot_bottle/backend/docker/provision/git.py(new) or an extension of the ssh provisioner: render theinsteadOfconfig and any extra~/.gitconfigplumbing.bot_bottle/backend/docker/backend.py: instantiateDockerGitGatealongsideDockerPipelockProxyandDockerSSHGate; thread itsprepare/start/stopthroughresolve_plan/launch.bot_bottle/backend/docker/launch.py: add gate start / stop to theExitStackso the gate is up before any provisioner that writes the agent's~/.gitconfig.bot_bottle/backend/docker/bottle_plan.py: newGitGatePlanfield onDockerBottlePlan; preflight rendering surfaces the gate sidecar (name, per-upstream local paths, upstream real URLs, which credential is in use).- Tests: unit tests for
GitGate.prepareand render shape; manifest validator tests for the new field and the no-shadow-route rule; an integration test intests/integration/for the push-with-secret (rejected) and push-without-secret (forwarded) cases.
Data model changes
Bottle grows an optional git: list[GitEntry] field. A
GitEntry carries the upstream URL, the local name the gate
exposes it as, and the credential the gate uses to push upstream
(initial shape: identity_file + known_host_key, matching
bottle.ssh).
External dependencies
zricethezav/gitleaksbase image, pinned by digest. The base ships gitleaks + git; the gate Dockerfile addsgit-daemonandopenssh-clienton top.- No new Python packages.
Future work
- Smolmachines colocation. The eventual smolmachines backend
may pack pipelock + ssh-gate + git-gate into a single microVM,
or split git-gate off because it holds push creds and the
others don't. That decision belongs to the backend; the shared
BottleBackendinterface keeps sidecars independent so either packing is possible without touching this PRD's design.
Open questions
- Protocol on the agent → gate hop: SSH (
sshd+git-shellinside the gate) or HTTP smart protocol (git-http-backendbehind a tiny webserver)? SSH matches the existing ssh-gate patterns and the user's existing~/.sshmuscle memory; HTTP is lighter on image size and avoids anauthorized_keysstory. Default: SSH unless image size becomes a problem. - Where gitleaks runs: pre-receive hook against a checkout of the
incoming ref vs. a wrapper around
git-receive-packthat inspects the pack file directly. Hook is canonical; defer the wrapper variant. - Rejection signalling: gitleaks failures surface as a normal pre-receive reject (the user sees gitleaks's report on stderr). Worth a "redacted" mode that hides the matched bytes from the rejection message? Default: show file + line, hide the matched bytes.
- Credential reuse vs. duplication from
bottle.ssh. If a user lists the same identity for ssh-gate (read) and git-gate (write), we can either reference by name or require two copies. Default: inline copies; revisit when it gets annoying.
References
- PRD 0001: per-agent egress proxy via pipelock — sidecar pattern this PRD reuses.
- PRD 0007: SSH egress gate — the L4 SSH forwarder this PRD sits alongside; explicitly not the place to add git-protocol awareness.
bot_bottle/ssh_gate.py/bot_bottle/pipelock.py— existing sidecar abstractions to mirror.- gitleaks: https://github.com/gitleaks/gitleaks