Files
bot-bottle/docs/prds/0048-ssh-deploy-key-provisioning.md
T
didericis-claude 83463f1cc8
test / unit (push) Successful in 40s
test / integration (push) Successful in 41s
docs(prd): activate PRD 0048 — SSH deploy-key provisioning
2026-06-03 11:58:36 -04:00

297 lines
11 KiB
Markdown

# PRD 0048: SSH Deploy-Key Provisioning
- **Status:** Active
- **Author:** didericis-claude
- **Created:** 2026-06-03
- **Issue:** #169
## Summary
Replace per-repo static SSH identity files with short-lived ed25519 deploy
keys that are generated at spin-up and revoked at teardown. Introduce
`bot_bottle/contrib/` as the package for platform-specific provisioners and
ship the first contrib sub-package: `bot_bottle/contrib/gitea/` with
`GiteaDeployKeyProvisioner`. A new `provisioned_key:` block in `git-gate.repos`
entries opts a repo into automatic key lifecycle management; `identity:` stays
valid for operators who supply their own key material.
## Problem
The current `git-gate.repos` entries require an `identity:` field pointing to
a host-side SSH private key (PRD 0047). Keys are static: the operator generates
them once, registers them with the upstream forge, and the same key is reused
across every bottle spin-up. This has several consequences:
- **No automatic revocation.** If a bottle misbehaves or a key leaks, the
operator must notice and manually delete the key from the forge. There is no
teardown hook that does it.
- **Broad blast radius.** A forge deploy key typically grants write access for
the lifetime of the key. A static key that survives bottle teardown continues
to grant that access.
- **Manual rotation burden.** Operators must manage key files on disk, keeping
them secure, rotating them on a schedule, and distributing them across hosts
that run `./cli.py start`.
## Goals / Success Criteria
- `git-gate.repos` entries accept `provisioned_key:` as an alternative to
`identity:`. The parser rejects entries that have both, or neither.
- `provisioned_key.provider: gitea` provisions and revokes deploy keys via the
Gitea HTTP API.
- At prepare time the provisioner generates a fresh ed25519 keypair, registers
the public half as a repo-scoped deploy key, and makes the private key
available to git-gate at the path it expects — the rest of the pipeline is
unchanged.
- At teardown the provisioner deletes the registered deploy key. Failure to
delete halts teardown and propagates the error loudly.
- `bot_bottle/contrib/` is introduced as the package for platform-specific
implementations; the core defines the abstract interface; contrib sub-packages
provide concrete implementations.
- Existing `identity:`-based repos continue to work without change.
- The unit test suite passes unchanged for `identity:` paths; new tests cover
`provisioned_key:` parse, validation, and provisioner dispatch.
## Non-goals
- GitHub, GitLab, or other forge providers (a future contrib sub-package each).
- Dashboard UI for listing or revoking orphaned deploy keys.
- SSH CA certificate approach (rejected in the issue thread in favour of
per-repo deploy keys for simpler revocation, smaller blast radius, and forge
compatibility).
- Key rotation mid-session (keys live for exactly one spin-up / teardown cycle).
- Any change to how `identity:` repos are provisioned.
## Design
### Manifest changes (builds on PRD 0047)
`git-gate.repos.<name>` currently accepts exactly:
```
url (required string)
identity (required string)
host_key (optional string)
```
After this PRD:
```
url (required string)
identity (optional string — mutually exclusive with provisioned_key)
provisioned_key (optional object — mutually exclusive with identity)
host_key (optional string)
```
Exactly one of `identity` or `provisioned_key` must be present. The parser
emits a targeted error for each violation:
```
bottle 'dev' git-gate.repos['bot-bottle'] must set exactly one of
'identity' or 'provisioned_key'; got neither.
bottle 'dev' git-gate.repos['bot-bottle'] must set exactly one of
'identity' or 'provisioned_key'; got both.
```
`provisioned_key` object schema:
```yaml
provisioned_key:
provider: gitea # required; names the contrib module to load
token_env: GITEA_TOKEN # required; name of a host env var holding the API token
api_url: https://... # optional; defaults to https://<host from url>
```
| Field | Type | Notes |
|-------|------|-------|
| `provider` | required string | Must match a sub-package under `bot_bottle/contrib/` |
| `token_env` | required string | Resolved at provision time via `os.environ`; never stored in plan |
| `api_url` | optional string | Override when the API endpoint differs from the git host |
**Example bottle manifest:**
```yaml
git-gate:
user:
name: implementer-bot
email: eric+implementer@dideric.is
repos:
bot-bottle:
url: ssh://git@gitea.dideric.is:30009/didericis/bot-bottle.git
provisioned_key:
provider: gitea
token_env: GITEA_DEPLOY_TOKEN
host_key: "ssh-rsa AAAA..."
```
### `contrib` package structure
```
bot_bottle/
contrib/
__init__.py # empty; no core symbols
gitea/
__init__.py # empty
deploy_key_provisioner.py
```
`contrib` is a flat namespace of forge/platform sub-packages. Each sub-package
is self-contained; the core imports from contrib lazily (inside factory
functions) so that missing optional dependencies in a contrib sub-package don't
break unrelated features.
### Core interface
New file: `bot_bottle/deploy_key_provisioner.py`
```python
from abc import ABC, abstractmethod
class DeployKeyProvisioner(ABC):
@abstractmethod
def create(self, owner_repo: str, title: str) -> tuple[str, bytes]:
"""Generate a keypair and register the public half.
owner_repo: '<owner>/<repo>' portion of the git upstream URL.
title: human-readable label shown in the forge key list.
Returns (key_id, private_key_pem) where key_id is opaque to
the caller and is only passed back to delete()."""
@abstractmethod
def delete(self, owner_repo: str, key_id: str) -> None:
"""Delete the registered deploy key.
Must not raise if the key is already absent (HTTP 404 is success).
Must raise for all other failures so that teardown halts."""
def get_provisioner(provider: str, token: str, api_url: str) -> DeployKeyProvisioner:
"""Instantiate the named contrib provisioner.
Raises ManifestError for unknown providers so the error is caught
at parse time rather than at runtime."""
if provider == "gitea":
from bot_bottle.contrib.gitea.deploy_key_provisioner import (
GiteaDeployKeyProvisioner,
)
return GiteaDeployKeyProvisioner(token=token, api_url=api_url)
from .manifest_util import ManifestError
raise ManifestError(f"unknown provisioned_key provider: {provider!r}")
```
### Gitea contrib implementation
`bot_bottle/contrib/gitea/deploy_key_provisioner.py`:
`create(owner_repo, title)`:
1. Generate an ed25519 keypair via `ssh-keygen -t ed25519 -f <tmpfile> -N ''`
(uses the SSH tooling already required by git-gate; no new Python dependency).
2. Read the private key bytes and the `.pub` file.
3. `POST /api/v1/repos/{owner}/{repo}/keys` with the public key, `title`, and
`read_only: false` (deploy keys always need push access for git-gate).
4. Return `(str(response["id"]), private_key_bytes)`.
`delete(owner_repo, key_id)`:
1. `DELETE /api/v1/repos/{owner}/{repo}/keys/{id}`.
2. Treat HTTP 404 as success (key already gone).
3. Raise `RuntimeError` for any other non-2xx response or network error,
including the status code and response body in the message.
HTTP calls use `urllib.request` from the stdlib; no new runtime dependency.
### `GitEntry` dataclass changes
`bot_bottle/manifest_git.py`:
- Add `ProvisionedKeyConfig` dataclass:
```python
@dataclass(frozen=True)
class ProvisionedKeyConfig:
provider: str
token_env: str
api_url: str # empty string means "derive from UpstreamHost"
```
- `GitEntry`:
- `IdentityFile: str` unchanged internally; empty string when
`provisioned_key` is used; set at provision time, not parse time.
- New field: `ProvisionedKey: ProvisionedKeyConfig | None = None`
- `from_repos_entry` validates the mutually-exclusive constraint and parses
the `provisioned_key` block when present.
### `GitGateUpstream` / prepare-time changes
`bot_bottle/git_gate.py` and `bot_bottle/backend/docker/provision/git.py`:
The existing path writes the identity file path into `GitGateUpstream.IdentityFile`
and docker-cp's it into `/git-gate/creds/<name>-key`. That path stays unchanged
for `identity:` repos.
For `provisioned_key:` repos, a new helper `provision_deploy_key(entry,
stage_dir, bottle_name)` runs before the git-gate sidecar starts:
1. Resolve `token = os.environ[entry.ProvisionedKey.token_env]`. Missing key
raises `RuntimeError` with a clear message naming the env var.
2. Resolve `api_url = entry.ProvisionedKey.api_url or f"https://{entry.UpstreamHost}"`.
3. Instantiate `get_provisioner(entry.ProvisionedKey.provider, token, api_url)`.
4. Call `provisioner.create(entry.UpstreamPath.lstrip("/"), title)` where
`title = f"bot-bottle:{bottle_name}:{entry.Name}"`.
5. Write private key to `stage_dir / f"{entry.Name}-key"` (mode 0o600).
6. Write key ID to `stage_dir / f"{entry.Name}-deploy-key-id"` (plain text).
7. Return the key file path; caller sets `GitGateUpstream.IdentityFile` to it.
`owner_repo` is extracted from `entry.UpstreamPath` (the path component of the
`ssh://` URL, e.g. `/didericis/bot-bottle.git` → `didericis/bot-bottle`).
### Teardown changes
`bot_bottle/backend/docker/cleanup.py` (or the equivalent teardown path):
After the git-gate sidecar stops, for each `GitEntry` with `ProvisionedKey`
set:
1. Check that `stage_dir / f"{entry.Name}-deploy-key-id"` exists; skip if
absent (provision never ran or already cleaned up).
2. Resolve token and API URL as above.
3. Instantiate provisioner and call `provisioner.delete(owner_repo, key_id)`.
4. On success, log at INFO. On failure, allow the exception to propagate —
teardown halts and the error surfaces to the operator.
A stranded deploy key is a security concern: the operator must know about it
and address it manually. Silent continuation is not acceptable.
The private key file in `stage_dir` is cleaned up as part of normal stage-dir
teardown (no extra step needed).
## Testing strategy
```
python3 -m unittest discover -s tests/unit
```
New / modified test files:
- `tests/unit/test_manifest_git.py` — add cases for:
- `provisioned_key:` accepted with valid `provider`, `token_env`, optional `api_url`
- Both `identity` and `provisioned_key` present → `ManifestError`
- Neither `identity` nor `provisioned_key` present → `ManifestError`
- Unknown key inside `provisioned_key` block → `ManifestError`
- Missing `provider` or `token_env` inside `provisioned_key` → `ManifestError`
- `tests/unit/test_deploy_key_provisioner.py` — new:
- `get_provisioner("gitea", ...)` returns `GiteaDeployKeyProvisioner`
- `get_provisioner("unknown", ...)` raises `ManifestError`
- `tests/unit/test_contrib_gitea_deploy_key.py` — new (using `unittest.mock`
to stub `urllib.request.urlopen` and `subprocess.run`):
- `create()` calls `ssh-keygen`, POSTs to correct endpoint, returns key ID
- `delete()` DELETEs to correct endpoint
- `delete()` tolerates HTTP 404 (already-deleted key)
- `delete()` raises `RuntimeError` on non-404 HTTP error
## Open questions
None.