diff --git a/docs/prds/0046-ssh-config-provisioning.md b/docs/prds/0046-ssh-config-provisioning.md new file mode 100644 index 0000000..83850ef --- /dev/null +++ b/docs/prds/0046-ssh-config-provisioning.md @@ -0,0 +1,230 @@ +# PRD 0046: SSH Config Provisioning + +- **Status:** Draft +- **Author:** didericis-codex +- **Created:** 2026-06-02 +- **Issue:** #150 + +## Summary + +Add top-level bottle SSH support so operators can provide approved +`known_hosts` lines and `Host` stanzas to the agent and git environment. +At the same time, simplify `git.remotes` to a logical name-to-upstream-URL map. +This lets SSH remotes that rely on host aliases work inside a bottle while +HTTP/API traffic continues to resolve through the declared egress route. + +## Problem + +`git.remotes` currently mixes repository identity, SSH key material, known host +material, and optional host overrides in one block. `ExtraHosts` is especially +awkward: it is a hosts-file override, but the Gitea remote case really needs SSH +client config and host key verification. + +The Gitea remote case needs the bottle to carry operator-approved SSH client +config into the agent/git environment. Local SSH config can make +`git@gitea:didericis/bot-bottle.git` resolve to the same endpoint as +`ssh://git@gitea.dideric.is:30009/didericis/bot-bottle.git`, but encoding that +as `ExtraHosts` can affect Docker `extra_hosts`, sidecar hosts config, agent +hosts config, and egress DNS. That coupling can break HTTP/API access because +the egress path sees the public hostname resolving to an internal address. + +Known host keys have the same ownership problem. They are SSH client trust +material, not repository metadata. They should be declared under `ssh`, rendered +to the default SSH known-hosts file, and kept independent from egress DNS and +HTTP/API behavior. + +## Goals / Success Criteria + +- The manifest parser accepts a simplified `git.remotes` mapping from logical + remote name to upstream URL. +- The manifest parser accepts a top-level `ssh.known_hosts` list of OpenSSH + `known_hosts` lines. +- The manifest parser accepts a top-level `ssh.config` list. +- Each config entry supports at least `Host`, `Hostname`, `Port`, `User`, and + `IdentityFile`. +- `IdentityFile` values are host-side paths, and provisioning stages or mounts + the referenced key material through the existing key-handling path. +- Generated in-bottle SSH config rewrites `IdentityFile` to the staged + in-bottle key path. +- Private key contents are never printed, logged, committed, or inlined into + the manifest or generated config outside the intended staged key file. +- Known host lines are rendered to the default SSH known-hosts file for the + agent user, normally `~/.ssh/known_hosts`. +- Git operations using SSH host aliases, such as + `git@gitea:didericis/bot-bottle.git`, work because SSH sees the provisioned + `Host gitea` stanza. +- `ssh.config` entries do not alter Docker `extra_hosts`, sidecar hosts config, + agent hosts config, or egress route DNS. +- `ExtraHosts`, `IdentityFile`, and known-host fields are removed from the + `git.remotes` target model; after this PRD, `git.remotes` carries only remote + names and upstream URLs. +- Documentation distinguishes SSH client config, git `insteadOf` rewrites, and + egress DNS/HTTP policy. + +## Non-goals + +- No gitconfig-only alias feature. +- No hosts-file override replacement. This PRD removes `ExtraHosts` from the + target git remote schema instead of adding a new hosts override elsewhere. +- No automatic import of the operator's full host `~/.ssh/config`. +- No SSH config support for arbitrary OpenSSH directives beyond the fields + listed in this PRD. +- No private key material in manifests, logs, PRDs, tests, or generated + non-key config files. +- No changes to HTTP/API egress auth or DNS routing semantics. + +## Scope + +In scope: + +- Change the target manifest model for `git.remotes` to + `name: upstream-url`. +- Remove `ExtraHosts`, `IdentityFile`, and embedded known-host fields from the + git remote target schema. +- Add manifest model and schema support for top-level `ssh.known_hosts`. +- Add manifest model and schema support for top-level `ssh.config`. +- Validate known-host entries as non-empty strings. +- Validate required SSH config fields and reject malformed entries with clear + manifest errors. +- Add a shared provisioning plan for staged SSH config and referenced identity + files, plus rendered known-hosts files. +- Apply the provisioning plan in both Docker and smolmachines agent/git + environments. +- Update focused unit tests for parsing, rendered SSH config, key path + rewriting, known-host rendering, and hosts/DNS isolation. +- Update user documentation and examples. + +Out of scope: + +- Integration tests that require a live SSH server. +- Reworking git-gate gitleaks scanning. +- Supporting `Include`, `Match`, `ProxyCommand`, `CertificateFile`, or other + advanced SSH config directives. +- Per-command SSH config injection for tools outside the bottle's provisioned + environment. + +## Design + +Add a top-level `ssh` manifest block: + +```yaml +git: + remotes: + bot-bottle: ssh://git@100.78.141.42:30009/didericis/bot-bottle.git + +ssh: + known_hosts: + - "[100.78.141.42]:30009 ssh-rsa ..." + config: + - Host: gitea + Hostname: 100.78.141.42 + Port: 30009 + User: git + IdentityFile: ~/.ssh/gitea-delos-2.pem + - Host: gitea.dideric.is + Hostname: 100.78.141.42 + Port: 30009 + User: git + IdentityFile: ~/.ssh/gitea-delos-2.pem +``` + +Represent remotes as the existing `GitEntry` equivalent, but with only: + +```python +@dataclass(frozen=True) +class GitEntry: + Name: str + Upstream: str +``` + +Represent each entry with a small manifest dataclass, for example: + +```python +@dataclass(frozen=True) +class SshConfig: + known_hosts: tuple[str, ...] + config: tuple[SshConfigEntry, ...] + +@dataclass(frozen=True) +class SshConfigEntry: + Host: str + Hostname: str + Port: int + User: str + IdentityFile: str +``` + +`BottleManifest` should expose parsed SSH data on the bottle object, similar to +existing `git`, `env`, and `egress` accessors. Parser validation should require +non-empty strings for each `known_hosts` entry and for `Host`, `Hostname`, +`User`, and `IdentityFile`; `Port` must be an integer in the valid TCP port +range. + +### Provisioning + +Build an SSH provisioning plan during backend prepare. The plan should: + +1. Expand each host-side `IdentityFile` path using the existing tilde and + key-file validation behavior. +2. Stage or mount each referenced key through the same private-key handling path + used for git remotes. +3. Assign each staged key a stable in-bottle path with private file + permissions. +4. Render an OpenSSH-compatible known-hosts file from `ssh.known_hosts`. +5. Render an OpenSSH-compatible config file where each `IdentityFile` points to + the staged in-bottle key path. +6. Install the rendered files where agent and git commands will use them by + default: normally `~/.ssh/known_hosts` and `~/.ssh/config` for the `node` + user. + +The rendered config should contain only SSH directives and staged key paths. The +rendered known-hosts file should contain only the declared known-host lines. +Neither file may contain private key contents, host-side private key paths, or +secret-derived material. + +### Isolation from Hosts and Egress + +`ssh.config` is SSH client configuration only. It must not be translated into: + +- Docker `extra_hosts`. +- Sidecar hosts config. +- Agent hosts config. +- Egress route DNS or auth config. + +`ssh.known_hosts` is SSH trust material only. It must not be translated into +hosts-file mappings, egress allowlists, or HTTP/API trust. + +Git remotes continue to use `insteadOf` rewrites for git-gate routing. The git +remote manifest block should only answer "which logical repo names are routed +to which upstream URLs." SSH config and known-host verification live under +`ssh`; egress DNS and HTTP/API behavior continue to live under `egress`. + +## Testing Strategy + +- Unit-test manifest parsing for the simplified `git.remotes` mapping. +- Unit-test manifest parsing for valid `ssh.known_hosts` and `ssh.config` + entries. +- Unit-test parser errors for malformed `git.remotes` values. +- Unit-test parser errors for empty known-host entries. +- Unit-test parser errors for missing fields, empty string fields, non-integer + ports, and out-of-range ports. +- Unit-test known-host rendering to prove declared lines are emitted exactly + once into the planned known-hosts file. +- Unit-test SSH config rendering to prove `Host`, `Hostname`, `Port`, `User`, + and rewritten `IdentityFile` lines are emitted correctly. +- Unit-test duplicate `IdentityFile` handling so repeated keys are staged once + or otherwise handled deterministically. +- Unit-test Docker and smolmachines provisioning plans install the same + logical known-hosts file, SSH config, and staged key paths. +- Unit-test that `ssh.config` entries do not appear in Docker `extra_hosts`, + sidecar hosts config, agent hosts config, or egress route config. +- Unit-test documentation examples through the existing manifest loader where + practical. + +Run: + +- `python3 -m unittest discover -s tests/unit` + +## Open Questions + +None.