Spike: container on a `--internal` user-defined network resolves another container's name via the embedded resolver at 127.0.0.11 and reaches it over TCP, while egress to the public internet remains blocked. The PRD's design assumption holds — no design change needed.
7.7 KiB
PRD 0007: SSH egress gate
- Status: Draft
- Author: didericis
- Created: 2026-05-12
Summary
Per-agent TCP-forwarder sidecar built from bottle.ssh entries; SSH stops
going through pipelock; pipelock keeps full TLS interception with no
SSH carve-outs.
Problem
git fetch over SSH from inside an implementer-agent bottle is broken
on main. The error surfaced after PRD 0006 enabled pipelock's
native tls_interception:
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
fatal: Could not read from remote repository.
The agent's ssh client tunnels through pipelock via a ProxyCommand socat - PROXY:pipelock:%h:%p and pipelock now bumps that CONNECT
tunnel. SSH sends its banner instead of a TLS ClientHello; pipelock's
SNI gate rejects it; the tunnel closes mid-kex. Every bottle with an
ssh entry hits this — including the implementer agent used by the
free-agent workflow, which can't pull or push.
Goals / Success Criteria
Integration test: spin up a bottle with an SSH entry, exec git fetch against a real-ish SSH host from inside the agent, observe
exit 0. This is the same signal that's broken today; flipping it
back to green is the test.
Non-goals
- Pluggable forwarder backend. One TCP forwarder image is baked in; abstracting over haproxy / nginx-stream / etc. is deferred.
- SSH-protocol awareness. The gate stays at L4. No SSH-version sniffing, no kex inspection, no per-key gating beyond what ssh itself enforces inside the agent.
- Replacing pipelock for anything else. HTTPS / HTTP traffic continues to flow through pipelock unchanged. This PRD adds a sidecar; it doesn't displace one.
- Connection rate limits or quotas. No per-host or per-agent rate limiting on the gate; future PRD if it ever matters.
Scope
In scope
- Gate sidecar lifecycle.
DockerSSHGateclass withprepare/start/stop, mirroringDockerPipelockProxy's shape and network attachment story. - ssh provisioner rewrite.
provision/ssh.pydrops the socatProxyCommand;~/.ssh/configpoints eachHostat the gate container and the per-host listen port. - Pipelock carve-out removal. Strip
pipelock_bottle_ssh_trusted_domains,pipelock_bottle_ssh_ip_cidrs, and the related code paths inpipelock_build_config+ tests. After this PRD, pipelock has no knowledge ofbottle.ssh. - Plan rendering / dry-run.
bottle_plan.pyand the y/N preflight surface the new gate sidecar (name, listen ports, upstream targets).
Out of scope
- SSH key generation / rotation. Bottle keys are still
user-supplied via
IdentityFile; the gate doesn't manage key material. - Per-host audit logging. The gate is dumb TCP forwarding; no in-band visibility into SSH session content. (Connection-level logs from socat are a nice-to-have, not a goal.)
- Non-Docker backends. Implementation lands for Docker only; the
BottleBackendabstraction can grow the hook but other backends are deferred. - Manifest schema changes.
bottle.sshstays exactly as it is today; this PRD is internals-only.
Proposed Design
New services / components
Mirror the pipelock layout:
claude_bottle/ssh_gate.py(new): abstractSSHGate+SSHGatePlandataclass.prepareis host-side / side-effect-free on docker; renders the forwarder config understage_dir.claude_bottle/backend/docker/ssh_gate.py(new):DockerSSHGateconcrete subclass —startdoesdocker createon the internal network, copies the config in, attaches the egress network,docker start.stopis idempotentdocker rm -f. Container name:claude-bottle-ssh-gate-<slug>.
Forwarder image: alpine/socat, pinned by digest. One socat
process per ssh entry, multiplexed inside the same gate container
via an entrypoint script that backgrounds N socat invocations:
socat TCP-LISTEN:<port_i>,reuseaddr,fork TCP:<Hostname_i>:<Port_i>
Listen ports are assigned deterministically per ssh entry (e.g.
30000 + index). One container, N listeners, N upstreams.
Existing code touched
claude_bottle/backend/docker/provision/ssh.py: drop theProxyCommand socat - PROXY:...plumbing and thepipelock_proxy_host_portimport. The rendered~/.ssh/configblock per entry becomes:Host <name> HostName <gate-container> User <user> Port <listen-port> IdentityAgent <public-socket>known_hostsentries are keyed off<name>and the new[<gate-container>]:<listen-port>form so OpenSSH's strict host-key checking still matches.claude_bottle/pipelock.py: deletepipelock_bottle_ssh_hostnames,pipelock_bottle_ssh_trusted_domains,pipelock_bottle_ssh_ip_cidrs, and the calls into them frompipelock_effective_allowlistandpipelock_build_config. The effective allowlist becomes baked-defaults ∪bottle.egress.allowlist.claude_bottle/backend/docker/backend.py: instantiateDockerSSHGatealongsideDockerPipelockProxy; thread itsprepare/start/stopthroughresolve_plan/launch.claude_bottle/backend/docker/launch.py: add gate start / stop to theExitStackin the right order — gate must be up beforeprovision_sshruns so the agent can dial it on first boot.claude_bottle/backend/docker/bottle_plan.py: newSSHGatePlanfield onDockerBottlePlan; preflight rendering surfaces the gate sidecar (name, per-entry listen ports, upstreamHostname:Porttargets).- Tests: update
tests/fixtures.pycallers; rewritetests/unit/test_pipelock_yaml.py::TestBuildConfig::test_ssh_shapeto assert pipelock no longer reflects ssh entries; add unit tests forSSHGate.prepare+ render shape; add an integration test intests/integration/for thegit fetchround-trip.
Data model changes
None. bottle.ssh schema is unchanged; one new internal plan
dataclass (SSHGatePlan) under claude_bottle/ssh_gate.py.
External dependencies
alpine/socatimage, pinned by digest (declared next to thePIPELOCK_IMAGEconstant). No new Python packages.
Open questions
- Network topology: does the gate need its own per-agent egress bridge, or can it share pipelock's egress network? Sharing is simpler; per-gate isolates failure modes. Decide during implementation; default to "share pipelock's egress network" unless a concrete reason emerges.
- Socat container restart policy: a single socat that crashes
takes one upstream offline; do we want a wrapper that restarts
individual listeners, or just rely on
docker restart? Default to no-restart for v1 (matches pipelock). - Connection-level audit log: socat's
-vmode logs every connect/close. Worth piping into the bottle's stderr stream, or is that noise? Default off, reconsider if debugging gets hard. Docker DNS for theResolved. Spike confirmed: a container on a<gate-container>hostname inside the agent: works via Docker's embedded resolver on user-defined networks. Verify on the--internalnetwork specifically before implementation.--internaluser-defined network resolves another container's name via the embedded resolver at 127.0.0.11 and reaches it over TCP, while egress to the public internet remains blocked. The PRD's design assumption holds.
References
- PRD 0001: per-agent egress proxy via pipelock — the parent topology this PRD slots into.
- PRD 0006: pipelock native TLS interception — the change that surfaced this regression by making pipelock incompatible with SSH-over-CONNECT.
claude_bottle/backend/docker/provision/ssh.py— current SSH provisioning that this PRD rewrites.claude_bottle/pipelock.py— current pipelock config builder that gains thebottle.ssh-derived fields this PRD removes.