80 lines
5.3 KiB
Markdown
80 lines
5.3 KiB
Markdown
# PRD 0057: Promote smolmachines to default backend; convert Docker to example-only
|
|
|
|
- **Status:** Active
|
|
- **Author:** didericis
|
|
- **Created:** 2026-06-06
|
|
- **Issue:** #206
|
|
|
|
## Summary
|
|
|
|
Make smolmachines the default bot-bottle backend and demote Docker to an example-only configuration. This closes the DNS sinkhole gap that exists in the Docker backend: the mitmproxy egress addon intercepts HTTP(S) but cannot see raw UDP port-53 DNS queries, so an agent can exfiltrate data via DNS tunnelling without the egress guard seeing it. The smolmachines backend eliminates this gap at the VMM layer — DNS filtering is built in and the agent container cannot bypass it.
|
|
|
|
## Problem
|
|
|
|
The current default backend is Docker. The egress addon (PRDs 0052/0053) intercepts HTTPS and scans request/response surfaces, but it is an HTTP proxy: raw UDP/TCP port-53 DNS queries go to the OS resolver and never pass through it. An agent can encode secrets as base32 or hex subdomains in a DNS query (`<encoded>.attacker.com`) and exfiltrate them silently.
|
|
|
|
The smolmachines backend already solves this: its Transport Socket Interface (TSI) enforces a CIDR allowlist at the VMM layer, and DNS is handled via vsock port 6002 — the guest's `/etc/resolv.conf` points at `127.0.0.1`, and a guest-side DNS proxy tunnels queries over vsock to the host, which returns NXDOMAIN for anything not on the allowlist. The agent cannot bypass this by hardcoding IPs or by configuring an alternate resolver, because both mechanisms are enforced below the guest OS.
|
|
|
|
Docker has no equivalent. Adding dnsmasq to the Docker backend would close the gap at some cost (dnsmasq sidecar, iptables `NET_ADMIN`, per-launch config generation), but it is the wrong direction if smolmachines supersedes Docker anyway.
|
|
|
|
## Goals / Success Criteria
|
|
|
|
- `BOT_BOTTLE_BACKEND` defaults to `smolmachines` when not set.
|
|
- The existing Docker backend remains functional (not removed) but is no longer the default and is documented as legacy/example-only.
|
|
- Example bottles (`examples/bottles/`) reference smolmachines, not Docker.
|
|
- `AGENTS.md` documents the backend choice and the DNS gap closure.
|
|
- Existing Docker-backed integration tests continue to pass; they select Docker explicitly via `BOT_BOTTLE_BACKEND=docker` rather than relying on the default.
|
|
|
|
## Non-goals
|
|
|
|
- Removing the Docker backend or its tests.
|
|
- Implementing a dnsmasq layer for the Docker backend (closed by this change; not needed on the default path).
|
|
- Iptables / `NET_ADMIN` work for Docker (deferred).
|
|
- Subdomain-depth filtering for allowlisted zones (documented residual gap; tracked separately per the issue).
|
|
|
|
## Design
|
|
|
|
### Default backend change
|
|
|
|
`bot_bottle/backend/__init__.py`, line ~440:
|
|
|
|
```python
|
|
# Before
|
|
resolved = name or os.environ.get("BOT_BOTTLE_BACKEND") or "docker"
|
|
|
|
# After
|
|
resolved = name or os.environ.get("BOT_BOTTLE_BACKEND") or "smolmachines"
|
|
```
|
|
|
|
### DNS gap closure (how smolmachines handles it)
|
|
|
|
When the smolmachines backend launches an agent VM:
|
|
|
|
1. The VM's network device uses TSI (`--allow-host` / `--allow-cidr` flags), which enforces a CIDR allowlist at the VMM layer. The guest cannot dial IPs outside the allowlist even with raw sockets.
|
|
2. The guest's `/etc/resolv.conf` is set to `127.0.0.1`; a guest-side DNS proxy relays queries over vsock port 6002 to the host.
|
|
3. The host-side DNS filter returns NXDOMAIN for any hostname not in the allowlist derived from `egress.routes` in the bottle manifest.
|
|
|
|
This means DNS exfiltration via unknown subdomains is blocked by NXDOMAIN before the query leaves the host, and even if the agent hardcoded the IP of an attacker-controlled server, TSI would drop the packet at the VMM layer.
|
|
|
|
**Residual gap:** if the attacker controls a subdomain of an allowlisted zone (e.g., a legitimate zone like `api.anthropic.com` that the attacker can inject into via a separate compromise), DNS queries for that subdomain would be forwarded. This is accepted and documented.
|
|
|
|
### Example bottles
|
|
|
|
Update `examples/bottles/dev.md` and `examples/bottles/claude.md` to remove Docker-specific notes and reference smolmachines as the runtime.
|
|
|
|
### Integration test migration
|
|
|
|
Tests that exercise the Docker backend explicitly should set `BOT_BOTTLE_BACKEND=docker` rather than relying on the default. Tests that are backend-agnostic continue to use whatever `BOT_BOTTLE_BACKEND` is set to (defaulting to smolmachines in the test environment if available).
|
|
|
|
## Resolved questions
|
|
|
|
- **TSI + egress proxy loopback.** The implementation uses a per-bottle loopback alias rather than broad `127.0.0.1` passthrough. The smolmachines launch integration test now asserts that the guest receives proxy env vars on a `127.x` alias, can reach an allowlisted host through the proxy, cannot reach the same host directly with proxy vars unset, and cannot reach a non-allowlisted host through the proxy.
|
|
- **smolmachines availability check.** The smolmachines preflight error points operators at the smolvm installer and explicitly suggests `BOT_BOTTLE_BACKEND=docker` / `--backend=docker` for legacy Docker-backed runs.
|
|
|
|
## References
|
|
|
|
- `docs/research/smolmachines-as-vm-backend.md` — smolmachines evaluation
|
|
- `docs/research/network-egress-guard.md` — Approach 4 (DNS-based egress control)
|
|
- `docs/research/secret-exfil-tripwire-encodings.md` — DNS exfil discussion
|
|
- PRD 0052, PRD 0053 — egress DLP addon (HTTP-level; partial mitigation only)
|