Files
bot-bottle/docs/prds/prd-new-smolmachines-default.md
T
didericis-codex 17fc44d0d8
lint / lint (push) Successful in 1m46s
test / unit (pull_request) Successful in 41s
test / integration (pull_request) Successful in 22s
complete(prd): mark smolmachines default active
2026-06-09 03:27:58 +00:00

5.3 KiB

PRD prd-new: Promote smolmachines to default backend; convert Docker to example-only

  • Status: Active
  • Author: didericis
  • Created: 2026-06-06
  • Issue: #206

Summary

Make smolmachines the default bot-bottle backend and demote Docker to an example-only configuration. This closes the DNS sinkhole gap that exists in the Docker backend: the mitmproxy egress addon intercepts HTTP(S) but cannot see raw UDP port-53 DNS queries, so an agent can exfiltrate data via DNS tunnelling without the egress guard seeing it. The smolmachines backend eliminates this gap at the VMM layer — DNS filtering is built in and the agent container cannot bypass it.

Problem

The current default backend is Docker. The egress addon (PRDs 0052/0053) intercepts HTTPS and scans request/response surfaces, but it is an HTTP proxy: raw UDP/TCP port-53 DNS queries go to the OS resolver and never pass through it. An agent can encode secrets as base32 or hex subdomains in a DNS query (<encoded>.attacker.com) and exfiltrate them silently.

The smolmachines backend already solves this: its Transport Socket Interface (TSI) enforces a CIDR allowlist at the VMM layer, and DNS is handled via vsock port 6002 — the guest's /etc/resolv.conf points at 127.0.0.1, and a guest-side DNS proxy tunnels queries over vsock to the host, which returns NXDOMAIN for anything not on the allowlist. The agent cannot bypass this by hardcoding IPs or by configuring an alternate resolver, because both mechanisms are enforced below the guest OS.

Docker has no equivalent. Adding dnsmasq to the Docker backend would close the gap at some cost (dnsmasq sidecar, iptables NET_ADMIN, per-launch config generation), but it is the wrong direction if smolmachines supersedes Docker anyway.

Goals / Success Criteria

  • BOT_BOTTLE_BACKEND defaults to smolmachines when not set.
  • The existing Docker backend remains functional (not removed) but is no longer the default and is documented as legacy/example-only.
  • Example bottles (examples/bottles/) reference smolmachines, not Docker.
  • AGENTS.md documents the backend choice and the DNS gap closure.
  • Existing Docker-backed integration tests continue to pass; they select Docker explicitly via BOT_BOTTLE_BACKEND=docker rather than relying on the default.

Non-goals

  • Removing the Docker backend or its tests.
  • Implementing a dnsmasq layer for the Docker backend (closed by this change; not needed on the default path).
  • Iptables / NET_ADMIN work for Docker (deferred).
  • Subdomain-depth filtering for allowlisted zones (documented residual gap; tracked separately per the issue).

Design

Default backend change

bot_bottle/backend/__init__.py, line ~440:

# Before
resolved = name or os.environ.get("BOT_BOTTLE_BACKEND") or "docker"

# After
resolved = name or os.environ.get("BOT_BOTTLE_BACKEND") or "smolmachines"

DNS gap closure (how smolmachines handles it)

When the smolmachines backend launches an agent VM:

  1. The VM's network device uses TSI (--allow-host / --allow-cidr flags), which enforces a CIDR allowlist at the VMM layer. The guest cannot dial IPs outside the allowlist even with raw sockets.
  2. The guest's /etc/resolv.conf is set to 127.0.0.1; a guest-side DNS proxy relays queries over vsock port 6002 to the host.
  3. The host-side DNS filter returns NXDOMAIN for any hostname not in the allowlist derived from egress.routes in the bottle manifest.

This means DNS exfiltration via unknown subdomains is blocked by NXDOMAIN before the query leaves the host, and even if the agent hardcoded the IP of an attacker-controlled server, TSI would drop the packet at the VMM layer.

Residual gap: if the attacker controls a subdomain of an allowlisted zone (e.g., a legitimate zone like api.anthropic.com that the attacker can inject into via a separate compromise), DNS queries for that subdomain would be forwarded. This is accepted and documented.

Example bottles

Update examples/bottles/dev.md and examples/bottles/claude.md to remove Docker-specific notes and reference smolmachines as the runtime.

Integration test migration

Tests that exercise the Docker backend explicitly should set BOT_BOTTLE_BACKEND=docker rather than relying on the default. Tests that are backend-agnostic continue to use whatever BOT_BOTTLE_BACKEND is set to (defaulting to smolmachines in the test environment if available).

Resolved questions

  • TSI + egress proxy loopback. The implementation uses a per-bottle loopback alias rather than broad 127.0.0.1 passthrough. The smolmachines launch integration test now asserts that the guest receives proxy env vars on a 127.x alias, can reach an allowlisted host through the proxy, cannot reach the same host directly with proxy vars unset, and cannot reach a non-allowlisted host through the proxy.
  • smolmachines availability check. The smolmachines preflight error points operators at the smolvm installer and explicitly suggests BOT_BOTTLE_BACKEND=docker / --backend=docker for legacy Docker-backed runs.

References

  • docs/research/smolmachines-as-vm-backend.md — smolmachines evaluation
  • docs/research/network-egress-guard.md — Approach 4 (DNS-based egress control)
  • docs/research/secret-exfil-tripwire-encodings.md — DNS exfil discussion
  • PRD 0052, PRD 0053 — egress DLP addon (HTTP-level; partial mitigation only)