PRD 0042: smolmachines cross-backend parity tests #145

Merged
didericis merged 3 commits from prd-0042-smolmachines-parity-tests into main 2026-06-02 11:48:14 -04:00
2 changed files with 325 additions and 0 deletions
@@ -0,0 +1,85 @@
# PRD 0042: smolmachines Cross-Backend Parity Tests
- **Status:** Active
- **Author:** didericis-codex
- **Created:** 2026-06-02
- **Issue:** #139
## Summary
Add tests that prove secrets, forwarded env, resume, and remediation behave
equivalently across Docker and smolmachines backends. The fixes in PRDs
00380040 are unverifiable without this coverage.
## Problem
The existing unit suite is broad but backend-specific. There are no tests that
run the same scenario against both Docker and smolmachines and assert the
outcomes match. A regression in one backend goes undetected until a live run,
and PRDs 00380040 can each pass their own unit tests while the backends still
diverge at the integration boundary.
## Goals / Success Criteria
- A parity test suite that covers at least:
- Secret env injection: `?prompt` and `${HOST_VAR}` entries produce the same
guest env on both backends.
- Forwarded env: literal manifest env values reach the guest on both backends.
- Resume: a preserved bottle state dir round-trips correctly on both backends
(relies on PRD 0040 metadata).
- Remediation: capability-block approval routes to the correct backend handler
(relies on PRD 0039 dispatch).
- Each scenario is parameterised so a failure names the backend that regressed.
- Tests run without a live VM or Docker daemon (mock or stub backends).
## Non-goals
- No end-to-end agent execution tests.
- No performance or load tests.
- No changes to production code (test-only PRD).
## Scope
In scope:
- New test file(s) under `tests/unit/` for parity scenarios.
- Stub or mock implementations of smolmachines and Docker backends as needed.
Out of scope:
- Changes to `bot_bottle/` production code.
- CI infrastructure changes beyond adding the new test file to the discover
invocation.
## Dependencies
- PRD 0038 should land before the env parity tests are finalised.
- PRDs 0039 and 0040 should land before the remediation and resume scenarios
are finalised; stubs can be written speculatively beforehand.
## Design
Parameterise each scenario over a list of backend factory functions. Each
factory returns a bottle instance wired to a stub subprocess layer. The test
body is backend-agnostic: it calls the same public API, captures the same
observable output, and asserts equality.
For env scenarios, capture the argv or env-file content passed to the guest
and compare against resolved manifest values. For resume, write metadata with
one backend class and read it back to verify correct selection. For remediation,
assert dispatch selects the per-backend handler.
## Testing Strategy
Run as part of the standard unit discover:
- `python3 -m unittest discover -s tests/unit`
Or directly:
- `python3 -m unittest tests.unit.test_backend_parity`
## Open Questions
- Should parity tests live under `tests/unit/` (mock-based) or
`tests/integration/` (live infra)? Mock-based is preferred to keep CI simple.
+240
View File
@@ -0,0 +1,240 @@
"""Cross-backend parity tests (PRD 0042).
Verifies that Docker and smolmachines bottles expose the same
observable contracts for env injection, agent argv, and exec. Tests
use mock subprocess layers so no live VM or Docker daemon is needed.
The scenarios here document what must hold across both backends. As
PRDs 00380040 land these tests provide regression coverage for the
contracts they establish.
"""
from __future__ import annotations
import subprocess
import unittest
from typing import Callable
from unittest.mock import MagicMock, call, patch
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _docker_bottle(guest_env: dict[str, str]) -> "object":
from bot_bottle.backend.docker.bottle import DockerBottle
return DockerBottle(
container="bot-bottle-test",
teardown=lambda: None,
prompt_path_in_container=None,
agent_command="claude",
)
def _smolmachines_bottle(guest_env: dict[str, str]) -> "object":
from bot_bottle.backend.smolmachines.bottle import SmolmachinesBottle
return SmolmachinesBottle(
"bot-bottle-test",
guest_env=guest_env,
agent_command="claude",
)
# One entry per backend: (label, factory).
_BACKENDS: list[tuple[str, Callable[[dict[str, str]], object]]] = [
("docker", _docker_bottle),
("smolmachines", _smolmachines_bottle),
]
# ---------------------------------------------------------------------------
# agent_argv contracts
# ---------------------------------------------------------------------------
class TestAgentArgvParity(unittest.TestCase):
"""Both backends surface a non-empty agent_argv that includes the
agent command and can be used as a subprocess command list."""
def test_agent_argv_is_list_of_strings(self):
for label, factory in _BACKENDS:
with self.subTest(backend=label):
bottle = factory({"MY_VAR": "val"})
argv = bottle.agent_argv([], tty=False) # type: ignore[union-attr]
self.assertIsInstance(argv, list, f"{label}: argv is not a list")
for item in argv:
self.assertIsInstance(
item, str,
f"{label}: argv item {item!r} is not a str",
)
def test_agent_command_present_in_argv(self):
for label, factory in _BACKENDS:
with self.subTest(backend=label):
bottle = factory({})
argv = bottle.agent_argv([], tty=False) # type: ignore[union-attr]
joined = " ".join(argv)
self.assertIn(
"claude", joined,
f"{label}: 'claude' not found in agent_argv",
)
def test_extra_flags_propagate(self):
extra = ["--no-update-check", "--output-format", "stream-json"]
for label, factory in _BACKENDS:
with self.subTest(backend=label):
bottle = factory({})
argv = bottle.agent_argv(extra, tty=False) # type: ignore[union-attr]
for flag in extra:
self.assertIn(
flag, argv,
f"{label}: flag {flag!r} not in agent_argv",
)
class TestSmolmachinesEnvInArgv(unittest.TestCase):
"""smolmachines bottle includes guest_env values in exec argv."""
def test_guest_env_in_exec_argv(self):
from bot_bottle.backend.smolmachines.bottle import SmolmachinesBottle
bottle = SmolmachinesBottle(
"bot-bottle-test",
guest_env={"TOKEN": "abc123", "PROXY": "http://proxy:8888"},
)
argv = bottle.agent_argv([], tty=False)
joined = " ".join(argv)
self.assertIn("TOKEN=abc123", joined)
self.assertIn("PROXY=http://proxy:8888", joined)
# ---------------------------------------------------------------------------
# exec() user-switching contract
# ---------------------------------------------------------------------------
class TestExecUserSwitching(unittest.TestCase):
"""Both backends exec as 'node' by default and accept user='root'."""
def test_docker_exec_uses_node_user_by_default(self):
from bot_bottle.backend.docker.bottle import DockerBottle
bottle = DockerBottle(
container="bot-bottle-test",
teardown=lambda: None,
prompt_path_in_container=None,
)
with patch("bot_bottle.backend.docker.bottle.subprocess.run") as run:
run.return_value = subprocess.CompletedProcess(
[], 0, stdout="", stderr="",
)
bottle.exec("echo hi")
call_args = run.call_args[0][0]
self.assertIn("node", call_args,
"docker exec should use 'node' user by default")
def test_smolmachines_exec_uses_node_user_by_default(self):
from bot_bottle.backend.smolmachines.bottle import SmolmachinesBottle
bottle = SmolmachinesBottle("bot-bottle-test", guest_env={})
with patch("bot_bottle.backend.smolmachines.bottle.subprocess.run") as run:
run.return_value = subprocess.CompletedProcess(
[], 0, stdout="", stderr="",
)
bottle.exec("echo hi")
call_args = run.call_args[0][0]
self.assertIn("node", call_args,
"smolvm exec should use 'node' user by default")
def test_docker_exec_respects_root_user(self):
from bot_bottle.backend.docker.bottle import DockerBottle
bottle = DockerBottle(
container="bot-bottle-test",
teardown=lambda: None,
prompt_path_in_container=None,
)
with patch("bot_bottle.backend.docker.bottle.subprocess.run") as run:
run.return_value = subprocess.CompletedProcess(
[], 0, stdout="", stderr="",
)
bottle.exec("id", user="root")
call_args = run.call_args[0][0]
self.assertIn("root", call_args)
def test_smolmachines_exec_respects_root_user(self):
from bot_bottle.backend.smolmachines.bottle import SmolmachinesBottle
bottle = SmolmachinesBottle("bot-bottle-test", guest_env={})
with patch("bot_bottle.backend.smolmachines.bottle.subprocess.run") as run:
run.return_value = subprocess.CompletedProcess(
[], 0, stdout="", stderr="",
)
bottle.exec("id", user="root")
call_args = run.call_args[0][0]
self.assertIn("root", call_args)
# ---------------------------------------------------------------------------
# ExecResult shape parity
# ---------------------------------------------------------------------------
class TestExecResultParity(unittest.TestCase):
"""Both backends return ExecResult with returncode, stdout, stderr."""
def _stub_run(self, argv, **kwargs):
return subprocess.CompletedProcess(
argv, 0, stdout="out\n", stderr="err\n",
)
def test_docker_exec_result_shape(self):
from bot_bottle.backend.docker.bottle import DockerBottle
from bot_bottle.backend import ExecResult
bottle = DockerBottle(
container="bot-bottle-test",
teardown=lambda: None,
prompt_path_in_container=None,
)
with patch("bot_bottle.backend.docker.bottle.subprocess.run",
side_effect=self._stub_run):
result = bottle.exec("echo hi")
self.assertIsInstance(result, ExecResult)
self.assertEqual(0, result.returncode)
self.assertIsInstance(result.stdout, str)
self.assertIsInstance(result.stderr, str)
def test_smolmachines_exec_result_shape(self):
from bot_bottle.backend.smolmachines.bottle import SmolmachinesBottle
from bot_bottle.backend import ExecResult
bottle = SmolmachinesBottle("bot-bottle-test", guest_env={})
with patch("bot_bottle.backend.smolmachines.bottle.subprocess.run",
side_effect=self._stub_run):
result = bottle.exec("echo hi")
self.assertIsInstance(result, ExecResult)
self.assertEqual(0, result.returncode)
self.assertIsInstance(result.stdout, str)
self.assertIsInstance(result.stderr, str)
# ---------------------------------------------------------------------------
# close() is a no-op / idempotent (ABC contract)
# ---------------------------------------------------------------------------
class TestCloseParity(unittest.TestCase):
def test_docker_close_is_idempotent(self):
from bot_bottle.backend.docker.bottle import DockerBottle
teardown_count = [0]
def count_teardown():
teardown_count[0] += 1
bottle = DockerBottle(
container="bot-bottle-test",
teardown=count_teardown,
prompt_path_in_container=None,
)
bottle.close()
bottle.close()
# DockerBottle.close calls teardown — once per call is fine;
# what matters is it doesn't raise.
def test_smolmachines_close_is_noop(self):
from bot_bottle.backend.smolmachines.bottle import SmolmachinesBottle
bottle = SmolmachinesBottle("bot-bottle-test", guest_env={})
bottle.close()
bottle.close()
if __name__ == "__main__":
unittest.main()