2.9 KiB
PRD 0042: smolmachines Cross-Backend Parity Tests
- Status: Active
- Author: didericis-codex
- Created: 2026-06-02
- Issue: #139
Summary
Add tests that prove secrets, forwarded env, resume, and remediation behave equivalently across Docker and smolmachines backends. The fixes in PRDs 0038–0040 are unverifiable without this coverage.
Problem
The existing unit suite is broad but backend-specific. There are no tests that run the same scenario against both Docker and smolmachines and assert the outcomes match. A regression in one backend goes undetected until a live run, and PRDs 0038–0040 can each pass their own unit tests while the backends still diverge at the integration boundary.
Goals / Success Criteria
- A parity test suite that covers at least:
- Secret env injection:
?promptand${HOST_VAR}entries produce the same guest env on both backends. - Forwarded env: literal manifest env values reach the guest on both backends.
- Resume: a preserved bottle state dir round-trips correctly on both backends (relies on PRD 0040 metadata).
- Remediation: capability-block approval routes to the correct backend handler (relies on PRD 0039 dispatch).
- Secret env injection:
- Each scenario is parameterised so a failure names the backend that regressed.
- Tests run without a live VM or Docker daemon (mock or stub backends).
Non-goals
- No end-to-end agent execution tests.
- No performance or load tests.
- No changes to production code (test-only PRD).
Scope
In scope:
- New test file(s) under
tests/unit/for parity scenarios. - Stub or mock implementations of smolmachines and Docker backends as needed.
Out of scope:
- Changes to
bot_bottle/production code. - CI infrastructure changes beyond adding the new test file to the discover invocation.
Dependencies
- PRD 0038 should land before the env parity tests are finalised.
- PRDs 0039 and 0040 should land before the remediation and resume scenarios are finalised; stubs can be written speculatively beforehand.
Design
Parameterise each scenario over a list of backend factory functions. Each factory returns a bottle instance wired to a stub subprocess layer. The test body is backend-agnostic: it calls the same public API, captures the same observable output, and asserts equality.
For env scenarios, capture the argv or env-file content passed to the guest and compare against resolved manifest values. For resume, write metadata with one backend class and read it back to verify correct selection. For remediation, assert dispatch selects the per-backend handler.
Testing Strategy
Run as part of the standard unit discover:
python3 -m unittest discover -s tests/unit
Or directly:
python3 -m unittest tests.unit.test_backend_parity
Open Questions
- Should parity tests live under
tests/unit/(mock-based) ortests/integration/(live infra)? Mock-based is preferred to keep CI simple.