Files
bot-bottle/docs/prds/0042-smolmachines-parity-tests.md
T
didericis-claude 2c061d9cd9
test / unit (pull_request) Successful in 40s
test / integration (pull_request) Successful in 55s
test / unit (push) Successful in 40s
test / integration (push) Successful in 46s
docs: mark PRD 0042 Active
2026-06-02 11:30:54 -04:00

2.9 KiB
Raw Blame History

PRD 0042: smolmachines Cross-Backend Parity Tests

  • Status: Active
  • Author: didericis-codex
  • Created: 2026-06-02
  • Issue: #139

Summary

Add tests that prove secrets, forwarded env, resume, and remediation behave equivalently across Docker and smolmachines backends. The fixes in PRDs 00380040 are unverifiable without this coverage.

Problem

The existing unit suite is broad but backend-specific. There are no tests that run the same scenario against both Docker and smolmachines and assert the outcomes match. A regression in one backend goes undetected until a live run, and PRDs 00380040 can each pass their own unit tests while the backends still diverge at the integration boundary.

Goals / Success Criteria

  • A parity test suite that covers at least:
    • Secret env injection: ?prompt and ${HOST_VAR} entries produce the same guest env on both backends.
    • Forwarded env: literal manifest env values reach the guest on both backends.
    • Resume: a preserved bottle state dir round-trips correctly on both backends (relies on PRD 0040 metadata).
    • Remediation: capability-block approval routes to the correct backend handler (relies on PRD 0039 dispatch).
  • Each scenario is parameterised so a failure names the backend that regressed.
  • Tests run without a live VM or Docker daemon (mock or stub backends).

Non-goals

  • No end-to-end agent execution tests.
  • No performance or load tests.
  • No changes to production code (test-only PRD).

Scope

In scope:

  • New test file(s) under tests/unit/ for parity scenarios.
  • Stub or mock implementations of smolmachines and Docker backends as needed.

Out of scope:

  • Changes to bot_bottle/ production code.
  • CI infrastructure changes beyond adding the new test file to the discover invocation.

Dependencies

  • PRD 0038 should land before the env parity tests are finalised.
  • PRDs 0039 and 0040 should land before the remediation and resume scenarios are finalised; stubs can be written speculatively beforehand.

Design

Parameterise each scenario over a list of backend factory functions. Each factory returns a bottle instance wired to a stub subprocess layer. The test body is backend-agnostic: it calls the same public API, captures the same observable output, and asserts equality.

For env scenarios, capture the argv or env-file content passed to the guest and compare against resolved manifest values. For resume, write metadata with one backend class and read it back to verify correct selection. For remediation, assert dispatch selects the per-backend handler.

Testing Strategy

Run as part of the standard unit discover:

  • python3 -m unittest discover -s tests/unit

Or directly:

  • python3 -m unittest tests.unit.test_backend_parity

Open Questions

  • Should parity tests live under tests/unit/ (mock-based) or tests/integration/ (live infra)? Mock-based is preferred to keep CI simple.