test: reorganize suite into unit/integration/canaries directories

Replace the hand-maintained INTEGRATION_NAMES classifier (and the bespoke run_tests.py around it) with a directory-driven split: tests/unit/ unit tests, always run tests/integration/ Docker-dependent, skip cleanly without Docker tests/canaries/ upstream-regression checks, opt-in via CLAUDE_BOTTLE_RUN_CANARIES=1 The pinned-pipelock-image check moves to the canary suite — it tests upstream packaging, not our code, so it shouldn't gate every dev push. A scheduled canaries.yml workflow runs it weekly. The manifest-runtime tests collapse the four assertRaises cases for distinct 'runtime' values into one subTest loop and drop the error-message-wording assertions; the contract is "any value is rejected", not "the error literally contains 'auto-detect'". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:23:02 -04:00
parent 83fe5741f5
commit 4462863d56
16 changed files with 157 additions and 207 deletions
@@ -8,47 +8,58 @@ tests need Docker and skip cleanly otherwise.

 ```
 tests/
-  run_tests.py                    # entry point
-  fixtures.py                     # JSON manifest builders
-  _docker.py                      # docker-availability skip helper
-  test_pipelock_naming.py         # unit
-  test_pipelock_classify.py       # unit
-  test_pipelock_allowlist.py      # unit
-  test_pipelock_yaml.py           # unit
-  test_pipelock_image.py          # integration
-  test_pipelock_sidecar_smoke.py  # integration
-  test_dry_run_plan.py            # integration
-  test_orphan_cleanup.py          # integration
+  fixtures.py                       # JSON manifest builders (shared)
+  _docker.py                        # docker-availability skip helper (shared)
+  unit/
+    test_pipelock_classify.py
+    test_pipelock_allowlist.py
+    test_pipelock_yaml.py
+    test_manifest_runtime.py
+  integration/
+    test_pipelock_sidecar_smoke.py
+    test_dry_run_plan.py
+    test_orphan_cleanup.py
+  canaries/
+    test_pipelock_image.py          # opt-in; see below
 ```

+Classification falls out of the directory — no hand-maintained list to
+keep in sync.
+
 ## Running

 ```bash
-tests/run_tests.py                                  # everything
-tests/run_tests.py unit                             # unit only
-tests/run_tests.py integration                      # integration only
-tests/run_tests.py tests/test_pipelock_yaml.py      # one file
+python -m unittest discover -t . -s tests/unit -v         # unit only
+python -m unittest discover -t . -s tests/integration -v  # integration only
+python -m unittest discover -t . -s tests -v              # both (recursive)
+python -m unittest tests.unit.test_pipelock_yaml          # one file
 ```

-You can also run via `python -m unittest`:
-
-```bash
-python -m unittest discover -s tests
-python -m unittest tests.test_pipelock_yaml
-```
+Discovery is invoked with `-t .` (top-level dir = repo root) so the
+`claude_bottle` package on `sys.path` resolves correctly.

 ## What the integration tests cover

- `test_pipelock_image.py` — the pinned digest is reachable, ENTRYPOINT
-  is `/pipelock`, and `CMD` includes `run`.
- `test_pipelock_sidecar_smoke.py` — `docker create` + `docker cp` the
-  generated YAML to `/etc/pipelock.yaml` + `docker start`, then probe
-  `/health`.
- `test_dry_run_plan.py` — `cli.py start --dry-run` shows the resolved
-  egress allowlist and creates zero docker resources.
- `test_orphan_cleanup.py` — network_remove and pipelock_stop are
-  idempotent against missing resources, so the EXIT trap can call them
-  unconditionally.
+- `test_pipelock_sidecar_smoke.py` — drives `DockerPipelockProxy.prepare`
+  + `.start` (the production code path) against a real Docker daemon and
+  probes the sidecar's `/health` from an in-network curl container.
+- `test_dry_run_plan.py` — `cli.py start --dry-run --format=json` emits
+  a structured plan that contains the resolved egress allowlist and
+  the bottle's runtime, and creates zero Docker resources.
+- `test_orphan_cleanup.py` — `network_remove` and `PipelockProxy.stop`
+  are idempotent against missing resources, so the EXIT trap can call
+  them unconditionally.
+
+## Canaries
+
+`tests/canaries/` holds upstream-regression checks (e.g. the pinned
+pipelock digest's binary still runs). These are gated on
+`CLAUDE_BOTTLE_RUN_CANARIES=1` and not part of the per-push suite.
+They're invoked by the scheduled `canaries` workflow.
+
+```bash
+CLAUDE_BOTTLE_RUN_CANARIES=1 python -m unittest discover -t . -s tests/canaries -v
+```

 ## What's NOT covered

@@ -60,9 +71,10 @@ python -m unittest tests.test_pipelock_yaml

 ## Adding a test

-1. Pick a filename: `test_<topic>.py`. Add it to `INTEGRATION_NAMES`
-   in `run_tests.py` if it needs Docker.
-2. Boilerplate:
+1. Pick the directory: `tests/unit/` for a pure unit test,
+   `tests/integration/` for one that needs Docker.
+2. Filename: `test_<topic>.py`.
+3. Boilerplate:
   ```python
   import unittest

@@ -75,5 +87,5 @@ python -m unittest tests.test_pipelock_yaml
   if __name__ == "__main__":
       unittest.main()
   ```
-3. For Docker-dependent tests, decorate the class with
+4. For Docker-dependent tests, decorate the class with
   `@skip_unless_docker()` from `tests._docker`.
@@ -1,7 +1,13 @@
-"""Integration: the pinned pipelock image's binary actually runs.
-Catches a broken upstream packaging at the pinned digest. Requires
-docker."""
+"""Canary: the pinned pipelock image's binary actually runs.

+This test exists to catch a broken upstream packaging at the pinned
+digest. It is NOT part of the per-push suite — that would couple every
+dev push to upstream registry availability. Set
+CLAUDE_BOTTLE_RUN_CANARIES=1 to opt in (a scheduled CI workflow does
+this; humans can run it ad-hoc the same way).
+"""
+
+import os
 import subprocess
 import unittest

@@ -9,6 +15,10 @@ from claude_bottle.backend.docker.pipelock import PIPELOCK_IMAGE
 from tests._docker import skip_unless_docker


+@unittest.skipUnless(
+    os.environ.get("CLAUDE_BOTTLE_RUN_CANARIES") == "1",
+    "canary suite is opt-in; set CLAUDE_BOTTLE_RUN_CANARIES=1 to run",
+)
@skip_unless_docker()
 class TestPipelockImage(unittest.TestCase):
    @classmethod
@@ -1,91 +0,0 @@
-#!/usr/bin/env python3
-"""Test runner. Wraps unittest's discovery so we can split unit /
-integration the same way the bash runner did.
-
-Usage:
-  tests/run_tests.py             # unit + integration
-  tests/run_tests.py unit        # unit only (no docker)
-  tests/run_tests.py integration # integration only (need docker)
-  tests/run_tests.py tests/test_x.py  # one specific file (or path)
-
-Tests are auto-classified as integration when their filename matches
-one of INTEGRATION_NAMES below; everything else is a unit test.
-"""
-
-from __future__ import annotations
-
-import sys
-import unittest
-from pathlib import Path
-
-REPO_ROOT = Path(__file__).resolve().parent.parent
-TESTS_DIR = REPO_ROOT / "tests"
-
-INTEGRATION_NAMES = {
-    "test_dry_run_plan.py",
-    "test_orphan_cleanup.py",
-    "test_pipelock_image.py",
-    "test_pipelock_sidecar_smoke.py",
-}
-
-
-def _all_test_files() -> list[Path]:
-    return sorted(TESTS_DIR.glob("test_*.py"))
-
-
-def _classify(path: Path) -> str:
-    return "integration" if path.name in INTEGRATION_NAMES else "unit"
-
-
-def _modname(path: Path) -> str:
-    return f"tests.{path.stem}"
-
-
-def _build_suite(files: list[Path]) -> unittest.TestSuite:
-    loader = unittest.TestLoader()
-    suite = unittest.TestSuite()
-    for f in files:
-        suite.addTests(loader.loadTestsFromName(_modname(f)))
-    return suite
-
-
-def usage() -> None:
-    sys.stderr.write(
-        "usage: tests/run_tests.py [unit|integration|path/to/test.py]\n"
-    )
-
-
-def main(argv: list[str]) -> int:
-    sys.path.insert(0, str(REPO_ROOT))
-
-    if not argv:
-        files = _all_test_files()
-    else:
-        arg = argv[0]
-        if arg in ("-h", "--help"):
-            usage()
-            return 0
-        if arg == "unit":
-            files = [f for f in _all_test_files() if _classify(f) == "unit"]
-        elif arg == "integration":
-            files = [f for f in _all_test_files() if _classify(f) == "integration"]
-        else:
-            p = Path(arg).resolve()
-            if not p.is_file():
-                sys.stderr.write(f"no such file: {arg}\n")
-                usage()
-                return 2
-            files = [p]
-
-    if not files:
-        sys.stderr.write("no test files found\n")
-        return 2
-
-    suite = _build_suite(files)
-    runner = unittest.TextTestRunner(verbosity=2)
-    result = runner.run(suite)
-    return 0 if result.wasSuccessful() else 1
-
-
-if __name__ == "__main__":
-    sys.exit(main(sys.argv[1:]))
@@ -1,68 +0,0 @@
-"""Unit: bottle 'runtime' field is no longer supported (PRD 0003).
-
-gVisor is now auto-detected by the Docker factory. A manifest carrying
-the legacy 'runtime' field must fail loudly with a message pointing the
-user at the auto-detect behavior, rather than silently ignoring."""
-
-import io
-import sys
-import unittest
-
-from claude_bottle.log import Die
-from claude_bottle.manifest import Bottle, Manifest
-
-
-_ABSENT = object()
-
-
-def _manifest(runtime_value: object) -> dict:
-    """Build a minimal manifest JSON shape with one bottle whose runtime
-    field is set (or absent if `runtime_value is _ABSENT`)."""
-    bottle: dict = {}
-    if runtime_value is not _ABSENT:
-        bottle["runtime"] = runtime_value
-    return {
-        "bottles": {"dev": bottle},
-        "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}},
-    }
-
-
-class TestManifestRuntimeRemoved(unittest.TestCase):
-    def test_loads_when_runtime_absent(self):
-        m = Manifest.from_json_obj(_manifest(_ABSENT))
-        self.assertIn("dev", m.bottles)
-
-    def test_bottle_dataclass_has_no_runtime_attribute(self):
-        """Structural check: the field has been removed from the dataclass."""
-        b = Bottle()
-        self.assertFalse(hasattr(b, "runtime"))
-
-    def test_rejects_runsc_value_with_helpful_message(self):
-        captured = io.StringIO()
-        old_stderr = sys.stderr
-        sys.stderr = captured
-        try:
-            with self.assertRaises(Die):
-                Manifest.from_json_obj(_manifest("runsc"))
-        finally:
-            sys.stderr = old_stderr
-        msg = captured.getvalue()
-        self.assertIn("'runtime'", msg, "error names the field")
-        self.assertIn("auto-detect", msg, "error points at the new behavior")
-
-    def test_rejects_runc_value(self):
-        with self.assertRaises(Die):
-            Manifest.from_json_obj(_manifest("runc"))
-
-    def test_rejects_unknown_value(self):
-        with self.assertRaises(Die):
-            Manifest.from_json_obj(_manifest("kata-runtime"))
-
-    def test_rejects_non_string(self):
-        """Any presence of the field is an error; type is not consulted."""
-        with self.assertRaises(Die):
-            Manifest.from_json_obj(_manifest(42))
-
-
-if __name__ == "__main__":
-    unittest.main()
@@ -0,0 +1,39 @@
+"""Unit: bottle 'runtime' field is no longer supported (PRD 0003).
+
+gVisor is now auto-detected by the Docker factory. A manifest carrying
+the legacy 'runtime' field must fail, regardless of value, rather than
+silently ignoring."""
+
+import unittest
+
+from claude_bottle.log import Die
+from claude_bottle.manifest import Bottle, Manifest
+
+
+def _manifest_with_runtime(value: object) -> dict:
+    return {
+        "bottles": {"dev": {"runtime": value}},
+        "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}},
+    }
+
+
+class TestManifestRuntimeRemoved(unittest.TestCase):
+    def test_loads_when_runtime_absent(self):
+        m = Manifest.from_json_obj({
+            "bottles": {"dev": {}},
+            "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}},
+        })
+        self.assertIn("dev", m.bottles)
+
+    def test_bottle_dataclass_has_no_runtime_attribute(self):
+        self.assertFalse(hasattr(Bottle(), "runtime"))
+
+    def test_any_runtime_value_is_rejected(self):
+        for value in ("runsc", "runc", "kata-runtime", "", 42, None):
+            with self.subTest(value=value):
+                with self.assertRaises(Die):
+                    Manifest.from_json_obj(_manifest_with_runtime(value))
+
+
+if __name__ == "__main__":
+    unittest.main()