test: reorganize suite into unit/integration/canaries directories

Replace the hand-maintained INTEGRATION_NAMES classifier (and the bespoke run_tests.py around it) with a directory-driven split: tests/unit/ unit tests, always run tests/integration/ Docker-dependent, skip cleanly without Docker tests/canaries/ upstream-regression checks, opt-in via CLAUDE_BOTTLE_RUN_CANARIES=1 The pinned-pipelock-image check moves to the canary suite — it tests upstream packaging, not our code, so it shouldn't gate every dev push. A scheduled canaries.yml workflow runs it weekly. The manifest-runtime tests collapse the four assertRaises cases for distinct 'runtime' values into one subTest loop and drop the error-message-wording assertions; the contract is "any value is rejected", not "the error literally contains 'auto-detect'". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:23:02 -04:00
parent 83fe5741f5
commit 4462863d56
16 changed files with 157 additions and 207 deletions
@@ -0,0 +1,31 @@
 # Weekly canary suite. Catches upstream regressions (broken pipelock
 # image packaging at the pinned digest, etc.) without coupling every
 # dev push to upstream registry availability.
 #
 # Opt-in via CLAUDE_BOTTLE_RUN_CANARIES=1 so the same files can be run
 # locally with the same gating.
 name: canaries
 on:
  schedule:
    # 12:00 UTC every Monday.
    - cron: "0 12 * * 1"
  workflow_dispatch:
 jobs:
  canaries:
    runs-on: ubuntu-latest
    env:
      CLAUDE_BOTTLE_RUN_CANARIES: "1"
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Run canaries
        run: python3 -m unittest discover -t . -s tests/canaries -v
@@ -1,10 +1,14 @@
-# Run the project's full test suite on every PR push and on push to main.
+# Run the project's test suite on every PR push and on push to main.
 #
-# The suite uses stdlib `unittest` (see tests/run_tests.py) — no external
+# The suite uses stdlib `unittest` discovery — no external Python
-# Python dependencies are required to execute it. Integration tests need a
+# dependencies are required to execute it. Tests are split by directory:
-# reachable Docker daemon; if Docker is unavailable on the runner those
+#
-# tests skip cleanly via tests/_docker.py:skip_unless_docker, so the job
+#   tests/unit/         — pure unit tests; always run
-# still passes (with skips visible in the run output).
+#   tests/integration/  — need a reachable Docker daemon; skip cleanly
 #                         (via tests/_docker.py:skip_unless_docker) when
 #                         Docker isn't available on the runner
 #   tests/canaries/     — upstream regression canaries; run on a separate
 #                         schedule (see canaries.yml), not here
 #
 # This workflow assumes the Gitea Actions runner exposes the host Docker
 # socket to the job container so `docker` commands inside the job can
@@ -20,8 +24,21 @@ on:
  pull_request:
 jobs:
-  test:
+  unit:
-    name: run tests/run_tests.py
+    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Run unit tests
        run: python3 -m unittest discover -t . -s tests/unit -v
  integration:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
@@ -41,5 +58,5 @@ jobs:
            echo "docker not on PATH — integration tests will skip"
          fi
-      - name: Run full test suite
+      - name: Run integration tests
-        run: python3 tests/run_tests.py
+        run: python3 -m unittest discover -t . -s tests/integration -v
@@ -8,47 +8,58 @@ tests need Docker and skip cleanly otherwise.
 ```
 tests/
-  run_tests.py                    # entry point
+  fixtures.py                       # JSON manifest builders (shared)
-  fixtures.py                     # JSON manifest builders
+  _docker.py                        # docker-availability skip helper (shared)
-  _docker.py                      # docker-availability skip helper
+  unit/
-  test_pipelock_naming.py         # unit
+    test_pipelock_classify.py
-  test_pipelock_classify.py       # unit
+    test_pipelock_allowlist.py
-  test_pipelock_allowlist.py      # unit
+    test_pipelock_yaml.py
-  test_pipelock_yaml.py           # unit
+    test_manifest_runtime.py
-  test_pipelock_image.py          # integration
+  integration/
-  test_pipelock_sidecar_smoke.py  # integration
+    test_pipelock_sidecar_smoke.py
-  test_dry_run_plan.py            # integration
+    test_dry_run_plan.py
-  test_orphan_cleanup.py          # integration
+    test_orphan_cleanup.py
  canaries/
    test_pipelock_image.py          # opt-in; see below
 ```
 Classification falls out of the directory — no hand-maintained list to
 keep in sync.
 ## Running
 ```bash
-tests/run_tests.py                                  # everything
+python -m unittest discover -t . -s tests/unit -v         # unit only
-tests/run_tests.py unit                             # unit only
+python -m unittest discover -t . -s tests/integration -v  # integration only
-tests/run_tests.py integration                      # integration only
+python -m unittest discover -t . -s tests -v              # both (recursive)
-tests/run_tests.py tests/test_pipelock_yaml.py      # one file
+python -m unittest tests.unit.test_pipelock_yaml          # one file
 ```
-You can also run via `python -m unittest`:
+Discovery is invoked with `-t .` (top-level dir = repo root) so the
-
+`claude_bottle` package on `sys.path` resolves correctly.
 ```bash
 python -m unittest discover -s tests
 python -m unittest tests.test_pipelock_yaml
 ```
 ## What the integration tests cover
- `test_pipelock_image.py` — the pinned digest is reachable, ENTRYPOINT
+- `test_pipelock_sidecar_smoke.py` — drives `DockerPipelockProxy.prepare`
-  is `/pipelock`, and `CMD` includes `run`.
+  + `.start` (the production code path) against a real Docker daemon and
- `test_pipelock_sidecar_smoke.py` — `docker create` + `docker cp` the
+  probes the sidecar's `/health` from an in-network curl container.
-  generated YAML to `/etc/pipelock.yaml` + `docker start`, then probe
+- `test_dry_run_plan.py` — `cli.py start --dry-run --format=json` emits
-  `/health`.
+  a structured plan that contains the resolved egress allowlist and
- `test_dry_run_plan.py` — `cli.py start --dry-run` shows the resolved
+  the bottle's runtime, and creates zero Docker resources.
-  egress allowlist and creates zero docker resources.
+- `test_orphan_cleanup.py` — `network_remove` and `PipelockProxy.stop`
- `test_orphan_cleanup.py` — network_remove and pipelock_stop are
+  are idempotent against missing resources, so the EXIT trap can call
-  idempotent against missing resources, so the EXIT trap can call them
+  them unconditionally.
-  unconditionally.
+
 ## Canaries
 `tests/canaries/` holds upstream-regression checks (e.g. the pinned
 pipelock digest's binary still runs). These are gated on
 `CLAUDE_BOTTLE_RUN_CANARIES=1` and not part of the per-push suite.
 They're invoked by the scheduled `canaries` workflow.
 ```bash
 CLAUDE_BOTTLE_RUN_CANARIES=1 python -m unittest discover -t . -s tests/canaries -v
 ```
 ## What's NOT covered
@@ -60,9 +71,10 @@ python -m unittest tests.test_pipelock_yaml
 ## Adding a test
-1. Pick a filename: `test_<topic>.py`. Add it to `INTEGRATION_NAMES`
+1. Pick the directory: `tests/unit/` for a pure unit test,
-   in `run_tests.py` if it needs Docker.
+   `tests/integration/` for one that needs Docker.
-2. Boilerplate:
+2. Filename: `test_<topic>.py`.
 3. Boilerplate:
   ```python
   import unittest
@@ -75,5 +87,5 @@ python -m unittest tests.test_pipelock_yaml
   if __name__ == "__main__":
       unittest.main()
   ```
-3. For Docker-dependent tests, decorate the class with
+4. For Docker-dependent tests, decorate the class with
   `@skip_unless_docker()` from `tests._docker`.
@@ -1,7 +1,13 @@
-"""Integration: the pinned pipelock image's binary actually runs.
+"""Canary: the pinned pipelock image's binary actually runs.
 Catches a broken upstream packaging at the pinned digest. Requires
 docker."""
 This test exists to catch a broken upstream packaging at the pinned
 digest. It is NOT part of the per-push suite — that would couple every
 dev push to upstream registry availability. Set
 CLAUDE_BOTTLE_RUN_CANARIES=1 to opt in (a scheduled CI workflow does
 this; humans can run it ad-hoc the same way).
 """
 import os
 import subprocess
 import unittest
@@ -9,6 +15,10 @@ from claude_bottle.backend.docker.pipelock import PIPELOCK_IMAGE
 from tests._docker import skip_unless_docker
@unittest.skipUnless(
    os.environ.get("CLAUDE_BOTTLE_RUN_CANARIES") == "1",
    "canary suite is opt-in; set CLAUDE_BOTTLE_RUN_CANARIES=1 to run",
 )
@skip_unless_docker()
 class TestPipelockImage(unittest.TestCase):
    @classmethod
@@ -1,91 +0,0 @@
 #!/usr/bin/env python3
 """Test runner. Wraps unittest's discovery so we can split unit /
 integration the same way the bash runner did.
 Usage:
  tests/run_tests.py             # unit + integration
  tests/run_tests.py unit        # unit only (no docker)
  tests/run_tests.py integration # integration only (need docker)
  tests/run_tests.py tests/test_x.py  # one specific file (or path)
 Tests are auto-classified as integration when their filename matches
 one of INTEGRATION_NAMES below; everything else is a unit test.
 """
 from __future__ import annotations
 import sys
 import unittest
 from pathlib import Path
 REPO_ROOT = Path(__file__).resolve().parent.parent
 TESTS_DIR = REPO_ROOT / "tests"
 INTEGRATION_NAMES = {
    "test_dry_run_plan.py",
    "test_orphan_cleanup.py",
    "test_pipelock_image.py",
    "test_pipelock_sidecar_smoke.py",
 }
 def _all_test_files() -> list[Path]:
    return sorted(TESTS_DIR.glob("test_*.py"))
 def _classify(path: Path) -> str:
    return "integration" if path.name in INTEGRATION_NAMES else "unit"
 def _modname(path: Path) -> str:
    return f"tests.{path.stem}"
 def _build_suite(files: list[Path]) -> unittest.TestSuite:
    loader = unittest.TestLoader()
    suite = unittest.TestSuite()
    for f in files:
        suite.addTests(loader.loadTestsFromName(_modname(f)))
    return suite
 def usage() -> None:
    sys.stderr.write(
        "usage: tests/run_tests.py [unit|integration|path/to/test.py]\n"
    )
 def main(argv: list[str]) -> int:
    sys.path.insert(0, str(REPO_ROOT))
    if not argv:
        files = _all_test_files()
    else:
        arg = argv[0]
        if arg in ("-h", "--help"):
            usage()
            return 0
        if arg == "unit":
            files = [f for f in _all_test_files() if _classify(f) == "unit"]
        elif arg == "integration":
            files = [f for f in _all_test_files() if _classify(f) == "integration"]
        else:
            p = Path(arg).resolve()
            if not p.is_file():
                sys.stderr.write(f"no such file: {arg}\n")
                usage()
                return 2
            files = [p]
    if not files:
        sys.stderr.write("no test files found\n")
        return 2
    suite = _build_suite(files)
    runner = unittest.TextTestRunner(verbosity=2)
    result = runner.run(suite)
    return 0 if result.wasSuccessful() else 1
 if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))
@@ -1,68 +0,0 @@
 """Unit: bottle 'runtime' field is no longer supported (PRD 0003).
 gVisor is now auto-detected by the Docker factory. A manifest carrying
 the legacy 'runtime' field must fail loudly with a message pointing the
 user at the auto-detect behavior, rather than silently ignoring."""
 import io
 import sys
 import unittest
 from claude_bottle.log import Die
 from claude_bottle.manifest import Bottle, Manifest
 _ABSENT = object()
 def _manifest(runtime_value: object) -> dict:
    """Build a minimal manifest JSON shape with one bottle whose runtime
    field is set (or absent if `runtime_value is _ABSENT`)."""
    bottle: dict = {}
    if runtime_value is not _ABSENT:
        bottle["runtime"] = runtime_value
    return {
        "bottles": {"dev": bottle},
        "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}},
    }
 class TestManifestRuntimeRemoved(unittest.TestCase):
    def test_loads_when_runtime_absent(self):
        m = Manifest.from_json_obj(_manifest(_ABSENT))
        self.assertIn("dev", m.bottles)
    def test_bottle_dataclass_has_no_runtime_attribute(self):
        """Structural check: the field has been removed from the dataclass."""
        b = Bottle()
        self.assertFalse(hasattr(b, "runtime"))
    def test_rejects_runsc_value_with_helpful_message(self):
        captured = io.StringIO()
        old_stderr = sys.stderr
        sys.stderr = captured
        try:
            with self.assertRaises(Die):
                Manifest.from_json_obj(_manifest("runsc"))
        finally:
            sys.stderr = old_stderr
        msg = captured.getvalue()
        self.assertIn("'runtime'", msg, "error names the field")
        self.assertIn("auto-detect", msg, "error points at the new behavior")
    def test_rejects_runc_value(self):
        with self.assertRaises(Die):
            Manifest.from_json_obj(_manifest("runc"))
    def test_rejects_unknown_value(self):
        with self.assertRaises(Die):
            Manifest.from_json_obj(_manifest("kata-runtime"))
    def test_rejects_non_string(self):
        """Any presence of the field is an error; type is not consulted."""
        with self.assertRaises(Die):
            Manifest.from_json_obj(_manifest(42))
 if __name__ == "__main__":
    unittest.main()
@@ -0,0 +1,39 @@
 """Unit: bottle 'runtime' field is no longer supported (PRD 0003).
 gVisor is now auto-detected by the Docker factory. A manifest carrying
 the legacy 'runtime' field must fail, regardless of value, rather than
 silently ignoring."""
 import unittest
 from claude_bottle.log import Die
 from claude_bottle.manifest import Bottle, Manifest
 def _manifest_with_runtime(value: object) -> dict:
    return {
        "bottles": {"dev": {"runtime": value}},
        "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}},
    }
 class TestManifestRuntimeRemoved(unittest.TestCase):
    def test_loads_when_runtime_absent(self):
        m = Manifest.from_json_obj({
            "bottles": {"dev": {}},
            "agents": {"demo": {"skills": [], "prompt": "", "bottle": "dev"}},
        })
        self.assertIn("dev", m.bottles)
    def test_bottle_dataclass_has_no_runtime_attribute(self):
        self.assertFalse(hasattr(Bottle(), "runtime"))
    def test_any_runtime_value_is_rejected(self):
        for value in ("runsc", "runc", "kata-runtime", "", 42, None):
            with self.subTest(value=value):
                with self.assertRaises(Die):
                    Manifest.from_json_obj(_manifest_with_runtime(value))
 if __name__ == "__main__":
    unittest.main()