Adopt ADR 0004: stop chasing a single global coverage number and
measure what matters instead.
- Omit the genuinely-interactive `cli/init.py` shell (read_tty_line
prompt loops) alongside the existing `cli/tui.py`, with a rationale
comment in .coveragerc. Subprocess/backend orchestration is NOT
omitted — it stays visible and is scored via the integration suite.
- scripts/coverage.sh runs unit + integration under one coverage
measurement (the policy's yardstick) and can report the critical
security/logic core held to the >=90% target.
- scripts/diff_coverage.py is a stdlib-only gate (no diff-cover dep):
new/changed executable lines must be >=90% covered. This is the
enforced regression guard; the global number is informational.
- CI gains a `coverage` job: combined report + the diff-coverage gate.
- Unit-test `cli/__init__.py` dispatch/exit-code mapping (it's logic,
not I/O, so it earns tests rather than an omit).
Combined unit+integration coverage now reports 83% global / 87% across
the critical modules; per-module ratcheting toward 90% is the ongoing
work this policy frames.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9
Replace the hand-maintained INTEGRATION_NAMES classifier (and the
bespoke run_tests.py around it) with a directory-driven split:
tests/unit/ unit tests, always run
tests/integration/ Docker-dependent, skip cleanly without Docker
tests/canaries/ upstream-regression checks, opt-in via
CLAUDE_BOTTLE_RUN_CANARIES=1
The pinned-pipelock-image check moves to the canary suite — it tests
upstream packaging, not our code, so it shouldn't gate every dev push.
A scheduled canaries.yml workflow runs it weekly.
The manifest-runtime tests collapse the four assertRaises cases for
distinct 'runtime' values into one subTest loop and drop the
error-message-wording assertions; the contract is "any value is
rejected", not "the error literally contains 'auto-detect'".
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>