Risk-weighted coverage policy + diff-coverage gate (ADR 0004) #294

2026-06-25T21:29:27-04:00

didericis-claude commented

2026-06-25 21:29:27 -04:00

Stacked on #290 (base = cover-egress-addon-adapter).

Implements ADR 0004 — risk-weighted coverage: instead of chasing one global percentage (which pushes the most test effort onto the least safety-relevant code and invites performative line-chasing), measure what matters.

What this does

Omit only genuinely-interactive shells. cli/init.py (a read_tty_line() prompt loop) joins cli/tui.py in .coveragerc, with a rationale comment. Subprocess/backend orchestration is deliberately NOT omitted — that code tears down sandboxes and wires networks, so it stays visible and is scored via the integration suite instead.
Count integration coverage. scripts/coverage.sh runs unit + integration under one measurement (the policy's yardstick) and can print the critical security/logic core (--critical).
Enforce a diff-coverage gate. scripts/diff_coverage.py is stdlib-only (no diff-cover dependency, per the repo's low-dep rule): new/changed executable lines must be ≥90% covered. This is the enforced regression guard; the global number is informational.
CI gains a coverage job: combined report + the diff gate (degrades gracefully when the runner lacks Docker — integration skips, the diff gate is Docker-independent).
Unit-test cli/__init__ dispatch — it's exit-code logic, not I/O, so it earns tests rather than an omit (9 tests).

Numbers

Measurement	Coverage
Unit only (old badge)	79%
Combined unit+integration (new yardstick)	83%
Critical security/logic core (aggregate)	87%

Per-module ratcheting of the critical core toward 90% (egress_addon 76→, git_gate 80→, yaml_subset 83→, …) is the ongoing work this policy frames — see ADR 0004.

Full unit suite (1329 tests) passes; pyright clean; pylint 10.00 on new files.

Stacked on #290 (base = `cover-egress-addon-adapter`). Implements **ADR 0004 — risk-weighted coverage**: instead of chasing one global percentage (which pushes the most test effort onto the least safety-relevant code and invites performative line-chasing), measure what matters. ## What this does - **Omit only genuinely-interactive shells.** `cli/init.py` (a `read_tty_line()` prompt loop) joins `cli/tui.py` in `.coveragerc`, with a rationale comment. **Subprocess/backend orchestration is deliberately NOT omitted** — that code tears down sandboxes and wires networks, so it stays visible and is scored via the integration suite instead. - **Count integration coverage.** `scripts/coverage.sh` runs unit + integration under one measurement (the policy's yardstick) and can print the critical security/logic core (`--critical`). - **Enforce a diff-coverage gate.** `scripts/diff_coverage.py` is stdlib-only (no `diff-cover` dependency, per the repo's low-dep rule): new/changed executable lines must be ≥90% covered. This is the enforced regression guard; the global number is informational. - **CI** gains a `coverage` job: combined report + the diff gate (degrades gracefully when the runner lacks Docker — integration skips, the diff gate is Docker-independent). - **Unit-test `cli/__init__` dispatch** — it's exit-code logic, not I/O, so it earns tests rather than an omit (9 tests). ## Numbers | Measurement | Coverage | |---|---| | Unit only (old badge) | 79% | | **Combined unit+integration (new yardstick)** | **83%** | | Critical security/logic core (aggregate) | 87% | Per-module ratcheting of the critical core toward 90% (egress_addon 76→, git_gate 80→, yaml_subset 83→, …) is the ongoing work this policy frames — see ADR 0004. Full unit suite (1329 tests) passes; pyright clean; pylint 10.00 on new files.

didericis-claude added 1 commit 2026-06-25 21:29:29 -04:00

ci(coverage): risk-weighted coverage policy + diff-coverage gate

lint / lint (push) Successful in 1m52s

Details

test / unit (pull_request) Successful in 46s

Details

test / integration (pull_request) Successful in 16s

Details

test / coverage (pull_request) Successful in 1m2s

Details

632ab002ed

Adopt ADR 0004: stop chasing a single global coverage number and
measure what matters instead.

- Omit the genuinely-interactive `cli/init.py` shell (read_tty_line
  prompt loops) alongside the existing `cli/tui.py`, with a rationale
  comment in .coveragerc. Subprocess/backend orchestration is NOT
  omitted — it stays visible and is scored via the integration suite.
- scripts/coverage.sh runs unit + integration under one coverage
  measurement (the policy's yardstick) and can report the critical
  security/logic core held to the >=90% target.
- scripts/diff_coverage.py is a stdlib-only gate (no diff-cover dep):
  new/changed executable lines must be >=90% covered. This is the
  enforced regression guard; the global number is informational.
- CI gains a `coverage` job: combined report + the diff-coverage gate.
- Unit-test `cli/__init__.py` dispatch/exit-code mapping (it's logic,
  not I/O, so it earns tests rather than an omit).

Combined unit+integration coverage now reports 83% global / 87% across
the critical modules; per-module ratcheting toward 90% is the ongoing
work this policy frames.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NkwFXLFff9PYPy4wgVBJp9

didericis-claude referenced this pull request

2026-06-25 21:54:53 -04:00

Ratchet egress_addon coverage to >=90% (ADR 0004) #295

lint / lint (push) Successful in 1m52s

Details

test / unit (pull_request) Successful in 46s

Details

test / integration (pull_request) Successful in 16s

Details

test / coverage (pull_request) Successful in 1m2s

Details

This pull request can be merged automatically.

You are not authorized to merge this pull request.

View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.

git fetch -u origin cover-global-90:cover-global-90

git checkout cover-global-90

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: didericis/bot-bottle#294