Skip to content

feat(health): scale coverage deductions by uncovered fraction#314

Merged
RaghavChamadiya merged 1 commit into
mainfrom
feat/health-coverage-scaled-severity
May 29, 2026
Merged

feat(health): scale coverage deductions by uncovered fraction#314
RaghavChamadiya merged 1 commit into
mainfrom
feat/health-coverage-scaled-severity

Conversation

@RaghavChamadiya
Copy link
Copy Markdown
Member

Adds a continuous coverage biomarker, coverage_gradient, that deducts health in direct proportion to a file's uncovered fraction — 4.0 × (1 − line_coverage_pct/100), capped — for files with known coverage, and stays silent when no coverage was ingested (absent is never imputed as uncovered).

Why

The two existing coverage biomarkers only fire below hard thresholds (~40–60% line coverage), so on well-tested codebases — where most files sit at 85–99% — the score was effectively blind to coverage even though the uncovered fraction still carries defect signal. The gradient fires across the whole 0–100% range and recovers the magnitude the binary gates discard.

How

A new optional deduction override on BiomarkerResult lets a finding carry a continuous magnitude that replaces the discrete severity → deduction table in score_file; the value is still weighted and category-capped, so per-finding health_impact stays linear and attributable. The gradient lives in its own capped category (test_coverage_gradient, −2.0) so the additive signal neither squeezes nor is squeezed by the binary gates.

Validation

Calibrated offline against the defect corpus and validated by re-aggregating the cached corpus findings through the shipped detector + score_file: +0.041 corpus AUC [95% CI +0.023, +0.059] on the covered subset, Popt-neutral, and exactly zero on repos without ingested coverage — purely additive. Zero added walk cost (arithmetic on already-parsed coverage).

Snapshot, biomarker tests, and docs updated in the same PR (biomarker count 24 → 25). 219 health tests pass; ruff clean.

Add a continuous coverage biomarker, coverage_gradient, that deducts
health in direct proportion to a file's uncovered fraction
(4.0 * (1 - line_coverage_pct/100), capped) for files with known
coverage, and stays silent when no coverage was ingested (absent is
never imputed as uncovered).

The two existing coverage biomarkers only fire below hard thresholds
(~40-60% line coverage), so on well-tested codebases - where most files
sit at 85-99% - the score was effectively blind to coverage even though
the uncovered fraction still carries defect signal. The gradient fires
across the whole 0-100% range and recovers the magnitude the binary
gates discard.

Mechanism: a new optional `deduction` override on BiomarkerResult lets a
finding carry a continuous magnitude that replaces the discrete severity
-> deduction table in score_file; the value is still weighted and
category-capped, so per-finding health_impact stays linear and
attributable. The gradient lives in its own capped category
(test_coverage_gradient, -2.0) so the additive signal neither squeezes
nor is squeezed by the binary gates.

Calibrated offline against the defect corpus: +0.043 corpus AUC
[95% CI +0.023, +0.061] on the covered subset, Popt-neutral, exactly
zero on repos without ingested coverage. Validated by re-aggregating the
cached corpus findings through the shipped detector + score_file
(reproduced +0.041 [+0.023, +0.059]). Zero added walk cost.

Snapshot, biomarker tests, and docs updated in step.
@RaghavChamadiya RaghavChamadiya requested a review from swati510 as a code owner May 29, 2026 20:24
@RaghavChamadiya RaghavChamadiya merged commit b691c9b into main May 29, 2026
5 checks passed
@RaghavChamadiya RaghavChamadiya deleted the feat/health-coverage-scaled-severity branch May 29, 2026 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants