Skip to content

fix(detection): resolve Kubernetes compare regressions#566

Merged
mstykow merged 1 commit intomainfrom
fix/kubernetes-compare-followups
Apr 4, 2026
Merged

fix(detection): resolve Kubernetes compare regressions#566
mstykow merged 1 commit intomainfrom
fix/kubernetes-compare-followups

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented Apr 4, 2026

Summary

  • fix the real regressions exposed by the kubernetes/kubernetes compare-outputs --profile common run, including embedded-PEM source scanning, local license-reference cleanup, Conda false positives on generic *env*.yaml, and holder extraction for plain year-range copyright lines
  • add focused regression coverage for the Kubernetes-driven fixes, rerun the affected copyright golden, and keep the remaining Provenant-better or non-regression deltas documented as accepted outcomes
  • add docs/BENCHMARKS.md as the canonical package-detection verification record and simplify the package-detection scorecard/docs to point detailed compare results there

Scope and exclusions

  • Included:
    • scanner text-detection fixes for source files that embed PEM blocks
    • post-processing cleanup for local LICENSE references that should not retain unknown-license-reference noise
    • Conda environment parsing guardrails to avoid package false positives on generic env-style YAML fixtures
    • copyright holder extraction improvements for plain year-range lines
    • package-detection benchmark documentation and scorecard simplification
  • Explicit exclusions:
    • deleting docs/implementation-plans/package-detection/PARSER_VERIFICATION_SCORECARD.md
    • adding new benchmark targets beyond the currently verified Boost and Kubernetes examples

Intentional differences from Python

  • keep Provenant's broader Dockerfile and go.work package coverage where it is more correct than ScanCode
  • keep cleaner license-expression, URL, and metadata outcomes where the remaining differences are accepted non-regressions rather than parity losses

Expected-output fixture changes

  • Files changed: testdata/copyright-golden/copyrights/misco2/ocaml.txt.yml
  • Why the new expected output is correct: the improved year-range holder extraction now keeps the holder attached to Copyright 2013-2020 by OCamlPro, so the golden should record both the fuller copyright line and the OCamlPro holder instead of underreporting the result

@mstykow mstykow merged commit 4ab5b1e into main Apr 4, 2026
13 checks passed
@mstykow mstykow deleted the fix/kubernetes-compare-followups branch April 4, 2026 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant