ci: benchmark comparisons always use main branch baseline #309

Copilot · 2025-10-28T07:48:08Z

Benchmark comparisons were using PR-specific cache as the baseline, masking cumulative performance degradation across multiple commits. A PR with commit 1 degrading by 5% and commit 2 degrading by another 5% would show as two separate 5% regressions instead of cumulative 10% degradation from main.

Changes

Cache Strategy

Before: Restore keys prioritized PR-specific cache, then fell back to main
After: Only restore from main branch baseline
All PR commits now compare against the same main baseline

Override Benchmark Threshold

Added override-benchmark-threshold PR label support
When present: benchmark runs and comments but doesn't fail PR
Enables intentional performance trade-offs with full visibility
Label check includes error handling with safe defaults (fails on error)
Only applies to PRs; main branch always fails on regression

Configuration

restore-keys: |
  ${{ format('{0}-benchmark-main-', runner.os) }}

fail-on-alert: ${{ github.event_name != 'pull_request' || (steps.check-label.outputs.has_label || 'false') != 'true' }}

Documentation

Updated benchmark README with comparison strategy and label usage
Added micro-benchmark section to main README

Original prompt

The cache key for benchmarks should always compare to the last successful benchmark in the main branch. Otherwise, we might see failures when committing several commits to a branch that still succeed when compared to the merged code. We should also have a way to accept degraded performance in CI.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Change benchmark cache restore keys to prioritize main branch - Remove fallback to PR/branch-specific cache for comparisons - Add support for accept-performance-degradation label to skip fail-on-alert - Update documentation to explain new behavior - Ensures cumulative performance regressions across multiple commits are detected Co-authored-by: LayZeeDK <6364586+LayZeeDK@users.noreply.github.com>

- Add error handling for gh pr view command failures - Default to fail-on-alert if label check fails - Use OR operator with default value in fail-on-alert expression - Addresses code review feedback Co-authored-by: LayZeeDK <6364586+LayZeeDK@users.noreply.github.com>

LayZeeDK · 2025-10-28T08:17:55Z

@copilot Change the PR label to override-benchmark-threshold.

- Rename accept-performance-degradation to override-benchmark-threshold - Update all references in workflow, README, and benchmark docs - More descriptive name for bypassing benchmark failure Co-authored-by: LayZeeDK <6364586+LayZeeDK@users.noreply.github.com>

Copilot · 2025-10-28T08:22:38Z

@copilot Change the PR label to override-benchmark-threshold.

Done in 739baef. Renamed the label from accept-performance-degradation to override-benchmark-threshold across all files (workflow, README, and benchmark documentation).

Initial plan

eeeedd4

Copilot AI assigned Copilot and LayZeeDK Oct 28, 2025

Copilot started work on behalf of LayZeeDK October 28, 2025 07:48 View session

Copilot AI and others added 2 commits October 28, 2025 07:57

Copilot AI changed the title ~~[WIP] Fix benchmark cache key for CI comparisons~~ fix(ci): benchmark comparisons always use main branch baseline Oct 28, 2025

Copilot finished work on behalf of LayZeeDK October 28, 2025 08:06

Copilot AI requested a review from LayZeeDK October 28, 2025 08:06

Copilot started work on behalf of LayZeeDK October 28, 2025 08:18 View session

Copilot finished work on behalf of LayZeeDK October 28, 2025 08:23

Merge branch 'main' into copilot/fix-benchmark-cache-key

c2ec671

LayZeeDK marked this pull request as ready for review October 28, 2025 20:46

LayZeeDK approved these changes Oct 28, 2025

View reviewed changes

LayZeeDK changed the title ~~fix(ci): benchmark comparisons always use main branch baseline~~ vi: benchmark comparisons always use main branch baseline Oct 28, 2025

LayZeeDK changed the title ~~vi: benchmark comparisons always use main branch baseline~~ ci: benchmark comparisons always use main branch baseline Oct 28, 2025

LayZeeDK merged commit 64e2ffb into main Oct 28, 2025
13 checks passed

LayZeeDK deleted the copilot/fix-benchmark-cache-key branch October 28, 2025 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

ci: benchmark comparisons always use main branch baseline #309

ci: benchmark comparisons always use main branch baseline #309

Copilot AI commented Oct 28, 2025 •

edited

Loading

Uh oh!

LayZeeDK commented Oct 28, 2025

Uh oh!

Copilot AI commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

ci: benchmark comparisons always use main branch baseline #309

ci: benchmark comparisons always use main branch baseline #309

Conversation

Copilot AI commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Cache Strategy

Override Benchmark Threshold

Configuration

Documentation

Uh oh!

LayZeeDK commented Oct 28, 2025

Uh oh!

Copilot AI commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 28, 2025 •

edited

Loading