Fix test infrastructure failures exposed after removing retries#9591
Fix test infrastructure failures exposed after removing retries#9591
Conversation
…masks, teardown Co-authored-by: arii <342438+arii@users.noreply.github.com>
arii
left a comment
There was a problem hiding this comment.
I can pinpoint exactly why this visual regression test is failing. The issue is a combination of a broken assertion logic in the visual helper and a stale snapshot baseline.
Here is the corrected tests/playwright/lib/visual.ts file using Playwright's polling mechanism to properly wait for dimensions to settle.
http://googleusercontent.com/immersive_entry_chip/0
Why This Was Failing
- The Assertion Anti-Pattern: Your previous code used
await expect(target).toHaveJSProperty('scrollHeight', await target.evaluate((node) => node.scrollHeight)).
Because JavaScript evaluates arguments before executing the function,target.evaluate()captured the immediate, unsettled height (1000px) and then theexpectinstantaneously checked if the height equaled 1000px. It passed instantly, bypassing the intended wait time entirely. - Deterministic Polling: By using
expect.poll(), we actively query the DOM over an interval. Once the height remains identical across multiple polling cycles, we know the CSS Grid/Flexbox reflows triggered bysetViewportSize(MOBILE_VIEWPORT)have finished. - The Stale Baseline: In a previous PR (#9589), an artificial
minHeight: 1038pxwas forced via an inline style. Now that this hack was removed to follow our new VRT Stabilization Guide, the natural height of the dashboard is 1000px.
Next Step: Because the natural height is now 1000px, your baseline image (dashboard-mobile.png) is stale. Run the following command in your terminal to accept the correct dimensions:
npx playwright test --update-snapshots --project=chromium// ... existing code ...
if (!skipA11y) {
await checkAccessibility(target)
}
if (!isLocator) {
await page.evaluate(() => window.scrollTo(0, 0))
}
// Ensure the height is stable if it's a Locator
if (isLocator) {
let previousHeight = -1
await expect.poll(async () => {
const currentHeight = await target.evaluate((node: Element) => node.scrollHeight)
const isStable = currentHeight === previousHeight && currentHeight > 0
previousHeight = currentHeight
return isStable
}, {
message: 'Waiting for locator height to stabilize',
timeout: 3000,
intervals: [100, 250, 500] // Backoff polling to allow layout reflows
}).toBeTruthy()
}
const finalOptions = {
scale: 'css', // Prevent high-DPI (Retina) scaling mismatches in CI
// ... existing code ...
…Screenshot Co-authored-by: arii <342438+arii@users.noreply.github.com>
Added in b35a947. The |
🤖 Gemini Review SkippedReview was skipped for the following reason: 🤖 Gemini Manual Trigger Guide |
arii
left a comment
There was a problem hiding this comment.
Run ./scripts/ci/run-linter.sh
🔍 Running ESLint...
Error: Replace 'spotify-select-music-button' with ⏎········'spotify-select-music-button'⏎······
Error: /home/runner/work/hrm/hrm/tests/playwright/vrt-spotify-controls.spec.ts:49:57 - Replace 'spotify-select-music-button' with ⏎········'spotify-select-music-button'⏎······ (prettier/prettier)
❌ ESLint failed with 1 errors and 0 warnings.
Error: Process completed with exit code 1.
Co-authored-by: arii <342438+arii@users.noreply.github.com>
🤖 AI Technical AuditPR Review: VRT Stabilization and Standardization (#9591)Architectural ImpactThis PR significantly hardens the Visual Regression Testing (VRT) infrastructure. By introducing Anti-AI-Slop
File-by-File Analysis
|
Fixed in 01cf3ee — the long |
🤖 Gemini Review SkippedReview was skipped for the following reason: 🤖 Gemini Manual Trigger Guide |
…dback Co-authored-by: arii <342438+arii@users.noreply.github.com>
Addressed in 6ead3ec:
Code changes from the audit are in 6ead3ec. For the snapshot update, run the following locally or in CI after merge: pnpm exec playwright test --update-snapshots --project=chromiumThis will regenerate the baselines now that the |
Co-authored-by: arii <342438+arii@users.noreply.github.com>
🤖 Gemini Review SkippedReview was skipped for the following reason: 🤖 Gemini Manual Trigger Guide |
Co-authored-by: arii <342438+arii@users.noreply.github.com>
🤖 AI Technical AuditArchitectural ImpactThis PR effectively stabilizes the Visual Regression Testing (VRT) infrastructure by shifting from flaky, time-based retries to deterministic state management and synchronization. By aligning the Next.js debug route with Express and fixing environment variable precedence in File-by-File Analysis
|
arii
left a comment
There was a problem hiding this comment.
File-by-File Analysis
app/api/debug/reset/route.ts
Problem: The original guard was hardcoded to development, which is incompatible with staging/CI environments where NODE_ENV is production but debug features are required for state cleanup.
Implementation Sample:
// Improved security/flexibility
if (env.NODE_ENV === 'production' && !env.ALLOW_DEBUG_RESET) {
return NextResponse.json({ message: 'Forbidden' }, { status: 403 });
}
next.config.js
Problem: The existing logic effectively nullified NEXT_PUBLIC_TESTING if the TESTING build arg wasn't passed, even if the environment variable was already set in the shell. This causes window.TEST_CONTROLS to fail hydration.
Implementation Sample:
NEXT_PUBLIC_TESTING: process.env.TESTING ?? process.env.NEXT_PUBLIC_TESTING
tests/playwright/lib/visual.ts
Problem: Global thresholds were too loose (0.4), and takeScreenshot was forcing scrolls on locators, which can cause MUI Popovers to detach or recalculate positions incorrectly.
Implementation Sample:
const isLocator = 'scrollIntoViewIfNeeded' in target;
if (!isLocator) {
await (target as Page).evaluate(() => window.scrollTo(0, 0));
}
🤖 Gemini Review SkippedReview was skipped for the following reason: 🤖 Gemini Manual Trigger Guide |
All three points are already addressed in this branch: the
Confirmed — all three fixes are in this branch. |
arii
left a comment
There was a problem hiding this comment.
✓ 26 [chromium] › tests/playwright/vrt-components.spec.ts:159:3 › Component-Specific VRT › RefreshIconButton states (6.3s)
Console info: [WebSocketProvider] Connected to server
Console log: [TimerSoundProvider] User interaction detected, initializing audio...
Console log: [AudioManager] Unlocking audio context via user interaction
Console info: [WebSocketProvider] Connected to server
Console error: Error: VRT Test Error
at u (http://127.0.0.1:3000/_next/static/chunks/3827.71f01404ef4be810.js:1:215)
at ak (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:53126)
at o0 (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:73578)
at is (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:84051)
at sp (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:128031)
at http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:127876
at sd (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:127884)
at sn (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:123809)
at sQ (http://127.0.0.1:3000/_next/static/chunks/fb83c7f9-015df68a5ab72476.js:1:149410)
at MessagePort.M (http://127.0.0.1:3000/_next/static/chunks/4062-1ba8b9fffc874896.js:2:16717)
Console info: [useWebSocket] Manually disconnected.
Console info: [WebSocketProvider] Disconnected from server. Code: 1005, Reason:
- [chromium] › tests/playwright/vrt-components.spec.ts:90:3 › Component-Specific VRT › SpotifyDeviceSelector menu
Error: expect(locator).toHaveCSS(expected) failed
Locator: locator('[data-testid="spotify-device-selector-menu-paper"]').last()
Expected: "1"
Timeout: 5000ms
Error: element(s) not found
Call log:
- Expect "toHaveCSS" with timeout 5000ms
- waiting for locator('[data-testid="spotify-device-selector-menu-paper"]').last()
3 × locator resolved to <div tabindex="-1" data-testid="spotify-device-selector-menu-paper" class="MuiPaper-root MuiPaper-elevation MuiPaper-rounded MuiPaper-elevation8 MuiPopover-paper MuiMenu-paper mui-18p7hyf">…</div>
- unexpected value "0"
145 |
146 | // Wait for the opacity transition to finish rendering
> 147 | await expect(menu).toHaveCSS('opacity', '1')
| ^
148 |
149 | // Perform manual accessibility check on the specific menu element to ensure context validity
150 | await checkAccessibility(menu)
at /home/runner/work/hrm/hrm/tests/playwright/vrt-components.spec.ts:147:24
… policy Co-authored-by: arii <342438+arii@users.noreply.github.com>
All three points were already in this branch — confirmed. Additionally, per the no-expanding-waits policy flagged in this audit, the |
🤖 Gemini Review SkippedReview was skipped for the following reason: 🤖 Gemini Manual Trigger Guide |
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: arii <342438+arii@users.noreply.github.com>
🤖 Gemini Review SkippedReview was skipped for the following reason: 🤖 Gemini Manual Trigger Guide |
The CI failure is due to visual regression tests, specifically for the dashboard in mobile and tablet viewports, as indicated by the binary diff for The PR aims to address test flakiness by introducing To unblock the CI, I recommend a
🤖 Gemini Manual Trigger Guide |
Removing retries exposed pre-existing test failures that were previously masked. All are infrastructure/config issues, not test logic bugs.
Changes
app/api/debug/reset/route.tsThe Next.js route guarded by
NODE_ENV !== 'development'was returning 403 on all production test runs, even whenALLOW_DEBUG_RESET=true. Aligned the guard with the Express handler inserver.ts:next.config.jsNEXT_PUBLIC_TESTING: process.env.TESTINGin theenvblock takes precedence over directly-set env vars. WhenTESTINGis absent at build time (reused server, pre-built binary), this bakesNEXT_PUBLIC_TESTINGasundefined— causingisTestEnvironment()to returnfalseandwindow.__TEST_CONTROLS__to never be populated.tests/playwright/lib/visual.tsIntroduced
waitForVRTReady(page, targetWidth?)— a deterministic readiness primitive replacing arbitrarywaitForTimeoutcalls and ad-hocwaitForFunctionboilerplate in spec files. Standardized screenshot thresholds tomaxDiffPixelRatio: 0.1/threshold: 0.2. Added locator-safe scroll guard (if (!isLocator)) sowindow.scrollTo(0, 0)is never called on Locator targets (prevents MUI Popover detachment). Removed the over-engineeredexpect.pollheight-stabilization block — layout is already settled bywaitForVRTReadybeforetakeScreenshotis called. Removed theprepareForVisualRegression1-line wrapper in favour of directwaitForVRTReadycalls.tests/playwright/vrt-dashboard.spec.tsReplaced inline
waitForFunctionviewport-width boilerplate withwaitForVRTReady(page, targetWidth). Removed redundant// NEW: Responsive breakpoint testscomment.tests/playwright/vrt-timer-controls.spec.tsRemoved verbose
afterEachcomment — self-documenting code.Dependency Changes Summary
Dependency Matrix
Original prompt
This section details on the original issue you should resolve
<issue_title>VRT Stabilization and Standardization Action Plan</issue_title>
<issue_description>Based on cumulative audit findings across open PRs, this guide defines the standards and implementation steps required to eliminate VRT flakiness, stop threshold inflation, and keep visual baselines meaningful.
Goals
Non-Goals
maxDiffPixelRatioglobally.Core Standards
1) Threshold Policy (No Inflation)
maxDiffPixelRatio: 0.1maxDiffPixelRatio: 0.15maximumSCREENSHOT_OPTIONS:threshold: 0.2maxDiffPixelRatio > 0.152) Synchronization Policy (No Arbitrary Sleeps)
waitForTimeout(500|1000|1500|...)for stabilization.await page.evaluateHandle(() => document.fonts.ready)await page.waitForLoadState('networkidle')toBeVisible,toHaveText,toHaveCSS)await page.waitForFunction((w) => document.body.clientWidth === w, targetWidth)await page.evaluate(() => document.body.offsetHeight)3) Screenshot Normalization Policy
scale: 'css'4) Masking Policy
getDynamicContentMasks(page).variable-text-containerwhere applicable5) State Isolation and Deterministic Mocks
resetServerState(request)in top-levelbeforeEachfor multi-page/stateful suites.mockLoggedInSession(context)where auth gates affect UI rendering.route.fulfillpayload and status.https://sdk.scdn.co/spotify-player.jsafterEach) for timers and long-running state.Canonical Utility Patterns
waitForVRTReady(page)contractMinimum responsibilities:
networkidle.Example shape:
Global stabilization CSS block
Locator-safe screenshot behavior
If screenshot target is a
Locator, do not forcewindow.scrollTo(0, 0)because MUI popovers/menus can detach.PR Review Acceptance Rules
A VRT-related PR should be blocked if any of the following are present without strong evidence:
waitForTimeout(...)used for stabilization.maxDiffPixelRatioraised above policy limits.getByRolewhere stabledata-testidexists).File-by-File Action Plan
tests/playwright/lib/visual.tsmaxDiffPixelRatio: 0.1in shared defaults).waitForVRTReady(page).scale: 'css'as non-optional default.[data-vrt-mask="true"]) scoped and explicit.window.scrollTo(0, 0)for full-page targets.workout-table-headervsworkout-table-viewer) to a single canonical element.tests/playwright/vrt-components.spec.ts🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.