Print runner stderr in Windows E2E test script#17789
Print runner stderr in Windows E2E test script#17789larryliu0820 wants to merge 3 commits intomainfrom
Conversation
The Voxtral CUDA Windows E2E test is failing with exit code -1073740791 and no useful diagnostics. The runner's stderr was being captured to a file but silently deleted without printing. Surface it in CI output so we can diagnose the crash. This PR was authored with the assistance of Claude.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17789
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled Job, 3 Unrelated FailuresAs of commit 4eb9475 with merge base 389ea94 ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
18db1cb to
1562d4f
Compare
The IRunner refactoring in #17741 split generate() into separate prefill() and decode_from_token() calls. On Windows, calling any sub-method from generate() triggers STATUS_STACK_BUFFER_OVERRUN (0xC0000409) — this appears to be a Windows-specific issue with the function call pattern (confirmed by SSH debugging: inlining the prefill loop works, but calling it as a method crashes even with 8 MB stack). Fix by restoring the monolithic generate(vector, ...) implementation that keeps all prefill and decode logic inline, matching the pre-#17741 pattern that works on Windows. The separate prefill() and decode_from_token() methods are retained for external callers and the prefill-then-generate workflow. Also: - Pass std::function params to decode_from_token by const ref - Increase voxtral_runner stack to 8 MB on Windows as a safety net - Print runner stderr in the Windows E2E test for diagnostics This PR was authored with the assistance of Claude.
1562d4f to
d734f75
Compare
The Voxtral CUDA Windows E2E test is failing with exit code -1073740791 and no useful diagnostics. The runner's stderr was being captured to a file but silently deleted without printing. Surface it in CI output so we can diagnose the crash.