Skip to content

Print runner stderr in Windows E2E test script#17789

Open
larryliu0820 wants to merge 3 commits intomainfrom
fix-voxtral-ci-print-stderr
Open

Print runner stderr in Windows E2E test script#17789
larryliu0820 wants to merge 3 commits intomainfrom
fix-voxtral-ci-print-stderr

Conversation

@larryliu0820
Copy link
Contributor

The Voxtral CUDA Windows E2E test is failing with exit code -1073740791 and no useful diagnostics. The runner's stderr was being captured to a file but silently deleted without printing. Surface it in CI output so we can diagnose the crash.

The Voxtral CUDA Windows E2E test is failing with exit code -1073740791
and no useful diagnostics. The runner's stderr was being captured to a
file but silently deleted without printing. Surface it in CI output so
we can diagnose the crash.

This PR was authored with the assistance of Claude.
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 2, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17789

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 3 Unrelated Failures

As of commit 4eb9475 with merge base 389ea94 (image):

NEW FAILURE - The following job has failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 2, 2026
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

The IRunner refactoring in #17741 split generate() into separate
prefill() and decode_from_token() calls. On Windows, calling any
sub-method from generate() triggers STATUS_STACK_BUFFER_OVERRUN
(0xC0000409) — this appears to be a Windows-specific issue with the
function call pattern (confirmed by SSH debugging: inlining the prefill
loop works, but calling it as a method crashes even with 8 MB stack).

Fix by restoring the monolithic generate(vector, ...) implementation
that keeps all prefill and decode logic inline, matching the pre-#17741
pattern that works on Windows. The separate prefill() and
decode_from_token() methods are retained for external callers and the
prefill-then-generate workflow.

Also:
- Pass std::function params to decode_from_token by const ref
- Increase voxtral_runner stack to 8 MB on Windows as a safety net
- Print runner stderr in the Windows E2E test for diagnostics

This PR was authored with the assistance of Claude.
@larryliu0820 larryliu0820 force-pushed the fix-voxtral-ci-print-stderr branch from 1562d4f to d734f75 Compare March 3, 2026 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant