[ET-VK][ez] Use tree reduction in q8ta_linear_gemv shader#17792
[ET-VK][ez] Use tree reduction in q8ta_linear_gemv shader#17792meta-codesync[bot] merged 1 commit intogh/SS-JIA/453/basefrom
Conversation
Replace the serial O(WGS) reduction loop with a tree reduction pattern (O(log2(WGS))). Previously, only thread 0 summed all 64 partial accumulators sequentially. Now all threads participate in a classic halving reduction, matching the pattern already used in linear_q4gsw_coop.glsl. Authored by Claude. Differential Revision: [D94949137](https://our.internmc.facebook.com/intern/diff/D94949137/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17792
Note: Links to docs will display an error until the docs builds have been completed. ❌ 15 New Failures, 1 Cancelled Job, 2 Unrelated FailuresAs of commit a1e2aa2 with merge base ae41854 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
ad8ff12
into
gh/SS-JIA/453/base
Replace the serial O(WGS) reduction loop with a tree reduction pattern (O(log2(WGS))). Previously, only thread 0 summed all 64 partial accumulators sequentially. Now all threads participate in a classic halving reduction, matching the pattern already used in linear_q4gsw_coop.glsl. Authored by Claude. Differential Revision: [D94949137](https://our.internmc.facebook.com/intern/diff/D94949137/) ghstack-source-id: 346524552 Pull Request resolved: #17792
Stack from ghstack (oldest at bottom):
Replace the serial O(WGS) reduction loop with a tree reduction pattern
(O(log2(WGS))). Previously, only thread 0 summed all 64 partial
accumulators sequentially. Now all threads participate in a classic
halving reduction, matching the pattern already used in
linear_q4gsw_coop.glsl.
Authored by Claude.
Differential Revision: D94949137