-
Notifications
You must be signed in to change notification settings - Fork 217
fix RVV index_max/min kernels returning wrong index #802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @Ka-zam! Oh! Wasn't aware that's a bug; thanks for submitting a fix! Um, are we perhaps missing a unit test there, or is the existing test borked? |
1e6670a to
4095edb
Compare
Hi! (edit) |
…mprovements, asin/acos optimization PR gnuradio#801 - NEON/NEONv8 kernel implementations: - Add NEON/NEONv8 implementations for many kernels with 2-20x speedups - Fix NEON asin/acos sqrt approximation for large values PR gnuradio#802 - RVV bug fix: - Fix RVV index_max/min kernels returning wrong index PR gnuradio#803 - Test output improvements: - Print tolerance, max_err, fail variables - Tablify test output for better readability PR gnuradio#804 - Optimize asin/acos kernels: - Use Sollya-generated polynomial with two-range algorithm - Improve accuracy from ~5.5e-4 to ~1.5e-6 (67x better) - Achieve ~20-27x speedup on x86, ~4.5x on ARM - Tighten test tolerance from 1e-2 to 1e-5 with edge cases - Add ARMv7 NEON sqrt intrinsic with Newton-Raphson iteration
|
Thanks for having a look at this! So, are we consistently failing here across implementations? size_t maxIndex = 0;
float maximum = -INFTY;
for (size_t index = 0; index < vec.size(); ++index){
if (vec[index] > maximum){
maximum = vec[index];
maxIndex = index;
}
}I assume this would be the simple and correct algorithm. Thus the result for This is the kind of bug and test behavior why I introduced googletest. I'd like to add tests like this there to encode expected behavior and make it possible to test against that. If you want to add tests for this case, please go ahead. Besides, we can merge this PR. |
Yes, your implementation is how I understand it should work. We are consistently failing across all implementations when injecting floats that are bit equivalent since none of the implementations keep track of which lane holds the minimum value. The reason RVV failed for this run was simply that the test generated identical input values spaced close enough. So, we fix and inject known problematic test patterns to test this? |
7d42ef6 to
87a19b7
Compare
|
Added edge case to index_* tests: // Index kernels need identical values to test tie-breaking (first index wins)
volk_test_params_t test_params_index(test_params.make_tol(0));
test_params_index.add_float_edge_cases({
1.0f,
1.0f,
1.0f,
1.0f, // 4 identical (SSE lane width)
1.0f,
1.0f,
1.0f,
1.0f, // 8 total (AVX lane width)
1.0f,
1.0f,
1.0f,
1.0f, // 12
1.0f,
1.0f,
1.0f,
1.0f, // 16 total (AVX512 lane width)
});
QA(VOLK_INIT_TEST(volk_32f_index_max_16u, test_params_index))
QA(VOLK_INIT_TEST(volk_32f_index_max_32u, test_params_index))
QA(VOLK_INIT_TEST(volk_32f_index_min_16u, test_params_index))
QA(VOLK_INIT_TEST(volk_32f_index_min_32u, test_params_index))Fixed kernels: |
jdemel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing and fixing this bug. LGTM.
|
I linked #700 against this PR since it discusses the same root cause. Thanks again! |
|
Merging the other PR #801 created merge conflicts. That's very unfortunate. Can you rebase? |
87a19b7 to
2f7d264
Compare
Signed-off-by: Magnus Lundmark <magnuslundmark@gmail.com>
Signed-off-by: Magnus Lundmark <magnuslundmark@gmail.com>
2f7d264 to
0478a46
Compare
|
This can be merged cleanly now. |
index_??? should return first index meeting criteria.