-
Notifications
You must be signed in to change notification settings - Fork 83
fix the issue of GEMM validation failure #378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: amd-staging
Are you sure you want to change the base?
Conversation
|
cc @neon60 @j-stephan @adeljo-amd I wonder if this example overlaps with matrix multiplication from #375? If they are similar enough, we should probably just keep one. |
|
I think these two examples have different kernels ,which should not be redundant |
zichguan-amd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with the change, the CPU/GPU error should be in the same range, we can definitely use double for more precision. I'll let others weigh in.
| constexpr float b_value = 0.02F; | ||
| std::fill(B.begin(), B.end(), b_value); | ||
| // Set matrix elements to random value on the host. | ||
| for (size_t i = 0; i < A.size(); ++i) A[i] = static_cast<float>(rand() / RAND_MAX ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be static_cast<double>(rand()) / RAND_MAX, static_cast<float>(rand() / RAND_MAX ) would result in 0 most of the time (integer division)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your reminding. we can change it to be "static_cast(rand() / (RAND_MAX+1.0f) );", and it will generate [0,1) random float.
| std::fill(B.begin(), B.end(), b_value); | ||
| // Set matrix elements to random value on the host. | ||
| for (size_t i = 0; i < A.size(); ++i) A[i] = static_cast<float>(rand() / RAND_MAX ); | ||
| for (size_t i = 0; i < B.size(); ++i) B[i] = static_cast<float>(rand() / RAND_MAX ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated it to be "static_cast(rand() / (RAND_MAX+1.0f) )", and verified that it can generate [0,1) random float value
| #include <cstdlib> | ||
| #include <cassert> | ||
| #include <cstddef> | ||
| #include <memory> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed them in the new commit
Motivation
when trying GEMM sample on R9700, if I change A_rows,A_cols,B_cols from default value to be 4096, validation will fail
Technical Details
Test Plan
./hip_matrix_multiplication
Matrix multiplication: [2048x1024] * [1024x1024], block size: 16x16
Validation passed.
./hip_matrix_multiplication --A_rows 4096 --A_cols 4096 --B_cols 4096
Matrix multiplication: [4096x4096] * [4096x4096], block size: 16x16
Validation passed.
./hip_matrix_multiplication --A_rows 4096 --A_cols 512 --B_cols 2048
Matrix multiplication: [4096x512] * [512x2048], block size: 16x16
Validation passed.
Test Result
all the test can pass
Added/Updated documentation?
Included Visual Studio files?
Submission Checklist