-
Notifications
You must be signed in to change notification settings - Fork 14
Added xxh3 64bit hash streaming implementation of WritableSequentialD… #566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
…ata. Added JMH benchmark for model object hashCode(). Rewritten code generation to generate hashCode64() methods and hashCode() using xxhash. Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
Integration Test Report 406 files + 2 406 suites +2 18m 7s ⏱️ +11s For more details on these failures, see this check. Results for commit a5d1b7b. ± Comparison against base commit d5d325e. ♻️ This comment has been updated with latest results. |
Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
| @Override | ||
| public int hashCode() { | ||
| return(int)hashCode64(); | ||
| } | ||
| /** | ||
| * Extended 64bit hashCode method for to make hashCode better distributed and follows protobuf rules | ||
| * for default values. This is important for backward compatibility. This also lazy computes and caches the | ||
| * hashCode for future calls. It is designed to be thread safe. | ||
| */ | ||
| public long hashCode64() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I like this approach.
Still unsure if (int)long is the best way to go for the legacy hash code, but overall, the approach in this PR looks a lot better than the one in the Bytes PR.
| """ | ||
| if ($unknownFields != null) { | ||
| for (int i = 0; i < $unknownFields.size(); i++) { | ||
| hashingStream.writeInt($unknownFields.get(i).hashCode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I'm wondering if we could just throw the unknown field bytes into the hashingStream and let it compute a proper 64-bit hash for all the unknown fields data?
Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
...j-compiler/src/main/java/com/hedera/pbj/compiler/impl/generators/ModelHashCodeGenerator.java
Outdated
Show resolved
Hide resolved
…sh quality testing Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
| @Override | ||
| public long hashCode64() { | ||
| return bytes.hashCode64(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. But you want to also consider feeding the wire type and the field number into the hasher. Otherwise the hash code may not be as dispersed as we'd like.
| /** | ||
| * Interface for objects that can be hashed to a 64-bit long value. | ||
| */ | ||
| public interface SixtyFourBitHashable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I have no strong opinion on this name and the current one does look logical, but HashCode64 might be another option too as it's basically equal to the very method name introduced here.
| * A test to evaluate the quality of non-cryptographic hash functions by checking how many unique hashes can be | ||
| * generated from 4.5 billion StateKey inputs. | ||
| */ | ||
| public final class PbjObjHashQualityStateKeyTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thank you! And what are the results? It'd be interesting to compare the old one and the new one, if possible.
| @@ -0,0 +1,230 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: can we haz Java? :)
… hashing Strings Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
Signed-off-by: Jasper Potts <1466205+jasperpotts@users.noreply.github.com>
Added xxh3 64bit hash streaming implementation of WritableSequentialData. Added JMH benchmark for model object hashCode(). Rewritten code generation to generate hashCode64() methods and hashCode() using xxhash.