-
Notifications
You must be signed in to change notification settings - Fork 59
Directly Build UstrMap For Properties #525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
For what it's worth, this was a performance decision because a lot of time is spent inserting into that hashmap: #467 On my machine the change in this PR actually degrades performance by quite a lot: Are you able to replicate the boost in performance? What are your system specs like? |
|
Excellent, I assumed it must be that way for performance reasons, but I still don't understand how it's faster to build a vec and then build a hashmap instead of just directly building a hashmap. Is is because of rehashing while allocating more capacity? I'm running the benchmarks on Linux using a 5950X like I mentioned on #529. I reran the benchmarks first on Seems positive and reproducible. |
|
This change overall degrades performance on my M1 MacBook Air. Here is the deserialize Miner's Haven benchmark plotted (n = 1000, red is latest master, blue is this branch): I'm not entirely sure why there is such a significant difference, but I did some cursory profiling and noticed that these changes cause ~10% more L1 misses and ~12% more L2 misses when deserializing miners-haven.rbxl on my machine. So, one reason could be that using a scratch Vec for instance builder properties somehow improves cache locality over UstrMap (at least, on my machine and on this workload). I highly recommend resolving #529 before moving on with any performance work! The benchmarks are only as good as the data they can collect - if the data is polluted with e.g. CPU boosts then it's much, much harder to get a good signal |
|
I looked up how to set the cpu performance governor, and it can be done using Which supposedly just sets it to always run at the maximum frequency. The default governer was called "powersave". Running benchmarks repeatedly on the Ustr uses precomputed hashes which I assume are the same for every hashmap. Could it be accidental quadratic behaviour when doing |
|
Tried this again to see if any of the other changes had magically fixed the performance regression but nope, this still regresses performance by a fair bit. I ran it around a dozen times and it was worse every time, so I'm still not sure what you're observing. |
|
I've updated the description to include pertinent information. The originally reported performance uplift was because I was comparing to a low-performance run of master branch. Compared to the high-performance variant, which is the expected master branch performance, this change introduces a substantial performance regression. |
|
Closing after revisiting benchmarks with a new governor tool, see #529 (comment). |
|
which corresponds to this line: rbx-dom/rbx_binary/src/deserializer/state.rs Line 527 in 1c5b4bc
I hit a segfault on the next run: Very curious! Undefined behaviour! (Nov 20) (Nov 29) |



Fixes Unreliable Benchmarks #529If there's no reason why InstanceBuilder.properties can't be directly assigned to Instance.properties, then let's do that. The same logic is applied to building the properties in rbx_xml.
The "Miner's Haven/Deserialize" benchmark says throughput is increased by 30%. I was mistaken, the performance uplift was due to #529 comparing to a bad baseline run on master branch.There seems to be a performance disparity between Windows and Linux for this change.