fix: use O(n) hash map for duplicate case detection instead of O(n²)#22735
fix: use O(n) hash map for duplicate case detection instead of O(n²)#22735dkorpel merged 23 commits intodlang:masterfrom
Conversation
|
Thanks for your pull request and interest in making D better, @hariprakazz! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#22735" |
|
Thanks for the review @ibuclaw! You're right — I was incorrectly assuming pointer equality. I've updated the fix to use separate hash maps keyed by value: |
|
Looks like moving the check to |
|
Fixed the two remaining issues: Preserved original case expression ('a' instead of 97) in error messages by storing initialExp in cs.extra before folding Ready for re-review @ibuclaw @limepoutine |
|
Fixed null dereference for cases expanded from CaseRangeStatement — added null check for cs.extra so it falls back to cs.exp when initialExp wasn't stored. Ready for re-review @ibuclaw @limepoutine |
533c572 to
bec5574
Compare
|
Simplified the fix — moved the O(n) duplicate check back to visitCase where initialExp is already available, eliminating the need for cs.extra and the CaseRange null pointer issue. Error messages now correctly show the original expression (e.g. 'a' not 97). Ready for re-review @ibuclaw @limepoutine |
|
@hariprakazz having a think about possible alternatives to the current algorithm, I think best place would be to put an AA in See how This will get you started off (put it somewhere in /**
* This struct is needed for the Expression of a CaseStatement to be the key
* in an associative array.
*/
private struct CaseExpressionBox
{
Expression exp;
size_t hash;
this(Expression exp)
{
assert(exp.op == EXP.int64 || exp.op == EXP.string_);
this.exp = exp;
if (exp.isIntegerExp())
hash = hashOf(exp.toInteger());
else
hash = hashOf(exp.toStringExp().peekData());
}
size_t toHash() const @safe pure nothrow
{
return hash;
}
bool opEquals(ref const CaseExpressionBox s) @trusted const
{
static if (__VERSION__ < 2099) // https://bugzilla-archive.dlang.org/bugs/22717/
return s.exp.equals(exp);
else
return exp.equals(s.exp);
}
}Then add a new field to void* switchCases;The rest should just be initialising the above field with |
|
@ibuclaw @limepoutine ready for re-review! |
|
@hariprakazz tah! |
|
Fixed the segfault — (*seen)[box] was wrong, changed to seen[box] since seen is already a value not a pointer. All issues resolved, ready for re-review @ibuclaw @limepoutine |
Other than PR is mostly unreviewable with > 10k changes. |
|
Why is there a |
|
Fixed: Changed condition to cs.exp.isIntegerExp() || cs.exp.isStringExp() to exclude variable cases from duplicate detection. Replaced switch_10000.d with static foreach version (9 lines). Removed stray files (diff.txt, fulldiff.txt, rcdmdstatementsem.d). Ready for re-review @ibuclaw @limepoutine |
|
@ibuclaw thanks for the cleanup! What's the correct way to initialize the AA in switchCases that works with pre-2.101? Should I heap-allocate a pointer to the AA instead? |
|
I'm seeing lots of green now. :-) |
|
@ibuclaw the macOS bootstrap failure is in std.experimental.allocator.building_blocks.allocator_list (line 500) — unrelated to this PR's changes. It's a pre-existing flaky test that passes in debug64 but fails in release64. The CircleCI failure is also the frontend.h spacing issue now fixed in the latest commit. Could you please re-run the checks? Thanks! |
|
@ibuclaw all checks are now passing including the CircleCI frontend.h sync. Ready for merge whenever you are! |
|
@ibuclaw all checks are now passing including the CircleCI frontend.h sync. Ready for merge whenever you are! |
dkorpel
left a comment
There was a problem hiding this comment.
All the associative array workarounds aren't pleasant but could be refactored later.
|
Thanks everyone for the reviews and guidance. Happy to address any follow-ups if needed. |
Fixes #22710
The duplicate case detection in
visitCasewas O(n²) — for each caseit iterated over all previously added cases to check for duplicates.
This moves the check to
visitSwitchafter all cases are collected,using
AssocArrayfor O(n) lookup instead of a nested loop.Benchmark from the issue shows 3000→10000 cases causes a 6.3x jump
in compile time with the old approach. This fix brings it to near-linear.