Skip to content

fix: use O(n) hash map for duplicate case detection instead of O(n²)#22735

Merged
dkorpel merged 23 commits intodlang:masterfrom
hariprakazz:fix-switch-quadratic
Apr 4, 2026
Merged

fix: use O(n) hash map for duplicate case detection instead of O(n²)#22735
dkorpel merged 23 commits intodlang:masterfrom
hariprakazz:fix-switch-quadratic

Conversation

@hariprakazz
Copy link
Copy Markdown
Contributor

Fixes #22710

The duplicate case detection in visitCase was O(n²) — for each case
it iterated over all previously added cases to check for duplicates.

This moves the check to visitSwitch after all cases are collected,
using AssocArray for O(n) lookup instead of a nested loop.

Benchmark from the issue shows 3000→10000 cases causes a 6.3x jump
in compile time with the old approach. This fix brings it to near-linear.

@dlang-bot
Copy link
Copy Markdown
Contributor

Thanks for your pull request and interest in making D better, @hariprakazz! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#22735"

Copy link
Copy Markdown
Member

@ibuclaw ibuclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing tests say duplicate case detection breaks with this change. Need a hash map with a custom equality test, not just taking the hash of the pointer to the expression object.

@hariprakazz
Copy link
Copy Markdown
Contributor Author

Thanks for the review @ibuclaw! You're right — I was incorrectly assuming pointer equality. I've updated the fix to use separate hash maps keyed by value: ulong for integer cases and const(char)[] for string cases, so the comparison is now value-based using the actual case values.

@ibuclaw
Copy link
Copy Markdown
Member

ibuclaw commented Mar 16, 2026

Looks like moving the check to SwitchStatement means that the original case value is lost.

Test 'fail_compilation/b15909.d' failed: 
expected:
----
fail_compilation/b15909.d(12): Error: duplicate `case 'a'` in `switch` statement
----
actual:
----
fail_compilation/b15909.d(12): Error: duplicate `case 97` in `switch` statement
fail_compilation/b15909.d(9): Error: `switch` statement without a `default`; use `final switch` or add `default: assert(0);` or add `default: break;`
----

@hariprakazz
Copy link
Copy Markdown
Contributor Author

Fixed the two remaining issues:

Preserved original case expression ('a' instead of 97) in error messages by storing initialExp in cs.extra before folding
Fixed string hash collision handling with chained linked list entries

Ready for re-review @ibuclaw @limepoutine

@hariprakazz
Copy link
Copy Markdown
Contributor Author

Fixed null dereference for cases expanded from CaseRangeStatement — added null check for cs.extra so it falls back to cs.exp when initialExp wasn't stored. Ready for re-review @ibuclaw @limepoutine

@hariprakazz hariprakazz force-pushed the fix-switch-quadratic branch from 533c572 to bec5574 Compare March 16, 2026 17:18
@hariprakazz
Copy link
Copy Markdown
Contributor Author

Simplified the fix — moved the O(n) duplicate check back to visitCase where initialExp is already available, eliminating the need for cs.extra and the CaseRange null pointer issue. Error messages now correctly show the original expression (e.g. 'a' not 97). Ready for re-review @ibuclaw @limepoutine

@ibuclaw
Copy link
Copy Markdown
Member

ibuclaw commented Mar 16, 2026

@hariprakazz having a think about possible alternatives to the current algorithm, I think best place would be to put an AA in Scope. As you can't have a class as the key for an AA however, you'll need to create a "box" struct to wrap the case statement expression.

See how TemplateInstanceBox is used as an example of how to do it.

This will get you started off (put it somewhere in dmd/statementsem.d)

/**
 * This struct is needed for the Expression of a CaseStatement to be the key
 * in an associative array.
 */
private struct CaseExpressionBox
{
    Expression exp;
    size_t hash;

    this(Expression exp)
    {
        assert(exp.op == EXP.int64 || exp.op == EXP.string_);
        this.exp = exp;

        if (exp.isIntegerExp())
            hash = hashOf(exp.toInteger());
        else
            hash = hashOf(exp.toStringExp().peekData());
    }

    size_t toHash() const @safe pure nothrow
    {
        return hash;
    }

    bool opEquals(ref const CaseExpressionBox s) @trusted const
    {
        static if (__VERSION__ < 2099) // https://bugzilla-archive.dlang.org/bugs/22717/
            return s.exp.equals(exp);
        else
            return exp.equals(s.exp);
    }
}

Then add a new field to struct Scope in dmd/dscope.d - after switchStatement will do just fine.

    void* switchCases;

The rest should just be initialising the above field with new CaseStatement[CaseExpressionBox], inserting the CaseStatements into the AA with (*cast(...*)&sc.switchCases)[box] = cs, and using in to find duplicates.

@hariprakazz
Copy link
Copy Markdown
Contributor Author

@ibuclaw @limepoutine ready for re-review!

@ibuclaw
Copy link
Copy Markdown
Member

ibuclaw commented Mar 17, 2026

@hariprakazz tah!

@hariprakazz
Copy link
Copy Markdown
Contributor Author

Fixed the segfault — (*seen)[box] was wrong, changed to seen[box] since seen is already a value not a pointer. All issues resolved, ready for re-review @ibuclaw @limepoutine

@ibuclaw
Copy link
Copy Markdown
Member

ibuclaw commented Mar 19, 2026

All issues resolved,

Other than PR is mostly unreviewable with > 10k changes.

@Herringway
Copy link
Copy Markdown
Contributor

Herringway commented Mar 19, 2026

Why is there a diff.txt, a fulldiff.txt and a rcdmdstatementsem.d\uF03Aq in this PR? They don't seem like they belong in it.

@hariprakazz
Copy link
Copy Markdown
Contributor Author

Fixed:

Changed condition to cs.exp.isIntegerExp() || cs.exp.isStringExp() to exclude variable cases from duplicate detection.

Replaced switch_10000.d with static foreach version (9 lines).

Removed stray files (diff.txt, fulldiff.txt, rcdmdstatementsem.d).

Ready for re-review @ibuclaw @limepoutine

@hariprakazz
Copy link
Copy Markdown
Contributor Author

@ibuclaw thanks for the cleanup! What's the correct way to initialize the AA in switchCases that works with pre-2.101? Should I heap-allocate a pointer to the AA instead?

@ibuclaw
Copy link
Copy Markdown
Member

ibuclaw commented Mar 20, 2026

I'm seeing lots of green now. :-)

@hariprakazz
Copy link
Copy Markdown
Contributor Author

@ibuclaw the macOS bootstrap failure is in std.experimental.allocator.building_blocks.allocator_list (line 500) — unrelated to this PR's changes. It's a pre-existing flaky test that passes in debug64 but fails in release64. The CircleCI failure is also the frontend.h spacing issue now fixed in the latest commit. Could you please re-run the checks? Thanks!

@hariprakazz
Copy link
Copy Markdown
Contributor Author

@ibuclaw frontend.h and scope.h both contain switchCases correctly in the latest commit (766cf5b). The CircleCI failure appears to be the same header sync check — could you please re-run the CI checks? The macOS bootstrap failure is also a pre-existing flaky test unrelated to this PR. Thank you!

@hariprakazz
Copy link
Copy Markdown
Contributor Author

@ibuclaw all checks are now passing including the CircleCI frontend.h sync. Ready for merge whenever you are!

@hariprakazz hariprakazz requested a review from ibuclaw March 26, 2026 17:39
@hariprakazz
Copy link
Copy Markdown
Contributor Author

@ibuclaw all checks are now passing including the CircleCI frontend.h sync. Ready for merge whenever you are!

Copy link
Copy Markdown
Contributor

@dkorpel dkorpel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the associative array workarounds aren't pleasant but could be refactored later.

@hariprakazz
Copy link
Copy Markdown
Contributor Author

Thanks everyone for the reviews and guidance. Happy to address any follow-ups if needed.

@dkorpel dkorpel merged commit 14340ef into dlang:master Apr 4, 2026
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch statement compile time: O(n²) scaling at 10000+ cases

6 participants