Skip to content

[UX BUG] Search fails to match names with underscores (e.g. "house prices nominal" misses "house_prices_nominal") #385

@Siddhartha-singh01

Description

@Siddhartha-singh01

Description: The current search implementation creates a discovery hurdle for users who search using natural language (spaces) for datasets that utilize underscores in their identifiers.

Steps to Reproduce:

1 - Search OpenML for house prices nominal.
2 - Notice that the top results are relevant but the exact match house_prices_nominal is missing or buried.
3 - Search for house_prices_nominal with underscores.
4 - Notice it appears immediately as the top result.

Impact: This is a high-priority UX issue because users naturally search with spaces. Having to know the exact underscore placement makes the search feel "broken" or "incomplete" for a large number of datasets.

Root Cause Analysis: The current Elasticsearch query builder appears to treat underscores as part of a single token rather than a word delimiter. When a user types a space, the analyzer doesn't correlate it with the underscore in the record's name.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions