Skip to content

Use the upodesh crate for dictionary suggestion search#45

Merged
mominul merged 2 commits intomasterfrom
upodesh
Jun 22, 2025
Merged

Use the upodesh crate for dictionary suggestion search#45
mominul merged 2 commits intomasterfrom
upodesh

Conversation

@mominul
Copy link
Member

@mominul mominul commented Jun 22, 2025

upodesh uses an approach based on the Finite State Transducer (FST) data structure which is substantially faster than the Regular Expression based approach. This approach is inspired by the Go project libavrophonetic of Mehdi Hasan Khan which used Trie data structure.

Benchmarks

upodesh is significantly faster than the previously used heavily optimized regex-based search approach in riti. Based on recent benchmarks, it is approximately ~21× to ~58× faster, depending on the input. This demonstrates a substantial performance gain over regex, especially in cases where large patterns previously caused bottlenecks.

📊 Summary of the Benchmark

This benchmark was performed on a Apple MacBook Air M1:

Word upodesh Time regex Time Speedup
a ~3.341 µs ~194.34 µs ~58× faster
arO ~11.840 µs ~246.53 µs ~20.8× faster
bistari ~9.734 µs ~353.74 µs ~36.3× faster

cc @mugli @gulshan

@mominul mominul merged commit 1701dab into master Jun 22, 2025
14 checks passed
@mominul mominul deleted the upodesh branch June 22, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant