perf: significant search and sort performance improvements#40
Open
ParthJadhav wants to merge 9 commits intomasterfrom
Open
perf: significant search and sort performance improvements#40ParthJadhav wants to merge 9 commits intomasterfrom
ParthJadhav wants to merge 9 commits intomasterfrom
Conversation
- Skip regex entirely when only file extension is specified (no search input, not strict, not ignore_case) — uses direct OsStr comparison - Replace path.display().to_string() with to_string_lossy().into_owned() for cheaper path conversion - Improve benchmark harness with warmup iterations, median timing, and an "all" mode for running all benchmarks together - Add dirs as dev-dependency for benchmarks
- Replace std::sync::mpsc with crossbeam-channel for faster multi-producer single-consumer communication - Pre-filter files by extension using ignore crate's TypesBuilder, reducing callback invocations for non-matching files - Increase thread count to 2x CPU cores for better I/O overlap during directory traversal - Skip filter_entry closure when no filters are configured - Add controlled benchmark suite with 10,000-file test directory for reliable matching-path measurement
Use a precomputed-scores approach instead of recomputing file name extraction, lowercasing, and Jaro-Winkler similarity on every comparison. Before: O(n log n) comparisons each computing 2 scores = redundant work After: O(n) score computations + O(n log n) float comparisons Also: - Return &str instead of String from file_name_from_path to avoid alloc - Use sort_unstable_by for better cache locality on float comparisons - Apply in-place permutation to reorder results without extra allocation Benchmark results (1000 items): 1.313ms -> 155µs (8.4x faster)
Remove the num_cpus dependency in favor of the standard library's available_parallelism() (stable since Rust 1.59). This reduces the dependency count and uses the platform-native CPU detection.
Use rayon's par_iter for computing Jaro-Winkler scores in parallel when the dataset exceeds 5,000 items. Below the threshold, use sequential iteration to avoid rayon thread pool overhead. Also scale up controlled benchmark to 100,000 files for better stress testing of matching and sorting paths.
…arch - Add AcceptAll matcher variant: when the types pre-filter handles extension matching, skip redundant per-entry extension checks - Use entry.into_path().into_os_string().into_string() for zero-copy String conversion when paths are valid UTF-8 (99.9% of cases) - Remove add_defaults() from TypesBuilder to avoid loading hundreds of predefined type definitions on every search
Document all optimization checkpoints, what worked (Schwartzian transform 9x sort speedup, crossbeam-channel, zero-copy paths), what didn't work (rayon overhead for small datasets), and benchmark results for each iteration.
Add "system" mode that benchmarks searching from the root filesystem, covering ext-only, regex, limit, no-filter, hidden, strict, and case-insensitive search patterns across ~3.9M real files. Refactor benchmark helpers to reduce duplication.
- Rename `matched` to `is_match` to avoid similar_names lint - Use `is_some_and` instead of `map_or(false, ...)` for Option check - Remove needless borrows in test files - Remove unused `Path` import in bench
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
num_cpusdependency, replaced withstd::thread::available_parallelism()crossbeam-channelandrayondependenciesBenchmark results
Full system benchmark (3.9M files, searching from /)
Key optimizations
similarity_sort— precompute all Jaro-Winkler scores once, sort by floatstd::sync::mpscAcceptAllmatcher skips redundant checks for ext-only searchesinto_path().into_os_string().into_string()Test plan