Add a benchmark targeting NFA to DFA tradeoffs.#492
Add a benchmark targeting NFA to DFA tradeoffs.#492sayrer wants to merge 3 commits intotimbray:mainfrom
Conversation
|
Cool benchmarks, thanks, will probably adopt. But I'm missing something, the benchmarks don't call nfa2Dfa so how do you arrive at the conclusions up at the top of this thread? |
You can only do damage to these with a lazy or eager (nfa2dfa) DFA implementation. These set the baseline with always-NFA in the presence of wildcards. So, if you look at the patches here (picking and choosing), you'll see it: main...sayrer:quamina:lazy_dfa |
|
Got it. Need to finish first-cut nfa2dfa. |
Shouldn't we just check in the benchmark now? Then hammer on it and declare victory? I've shown that's possible, but maybe not in a way you're cool with. |
Probably. Will take a closer look in the near future. |
Here's the tradeoff this file attempts to measure (see #481):
Small state space, eager fits in budget → eager wins, it's faster, no cache overhead.
Large state space, predictable input → lazy wins by a mile. Eager can't even attempt it.
Large state space, adversarial input → lazy falls back to NFA-with-overhead. Eager also falls back to NFA because it blew the budget. You're in the same place, maybe ~2x slower from your varied input benchmarks — but that's 2x slower than a path that was already the fallback.
This file contains 5 shellstyle wildcard benchmarks designed to characterize NFA vs DFA tradeoffs: