-
Couldn't load subscription status.
- Fork 31
Add --drop-exec option to filter warmup samples #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit introduces a new --drop-exec option that allows users to drop the first N execution samples when running with --exec-multisample. This is useful for mitigating warmup effects that can skew performance measurements. The option accepts an integer N specifying how many initial samples to drop, and works with all execution modes (normal, --exec, and --exec-interleaved-builds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question: one could argue that the handling of warmups should be done at the benchmark level. For example, GoogleBenchmark does that, they will discard some number of warmup runs and then iterate until there is a stable result. That way, when you run a benchmark, you get a result that is immediately usable.
Is that not the case for the benchmarks that are part of the LLVM test suite? Do you see an actual difference between results with and without dropping "warmup" runs?
| # Check for incompatible options | ||
| if opts.only_compile: | ||
| self._fatal("--drop-exec cannot be used with --only-compile") | ||
| if opts.build: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think --build means that we're necessarily skipping the tests, right? Can't you have both --build and --exec, in which case it would make sense to have --build and --drop-exec at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think --build and --exec is redundant as a concept because that's the default behavior. These options are only useful as a stopping point.
That said there's an issue which I don't think the current implementation is self consistent with. --exec (née --test-prebuilt) is supposed to skip previous steps, which is fine. But --build doesn't skip the --configure phase (and doing so seems like bad UX).
For some benchmark suites that are plugged into the test-suite as an "external" suite, the current best practices are that the first run be dropped. Whether or not this makes a meaningful difference isn't something I have evidence of, but this is the current recommendation that I'm hearing. Intuitively it does make some sense, for example when I run some industry standard benchmarks I first build with high parallelism ( |

This commit introduces a new --drop-exec option that allows users to
drop the first N execution samples when running with --exec-multisample.
This is useful for mitigating warmup effects that can skew performance
measurements.
The option accepts an integer N specifying how many initial samples to
drop, and works with all execution modes (normal, --exec, and
--exec-interleaved-builds).