Commit 0dc2767
authored
feat(benchmarking): enable tests to run in dedicated environment or in docker (#3157)
* refactor: move spamoor benchmark into testify suite in test/e2e/benchmark
- Create test/e2e/benchmark/ subpackage with SpamoorSuite (testify/suite)
- Move spamoor smoke test into suite as TestSpamoorSmoke
- Split helpers into focused files: traces.go, output.go, metrics.go
- Introduce resultWriter for defer-based benchmark JSON output
- Export shared symbols from evm_test_common.go for cross-package use
- Restructure CI to fan-out benchmark jobs and fan-in publishing
- Run benchmarks on PRs only when benchmark-related files change
* fix: correct BENCH_JSON_OUTPUT path for spamoor benchmark
go test sets the working directory to the package under test, so the
env var should be relative to test/e2e/benchmark/, not test/e2e/.
* fix: place package pattern before test binary flags in benchmark CI
go test treats all arguments after an unknown flag (--evm-binary) as
test binary args, so ./benchmark/ was never recognized as a package
pattern.
* fix: adjust evm-binary path for benchmark subpackage working directory
go test sets the cwd to the package directory (test/e2e/benchmark/),
so the binary path needs an extra parent traversal.
* wip: erc20 benchmark test
* fix: exclude benchmark subpackage from make test-e2e
The benchmark package doesn't define the --binary flag that test-e2e
passes. It has its own CI workflow so it doesn't need to run here.
* fix: replace FilterLogs with header iteration and optimize spamoor config
collectBlockMetrics hit reth's 20K FilterLogs limit at high tx volumes.
Replace with direct header iteration over [startBlock, endBlock] and add
Phase 1 metrics: non-empty ratio, block interval p50/p99, gas/block and
tx/block p50/p99.
Optimize spamoor configuration for 100ms block time:
- --slot-duration 100ms, --startup-delay 0 on daemon
- throughput=50 per 100ms slot (500 tx/s per spammer)
- max_pending=50000 to avoid 3s block poll backpressure
- 5 staggered spammers with 50K txs each
Results: 55 MGas/s, 1414 TPS, 19.8% non-empty blocks (up from 6%).
* fix: improve benchmark measurement window and reliability
- Move startBlock capture after spammer creation to exclude warm-up
- Replace 20s drain sleep with smart poll (waitForDrain)
- Add deleteAllSpammers cleanup to handle stale spamoor DB entries
- Lower trace sample rate to 10% to prevent Jaeger OOM
* fix: address PR review feedback for benchmark suite
- make reth tag configurable via EV_RETH_TAG env var (default pr-140)
- fix OTLP config: remove duplicate env vars, use http/protobuf protocol
- use require.Eventually for host readiness polling
- rename requireHTTP to requireHostUp
- use non-fatal logging in resultWriter.flush deferred context
- fix stale doc comment (setupCommonEVMEnv -> SetupCommonEVMEnv)
- rename loop variable to avoid shadowing testing.TB convention
- add block/internal/executing/** to CI path trigger
- remove unused require import from output.go
* chore: specify http
* chore: filter out benchmark tests from test-e2e
* refactor: centralize reth config and lower ERC20 spammer count
move EV_RETH_TAG resolution and rpc connection limits into setupEnv
so all benchmark tests share the same reth configuration. lower ERC20
spammer count from 5 to 2 to reduce resource contention on local
hardware while keeping the loop for easy scaling on dedicated infra.
* chore: collect all traces at once
* chore: self review
* refactor: extract benchmark helpers to slim down ERC20 test body
- add blockMetricsSummary with summarize(), log(), and entries() methods
- add evNodeOverhead() for computing ProduceBlock vs ExecuteTxs overhead
- add collectTraces() suite method to deduplicate trace collection pattern
- add addEntries() convenience method on resultWriter
- slim TestERC20Throughput from ~217 to ~119 lines
- reuse collectTraces in TestSpamoorSmoke
* docs: add detailed documentation to benchmark helper methods
* ci: add ERC20 throughput benchmark job
* chore: remove span assertions
* chore: adding gas burner test
* feat: allowing for env vars to be set for external resources
* chore: adding private key as env var
* chore: use host networking for spamoor in external mode
Bumps tastora to pick up host network support in the spamoor builder.
Spamoor in external mode now uses host networking so it can resolve
the same hostnames as the host machine.
* fix: pin tastora to correct commit with host network support
* chore: change service name
* chore: scope victoria traces queries to current test run
Record startTime when the provider is created and use it as the lower
bound for trace queries, preventing spans from previous runs being
included in the analysis.
* chore: paginate victoria traces queries
Fetch traces in pages of 1000 using offset parameter to avoid hitting
the VictoriaTraces limit cap while still collecting all spans.
* chore: reduce tryCollectSpans timeout to 15s for victoria provider
Avoids 3-minute wait when a service (e.g. ev-reth) has no traces.
* chore: increase gas burner spammers from 4 to 8
* chore: revert gas burner spammers back to 4
* chore: switch victoria traces to LogsQL API
Replace the Jaeger-compatible API with VictoriaTraces native LogsQL
endpoint which streams all results without the 1000 trace limit.
* fix: use _stream filter syntax for LogsQL query
* chore: tune gasburner config for 100M gas limit blocks
* chore: match spamoor slot-duration to deployed 250ms block time
* chore: increase gasburner tx count for sustained load
* chore: use 8 spammers for sustained gasburner load
* chore: revert to 4 spammers to avoid nonce collisions
* chore: bump gasburner fees to 1000/500 gwei to clear stale txs
* chore: adding support for rich flowchart display
* adding host name to span
* chore: add re-broadcast
* chore: adding flowchart
* chore: bump fee
* chore: bump wallet amounts
* chore: increase throughput
* chore: 4x tx volume
* chore: fewer tx, higher gas
* chore: remove amount
* Bumped from 50/20 ETH to 500/200 ETH per wallet
* feat: parameterize benchmark config via env vars and add engine span metrics
- Make gas_units_to_burn, max_wallets, num_spammers, throughput, and
warmup_txs configurable via BENCH_* env vars
- Add rethExecutionRate() for ev-reth GGas/s measurement
- Add engineSpanEntries() for ProduceBlock/GetPayload/NewPayload timing
- Switch local benchmarks from Jaeger to VictoriaTraces
- Add setupExternalEnv for running against pre-deployed infrastructure
- Update tastora to 2ee1b0a (victoriatraces support)
* refactor: extract benchConfig and benchmarkResult types
- benchConfig consolidates all BENCH_* env vars into a single struct
constructed once per test via newBenchConfig(serviceName)
- benchmarkResult collects all output metrics (block summary, overhead,
GGas/s, engine span timing, seconds_per_gigagas, span averages) and
produces entries via a single entries() call
- Removes scattered envInt/envOrDefault calls from test files
- Removes manual entry-by-entry result assembly from each test
- Net reduction of ~129 lines across existing files
* chore: cleaning up flowchart dissplay and fixing UI link error
* refactor: address PR review feedback on benchmark harness
- replace time.Sleep with require.EventuallyWithT for spammer checks
- use benchConfig env vars in TestERC20Throughput instead of hardcoded constants
- remove dead truncateID function
- fix stale Jaeger comment in smoke test
- deduplicate HTTP boilerplate in trace fetching via fetchLogStream helper
- fix fragile string comparison for ProduceBlock avg logging
- make waitForMetricTarget responsive to context cancellation
- add BENCH_WAIT_TIMEOUT env var support1 parent 6e0bd9a commit 0dc2767
File tree
12 files changed
+1139
-230
lines changed- test/e2e
- benchmark
12 files changed
+1139
-230
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
0 commit comments