Skip to content

Commit 0dc2767

Browse files
authored
feat(benchmarking): enable tests to run in dedicated environment or in docker (#3157)
* refactor: move spamoor benchmark into testify suite in test/e2e/benchmark - Create test/e2e/benchmark/ subpackage with SpamoorSuite (testify/suite) - Move spamoor smoke test into suite as TestSpamoorSmoke - Split helpers into focused files: traces.go, output.go, metrics.go - Introduce resultWriter for defer-based benchmark JSON output - Export shared symbols from evm_test_common.go for cross-package use - Restructure CI to fan-out benchmark jobs and fan-in publishing - Run benchmarks on PRs only when benchmark-related files change * fix: correct BENCH_JSON_OUTPUT path for spamoor benchmark go test sets the working directory to the package under test, so the env var should be relative to test/e2e/benchmark/, not test/e2e/. * fix: place package pattern before test binary flags in benchmark CI go test treats all arguments after an unknown flag (--evm-binary) as test binary args, so ./benchmark/ was never recognized as a package pattern. * fix: adjust evm-binary path for benchmark subpackage working directory go test sets the cwd to the package directory (test/e2e/benchmark/), so the binary path needs an extra parent traversal. * wip: erc20 benchmark test * fix: exclude benchmark subpackage from make test-e2e The benchmark package doesn't define the --binary flag that test-e2e passes. It has its own CI workflow so it doesn't need to run here. * fix: replace FilterLogs with header iteration and optimize spamoor config collectBlockMetrics hit reth's 20K FilterLogs limit at high tx volumes. Replace with direct header iteration over [startBlock, endBlock] and add Phase 1 metrics: non-empty ratio, block interval p50/p99, gas/block and tx/block p50/p99. Optimize spamoor configuration for 100ms block time: - --slot-duration 100ms, --startup-delay 0 on daemon - throughput=50 per 100ms slot (500 tx/s per spammer) - max_pending=50000 to avoid 3s block poll backpressure - 5 staggered spammers with 50K txs each Results: 55 MGas/s, 1414 TPS, 19.8% non-empty blocks (up from 6%). * fix: improve benchmark measurement window and reliability - Move startBlock capture after spammer creation to exclude warm-up - Replace 20s drain sleep with smart poll (waitForDrain) - Add deleteAllSpammers cleanup to handle stale spamoor DB entries - Lower trace sample rate to 10% to prevent Jaeger OOM * fix: address PR review feedback for benchmark suite - make reth tag configurable via EV_RETH_TAG env var (default pr-140) - fix OTLP config: remove duplicate env vars, use http/protobuf protocol - use require.Eventually for host readiness polling - rename requireHTTP to requireHostUp - use non-fatal logging in resultWriter.flush deferred context - fix stale doc comment (setupCommonEVMEnv -> SetupCommonEVMEnv) - rename loop variable to avoid shadowing testing.TB convention - add block/internal/executing/** to CI path trigger - remove unused require import from output.go * chore: specify http * chore: filter out benchmark tests from test-e2e * refactor: centralize reth config and lower ERC20 spammer count move EV_RETH_TAG resolution and rpc connection limits into setupEnv so all benchmark tests share the same reth configuration. lower ERC20 spammer count from 5 to 2 to reduce resource contention on local hardware while keeping the loop for easy scaling on dedicated infra. * chore: collect all traces at once * chore: self review * refactor: extract benchmark helpers to slim down ERC20 test body - add blockMetricsSummary with summarize(), log(), and entries() methods - add evNodeOverhead() for computing ProduceBlock vs ExecuteTxs overhead - add collectTraces() suite method to deduplicate trace collection pattern - add addEntries() convenience method on resultWriter - slim TestERC20Throughput from ~217 to ~119 lines - reuse collectTraces in TestSpamoorSmoke * docs: add detailed documentation to benchmark helper methods * ci: add ERC20 throughput benchmark job * chore: remove span assertions * chore: adding gas burner test * feat: allowing for env vars to be set for external resources * chore: adding private key as env var * chore: use host networking for spamoor in external mode Bumps tastora to pick up host network support in the spamoor builder. Spamoor in external mode now uses host networking so it can resolve the same hostnames as the host machine. * fix: pin tastora to correct commit with host network support * chore: change service name * chore: scope victoria traces queries to current test run Record startTime when the provider is created and use it as the lower bound for trace queries, preventing spans from previous runs being included in the analysis. * chore: paginate victoria traces queries Fetch traces in pages of 1000 using offset parameter to avoid hitting the VictoriaTraces limit cap while still collecting all spans. * chore: reduce tryCollectSpans timeout to 15s for victoria provider Avoids 3-minute wait when a service (e.g. ev-reth) has no traces. * chore: increase gas burner spammers from 4 to 8 * chore: revert gas burner spammers back to 4 * chore: switch victoria traces to LogsQL API Replace the Jaeger-compatible API with VictoriaTraces native LogsQL endpoint which streams all results without the 1000 trace limit. * fix: use _stream filter syntax for LogsQL query * chore: tune gasburner config for 100M gas limit blocks * chore: match spamoor slot-duration to deployed 250ms block time * chore: increase gasburner tx count for sustained load * chore: use 8 spammers for sustained gasburner load * chore: revert to 4 spammers to avoid nonce collisions * chore: bump gasburner fees to 1000/500 gwei to clear stale txs * chore: adding support for rich flowchart display * adding host name to span * chore: add re-broadcast * chore: adding flowchart * chore: bump fee * chore: bump wallet amounts * chore: increase throughput * chore: 4x tx volume * chore: fewer tx, higher gas * chore: remove amount * Bumped from 50/20 ETH to 500/200 ETH per wallet * feat: parameterize benchmark config via env vars and add engine span metrics - Make gas_units_to_burn, max_wallets, num_spammers, throughput, and warmup_txs configurable via BENCH_* env vars - Add rethExecutionRate() for ev-reth GGas/s measurement - Add engineSpanEntries() for ProduceBlock/GetPayload/NewPayload timing - Switch local benchmarks from Jaeger to VictoriaTraces - Add setupExternalEnv for running against pre-deployed infrastructure - Update tastora to 2ee1b0a (victoriatraces support) * refactor: extract benchConfig and benchmarkResult types - benchConfig consolidates all BENCH_* env vars into a single struct constructed once per test via newBenchConfig(serviceName) - benchmarkResult collects all output metrics (block summary, overhead, GGas/s, engine span timing, seconds_per_gigagas, span averages) and produces entries via a single entries() call - Removes scattered envInt/envOrDefault calls from test files - Removes manual entry-by-entry result assembly from each test - Net reduction of ~129 lines across existing files * chore: cleaning up flowchart dissplay and fixing UI link error * refactor: address PR review feedback on benchmark harness - replace time.Sleep with require.EventuallyWithT for spammer checks - use benchConfig env vars in TestERC20Throughput instead of hardcoded constants - remove dead truncateID function - fix stale Jaeger comment in smoke test - deduplicate HTTP boilerplate in trace fetching via fetchLogStream helper - fix fragile string comparison for ProduceBlock avg logging - make waitForMetricTarget responsive to context cancellation - add BENCH_WAIT_TIMEOUT env var support
1 parent 6e0bd9a commit 0dc2767

File tree

12 files changed

+1139
-230
lines changed

12 files changed

+1139
-230
lines changed

test/e2e/benchmark/config.go

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
//go:build evm
2+
3+
package benchmark
4+
5+
import (
6+
"os"
7+
"strconv"
8+
"testing"
9+
"time"
10+
)
11+
12+
// benchConfig holds all tunable parameters for a benchmark run.
13+
// fields are populated from BENCH_* env vars with sensible defaults.
14+
type benchConfig struct {
15+
ServiceName string
16+
17+
// infrastructure (used by setupLocalEnv)
18+
BlockTime string
19+
SlotDuration string
20+
GasLimit string
21+
ScrapeInterval string
22+
23+
// load generation (used by test functions)
24+
NumSpammers int
25+
CountPerSpammer int
26+
Throughput int
27+
WarmupTxs int
28+
GasUnitsToBurn int
29+
MaxWallets int
30+
WaitTimeout time.Duration
31+
}
32+
33+
func newBenchConfig(serviceName string) benchConfig {
34+
return benchConfig{
35+
ServiceName: serviceName,
36+
BlockTime: envOrDefault("BENCH_BLOCK_TIME", "100ms"),
37+
SlotDuration: envOrDefault("BENCH_SLOT_DURATION", "250ms"),
38+
GasLimit: envOrDefault("BENCH_GAS_LIMIT", ""),
39+
ScrapeInterval: envOrDefault("BENCH_SCRAPE_INTERVAL", "1s"),
40+
NumSpammers: envInt("BENCH_NUM_SPAMMERS", 2),
41+
CountPerSpammer: envInt("BENCH_COUNT_PER_SPAMMER", 2000),
42+
Throughput: envInt("BENCH_THROUGHPUT", 200),
43+
WarmupTxs: envInt("BENCH_WARMUP_TXS", 200),
44+
GasUnitsToBurn: envInt("BENCH_GAS_UNITS_TO_BURN", 1_000_000),
45+
MaxWallets: envInt("BENCH_MAX_WALLETS", 500),
46+
WaitTimeout: envDuration("BENCH_WAIT_TIMEOUT", 10*time.Minute),
47+
}
48+
}
49+
50+
func (c benchConfig) totalCount() int {
51+
return c.NumSpammers * c.CountPerSpammer
52+
}
53+
54+
func (c benchConfig) log(t testing.TB) {
55+
t.Logf("load: spammers=%d, count_per=%d, throughput=%d, warmup=%d, gas_units=%d, max_wallets=%d",
56+
c.NumSpammers, c.CountPerSpammer, c.Throughput, c.WarmupTxs, c.GasUnitsToBurn, c.MaxWallets)
57+
t.Logf("infra: block_time=%s, slot_duration=%s, gas_limit=%s, scrape_interval=%s",
58+
c.BlockTime, c.SlotDuration, c.GasLimit, c.ScrapeInterval)
59+
}
60+
61+
func envOrDefault(key, fallback string) string {
62+
if v := os.Getenv(key); v != "" {
63+
return v
64+
}
65+
return fallback
66+
}
67+
68+
// envInt returns the integer value of the given env var, or fallback if unset
69+
// or unparseable. Invalid values silently fall back to the default.
70+
func envInt(key string, fallback int) int {
71+
v := os.Getenv(key)
72+
if v == "" {
73+
return fallback
74+
}
75+
n, err := strconv.Atoi(v)
76+
if err != nil {
77+
return fallback
78+
}
79+
return n
80+
}
81+
82+
// envDuration returns the duration value of the given env var (e.g. "5m", "30s"),
83+
// or fallback if unset or unparseable.
84+
func envDuration(key string, fallback time.Duration) time.Duration {
85+
v := os.Getenv(key)
86+
if v == "" {
87+
return fallback
88+
}
89+
d, err := time.ParseDuration(v)
90+
if err != nil {
91+
return fallback
92+
}
93+
return d
94+
}

0 commit comments

Comments
 (0)