Parallelize fitness evaluation with rayon by urmzd · Pull Request #9 · urmzd/linear-gp

urmzd · 2026-03-29T04:04:43Z

Summary

Parallelizes fitness evaluation using rayon::par_iter_mut(), replacing the sequential loop over population individuals
Adds Send + Sync + Clone bounds to State and environment traits to support parallel execution
Refactors eval_fitness to take &[Self::State] (immutable slice) instead of &mut Vec<Self::State>, cloning trials per-individual for thread safety
Adds a Criterion benchmark comparing sequential vs parallel fitness evaluation on the Iris problem
Fixes debug/verbose logging from stalling computation by switching all stdout writers to non-blocking I/O via tracing_appender::non_blocking

Benchmark Results

Population Size	Sequential	Parallel	Speedup
50	1.57 ms	462 µs	3.4x
100	1.83 ms	416 µs	4.4x
200	3.17 ms	600 µs	5.3x
500	6.18 ms	1.04 ms	5.9x

Speedup scales with population size as expected — rayon distributes individual evaluations across available cores.

Test plan

cargo bench --bench parallel_fitness runs successfully and shows speedups
cargo test passes (no functional changes to evaluation logic)
Verify existing performance_after_training benchmark still works with --features gym

Implement parallel fitness evaluation using rayon's par_iter_mut() to evaluate individuals concurrently across multiple trials. Change eval_fitness signature from mutable trials vector to immutable slice reference to support parallel iteration. Add Clone + Send + Sync bounds to Core::State trait to enable safe parallel access. Refactor the fitness calculation logic to compute total score in parallel and reduce to average, improving performance on multi-core systems.

Enable gym environments to work with parallel fitness evaluation by adding Send + Sync trait bounds. Add Send + Sync requirements to GymRsEnvExt trait and its Observation type parameter across all impl blocks. This allows gym-based problem definitions to be safely used in the parallel evaluation framework.

Make IrisState cloneable to support parallel fitness evaluation. Add Clone derive macro to enable trial state cloning in the parallel fitness evaluation pipeline. Required by updated Core::State trait bounds.

Update benchmark tools to align with the new eval_fitness signature that accepts immutable trials slice. Change trials from mutable vector binding to immutable to match the updated API contract. Update eval_fitness call to pass immutable reference.

…talling computation Stdout logging was using blocking I/O, causing the system to hang or crawl in verbose/debug mode due to the high volume of trace events from hot paths (instruction execution, fitness eval). Switch all stdout writers to tracing_appender::non_blocking, matching the existing file logging approach. Introduce TracingGuard to hold all WorkerGuards.

Compare rayon-parallelized fitness evaluation against the sequential baseline across population sizes (50, 100, 200, 500) using the Iris problem. Demonstrates 3.4x-5.9x speedups scaling with population size.

Add --n-threads option to HyperParameters, ExperimentParams, and ExperimentConfig to control the number of rayon threads used for parallel fitness evaluation. Defaults to all available cores when not specified.

urmzd added 8 commits March 28, 2026 22:29

feat(iris): derive Clone for IrisState

19a4b85

Make IrisState cloneable to support parallel fitness evaluation. Add Clone derive macro to enable trial state cloning in the parallel fitness evaluation pipeline. Required by updated Core::State trait bounds.

test(benchmark): add parallel vs sequential fitness evaluation benchmark

a3986a8

Compare rayon-parallelized fitness evaluation against the sequential baseline across population sizes (50, 100, 200, 500) using the Iris problem. Demonstrates 3.4x-5.9x speedups scaling with population size.

feat(core): expose n_threads CLI flag for parallel evaluation

5c5a2f4

Add --n-threads option to HyperParameters, ExperimentParams, and ExperimentConfig to control the number of rayon threads used for parallel fitness evaluation. Defaults to all available cores when not specified.

style: fix rustfmt formatting

0568a99

urmzd merged commit cffdc5e into main Mar 29, 2026
2 checks passed

urmzd deleted the feat/parallel-fitness-benchmark branch March 29, 2026 04:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize fitness evaluation with rayon#9

Parallelize fitness evaluation with rayon#9
urmzd merged 8 commits intomainfrom
feat/parallel-fitness-benchmark

urmzd commented Mar 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

urmzd commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark Results

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

urmzd commented Mar 29, 2026 •

edited

Loading