Merged
Conversation
Implement parallel fitness evaluation using rayon's par_iter_mut() to evaluate individuals concurrently across multiple trials. Change eval_fitness signature from mutable trials vector to immutable slice reference to support parallel iteration. Add Clone + Send + Sync bounds to Core::State trait to enable safe parallel access. Refactor the fitness calculation logic to compute total score in parallel and reduce to average, improving performance on multi-core systems.
Enable gym environments to work with parallel fitness evaluation by adding Send + Sync trait bounds. Add Send + Sync requirements to GymRsEnvExt trait and its Observation type parameter across all impl blocks. This allows gym-based problem definitions to be safely used in the parallel evaluation framework.
Make IrisState cloneable to support parallel fitness evaluation. Add Clone derive macro to enable trial state cloning in the parallel fitness evaluation pipeline. Required by updated Core::State trait bounds.
Update benchmark tools to align with the new eval_fitness signature that accepts immutable trials slice. Change trials from mutable vector binding to immutable to match the updated API contract. Update eval_fitness call to pass immutable reference.
…talling computation Stdout logging was using blocking I/O, causing the system to hang or crawl in verbose/debug mode due to the high volume of trace events from hot paths (instruction execution, fitness eval). Switch all stdout writers to tracing_appender::non_blocking, matching the existing file logging approach. Introduce TracingGuard to hold all WorkerGuards.
Compare rayon-parallelized fitness evaluation against the sequential baseline across population sizes (50, 100, 200, 500) using the Iris problem. Demonstrates 3.4x-5.9x speedups scaling with population size.
Add --n-threads option to HyperParameters, ExperimentParams, and ExperimentConfig to control the number of rayon threads used for parallel fitness evaluation. Defaults to all available cores when not specified.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
rayon::par_iter_mut(), replacing the sequential loop over population individualsSend + Sync + Clonebounds toStateand environment traits to support parallel executioneval_fitnessto take&[Self::State](immutable slice) instead of&mut Vec<Self::State>, cloning trials per-individual for thread safetytracing_appender::non_blockingBenchmark Results
Speedup scales with population size as expected — rayon distributes individual evaluations across available cores.
Test plan
cargo bench --bench parallel_fitnessruns successfully and shows speedupscargo testpasses (no functional changes to evaluation logic)performance_after_trainingbenchmark still works with--features gym