-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix: Auto-increment seed across batch_run iterations #2841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Performance benchmarks:
|
|
I agree that this needs to be fixed. However, using subsequent integers with Mersenne Twister, Python's default RNG, is a bad idea. From wikipedia: "A consequence of poor diffusion is that two instances of the generator, started with initial states that are almost the same, will usually output nearly the same sequence for many iterations". Also, using a seed with many zeros (like 42) is actually bad as well. One option is to just use As an aside, numpy's rng is much better and I believe we should move all mesa code over to using this while deprecating the use of python's stdlib random library. |
|
Considering how important this is, maybe we should just go all in and do the switch to numpy and its rng and then have seed options like system time and hierarchical seeding? Does it have to be a breaking change? Could we keep the old behavior and just add a warning? |
Moving the internals over should be possible as a non-breaking change. |
| seed_value = kwargs["seed"] | ||
| if isinstance(seed_value, (int, float)) and not isinstance(seed_value, bool): | ||
| kwargs = kwargs.copy() | ||
| kwargs["seed"] = int(seed_value) + iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| kwargs["seed"] = int(seed_value) + iteration | |
| kwargs["seed"] = seed_value + time.time() |
This is all that is needed to ensure a much better spread of seeding values and thus better randomness.
Considering this, do we want to move forward with this PR?
Where/how should we do this (without breaking API)? |
When using batch_run() with a single seed value and multiple iterations, all iterations were using the same seed, producing identical results instead of independent replications. This defeats the purpose of running multiple iterations. This commit modifies _model_run_func to automatically increment the seed for each iteration (seed, seed+1, seed+2, ...) when a numeric seed is provided. This ensures: - Each iteration produces different random outcomes - Results remain reproducible (same base seed → same sequence) - Backward compatibility with seed arrays (no modification if seed is already an iterable passed via parameters) - Unchanged behavior when no seed is specified (each iteration gets random seed from OS) The fix only applies when: 1. A 'seed' parameter exists in kwargs 2. The seed value is not None 3. The iteration number is > 0 4. The seed is a single numeric value (int/float, not bool)
for more information, see https://pre-commit.ci
Shifting to numpy rng requires changes in e.g.,
The return is Alternatively, you can keep stuff as is and just document the behavior. Or, perhaps even better: raise a ValueError if In my view, we might even consider deprecating |
Problem
When using
batch_run()with a single seed value and multiple iterations, all iterations use the same seed, producing identical results instead of independent replications.See #2835.
Solution
Modify
_model_run_functo automatically increment the seed for each iteration: 42, 43, 44, etc.Behavior changes
seed=42, iterations=3: currently all use 42, now uses 42, 43, 44seed=[42, 43, 44], iterations=1: unchangedCode that passes a single seed with multiple iterations will get different results. The current behavior seems like a bug (why run multiple identical iterations?), but this technically breaks existing code.
Review
I'm in doubt about this. What if users change have other random elements in their model? Do we do good obscuring this?
Secondly, is this a bugfix or a breaking change? Should we treat it as a fix and merge, or wait for a major version?
Might close #2835. @dylan-munson curious what you think.