[enhancement] Refactor of independent_sampling.py#797
[enhancement] Refactor of independent_sampling.py#797
independent_sampling.py#797Conversation
independent_sampling.pyindependent_sampling.py
|
Here are distributions. I ran it on the v2.3 branch and this branch. I didn't see any major differences between the distributions. population_synthesis_analysis_v2.3.pdf |
| If m1 is None or has incorrect size. | ||
| """ | ||
| if rng is None: | ||
| rng = np.random.default_rng() |
There was a problem hiding this comment.
Maybe we can handle this in a separate PR because I noticed that we make this replacement in a few places when there is an undefined RNG, I think outside of this .py file as well; should we still apply a specified seed (or entropy) if one has been set in the .ini?
There was a problem hiding this comment.
The other place this happens is binarypopulation.py line 988.
There was a problem hiding this comment.
Would you like me to remove this part? I added is because previously we were just using uniform directly without allowing a RNG to be passed through. If the RNG is passed, it should be defined using the entropy in the configuration file. However, I don't fully remember if this is the case here
There was a problem hiding this comment.
My concern is minor and basically about semantics, so feel free to leave it for another PR of course, but I'll explain in more detail below.
We set up the PopulationManager in BinaryPopulation.__init__() using
192 self.RNG = np.random.default_rng(seed=seed_seq)
193 self.kwargs['RNG'] = self.RNG
in binarypopulation.py, where seed_seq is based on the entropy, which can be defined in the .ini file. This RNG gets passed to PopulationManager() and then to BinaryGenerator(). So from a regular user's perspective, RNG will never be None and it will be defined using the entropy set in the .ini. I think that is OK because we do not expose RNG as an option in the .ini. Incidentally though, any user-defined RNG will get overwritten during BinaryPopulation.__init__().
However, if someone constructs BinaryGenerator() independently, these lines (also in binarypopulation.py)
988 if RNG is None:
989 self.RNG = np.random.default_rng()
990 self.entropy = self.RNG.bit_generator._seed_seq.entropy
will force the system's entropy to be used, regardless of whatever entropy might be set in kwargs and likewise for similar cases in distributions.py. This behavior seems inconsistent somewhat. In BinaryPopulation() you can set the entropy in kwargs, whereas in other places you'd need to construct the RNG outside of the object with a specific entropy and pass it in kwargs.
In BinaryGenerator(), I'd suggest something like:
988 if RNG is None:
989 self.entropy = self.kwargs.get('entropy', None)
990 seq = np.random.SeedSequence(entropy=self.entropy)
991 self.RNG = np.random.default_rng(seed=seed_seq)
992 self.entropy = self.RNG.bit_generator._seed_seq.entropy
993 else:
994 assert isinstance(RNG, np.random.Generator)
995 self.RNG = RNG
996 self.entropy = self.RNG.bit_generator._seed_seq.entropy
and similarly doing that in distributions.py, etc. Or maybe defining a helper function to construct a default RNG, checking for any existing entropy setting.
There was a problem hiding this comment.
I see. I agree with you on the BinaryGenerator case. There is seems logical to define the RNG with the kwargs. However, I don't think we should propagate a kwargs check down to the rvs functions in distributions.py.
These are lower level functions and should not interface with such kwargs parameters. In our normal scenario, the rng parameters are never be not set, right? I would be okay with forcing rng to always be a parameter though, instead of setting it inside the function.
There was a problem hiding this comment.
I think that makes sense.
That's right, in the normal case the rng parameters should always be set prior. If we want to keep these lower level functions as is, I think it may be good to still set the rng within the function and keep the argument optional then (as it is right now), in case a person does not care what the RNG is. Maybe throw a warning that a default RNG is being used?
|
This PR looks good to me. @maxbriel are there any additional specific checks you'd like to carry out or help with? |
Co-authored-by: Seth Gossage <sethg45@gmail.com>
This refactors
independent_samplingto use thedistributionsandIMFsfor sampling.This additionally adds:
inverse_samplerandrejection_samplerI have tested this within the context of the unit tests. Not within initial population sampling. However, the norm_pop unit tests contain quite a few tests where the initial samplers are used. These seem to function similarly to before.
This PR does not make any functional changes.
This is the first step to solving issues related the initial sampling, including the difference between setting$m_2$ or $q$ boundaries.