feat: minor adjustments and fixes made to align with paper #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR introduced minor changes and bugfixes, to make the library more in-line with the paper it was based on as well as address some of the design issues.
Referenced paper:
Jason Adair, Gabriela Ochoa, and Katherine M. Malan. 2019. Local optima networks for continuous fitness landscapes. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '19). Association for Computing Machinery, New York, NY, USA, 1407–1414. https://doi.org/10.1145/3319619.3326852
Changes made
src/lonpy/lon.pyAdded
compute_performance_metricsmethod, which now returns thesuccessanddeviationmetrics, defined like in the paper, to allow for better benchmarking. The change also includes splitting thecompute_metricsmethod functionality intocompute_performance_metricsandcompute_network_metrics, withcompute_metricsitself now returning the aggregate results of the two functions.In order to be able to compute the performance metris, added
final_run_valuesfield to theLONclass, which is a pandas Series tracking the final values of each run of the sampler. This is used to later compute thesuccessmetric.Improved
lnodescreation in theLONclass to create the dataframe in a single function call, instead of creating two separate dataframes and merging them laterChanged the
strenghtmetric computation inCMLONto be defined the same way as in theLONclass. The issue was that, thatLONclass normalises the strenght of global sink according to all the other node's strengths, while theCMLONclass would normalize it in relation to the other local sinks, which produced inconsistent results between the classes. This change is up to discussion.Removed the
_simplify_with_edge_sumcall fromCMLONinstance creation as the_contract_verticescalled earlier already properly merges the multiple edgessrc/lonpy/sampling.pyChanged the sampler stopping condition, from performing
self.config.n_iterationsin a single run to the condition of stopping when there have beenself.config.n_iterationsruns without an improvement, to align the implementation with the definition of algorithm in the paper.Changed the default
BasinHoppingSamplerConfigvalues to match the paper.In
unbounded_perturbationaddedin order to avoid situations during the hashing, where there would be
-0.0points. This behaviour made creatingLONvertices from hashes inconsistent.domainlists to be parsed to numpy arrays for cleaner implementation, changed typings of passedlower_boundsandupper_boundsto beSequence's, to allow for passing numpy arrays.gitignore__pycachce__foldersThis is more of a question, as I did not change anything in the implementation here, but should we allow for rounding the
x's in the following part ofsampling.py?The code still performs computations on the usual precision used in numpy, only the results are rounded. Should it not be up to the user, whether he wants to round the end results and provide him with calculation with better precision where available?