Skip to content

Conversation

@pFornagiel
Copy link

Description

This PR introduced minor changes and bugfixes, to make the library more in-line with the paper it was based on as well as address some of the design issues.

Referenced paper:
Jason Adair, Gabriela Ochoa, and Katherine M. Malan. 2019. Local optima networks for continuous fitness landscapes. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '19). Association for Computing Machinery, New York, NY, USA, 1407–1414. https://doi.org/10.1145/3319619.3326852

Changes made

src/lonpy/lon.py

  • Added compute_performance_metrics method, which now returns the success and deviation metrics, defined like in the paper, to allow for better benchmarking. The change also includes splitting the compute_metrics method functionality into compute_performance_metrics and compute_network_metrics, with compute_metrics itself now returning the aggregate results of the two functions.

  • In order to be able to compute the performance metris, added final_run_values field to the LON class, which is a pandas Series tracking the final values of each run of the sampler. This is used to later compute the success metric.

  • Improved lnodes creation in the LON class to create the dataframe in a single function call, instead of creating two separate dataframes and merging them later

  • Changed the strenght metric computation in CMLON to be defined the same way as in the LON class. The issue was that, that LON class normalises the strenght of global sink according to all the other node's strengths, while the CMLON class would normalize it in relation to the other local sinks, which produced inconsistent results between the classes. This change is up to discussion.

  • Removed the _simplify_with_edge_sum call from CMLON instance creation as the _contract_vertices called earlier already properly merges the multiple edges

src/lonpy/sampling.py

  • Changed the sampler stopping condition, from performing self.config.n_iterations in a single run to the condition of stopping when there have been self.config.n_iterations runs without an improvement, to align the implementation with the definition of algorithm in the paper.

  • Changed the default BasinHoppingSamplerConfig values to match the paper.

  • In unbounded_perturbation added

    rounded = rounded + 0.0
    

    in order to avoid situations during the hashing, where there would be -0.0 points. This behaviour made creating LON vertices from hashes inconsistent.

    • Redefined the domain lists to be parsed to numpy arrays for cleaner implementation, changed typings of passed lower_bounds and upper_bounds to be Sequence's, to allow for passing numpy arrays

.gitignore

  • Updated gitignore to ignore the __pycachce__ folders

This is more of a question, as I did not change anything in the implementation here, but should we allow for rounding the x's in the following part of sampling.py?

if self.config.opt_digits < 0:
    current_x = res.x
    current_f = res.fun
else:
    current_x = np.round(res.x, self.config.opt_digits)
    current_f = np.round(func(current_x), self.config.opt_digits)

The code still performs computations on the usual precision used in numpy, only the results are rounded. Should it not be up to the user, whether he wants to round the end results and provide him with calculation with better precision where available?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant