Skip to content

Chriss4123/EvolutionaryForest

 
 

Repository files navigation

Evolutionary Forest

PyPI Version Build Status Documentation Status Updates

An open source Python library for automated feature engineering based on Genetic Programming.

Introduction

Feature engineering is a long-standing issue that has plagued machine learning practitioners for many years. Deep learning techniques have significantly reduced the need for manual feature engineering in recent years. However, a critical issue is that the features discovered by deep learning methods are difficult to interpret.

In the domain of interpretable machine learning, genetic programming has demonstrated to be a promising method for automated feature construction, as it can improve the performance of traditional machine learning systems while maintaining similar interpretability. Nonetheless, such a potent method is rarely mentioned by practitioners. We believe that the main reason for this phenomenon is that there is still a lack of a mature package that can automatically build features based on the genetic programming algorithm. As a result, we propose this package with the goal of providing a powerful feature construction tool for enhancing existing state-of-the-art machine learning algorithms, particularly decision-tree-based algorithms.

Features

  • A powerful feature construction tool for generating interpretable machine learning features.
  • A reliable machine learning model with powerful performance on small datasets.

Installation

pip install -U evolutionary_forest

Example

An example of usage:

X, y = load_diabetes(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
r = EvolutionaryForestRegressor(max_height=3, normalize=True, select='AutomaticLexicase',
                                gene_num=10, boost_size=100, n_gen=20, n_pop=200, cross_pb=1,
                                base_learner='Random-DT', verbose=True)
r.fit(x_train, y_train)
print(r2_score(y_test, r.predict(x_test)))

An example of improvements brought about by constructed features:

Constructed Features

Tutorials

Here are some notebook examples of using Evolutionary Forest:

Documentation

Tutorial: English Version | 中文版本

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

An open source python library for automated feature engineering based on Genetic Programming

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.4%
  • Jupyter Notebook 3.5%
  • Makefile 0.1%