Evolutionary optimization of a CNN for MNIST digit classification — no backpropagation required.
This project evolves a population of CNNs using genetic algorithms instead of gradient descent. The evolutionary strategy includes:
- Elite selection: Top performers survive to the next generation
- Layer crossover: Random swapping of entire convolutional layers between parents
- Parameter crossover: Weighted averaging of parent parameters
- Mutation: Reinitialization of the last convolutional layers
The model (SimpleCNN) has 10 convolutional layers with BatchNorm and achieves ~98.9% accuracy on MNIST after 160 generations.
EvolutionaryOptimization/
├── src/
│ ├── __init__.py # Package exports
│ ├── model.py # SimpleCNN architecture
│ ├── evolution.py # Crossover, mutation, population evolution
│ ├── evaluate.py # Model evaluation and visualization
│ ├── utils.py # Data loading, model save/load
│ └── logger.py # EvolutionLogger for tracking metrics
├── train.py # Main training script
├── Experiment_1.ipynb # Original experiment notebook
├── requirements.txt # Python dependencies
└── README.md
pip install -r requirements.txtRun the full evolutionary training:
python train.pyOr use the notebook Experiment_1.ipynb for interactive experimentation.
After 160 generations with a population of 10:
- Best accuracy: 99.1% (generation 150)
- The first ~100 generations show slow improvement (~19% accuracy) as structure-based crossover explores the search space
- A breakthrough occurs around generation 102, after which accuracy rapidly climbs to 98%+