This project explores and visualizes the internal mechanics of widely used optimization algorithms in machine learning. By creating intuitive visualizations that demonstrate how each optimizer behaves during training, we aim to help students and ML practitioners grasp the differences and use cases among optimization algorithms.
- Stochastic Gradient Descent (SGD) with Momentum
- RMSProp
- Adadelta
- AdaGrad
- Adam
Each optimizer is analyzed and compared based on:
- Update rules and formulas
- Handling of gradients (magnitude, direction, and adaptation)
- Performance on convex and non-convex functions
- Convergence speed and stability
The project includes:
- 2D and 3D contour plots of loss surfaces showing optimization paths
- Animations showing step-by-step movement of optimizers
- Visualizations of how optimizers navigate challenging features like saddle points
- Comparison plots of loss vs. epoch
- Visual demonstrations of optimizer behavior on various loss surfaces
optimizer-visualization/
│
├── visualizations.ipynb - Main notebook with all visualizations and experiments
├── notebooks/ - Individual optimizer exploration notebooks
├── optimizer_paths.gif - 2D animation of optimizer paths on loss surface
├── optimizer_paths_3d.gif - 3D visualization of optimizer behavior
├── optimizer_paths_saddle_point.gif - Visualization of saddle point navigation
├── optimizers_3d.mp4 - Video demonstration of 3D optimization paths
├── prd.md - Product Requirements Document
└── README.md - This file
- Python for all implementations
- NumPy and PyTorch/TensorFlow for numerical experiments
- Matplotlib/Seaborn for static visualizations
- Plotly/IPywidgets for interactive visualizations
- Clone this repository
- Open and run the
visualizations.ipynbnotebook to see all optimizer comparisons - Explore the individual optimizer notebooks in the
notebooksdirectory for deeper insights
- Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization (2014)
- Stanford CS231n Notes
- Distill.pub – Visualizing Optimization Algorithms
- PyTorch & TensorFlow official documentation
- Kien Tran
- Ken Lam

