A growing, hands-on study of MLOps tooling — starting with experiment tracking using Weights & Biases (W&B) and MLflow. The current examples are built around a real PyTorch FashionMNIST image classification task.
Both training scripts (one per tool) share the same model architectures, hyperparameters, and training loop so the differences come from the tracking tool, not the ML code.
| WandB | MLflow | |
|---|---|---|
| Script | train_fashionmnist_wandb.py |
train_fashionmnist_mlflow.py |
| Hosting | Cloud-first (wandb.ai) | Local-first (mlruns/ on disk) |
| Account required | Yes (free tier available) | No |
| Gradient tracking | Built-in (wandb.watch()) |
Manual hooks needed |
| Model serving | Separate product | Built-in (mlflow models serve) |
├── train_fashionmnist_wandb.py # Training with W&B logging
├── train_fashionmnist_mlflow.py # Training with MLflow logging
├── search_mlflow_runs.py # Query & filter MLflow runs programmatically
├── wandb_vs_mlflow.md # Detailed side-by-side learning guide
├── medium_article.md # Companion Medium article
├── checkpoints/ # W&B model checkpoints (MLP & CNN)
├── checkpoints_mlflow/ # MLflow model checkpoints (MLP & CNN)
├── data/FashionMNIST/ # Dataset (auto-downloaded on first run)
├── mlruns/ # MLflow local tracking store
└── wandb/ # W&B local run logs
Two architectures are included for comparison:
- MLP — A simple fully-connected network (
Flatten → 256 → 128 → 10) - CNN — A two-layer convolutional network (
Conv2d → MaxPool → Conv2d → MaxPool → FC)
Switch between them by setting model_name in get_config() to "mlp" or "cnn".
Both scripts log the same information using each tool's API:
- Hyperparameters — learning rate, batch size, hidden dims, seed, etc.
- Batch-level metrics — training loss every 100 steps
- Epoch-level metrics — train loss, validation loss, validation accuracy
- Best model tracking — best validation accuracy and corresponding checkpoint
- Visual artifacts — sample predictions, misclassified examples, confusion matrix
- Model checkpoints — best and final
.ptfiles saved as artifacts - Packaged model (MLflow only) —
mlflow.pytorch.log_model()for serving
- Python 3.10+
- A WandB account (free tier) for the W&B script
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
# Install dependencies
pip install torch torchvision numpy matplotlib wandb mlflow# Authenticate (one-time)
wandb login
# Train
python train_fashionmnist_wandb.pyResults stream live to your wandb.ai dashboard.
# Train (runs are saved to ./mlruns/ automatically)
python train_fashionmnist_mlflow.py
# Launch the MLflow UI to view results
mlflow ui
# Open http://localhost:5000 in your browser# Start the MLflow tracking server first
mlflow server --host 0.0.0.0 --port 5000
# Query runs
python search_mlflow_runs.py| Concept | WandB | MLflow |
|---|---|---|
| Top-level namespace | Project | Experiment |
| Single training execution | Run | Run |
| Sub-grouping of runs | Group | Tag |
| Hyperparameters | Config (wandb.config) |
Params (mlflow.log_params()) |
| Time-series scalars | wandb.log() |
mlflow.log_metric() |
| Best/final scalar | wandb.run.summary |
Last logged metric value |
| File uploads | wandb.save() |
mlflow.log_artifact() |
| Packaged model | W&B Model Registry | mlflow.pytorch.log_model() |
Hyperparameters are defined in get_config() at the top of each script. Key options:
| Parameter | Default | Description |
|---|---|---|
model_name |
"cnn" |
"mlp" or "cnn" |
batch_size |
128 |
Training batch size |
learning_rate |
1e-3 |
Adam optimizer LR |
epochs |
10 |
Number of training epochs |
hidden_dim_1 |
256 |
MLP first hidden layer |
hidden_dim_2 |
128 |
MLP second hidden layer |
cnn_channels_1 |
32 |
CNN first conv channels |
cnn_channels_2 |
64 |
CNN second conv channels |
seed |
42 |
Random seed for reproducibility |
- wandb_vs_mlflow.md — Comprehensive side-by-side comparison with code examples for every concept
- medium_article.md — Blog-style walkthrough of the key differences
This project is for educational purposes. Feel free to use and modify it.