LearnMLOpsTools

A growing, hands-on study of MLOps tooling — starting with experiment tracking using Weights & Biases (W&B) and MLflow. The current examples are built around a real PyTorch FashionMNIST image classification task.

Both training scripts (one per tool) share the same model architectures, hyperparameters, and training loop so the differences come from the tracking tool, not the ML code.

Overview

	WandB	MLflow
Script	`train_fashionmnist_wandb.py`	`train_fashionmnist_mlflow.py`
Hosting	Cloud-first (wandb.ai)	Local-first (`mlruns/` on disk)
Account required	Yes (free tier available)	No
Gradient tracking	Built-in (`wandb.watch()`)	Manual hooks needed
Model serving	Separate product	Built-in (`mlflow models serve`)

Project Structure

├── train_fashionmnist_wandb.py   # Training with W&B logging
├── train_fashionmnist_mlflow.py  # Training with MLflow logging
├── search_mlflow_runs.py         # Query & filter MLflow runs programmatically
├── wandb_vs_mlflow.md            # Detailed side-by-side learning guide
├── medium_article.md             # Companion Medium article
├── checkpoints/                  # W&B model checkpoints (MLP & CNN)
├── checkpoints_mlflow/           # MLflow model checkpoints (MLP & CNN)
├── data/FashionMNIST/            # Dataset (auto-downloaded on first run)
├── mlruns/                       # MLflow local tracking store
└── wandb/                        # W&B local run logs

Models

Two architectures are included for comparison:

MLP — A simple fully-connected network (Flatten → 256 → 128 → 10)
CNN — A two-layer convolutional network (Conv2d → MaxPool → Conv2d → MaxPool → FC)

Switch between them by setting model_name in get_config() to "mlp" or "cnn".

What Gets Logged

Both scripts log the same information using each tool's API:

Hyperparameters — learning rate, batch size, hidden dims, seed, etc.
Batch-level metrics — training loss every 100 steps
Epoch-level metrics — train loss, validation loss, validation accuracy
Best model tracking — best validation accuracy and corresponding checkpoint
Visual artifacts — sample predictions, misclassified examples, confusion matrix
Model checkpoints — best and final .pt files saved as artifacts
Packaged model (MLflow only) — mlflow.pytorch.log_model() for serving

Getting Started

Prerequisites

Python 3.10+
A WandB account (free tier) for the W&B script

Installation

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# venv\Scripts\activate   # Windows

# Install dependencies
pip install torch torchvision numpy matplotlib wandb mlflow

Run with Weights & Biases

# Authenticate (one-time)
wandb login

# Train
python train_fashionmnist_wandb.py

Results stream live to your wandb.ai dashboard.

Run with MLflow

# Train (runs are saved to ./mlruns/ automatically)
python train_fashionmnist_mlflow.py

# Launch the MLflow UI to view results
mlflow ui
# Open http://localhost:5000 in your browser

Search MLflow Runs Programmatically

# Start the MLflow tracking server first
mlflow server --host 0.0.0.0 --port 5000

# Query runs
python search_mlflow_runs.py

Key Vocabulary Map

Concept	WandB	MLflow
Top-level namespace	Project	Experiment
Single training execution	Run	Run
Sub-grouping of runs	Group	Tag
Hyperparameters	Config (`wandb.config`)	Params (`mlflow.log_params()`)
Time-series scalars	`wandb.log()`	`mlflow.log_metric()`
Best/final scalar	`wandb.run.summary`	Last logged metric value
File uploads	`wandb.save()`	`mlflow.log_artifact()`
Packaged model	W&B Model Registry	`mlflow.pytorch.log_model()`

Configuration

Hyperparameters are defined in get_config() at the top of each script. Key options:

Parameter	Default	Description
`model_name`	`"cnn"`	`"mlp"` or `"cnn"`
`batch_size`	`128`	Training batch size
`learning_rate`	`1e-3`	Adam optimizer LR
`epochs`	`10`	Number of training epochs
`hidden_dim_1`	`256`	MLP first hidden layer
`hidden_dim_2`	`128`	MLP second hidden layer
`cnn_channels_1`	`32`	CNN first conv channels
`cnn_channels_2`	`64`	CNN second conv channels
`seed`	`42`	Random seed for reproducibility

Learn More

wandb_vs_mlflow.md — Comprehensive side-by-side comparison with code examples for every concept
medium_article.md — Blog-style walkthrough of the key differences

License

This project is for educational purposes. Feel free to use and modify it.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
search_mlflow_runs.py		search_mlflow_runs.py
train_fashionmnist_mlflow.py		train_fashionmnist_mlflow.py
train_fashionmnist_wandb.py		train_fashionmnist_wandb.py
wandb_vs_mlflow.md		wandb_vs_mlflow.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LearnMLOpsTools

Overview

Project Structure

Models

What Gets Logged

Getting Started

Prerequisites

Installation

Run with Weights & Biases

Run with MLflow

Search MLflow Runs Programmatically

Key Vocabulary Map

Configuration

Learn More

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LearnMLOpsTools

Overview

Project Structure

Models

What Gets Logged

Getting Started

Prerequisites

Installation

Run with Weights & Biases

Run with MLflow

Search MLflow Runs Programmatically

Key Vocabulary Map

Configuration

Learn More

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages