Skip to content

dbiswas55/LearnMLOpsTools

Repository files navigation

LearnMLOpsTools

A growing, hands-on study of MLOps tooling — starting with experiment tracking using Weights & Biases (W&B) and MLflow. The current examples are built around a real PyTorch FashionMNIST image classification task.

Both training scripts (one per tool) share the same model architectures, hyperparameters, and training loop so the differences come from the tracking tool, not the ML code.

Overview

WandB MLflow
Script train_fashionmnist_wandb.py train_fashionmnist_mlflow.py
Hosting Cloud-first (wandb.ai) Local-first (mlruns/ on disk)
Account required Yes (free tier available) No
Gradient tracking Built-in (wandb.watch()) Manual hooks needed
Model serving Separate product Built-in (mlflow models serve)

Project Structure

├── train_fashionmnist_wandb.py   # Training with W&B logging
├── train_fashionmnist_mlflow.py  # Training with MLflow logging
├── search_mlflow_runs.py         # Query & filter MLflow runs programmatically
├── wandb_vs_mlflow.md            # Detailed side-by-side learning guide
├── medium_article.md             # Companion Medium article
├── checkpoints/                  # W&B model checkpoints (MLP & CNN)
├── checkpoints_mlflow/           # MLflow model checkpoints (MLP & CNN)
├── data/FashionMNIST/            # Dataset (auto-downloaded on first run)
├── mlruns/                       # MLflow local tracking store
└── wandb/                        # W&B local run logs

Models

Two architectures are included for comparison:

  • MLP — A simple fully-connected network (Flatten → 256 → 128 → 10)
  • CNN — A two-layer convolutional network (Conv2d → MaxPool → Conv2d → MaxPool → FC)

Switch between them by setting model_name in get_config() to "mlp" or "cnn".

What Gets Logged

Both scripts log the same information using each tool's API:

  • Hyperparameters — learning rate, batch size, hidden dims, seed, etc.
  • Batch-level metrics — training loss every 100 steps
  • Epoch-level metrics — train loss, validation loss, validation accuracy
  • Best model tracking — best validation accuracy and corresponding checkpoint
  • Visual artifacts — sample predictions, misclassified examples, confusion matrix
  • Model checkpoints — best and final .pt files saved as artifacts
  • Packaged model (MLflow only) — mlflow.pytorch.log_model() for serving

Getting Started

Prerequisites

Installation

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# venv\Scripts\activate   # Windows

# Install dependencies
pip install torch torchvision numpy matplotlib wandb mlflow

Run with Weights & Biases

# Authenticate (one-time)
wandb login

# Train
python train_fashionmnist_wandb.py

Results stream live to your wandb.ai dashboard.

Run with MLflow

# Train (runs are saved to ./mlruns/ automatically)
python train_fashionmnist_mlflow.py

# Launch the MLflow UI to view results
mlflow ui
# Open http://localhost:5000 in your browser

Search MLflow Runs Programmatically

# Start the MLflow tracking server first
mlflow server --host 0.0.0.0 --port 5000

# Query runs
python search_mlflow_runs.py

Key Vocabulary Map

Concept WandB MLflow
Top-level namespace Project Experiment
Single training execution Run Run
Sub-grouping of runs Group Tag
Hyperparameters Config (wandb.config) Params (mlflow.log_params())
Time-series scalars wandb.log() mlflow.log_metric()
Best/final scalar wandb.run.summary Last logged metric value
File uploads wandb.save() mlflow.log_artifact()
Packaged model W&B Model Registry mlflow.pytorch.log_model()

Configuration

Hyperparameters are defined in get_config() at the top of each script. Key options:

Parameter Default Description
model_name "cnn" "mlp" or "cnn"
batch_size 128 Training batch size
learning_rate 1e-3 Adam optimizer LR
epochs 10 Number of training epochs
hidden_dim_1 256 MLP first hidden layer
hidden_dim_2 128 MLP second hidden layer
cnn_channels_1 32 CNN first conv channels
cnn_channels_2 64 CNN second conv channels
seed 42 Random seed for reproducibility

Learn More

License

This project is for educational purposes. Feel free to use and modify it.

About

Hands-on MLOps tooling study: WandB vs MLflow side-by-side using PyTorch + FashionMNIST. Covers experiment tracking, metric logging, model artifacts, and visual analysis. More tools added as studied.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages