ML Model Serving API with MLOps

A production-ready machine learning model serving API with comprehensive MLOps capabilities including model versioning, monitoring, and automated training pipelines.

Problem Statement

Deploying machine learning models to production is often fragmented, fragile, and hard to scale. Teams face:

Inconsistent model versions across environments
Lack of observability into inference performance and data drift
Manual retraining and deployment with no governance
Poor scalability under real-world traffic
No standardized monitoring of latency, errors, or throughput

This leads to unreliable predictions, delayed updates, and high operational overhead.

Solution

This project delivers a production-grade, end-to-end MLOps platform that enables:

Reliable, versioned model serving via FastAPI with support for multiple formats
Real-time monitoring & alerting using Prometheus and Grafana
Automated drift detection and performance tracking
CI/CD-integrated training pipelines with model registry
Scalable, cloud-native deployment using Docker and Kubernetes
Observability by design with latency, error, and throughput metrics

Result: Faster, safer, and more reliable ML in production.

Features

RESTful API for model predictions with FastAPI
Model Versioning with support for multiple model formats
Real-time Monitoring with Prometheus and Grafana
Data Drift Detection and model performance tracking
Automated Training Pipelines with model registry
Docker & Kubernetes ready deployment
CI/CD Integration with GitHub Actions

Tech Stack

Category	Technology
API Framework	FastAPI (async/await)
Machine Learning	Scikit-learn, TensorFlow, PyTorch
Monitoring	Prometheus, Grafana
Database	PostgreSQL (async)
Storage	Local filesystem, S3, or cloud storage
Containerization	Docker, Docker Compose
Orchestration	Kubernetes manifests
CI/CD	GitHub Actions

Architecture

The system follows a microservices architecture:

API Service: Handles prediction requests and model management
Model Registry: Manages model versions and metadata
Monitoring Service: Tracks predictions, errors, and data drift
Training Pipeline: Automated model training and evaluation
Database: Stores model metadata and prediction history

Quick Start

Prerequisites

Python 3.9+
Docker and Docker Compose
PostgreSQL (for production)

Local Development

Clone the repository:

git clone https://github.com/mosesachizz/ml-model-serving.git
cd ml-model-serving

Set up environment:

cp .env.example .env
# Edit .env with your configuration

Install dependencies:
```
pip install -r requirements/dev.txt
```
Run with Docker Compose:
```
docker-compose up -d
```
Access the services:
- API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Grafana: http://localhost:3000 (admin/admin)
- Prometheus: http://localhost:9090

API Usage

Make a prediction:

curl -X POST "http://localhost:8000/api/v1/predict" -H "Content-Type: application/json" -d '{
  "model_version": "v1",
  "features": [[5.1, 3.5, 1.4, 0.2]]
}'

List available models:

curl "http://localhost:8000/api/v1/models"

Get model statistics:

curl "http://localhost:8000/api/v1/monitoring/models/v1/stats"

Model Formats Supported

Pickle (.pkl)
Joblib (.joblib)
TensorFlow/Keras (.h5)
ONNX (.onnx)
PyTorch (.pt) - via custom loading

Monitoring and Metrics

The API exposes Prometheus metrics at /metrics:

model_predictions_total: Total predictions count
model_prediction_latency_seconds: Prediction latency histogram
model_prediction_errors_total: Error counts by type
model_throughput_predictions_per_second: Real-time throughput

Training Pipeline

The training pipeline includes:

Data validation and preprocessing
Model training with hyperparameter tuning
Model evaluation and validation
Model registration and versioning
Automated testing and deployment

Run the training pipeline:

python -m training_pipeline.pipeline

Deployment

Kubernetes

Deploy to Kubernetes:

kubectl apply -f kubernetes/

Cloud Deployment

The application can be deployed to:

AWS ECS/EKS
Google Cloud Run/GKE
Azure Container Instances/AKS
Heroku with Docker

CI/CD Pipeline

GitHub Actions workflows included:

CI: Run tests on pull requests
CD: Deploy to staging/production
Training: Automated model training pipeline
Monitoring: Data drift detection and alerts

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests and ensure they pass
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
data		data
docker		docker
kubernetes		kubernetes
monitoring		monitoring
notebooks		notebooks
requirements		requirements
scripts		scripts
tests		tests
training_pipeline		training_pipeline
.env.example		.env.example
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ML Model Serving API with MLOps

Problem Statement

Solution

Features

Tech Stack

Architecture

Quick Start

Prerequisites

Local Development

API Usage

Model Formats Supported

Monitoring and Metrics

Training Pipeline

Deployment

Kubernetes

Cloud Deployment

CI/CD Pipeline

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

mosesachizz/ml-model-serving

Folders and files

Latest commit

History

Repository files navigation

ML Model Serving API with MLOps

Problem Statement

Solution

Features

Tech Stack

Architecture

Quick Start

Prerequisites

Local Development

API Usage

Model Formats Supported

Monitoring and Metrics

Training Pipeline

Deployment

Kubernetes

Cloud Deployment

CI/CD Pipeline

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages