Auto-Retraining Gradient Boosting Pipeline

End-to-end pipeline for periodic retraining of a gradient boosting model with:

data drift detection (Evidently)
experiment tracking and model registry (MLflow)
orchestration (Prefect 2)
PostgreSQL storage for training data and incoming data

What’s inside

src/consumer: reads data/inference.zip and writes raw rows to new_data
src/data_processing: feature engineering + validation, moves rows into train
src/retraining: Prefect flow that detects drift, trains XGBoost, logs to MLflow, and promotes the best model
src/database: DB helpers and schema
models/: baseline model artifact

Quick start (Docker Compose)

Create .env (or use defaults):

PG_DB=mlops
PG_USER=mlops
PG_PASSWORD=mlops

EXPERIMENT_NAME=xgb-retrain
MODEL_NAME=xgb-model

Build and start core services:

docker compose up --build db mlflow prefect

Ingest raw data to new_data:

docker compose up --build consumer

Log the first model into MLflow:

docker compose run --rm bootstrap_mlflow

Run the retraining flow once:

docker compose run --rm retraining

UIs

Prefect: http://localhost:4200
MLflow: http://localhost:5001

Environment variables

Required for DB and tracking:

PG_DB, PG_USER, PG_PASSWORD, PG_HOST, PG_PORT
EXPERIMENT_NAME, MODEL_NAME
MLFLOW_TRACKING_URI (default in compose: http://mlflow:5001)
PREFECT_API_URL (default in compose: http://prefect:4200/api)
MODEL_PATH (optional, overrides models/model.ubj)

Notes

consumer reads data/inference.zip from the repo and writes to new_data.
Feature engineering also validates rows before moving them to train.
MLflow uses a local SQLite backend inside the container volume.

Troubleshooting

If MLflow returns Invalid Host header, ensure MLFLOW_SERVER_ALLOWED_HOSTS is set in docker-compose.yml.
If Prefect errors with client/server version mismatch, rebuild retraining (it pins Prefect 2.20.18).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
models		models
notebooks		notebooks
src		src
static		static
.env.example		.env.example
.gitignore		.gitignore
Dockerfile.consumer		Dockerfile.consumer
Dockerfile.mlflow		Dockerfile.mlflow
Dockerfile.retraining		Dockerfile.retraining
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto-Retraining Gradient Boosting Pipeline

What’s inside

Quick start (Docker Compose)

UIs

Environment variables

Notes

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Auto-Retraining Gradient Boosting Pipeline

What’s inside

Quick start (Docker Compose)

UIs

Environment variables

Notes

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages