Stock Price Prediction with MLOps

🎯 Course Project

Objective

The goal of this project is to apply everything learned in the course to build an end-to-end machine learning system with full MLOps workflow.

📍 Problem Statement

This project aims to build a sustainable and maintainable stock price prediction system, implementing the complete MLOps lifecycle including data collection, feature engineering, model training, experiment tracking, real-time inference, deployment, and monitoring.

Users can query predicted stock prices and historical trend charts through a web interface. Developers can periodically retrain models, track experiments, monitor performance and data drift, and trigger auto-retraining.

🧰 Technologies Used

Category	Tools & Frameworks
Cloud / Infra	Docker Compose (extendable to EC2), MinIO, PostgreSQL, ClickHouse
ML Pipeline	FastAPI, Scikit-learn, Pandas, MLflow
Workflow Orchestration	Prefect 2
Monitoring	Evidently + Prometheus + Grafana
CI/CD	GitHub Actions
Testing	pytest (unit + integration tests)
Formatting / Hooks	black, pre-commit, flake8
IaC	Docker Compose + Volume + Network (extendable to Terraform)

🏗️ Project Structure

.
├── backend/                  # Backend with API, ML logic, workflows
│   ├── api/                  # FastAPI routes (train, predict)
│   ├── src/                  # Feature engineering, model training/inference
│   ├── monitor/              # Monitoring logic using Evidently
│   ├── tasks/                # Celery async tasks
│   ├── workflows/            # Prefect ETL & training flows
│   └── tests/                # Unit & integration tests
├── frontend/                 # Frontend (Vite + React)
├── data/, db/, pgdata/       # Data and DB initialization folders
├── monitor/                  # Prometheus & Grafana configurations
├── Dockerfile.*, docker-compose.yml
├── Makefile, setup.md, implementation_log.md
├── .github/                  # GitHub Actions configuration
│   └── workflows/            # GitHub Actions CI/CD workflow
├── .pre-commit-config.yaml   # Pre-commit configuration
├── README.md

🔁 Model Lifecycle

ETL and training pipelines are triggered regularly via Prefect
Training results are logged to MLflow and registered as versioned models
FastAPI serves /predict and /train APIs (Celery-supported)
Evidently exports model drift metrics to Prometheus
Grafana dashboards visualize prediction accuracy, drift metrics, and system metrics

🖥️ System Architecture (Mermaid)

graph TD
  %% ------------------- User / Frontend -------------------
  U[User Browser] -->|HTTP/WS Requests| NG[Nginx<br>Static + Reverse Proxy]

  subgraph Nginx_Proxy["Nginx Proxy"]
    NG -->|/api/predict| UP1
    NG -->|/api/train| UP2
    NG -->|/api/| UP3
    NG -->|/ws| W
    NG -->|Static files<br>/index.html, /js, /css...| Static[React Build]
  end

  %% ------------------- Upstream Pools -------------------
  subgraph Upstream_Pools["Upstream Pools"]
    direction TB
    UP1["backend_predict<br>70% to backend1<br>30% to backend2"]
    UP2["backend_train<br>30% to backend1<br>70% to backend2"]
    UP3["backend_api<br>1:1 to backend1, backend2"]
  end

  %% ------------------- Backend Containers -------------------
  subgraph Backend_API["Backend API multiple containers"]
    B1[backend1:8000]
    B2[backend2:8000]
  end

  UP1 --> B1
  UP1 --> B2
  UP2 --> B1
  UP2 --> B2
  UP3 --> B1
  UP3 --> B2

  %% ------------------- Data / ETL -------------------
  subgraph Data_ETL["Data and ETL"]
    P[Prefect Workflow<br>backend/src/workflows] -->|ETL processing| D1[(raw_db<br>PostgreSQL)]
    P -->|Cleaned data| D2[(OLAP<br>ClickHouse)]
  end

  B1 -->|Query cleaned data| D2
  B2 -->|Query cleaned data| D2
  B1 -->|Push task| E[Redis]
  B2 -->|Push task| E

  %% ------------------- Model Training -------------------
  subgraph Model_Training["Model Training & MLflow"]
    L[Celery Worker] -->|Read cleaned data| D2
    L -->|Execute training| G[Model training logic]
    G -->|Model version management| H[MLflow Registry]
    G -->|Update model metadata| D3[(mlflow-db<br>PostgreSQL)]
    H -->|Model Artifact| S[(MinIO<br>Model storage)]
    H --> D4[(mlflow internal DB<br>PostgreSQL)]
  end

  %% ------------------- Monitoring -------------------
  subgraph Monitoring["Monitoring & Real-time Push"]
    W[ws_monitor<br>Kafka Consumer + WebSocket]
    Q[metrics_publisher<br>Fetch & send to Kafka every 5s]
    N1[Kafka - prediction topic] -->|Prediction result| W
    N2[Kafka - metrics topic]
    Q --> N2
    N2 -->|Metrics| W
    J[Prometheus]
    J -->|Historical data| K[Grafana Dashboard]
  end

  %% ------------------- Async Queue -------------------
  subgraph Async_Tasks["Async Task Queue"]
    E --> |Execute| L
  end

  %% ------------------- Styles -------------------
  classDef frontend fill:#FFD966,stroke:#333,stroke-width:2px;
  classDef nginx fill:#FFB347,stroke:#333,stroke-width:2px;
  classDef upstream fill:#85C1E9,stroke:#333,stroke-width:2px;
  classDef backend fill:#ABEBC6,stroke:#333,stroke-width:2px;
  classDef db fill:#F9E79F,stroke:#333,stroke-width:2px;
  classDef cache fill:#F5B7B1,stroke:#333,stroke-width:2px;
  classDef mlflow fill:#D7BDE2,stroke:#333,stroke-width:2px;
  classDef monitoring fill:#FAD7A0,stroke:#333,stroke-width:2px;
  classDef prom fill:#D5F5E3,stroke:#333,stroke-width:2px;

  class U frontend
  class NG,Static nginx
  class UP1,UP2,UP3 upstream
  class B1,B2 backend
  class D1,D2,D3,D4,S db
  class E,L,M cache
  class G,H mlflow
  class W,Q,N1,N2 monitoring
  class J,K prom

Visual diagram of the Docker Compose services

graph TD
  subgraph Users
    A[Browser]
  end

  subgraph Frontend
    B[Vite + React]
  end

  subgraph Backend
    C[FastAPI API]
    D[Model Training / Inference]
    E[Celery Worker]
    F[Prefect Flows]
  end

  subgraph Storage
    G[PostgreSQL as raw_db]
    H[ClickHouse as cleaned data]
    I[MinIO as Model Artifacts]
    J[MLflow as Tracking DB]
  end

  subgraph Monitoring
    K[Prometheus]
    L[Grafana]
    M[Evidently]
  end

  subgraph Messaging
    N[Kafka]
    O[Redis]
  end

  subgraph CI/CD
    P[GitHub Actions]
  end

  A --> B
  B --> C
  C --> D
  D --> E
  E --> G
  E --> H
  D --> J
  D --> I
  F --> G
  F --> H
  M --> K
  K --> L
  D --> N
  M --> N
  E --> O
  P -->|CI/CD| C

📈 Evaluation Checklist

✅ Problem Definition

✔️ Well-defined scope: stock prediction + model lifecycle

☁️ Infrastructure

✔️ Docker Compose setup with multiple services
✔️ IaC-friendly (MinIO, DB volumes, Prometheus)

🔬 Experiment Tracking

✔️ MLflow for logging experiments and model versioning
- here

📅 Workflow Orchestration

✔️ Prefect 2 for ETL and training flows
- here

🚀 Model Deployment

✔️ FastAPI for model inference (containerized API)

📊 Monitoring

✔️ Evidently + Prometheus + Grafana for data/model monitoring
- docker-compose.monitor.yml
- docker-compose.kafka.yml
Webhook to discord

🔁 Reproducibility

✔️ Makefile + setup.md + requirements + Docker for consistent setup
```
make dev-setup
```

🧪 Best Practices

Unit tests
- train unit test code
- predict unit test code
Integration tests
- predict api test code
- train api test code
Code formatting (black, flake8)
- refer to pre-commit-config.yaml
Makefile automation
- refer to Makefile
Pre-commit hooks
- refer to pre-commit-config.yaml
GitHub Actions for CI
- refer to .github/workflows/ci-tests.yml
- refer to .github/workflows/cd-deploy.yml

⚙️ Installation Guide

# Create virtual environment
python -m venv .venv
source .venv/bin/activate
pip install -r backend/requirements.txt

# Start all services
docker compose up --build

# Run Prefect workflow or one-off training
make train
make workflow

📊 Dataset

Historical stock data from TW & US markets (e.g., 2330.TW, AAPL, TSM):

Source: Yahoo Finance
Transformed via ETL and stored in Parquet format (see workflows/parquet/)

🔗 Useful Resources

📜 License

MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github/workflows		.github/workflows
backend		backend
db/init-scripts		db/init-scripts
demo		demo
frontend		frontend
metrics_publisher		metrics_publisher
monitor		monitor
nginx		nginx
scripts		scripts
ws_monitor		ws_monitor
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Criteria.MD		Criteria.MD
Dockerfile.backend		Dockerfile.backend
Dockerfile.frontend		Dockerfile.frontend
Dockerfile.metrics_publisher		Dockerfile.metrics_publisher
Dockerfile.mlflow		Dockerfile.mlflow
Dockerfile.nginx		Dockerfile.nginx
Dockerfile.ws_monitor		Dockerfile.ws_monitor
Makefile		Makefile
Makefile.backup		Makefile.backup
arch.md		arch.md
dev.md		dev.md
docker-compose.backend.yml		docker-compose.backend.yml
docker-compose.celery.yml		docker-compose.celery.yml
docker-compose.database.yml		docker-compose.database.yml
docker-compose.frontend.yml		docker-compose.frontend.yml
docker-compose.kafka.yml		docker-compose.kafka.yml
docker-compose.monitor.yml		docker-compose.monitor.yml
docs.md		docs.md
package-lock.json		package-lock.json
readme.md		readme.md
readme_zh.md		readme_zh.md
實作歷程.md		實作歷程.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stock Price Prediction with MLOps

🎯 Course Project

Objective

📍 Problem Statement

🧰 Technologies Used

🏗️ Project Structure

🔁 Model Lifecycle

🖥️ System Architecture (Mermaid)

📈 Evaluation Checklist

✅ Problem Definition

☁️ Infrastructure

🔬 Experiment Tracking

📅 Workflow Orchestration

🚀 Model Deployment

📊 Monitoring

🔁 Reproducibility

🧪 Best Practices

⚙️ Installation Guide

📊 Dataset

🔗 Useful Resources

📜 License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

a920604a/stock-mlops

Folders and files

Latest commit

History

Repository files navigation

Stock Price Prediction with MLOps

🎯 Course Project

Objective

📍 Problem Statement

🧰 Technologies Used

🏗️ Project Structure

🔁 Model Lifecycle

🖥️ System Architecture (Mermaid)

📈 Evaluation Checklist

✅ Problem Definition

☁️ Infrastructure

🔬 Experiment Tracking

📅 Workflow Orchestration

🚀 Model Deployment

📊 Monitoring

🔁 Reproducibility

🧪 Best Practices

⚙️ Installation Guide

📊 Dataset

🔗 Useful Resources

📜 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages