An intelligent machine learning system with automated MLOps pipeline for detecting gambling-related content in user comments using natural language processing techniques.
This project implements a complete MLOps pipeline using GitHub Actions that automatically handles the entire machine learning lifecycle:
| Stage | Description | Tools Used |
|---|---|---|
| Environment Setup | Python 3.9, dependencies installation | setup-python@v4, pip |
| Data Management | Auto-generate dummy data if missing | Custom Python script |
| Model Training | Train ML model with latest data | scikit-learn, joblib |
| Model Evaluation | Generate performance metrics and reports | Custom evaluation script |
| Artifact Management | Version control for model files | Git commits, upload-artifact@v4 |
| Auto Deployment | Deploy to Hugging Face Spaces | Git LFS, HF Spaces API |
- Automatic : Every push to
mainbranch - Manual : Workflow dispatch for on-demand runs
gamble-comment-detector/
├── 📁 .github/
│ └── workflows/
│ └── pipeline.yml # Pipeline Configuration
├── 📁 app/
│ ├── __init__.py
│ └── main.py # FastAPI application
├── 📁 data/
│ └── comments.csv # Training dataset
├── 📁 model/
│ ├── eval_report.json # Automated evaluation metrics
│ ├── saved_model.joblib # Auto-generated ML model
│ └── vectorizer.joblib # Auto-generated text vectorizer
├── 📁 notebooks/
│ └── Baseline.ipynb # Jupyter notebook for experimentation
├── 📁 scripts/
│ └── generate_dummy_data.py # Auto data generation
├── 📁 src/
│ ├── __init__.py
│ ├── evaluate.py # Automated model evaluation
│ ├── inference.py # Prediction logic
│ ├── preprocessing.py # Data preprocessing
│ └── train.py # Automated model training
├── app.py # Main application entry point
├── requirements.txt # Python dependencies
└── README.md # This file
| Method | Endpoint | Description |
|---|---|---|
POST |
/predict |
Analyze a single comment |
POST |
/predict/batch |
Analyze multiple comments |
GET |
/health |
Health check endpoint |
GET |
/model/info |
Get model information and version |
Single Prediction:
POST /predict
{
"text": "I just won big at the casino last night!"
}
Response:
{
"is_gambling": true,
"confidence": 0.87,
"model_version": "v1.2.3",
"timestamp": "2024-01-15T10:30:00Z"
}The automated pipeline (pipeline.yml) includes:
# Key Pipeline Steps
1. Environment Setup (Python 3.9)
2. Dependency Installation
3. Automated Model Training
4. Performance Evaluation
5. Model Artifact Versioning
6. Auto-Deployment to HF SpacesFor the pipeline to work, set these GitHub repository secrets:
HF_TOKEN: Hugging Face API token for deployment
# Run tests
pytest tests/ -v
# Code formatting
black src/ app/
flake8 src/ app/
# Generate dummy data
python scripts/generate_dummy_data.py# Test pipeline locally (requires Act)
act -j train-and-deployWe welcome contributions! The automated pipeline will test your changes automatically.
- Fork the repository
- Create a feature branch (
git checkout -b feature/-------) - Commit your changes (
git commit -m '-----------') - Push to the branch (
git push origin feature/---------) - Open a Pull Request
- ✅ Basic ML model implementation
- ✅ FastAPI REST API
- ✅ Automated MLOps pipeline with GitHub Actions
- ✅ Auto-deployment to Hugging Face Spaces
- ✅ Model evaluation metrics and reporting
- 🔄 A/B testing framework
- 🔄 Model performance monitoring dashboard
- 🔄 Advanced deep learning models
- 🔄 Multi-language support
- 🔄 Docker containerization
- 🔄 Kubernetes deployment
Try the live model: Hugging Face Spaces
Updated automatically with every model improvement!
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to all contributors who helped build this project
- Built with using FastAPI, scikit-learn, GitHub Actions, and Hugging Face
- Found a bug? Open an issue
- Have a feature request? Start a discussion
- Pipeline Issues? Check Actions tab
⭐ Don't forget to star this repository if you found it helpful!