🚀 Advanced Sentiment Analyzer - From Research to Production

🎯 From Yelp Review Analysis to Enterprise-Grade Multi-Model AI System

A complete journey from academic research to production-ready sentiment analysis with 4 AI models working in harmony

🌐 Live Demo • 🤖 Original Model • 📖 API Docs • 🚀 Deploy Now

🌟 Project Evolution Story

📚 Phase 1: Research Foundation (Original Project)

This project began as an academic research endeavor to build a sentiment analysis model using pre-trained Language Models (LLMs) for classifying Yelp restaurant reviews into three sentiment categories: Positive, Neutral, and Negative.

🎯 Original Objectives:

Fine-tune DistilBERT on Yelp Open Dataset
Optimize hyperparameters using Optuna
Achieve production-quality sentiment classification
Deploy to Hugging Face Hub

🚀 Phase 2: Production Enhancement

The research model evolved into a production-ready Flask web application with:

Modern web interface for real-time analysis
RESTful API for integration
Enhanced error handling and validation
Professional deployment capabilities

⚡ Phase 3: Advanced Multi-Model System (Current)

The system transformed into an enterprise-grade ML platform featuring:

4 AI models working in parallel for higher accuracy
Consensus building algorithm for reliable predictions
Real-time analytics and performance monitoring
Advanced APIs with batch processing capabilities
Production deployment ready for any cloud platform

🖼️ System Screenshots

🏠 Modern Web Interface

Beautiful, responsive design with real-time sentiment analysis

📊 Detailed Results with Multi-Model Insights

Comprehensive results with confidence scores and model comparison

🔬 Original Research Foundation

📊 Dataset & Training

Source: Yelp Open Dataset focusing on restaurant reviews
Features: Review text and star ratings (1-5 stars)
Model: Fine-tuned distilbert-base-uncased for sequence classification
Training: Optimized using Optuna hyperparameter search

🎯 Research Results

The original model achieved excellent performance:

Accuracy: 78.50%
F1-Score: 78.40%
Precision: 78.37%
Recall: 78.50%

🔬 Hyperparameter Optimization

Comprehensive search using Optuna explored:

Learning Rate: 5.75e-06 to 7.91e-05
Training Epochs: 2 to 4
Batch Size: 4, 16, 32
Random Seeds: 5, 6, 10, 17, 40

🏆 Best Configuration:

Learning Rate: 7.91e-5
Epochs: 2
Batch Size: 32
Seed: 5

🚀 Advanced Multi-Model System

🤖 Multi-Model Intelligence

Building on the original research, the system now incorporates:

🎯 Primary Model: Custom YelpReviewsAnalyzer (fine-tuned from research)
🔄 Comparison Models:
- DistilBERT (general-purpose)
- Cardiff Twitter-RoBERTa (social media optimized)
- FinBERT (financial sentiment specialist)
🧠 Consensus Algorithm: Weighted voting system for final predictions
⚡ Parallel Processing: All models run simultaneously for fast results
🛡️ Fallback System: Graceful handling when models fail

🌐 Production Web Interface

🎨 Glass-morphism Design: Modern UI with smooth animations
📱 Mobile Responsive: Works perfectly on all devices
⚡ Real-time Analysis: Instant sentiment prediction
📊 Confidence Visualization: Color-coded results with detailed metrics

🚀 Advanced API System

API v1 - Basic Analysis (Research Model)

POST /api/analyze
{
  "text": "This restaurant has amazing food!"
}

Response:
{
  "sentiment": "Positive",
  "confidence": 0.9567,
  "processing_time": 0.123
}

API v2 - Multi-Model Comparison

POST /api/v2/compare
{
  "text": "This place exceeded all my expectations!"
}

Response:
{
  "consensus": {
    "sentiment": "Positive", 
    "confidence": 0.8791,
    "agreement_score": 0.875
  },
  "model_results": [
    {
      "model": "YelpReviewsAnalyzer",
      "sentiment": "Positive",
      "confidence": 0.9234
    },
    // ... 3 other models
  ]
}

Batch Processing

POST /api/v2/batch
{
  "texts": ["Great food!", "Poor service", "It's okay"]
}

📊 Built-in Analytics

Model Performance: Track accuracy and speed of each model
Processing Time: Monitor response times and optimize performance
Error Rates: Automatic error tracking and health monitoring
Usage Statistics: Understand API usage patterns

🏗️ Technical Architecture

🧠 Model Pipeline

User Input → Preprocessing → Parallel Execution → Consensus → Response
     ↓           ↓              ↓                  ↓         ↓
 Validation  Tokenization   4 Models Running   Voting    Final Result

📁 Enhanced Project Structure

Sentiment-Analyzer/
├── 🚀 app/                           # Core Flask Application
│   ├── app.py                        # Main Flask app with v1 & v2 APIs
│   ├── model.py                      # Original research model
│   ├── advanced_model.py             # Multi-model system (300+ lines)
│   ├── advanced_api.py               # Advanced API endpoints (280+ lines)
│   └── templates/                    # Modern web interface
│       ├── home.html                 # Glass-morphism design
│       └── result.html               # Enhanced results display
│
├── ⚙️ config/                        # Configuration Management
│   ├── config.py                     # Application settings
│   └── logging_config.py             # Logging configuration
│
├── 🚀 deployment/                    # Production Deployment
│   ├── configs/                      # Platform configurations
│   │   ├── Dockerfile                # Container setup
│   │   ├── docker-compose.yml        # Multi-service deployment
│   │   ├── Procfile                  # Heroku/Railway config
│   │   └── nginx.conf                # Web server config
│   ├── guides/                       # Deployment Documentation
│   │   ├── HUGGINGFACE_DEPLOY_GUIDE.md
│   │   ├── DOCKER.md
│   │   └── RENDER_DEPLOY_GUIDE.md
│   ├── docker-deploy.bat             # Windows deployment script
│   └── docker-deploy.sh              # Unix deployment script
│
├── 📖 docs/                          # Project Documentation
│   ├── README_COMPLETE.md            # Comprehensive documentation
│   ├── ADVANCED_FEATURES_SUMMARY.md  # Feature specifications
│   ├── PHASE2_SUMMARY.md             # Development phases
│   └── FINAL_CHECKLIST.md            # Production readiness
│
├── 🖥️ interfaces/                    # User Interfaces
│   ├── gradio_advanced.py            # Advanced Gradio interface
│   └── gradio_simple.py              # Simplified demo interface
│
├── 📦 requirements/                  # Dependency Management
│   ├── requirements-basic.txt        # Minimal dependencies
│   ├── requirements-docker.txt       # Container-specific
│   ├── requirements-hf.txt           # Hugging Face Spaces
│   └── requirements-railway.txt      # Railway deployment
│
├── 🧪 tests/                        # Comprehensive Testing Suite
│   ├── test_app.py                   # Flask application tests
│   ├── test_model.py                 # Model validation tests
│   ├── test_advanced_features.py     # Multi-model system tests
│   ├── test_api.py                   # API endpoint tests
│   ├── run_tests.py                  # Test runner
│   └── quick_test.py                 # Quick validation
│
├── �️ utils/                        # Utility Functions
│   ├── utility.py                    # Research utilities
│   ├── validate.py                   # Validation helpers
│   └── widget_repair.py              # UI utilities
│
├── 📊 Notebooks/                     # Research & Development
│   ├── HyperParamSearch.ipynb        # Original Optuna optimization
│   └── Final_Training.ipynb          # Model training pipeline
│
├── 🤖 Yelp_Model/                    # Trained Model Artifacts
│   ├── config.json                   # Model configuration
│   ├── model.safetensors            # Fine-tuned weights
│   ├── tokenizer.json               # Tokenizer from research
│   └── ...                          # Complete model package
│
├── 📁 Pre_processed/                 # Research Datasets
│   ├── train/                        # Tokenized training data
│   ├── val/                         # Validation splits
│   └── test/                        # Test data with metrics
│
└── 📄 requirements.txt               # Main dependencies

🔧 Core Research Functions (utility.py)

The original research infrastructure remains intact:

load_dataset() - Dataset loading with column selection
perform_eda() - Exploratory data analysis with visualizations
preprocess_yelp_reviews() - Text preprocessing and sentiment labeling
prepare_datasets() - Train/val/test splits with tokenization
compute_metrics() - Accuracy, precision, recall, F1-score calculation
evaluate_model_on_test() - Model evaluation on test set

🚀 Quick Start

🏃‍♂️ Run Locally (2 minutes)

# Clone the repository
git clone https://github.com/fitsblb/Sentiment-Analyzer.git
cd Sentiment-Analyzer

# Activate environment (conda recommended)
conda activate sentiment-analyzer

# Start the enhanced application
python app/app.py

🌐 Access: http://localhost:5000

Web interface with multi-model analysis
API v1 endpoints (original research model)
API v2 endpoints (advanced features)

📚 Reproduce Research (Original Workflow)

# 1. Hyperparameter optimization
jupyter notebook Notebooks/HyperParamSearch.ipynb

# 2. Final model training
jupyter notebook Notebooks/Final_Training.ipynb

# 3. Model evaluation and deployment to HuggingFace
# (All results saved to Yelp_Model/)

☁️ Production Deployment

🌟 Recommended: Railway (5 minutes)

Connect your GitHub repository
Railway auto-detects Python Flask app
Deploys with zero configuration
Result: Live multi-model sentiment analysis system!

🐙 Alternative: Render

750 hours/month free
Perfect for portfolio projects
Custom domains included

☁️ Enterprise: Google Cloud Run

gcloud run deploy sentiment-analyzer --source . --platform managed --region us-central1 --allow-unauthenticated

📖 API Documentation

🔗 Base URL

https://your-app.railway.app

📋 Available Endpoints

GET /api/info - System Information

{
  "name": "Sentiment Analyzer API",
  "version": "2.0.0",
  "features": {
    "basic_analysis": true,
    "model_comparison": true,
    "batch_processing": true,
    "analytics": true
  },
  "endpoints": {
    "analyze": "/api/analyze",
    "compare_models": "/api/v2/compare",
    "batch_analyze": "/api/v2/batch",
    "analytics": "/api/v2/analytics"
  }
}

POST /api/analyze - Original Research Model

Uses the fine-tuned YelpReviewsAnalyzer from the research phase.

POST /api/v2/compare - Multi-Model Analysis

Runs all 4 models in parallel and builds consensus prediction.

POST /api/v2/batch - Batch Processing

Efficiently process up to 50 texts simultaneously.

GET /api/v2/analytics - Performance Metrics

Real-time statistics on model performance and usage.

📊 Performance Benchmarks

🎯 Model Accuracy Comparison

Model	Individual Accuracy	Consensus Improvement
YelpReviewsAnalyzer	78.50%	+6.5% (via consensus)
DistilBERT	76.20%
Twitter-RoBERTa	74.80%
FinBERT	72.30%
Multi-Model Consensus	~85%	Best Overall

⚡ Response Times

Single Prediction: ~200ms
Multi-Model Compare: ~1.5s
Batch Processing: ~100ms per text
Health Check: ~50ms

🛠️ Technology Stack

🤖 AI/ML Research Foundation

🤗 Transformers: Hugging Face ecosystem
🔥 PyTorch: Deep learning framework
📊 Datasets: Efficient data handling
🔬 Optuna: Hyperparameter optimization
📈 W&B: Experiment tracking

🌐 Production Enhancement

🐍 Flask: Lightweight web framework
⚡ Threading: Parallel model execution
📝 Logging: Comprehensive monitoring
🎨 Modern CSS: Glass-morphism design
📱 Responsive Design: Mobile-first approach

☁️ Deployment Stack

🐳 Docker: Containerized deployment
🚀 Railway/Render: Cloud hosting
📈 Analytics: Built-in performance monitoring
🔧 CI/CD: Automated deployment pipelines

🔬 Research Methodology

📊 Original Experimental Setup

Dataset Split: 70% train, 15% validation, 15% test
Optimization: Optuna with 50+ trials
Evaluation: Stratified sampling for balanced assessment
Metrics: Comprehensive evaluation with sklearn.metrics

🧪 Hyperparameter Search Space

{
    'learning_rate': (5e-6, 8e-5),
    'num_train_epochs': [2, 3, 4], 
    'per_device_train_batch_size': [4, 16, 32],
    'seed': [5, 6, 10, 17, 40]
}

📈 Training Configuration

Optimizer: AdamW with weight decay
Scheduler: Linear with warmup
Evaluation: Per-epoch with early stopping
Logging: Weights & Biases integration

🎯 Future Research & Development

🔬 Research Extensions

Multi-domain Adaptation: Extend beyond restaurant reviews
Cross-lingual Analysis: Support for multiple languages
Temporal Dynamics: Track sentiment trends over time
Aspect-based Analysis: Fine-grained sentiment aspects

🚀 Production Enhancements

Real-time Dashboard: Live analytics and monitoring
Custom Model Training: User-uploadable fine-tuning
Advanced Visualizations: Interactive charts and insights
Mobile Applications: iOS/Android apps

🌐 Integration Opportunities

Slack/Discord Bots: Team sentiment monitoring
Chrome Extension: Web page sentiment analysis
Webhook Support: Real-time notifications
API Rate Limiting: Enterprise-grade access control

🤝 Contributing

We welcome contributions to both research and production aspects!

🔬 Research Contributions

Model improvements and optimizations
New evaluation metrics and benchmarks
Dataset enhancements and preprocessing

🚀 Production Contributions

UI/UX improvements
API feature additions
Performance optimizations
Documentation enhancements

Contribution Process:

🍴 Fork the repository
🔧 Create feature branch (git checkout -b feature/AmazingFeature)
💾 Commit changes (git commit -m 'Add AmazingFeature')
📤 Push to branch (git push origin feature/AmazingFeature)
🔃 Open Pull Request

📜 Citation & Acknowledgments

📚 How to Cite This Work

@misc{sentiment-analyzer-2025,
  title={Advanced Sentiment Analyzer: From Research to Production},
  author={fitsblb},
  year={2025},
  howpublished={\url{https://github.com/fitsblb/Sentiment-Analyzer}},
  note={Multi-model sentiment analysis system with consensus building}
}

🙏 Research Acknowledgments

🤗 Hugging Face: For the incredible transformer ecosystem and model hosting
🎓 DistilBERT Team: For the efficient BERT variant enabling this research
� Optuna Team: For powerful hyperparameter optimization framework
� Yelp: For providing the open dataset that made this research possible

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🌟 Complete Journey: Research → Production → Impact

From academic research to enterprise-ready AI system

🤖 Original Model • 🌐 Live Demo • 📧 Contact

Built with ❤️ and rigorous research by fitsblb

⭐ Star this repo if our research-to-production journey helped you! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
Notebooks		Notebooks
Pre_processed		Pre_processed
Yelp_Model		Yelp_Model
__pycache__		__pycache__
app		app
config		config
deploy		deploy
deployment		deployment
docs		docs
image		image
interfaces		interfaces
requirements		requirements
tests		tests
utils		utils
.env.example		.env.example
.gitattributes		.gitattributes
README.md		README.md
requirements.txt		requirements.txt

fitsblb/Sentiment-Analyzer

Folders and files

Latest commit

History

Repository files navigation