A machine learning-powered web application for predicting fantasy cricket player performance and generating optimal team selections. Upload any cricket dataset, train custom models, and generate data-driven predictions for Dream11 and other fantasy platforms.
- Universal dataset support for any T20 cricket league (IPL, BBL, CPL, etc.)
- On-demand ML model training with configurable scope
- Multiple algorithm evaluation (Random Forest, XGBoost, Gradient Boosting)
- Dream11-compliant fantasy points calculation
- Contextual analysis (venue, opposition, form, consistency)
- Interactive team builder with 22-player pool
- Model library for managing multiple league-specific models
- Material Design UI with professional analytics dashboard
pip install -r requirements.txtstreamlit run app.pyThe application will launch at http://localhost:8501
Navigate to "Data Ingestion" and upload a ball-by-ball cricket CSV file with the following required columns:
match_idbatting_teambowling_teamstrikerbowlerruns_off_batextrasvenue
The platform automatically creates a total_runs column from runs_off_bat and extras.
Go to "Model Training" to configure and execute the training protocol:
- Choose training scope (complete dataset or recent matches only)
- Monitor real-time progress and performance metrics
- Save trained models to the repository for future use
Complete the workflow:
- Select competing teams (Squad Configuration)
- Choose match venue (Venue Analysis)
- Build 22-player pool (Roster Management)
- Generate performance forecast
The system will output:
- Ranked list of all 22 players with predicted points
- Top 3 picks (Captain, Vice-Captain, Top Pick)
- Optimal 11-player lineup
fantasy-cricket-analyzer/
├── app.py # Main Streamlit application
├── requirements.txt # Python dependencies
├── models/
│ └── library/ # Saved model repository
├── src/
│ ├── data/
│ │ └── data_loader.py # Dataset processing
│ ├── fantasy/
│ │ └── points_calculator.py # Dream11 points engine
│ ├── ml/
│ │ ├── feature_engineering.py # Feature extraction
│ │ ├── trainer.py # Model training pipeline
│ │ ├── predictor.py # Prediction engine
│ │ └── model_library.py # Model persistence
│ └── optimization/
│ └── team_selector.py # Team optimization
└── scripts/
└── train_model.py # Offline training script
The training pipeline implements:
- Data validation and preprocessing
- Fantasy points calculation from historical data
- Feature engineering (batting, bowling, form, consistency, venue, opposition)
- Multi-model training with cross-validation
- Automatic best model selection based on R² score
- Model persistence with metadata
Per player, the model considers:
- Batting: average, strike rate, boundaries, high scores
- Bowling: wickets, economy, maidens, consistency
- Form: weighted recent performance (last 5 matches)
- Consistency: standard deviation, coefficient of variation
- Venue: historical performance at selected ground
- Opposition: matchup-specific statistics
Full implementation of official Dream11 point rules:
Batting: Runs (+1/run), boundaries (+1/4, +2/6), milestones (30/50/100 runs), duck penalty (-2)
Bowling: Wickets (+25), wicket hauls (+4/8/16), maidens (+12)
Fielding: Catches (+8), stumpings (+12), run-outs (+6/12)
Bonuses/Penalties: Economy rate bonuses/penalties, strike rate bonuses/penalties
- Python 3.8+
- Streamlit (web framework)
- pandas (data processing)
- scikit-learn (ML models, preprocessing)
- XGBoost (gradient boosting)
- joblib (model serialization)
The system does not account for:
- Player injuries or unavailability
- Real-time team changes or announcements
- Weather conditions or pitch reports
- Match context (knockout vs league stage)
- In-play match dynamics
Predictions are statistical estimates based on historical data only.
MIT License
Dataset compatibility: Cricsheet format (https://cricsheet.org/)
Created by Saurav