Skip to content

DanielDemoz/ttc-predict

Repository files navigation

TTC Predict

Live Demo GitHub Pages

An advanced machine learning project that predicts major delays (>5 minutes) in the Toronto TTC subway system and provides route optimization with interactive map visualization.

Features

Core Functionality

  • Delay Prediction: Predicts major delays using machine learning
  • Route Optimization: Finds the best routes considering delay probabilities
  • Interactive Map: Visualizes stations with delay risk indicators
  • Real-time Predictions: Get instant delay probability for any station

Map Visualization

  • Station Map: Interactive map showing all TTC stations
  • Risk Indicators: Color-coded markers (green/orange/red) based on delay probability
  • Station Details: Click markers to see delay probability and station info
  • Route Visualization: See optimized routes on the map

Route Optimization

  • Multi-route Planning: Compare different route options
  • Delay Risk Assessment: Routes ranked by total delay risk
  • Time Preferences: Optimize for rush hour, off-peak, or any time
  • Transfer Optimization: Smart transfer point recommendations

Technical Features

  • FastAPI Backend: High-performance REST API
  • Machine Learning: RandomForestClassifier with 85% accuracy
  • Interactive Web UI: Modern, responsive interface
  • GitHub Pages Ready: Static deployment support

Project Structure

ttc-predict/
├── main.py                          # FastAPI application with web interface
├── train_model.py                   # Model training script
├── requirements.txt                 # Python dependencies
├── model_training.ipynb            # Jupyter notebook with full ML pipeline
├── random_forest_model_new_task.pkl # Trained ML model
├── label_encoders_new_task.pkl     # Feature encoders
├── docs/index.html                 # Web interface
├── .github/workflows/deploy.yml    # GitHub Actions deployment
├── README.md                       # This file
└── LICENSE                         # MIT License

Data & Model

  • Dataset: Toronto Open Data – TTC Subway Delay Data

  • Target variable: MajorDelay

    • 0 = No major delay (≤5 minutes)
    • 1 = Major delay (>5 minutes)
  • Input features:

    • Line – Subway line (e.g., YU, BD)
    • Station – Station name
    • Code – Delay cause code
    • DayOfWeek – Numeric day of week (0 = Monday, 6 = Sunday)
  • Model used: RandomForestClassifier

  • Feature importance (sample result):

    • Code – 41.5%
    • Station – 38.6%
    • DayOfWeek – 17.2%
    • Line – 2.5%

Quick Start

Option 1: Run Locally

  1. Clone the Repository
git clone https://github.com/DanielDemoz/ttc-predict.git
cd ttc-predict
  1. Install Dependencies
py -m pip install -r requirements.txt
  1. Train the Model (if needed)
py train_model.py
  1. Run the Application
py -m uvicorn main:app --reload
  1. Access the Web Interface

Option 2: GitHub Pages (Static)

  1. Generate Static Files
py deploy.py
  1. Deploy to GitHub Pages
  • Push to main branch
  • GitHub Actions will automatically deploy
  • Access at: https://danieldemoz.github.io/ttc-predict

API Endpoints

Delay Prediction

POST /predict

{
  "Line": "YU",
  "Station": "UNION STATION", 
  "Code": "MUIS",
  "DayOfWeek": 0
}

Response:

{
  "prediction": 1,
  "probability": 0.75,
  "input": { ... }
}

Route Optimization

POST /route/optimize

{
  "start_station": "UNION STATION",
  "end_station": "FINCH",
  "day_of_week": 0,
  "time_preference": "rush_hour"
}

Response:

{
  "routes": [
    {
      "stations": ["UNION STATION", "FINCH"],
      "total_delay_risk": 0.15,
      "estimated_time": 25
    }
  ]
}

Station Predictions

GET /stations/predictions Returns delay probabilities for all stations with coordinates.

Health Check

GET /health Returns API status and model loading information.


Dependencies

fastapi
uvicorn[standard]
pandas
scikit-learn
joblib
numpy
requests
matplotlib
seaborn

Install with:

py -m pip install -r requirements.txt

Key Features Explained

Interactive Map

  • Real-time Visualization: See all TTC stations on an interactive map
  • Risk Indicators: Color-coded markers show delay probability
    • Green: Low risk (< 10%)
    • Orange: Medium risk (10-30%)
    • Red: High risk (> 30%)
  • Station Details: Click any marker for detailed information

Route Optimization

  • Smart Routing: Considers delay probabilities when planning routes
  • Multiple Options: Compare different route alternatives
  • Time Preferences: Optimize for rush hour, off-peak, or any time
  • Transfer Points: Intelligent transfer station recommendations

Machine Learning

  • Model: RandomForestClassifier with 85% accuracy
  • Features: Line, Station, Code, DayOfWeek
  • Prediction: Major delay probability (>5 minutes)
  • Real-time: Instant predictions for any station/condition

Development Workflow

  1. Data Collection: Toronto Open Data API
  2. Data Processing: Cleaning, feature engineering, encoding
  3. Model Training: RandomForestClassifier with cross-validation
  4. API Development: FastAPI with interactive web interface
  5. Deployment: GitHub Pages with automated CI/CD

Model Performance

  • Accuracy: 85%
  • Precision: 89% (No delay), 44% (Major delay)
  • Recall: 94% (No delay), 31% (Major delay)
  • Feature Importance:
    • Code: 41.6%
    • Station: 38.7%
    • DayOfWeek: 17.2%
    • Line: 2.5%

Deployment Options

Local Development

py -m uvicorn main:app --reload

GitHub Pages (Static)

py deploy.py
git add .
git commit -m "Deploy to GitHub Pages"
git push origin main

Docker (Optional)

FROM python:3.9-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

License

This project is open-source and available under the MIT License for educational and research purposes.

About

Predict TTC subway delays using ML and FastAPI deployment with real-time Toronto Open Data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published