Skip to content

study-projekt: The goal was to forecast daily waste volumes (tonnage) for the Berliner Stadtreinigung (BSR) based on a four-year historical dataset

Notifications You must be signed in to change notification settings

nicdriebe/BSR_AI

Repository files navigation

BSR Waste Tonnage Prediction

Project Overview

This project was developed during an intensive 4-day sprint at HTW Berlin. The goal was to forecast daily waste volumes (tonnage) for the Berliner Stadtreinigung (BSR) based on a four-year historical dataset.

The core research question was whether integrating external urban data sources could significantly improve the prediction accuracy of Machine Learning models compared to using historical waste data alone.


Key Features & Methodology

  • Data Fusion: Integrated the primary BSR dataset with five external sources:
    • Weather data (DWD - German Weather Service)
    • School holiday calendars
    • Public holidays
    • Election surveys ("Sonntagsfrage")
    • Temporal features (weekdays, months, seasons)
  • Feature Engineering: Developed advanced time-series features, including Tonnage_lag_2 (lagged values) and rolling averages to capture seasonal and weekly trends.
  • Model Benchmarking: Implemented and compared three different modeling approaches:
    • Linear Regression (serving as the baseline)
    • Decision Tree
    • Random Forest & XGBoost (the top-performing models)
  • Evaluation: Models were validated using Mean Squared Error (MSE), RMSE, and R²-Score to ensure robust forecasting.

Technologies & Tools

  • Language: Python
  • Libraries: Pandas, NumPy, Scikit-Learn, XGBoost, Matplotlib, Seaborn.
  • Environment: Jupyter Notebooks.

Key Insights

  • External vs. Internal Data: The analysis revealed that internal features—specifically historical lag data and the "Tour ID"—had a much higher predictive power than external factors like weather or holiday status.

  • Model Performance: The XGBoost model outperformed all others, achieving an R²-Score of approximately 0.64, proving its effectiveness in handling complex tabular data.

  • Operational Value: The findings suggest that while external data adds context, highly localized grouping (by specific depots and days) is the most effective path toward optimizing daily resource planning.

  • The results are summarized in the documentation.


About the Project

  • Type: Study Project (Module: AI Analytics)
  • Collaboration: Developed by a team of four students.
  • Timeline: 4-Day Sprint
  • Institution: HTW Berlin (University of Applied Sciences)
  • Status: Completed (Proof of Concept)

About

study-projekt: The goal was to forecast daily waste volumes (tonnage) for the Berliner Stadtreinigung (BSR) based on a four-year historical dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published