Contents detail a project to determine if quality predictions can be made about the duration of Denver RTD light rail departure delays. Notebooks and PowerPoint consist of the analysis and predictive modeling exercise for Springboard Data Science Boot Camp. This is the first of 2 capstones submitted for the program.
The PowerPoint provides a simple overiew of the process spanning problem definition, data sourcing, wrangling, modeling and performance evaluation. The process is separated into 3 notebooks:
- eda: data sourcing, wrangling, and exploratory data analysis
- statistical_analysis: additional inferential statistics for further exploration
- modeling: final data preparation, model tuning, training and prediction, and performance evaluation
Raw data files can be found in a google drive, shown on the first line of eda and pasted below. Also, there are interim printouts of data to alleviate duplicative processes. https://drive.google.com/open?id=1TwRkBqzS53oMC-ZmicCQygKSWiRYSJkW