IBM-Applied-Data-Science-Capstone

In this hands-on project, we are tasked with predicting if the first stage of the SpaceX Falcon 9 rocket will land successfully. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars; other providers cost upward of 165 million dollars each, much of the savings is due to the fact that SpaceX can reuse the first stage. Therefore if we can determine if the first stage will land, we can determine the cost of a launch. With the help of data science findings and models, a competing startup can make more informed bids against SpaceX for a rocket launch.

The project develops through different steps:

Part 1 - Data Collection using API

SpaceX launch data was gathered from SpaceX REST API (https://github.com/r-spacex/SpaceX-API), which has different endpoints. Specifically, we worked with the endpoint: api.spacexdata.com/v4/launches/past.

Part 2 - Data Scraping

Web scraping of Falcon 9 historical launch records from the Wikipedia page titled “List of Falcon 9 and Falcon Heavy launches” (https://en.wikipedia.org/wiki/List_of_Falcon_9_and_Falcon_Heavy_launches) was performed using the BeautifulSoup library.

Part 3 - Data Wrangling

We used the API again targeting different endpoints to gather specific missing data, such as Booster Version, Launch Site and Payload Data. The ‘mission outcome’ column in the raw data was also converted into Training Labels with 1 (the booster successfully landed) or 0 (unsuccessful).

Part 4 - Exploratory Data Analysis using Pandas

Categorical plots, scatter plots and bar plots were generated to visualize the relationship between different variables.

Part 5 - Exploratory Data Analysis using SQL

We performed SQL queries to further interrogate the dataset and gain insights into which features influence the mission outcome.

Part 6 - Interactive Maps and Dashboards

We built an interactive map using Folium to visualize the launch sites as well as a Plotly Dashboard. The Plotly Dashboard has been deployed to Render.com using the free hosting plan and is available at the following URL: https://ibm-applied-data-science-capstone-0esu.onrender.com.

Part 7 - Predictive Analysis using Classification Models

In this phase, we standardized the data and split it into training and test sets. We created machine learning models using:

Logistic Regression,
Support Vector Machine (SVM),
Decision Tree Classifier, and
K-Nearest Neighbors (KNN).

Finally, we performed a Grid Search to identify the optimal hyperparameters for each classification model.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Part_1_spacex_data_collection_api.ipynb		Part_1_spacex_data_collection_api.ipynb
Part_2_spacex_webscraping.ipynb		Part_2_spacex_webscraping.ipynb
Part_3_spacex_data_wrangling.ipynb		Part_3_spacex_data_wrangling.ipynb
Part_4_spacex_EDA_pandas.ipynb		Part_4_spacex_EDA_pandas.ipynb
Part_5_spacex_EDA_sql.ipynb		Part_5_spacex_EDA_sql.ipynb
Part_6_spacex_Folium_map.ipynb		Part_6_spacex_Folium_map.ipynb
Part_6_spacex_Plotly_dashboard.py		Part_6_spacex_Plotly_dashboard.py
Part_7_spacex_machine_learning_predictions.ipynb		Part_7_spacex_machine_learning_predictions.ipynb
README.md		README.md
requirements.txt		requirements.txt
spacex_launch_dash.csv		spacex_launch_dash.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IBM-Applied-Data-Science-Capstone

Part 1 - Data Collection using API

Part 2 - Data Scraping

Part 3 - Data Wrangling

Part 4 - Exploratory Data Analysis using Pandas

Part 5 - Exploratory Data Analysis using SQL

Part 6 - Interactive Maps and Dashboards

Part 7 - Predictive Analysis using Classification Models

About

Uh oh!

Releases

Packages

Languages

luona-zhang/IBM-Applied-Data-Science-Capstone

Folders and files

Latest commit

History

Repository files navigation

IBM-Applied-Data-Science-Capstone

Part 1 - Data Collection using API

Part 2 - Data Scraping

Part 3 - Data Wrangling

Part 4 - Exploratory Data Analysis using Pandas

Part 5 - Exploratory Data Analysis using SQL

Part 6 - Interactive Maps and Dashboards

Part 7 - Predictive Analysis using Classification Models

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages