Falcon 9

The project analyzes SpaceX Falcon 9 launch data, and uses machine learning models to predict landing outcomes.

Kickstarter

This project analyzes SpaceX launch data, beginning with data collection via SpaceX API and web scraping, followed by data cleaning and preparation. It includes exploratory data analysis (EDA) using SQL and visualization techniques to identify patterns and a machine learning component for forecasting successful launches. The final stage is the creation of an interactive dashboard with Dash to display the analysis results.

The Details

Data Collection

I collected comprehensive and up-to-date launch data from SpaceX’s API and supplemented it with web-scraped information from reliable sources.

Data Wrangling

I cleaned the collected data by handling missing values, correcting inconsistencies, and standardizing formats.

EDA

I used SQL and visualization tools like Matplotlib and Seaborn to explore the data, uncovering patterns and key features influencing launch success.

Feature Engineer

Based on the EDA, I created new features to improve model predictive power by deriving new variables and transforming some features to better capture their relationships with the target variable.

Model Development

I developed machine learning models, including logistic regression, decision trees, and random forests, to predict launch success, using cross-validation and grid search to tune hyper-parameters and improve performance.

Dashboard Creation

I built a Dash dashboard to make the analysis accessible and interactive, allowing users to explore the data, view predictions, and gain insights through various visualizations.

Results and Insights

Flight Number vs. Launch Site

We can see from the plot that various launch sites have different success rates. For instance, CCAFS LC-40 boasts a 60% success rate, while KSC LC-39A and VAFB SLC 4E have a success rate of 77%.

Payload vs. Launch Site

By examining the Payload Vs. Launch Site scatter plot, we can see that at the VAFB-SLC launch site, there were no rockets launched with a payload mass exceeding 10,000 kg.

Payload vs. Launch Site

The Success Rate vs. Orbit Type scatter plot shows that orbits ES-L1 (the first Lagrangian Point), GEO (Geosynchronous Equatorial Orbit), HEO (Highly Elliptical Orbit), and SSO (Sun-synchronous orbit) have the highest success rate compare to the other orbits.

Payload vs. Launch Site

The line chart illustrates a steady increase in the success rate from 2013, peaking in 2020.

The Model

I developed machine learning models, including logistic regression, decision trees, and random forests, to predict launch success, using cross-validation and grid search for hyper-parameter tuning. The models were evaluated using accuracy, precision, recall, and F1-score, helping to select the best-performing model for deployment.

The predictive analysis aimed to predict the outcome of SpaceX launches using features like Payload Mass, Orbit, Launch Site, and Landing Outcome. After preprocessing the data and converting categorical variables using one-hot encoding, three classification models (Decision Tree, Logistic Regression, and K-Nearest Neighbors) were trained and evaluated using the F1 score. The Decision Tree Classifier, optimized with Grid Search Cross-Validation, emerged as the best-performing model with an improved F1 score of 0.92.

Resources

Work

Data

Data, Machine Learning, and AI projects

Artist & Designer

Art gallery, UI/UX, and design collection

Developer

Projects engineered