Skip to content

Exploratory and explanatory analysis of global life expectancy patterns using multi-source social, demographic, and economic data from WHO, World Bank, and other international datasets.

Notifications You must be signed in to change notification settings

Davide011/social_data_project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌍 Life Expectancy Data Analysis & Visualization

Python Pandas NumPy Matplotlib Seaborn Plotly Jupyter License


🧭 Overview

This project delivers a comprehensive data-driven exploration of global life expectancy trends, using data from the World Bank and other trusted international sources.
It demonstrates a complete data analytics pipeline, from data collection and cleaning to exploratory data analysis (EDA), feature engineering, and interactive visualization.

By combining statistical reasoning, data storytelling, and human-centered analysis, this work turns global socio-economic data into clear, interpretable insights — a hallmark of professional data science practice.


🔍 Project Highlights

1. Data Collection & Preprocessing

  • Sourced datasets from the World Bank Open Data platform.
  • Performed data wrangling, normalization, and missing value handling using pandas and numpy.
  • Applied feature engineering to create meaningful indicators such as GDP per capita and health expenditure ratios.

2. Exploratory Data Analysis (EDA)

  • Conducted statistical summaries, correlation analysis, and trend exploration.
  • Compared regional patterns of life expectancy with economic and healthcare variables.
  • Combined explorative and explanatory approaches to identify and communicate key insights.

3. Data Visualization & Storytelling

  • Developed interactive and static visualizations using matplotlib, seaborn, and plotly.
  • Focused on clarity, accessibility, and data-driven storytelling for both technical and non-technical audiences.
  • Designed visuals that highlight socio-economic contrasts and long-term global health trends.

4. Insights & Interpretation

  • Showcases global inequalities in life expectancy and their relationships to economic and social factors.
  • Demonstrates the power of data visualization for communicating complex insights effectively.
  • Emphasizes evidence-based reasoning and interpretability.

🧠 Tools & Technologies

Category Tools & Techniques
Programming & Data Processing Python, Pandas, NumPy
Visualization & Storytelling Matplotlib, Seaborn, Plotly
Statistical & Exploratory Analysis Correlation, Regression, Feature Engineering
Data Sources World Bank Open Data, Global Development Indicators
Soft Skills Analytical Thinking, Data Storytelling, Insight Communication

🚀 Live Demo

🔗 Explore the full analysis and interactive visualizations:
Life Expectancy Analysis


🧩 Future Directions

  • Integrating predictive modeling (e.g., regression or clustering) to estimate life expectancy trends.
  • Expanding the dataset to include climate, education, and urbanization factors.
  • Building a Streamlit or Dash dashboard for a fully interactive analytical experience.

👩‍💻 Authors

- Davide Venuto
- Jakob Boëtius Andersen
- Huayuan Song


© 2025 — Licensed under the MIT License

About

Exploratory and explanatory analysis of global life expectancy patterns using multi-source social, demographic, and economic data from WHO, World Bank, and other international datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 76.7%
  • Jupyter Notebook 23.2%
  • Python 0.1%