This week you'll deepen your skills with Pandas, focusing on advanced data cleaning and transformation techniques. You'll learn to combine datasets, work with time series, and perform powerful aggregations to answer complex questions about social issues.
By the end of this week, you will be able to:
- Join, merge, and concatenate DataFrames to combine datasets
- Work with time series data (dates, resampling, rolling windows)
- Group and aggregate data to summarize patterns
- Clean and transform data for analysis (handling duplicates, outliers, reshaping)
- Apply advanced filtering and sorting
- Use Pandas to answer questions about trends and relationships in social impact data
- Lecture: Working with DataFrames II: Cleaning & Transforming Data
- Tutorial: DataFrames II: Cleaning & Transforming Data
Assignment: Create a notebook that demonstrates joining, merging, and transforming real-world datasets (e.g., education + health), including time series analysis and groupby operations.
Requirements:
- Load two or more related datasets (e.g., education and health indicators)
- Merge or join DataFrames to combine information
- Clean and transform data (handle duplicates, missing values, reshape as needed)
- Work with time series (parse dates, resample, plot trends)
- Group and aggregate data to answer at least 3 questions
- Document your process and findings with clear explanations
- Merging, Joining, and Concatenating
- Working with Time Series
- GroupBy: Aggregating and Transforming Data
Next Week: Week 4: Data Visualization I: Telling Stories with Plotly
Previous Week: Week 2: Working with DataFrames (Pandas Basics)