Explore 75-Year Projection Datasets for Social Security Reform Analysis
Background
The current Social Security taxation reform analysis in this repository is limited to 10-year projections (2026-2035). New PolicyEngine datasets will soon enable 75-year projections, which are critical for:
- Long-term solvency analysis
- Intergenerational impact assessment
- Alignment with Social Security Trustees' 75-year projection window
Objective
Create a data exploration notebook (similar to analysis-notebooks PR #88) to:
- Validate 75-year dataset coverage across the projection period
- Explore Social Security-specific variables relevant to this analysis
- Document data quality and limitations for long-term projections
- Identify any gaps that would affect reform impact calculations
Variables to Explore
Core Social Security Variables
taxable_social_security - Portion of SS benefits subject to income tax (critical for Options 1-4, 8)
social_security - Total Social Security benefits received
social_security_retirement - Retirement benefits specifically
social_security_disability - SSDI benefits
social_security_dependents - Dependent/survivor benefits
age - Age distribution over 75 years (demographic shifts)
is_senior - Senior population trends
Related Tax Variables
adjusted_gross_income - AGI trends over 75 years
taxable_income - Taxable income base
income_tax - Federal income tax (for revenue calculations)
standard_deduction - Changes over time
itemized_deductions - Itemization patterns
Employment/Payroll Variables (for Option 5-6 Roth-Style Swap)
employment_income - Wage base for employer-side payroll tax
fica_ss_tax - Current Social Security payroll tax
employer_social_security_tax - Employer portion (relevant to Options 5-6)
Analysis Structure
Following the RI dataset exploration approach:
1. Dataset Overview
- Load dataset:
Microsimulation(dataset="<75-year-dataset-path>")
- Check temporal coverage (2026-2100)
- Validate household/person counts across years
- Document any missing years or data gaps
2. Social Security Benefit Distribution
# For each decade: 2030, 2040, 2050, 2060, 2070, 2080, 2090, 2100
for year in [2030, 2040, 2050, 2060, 2070, 2080, 2090, 2100]:
taxable_ss = sim.calculate("taxable_social_security", period=year, map_to="household")
total_ss = sim.calculate("social_security", period=year, map_to="household")
# Distribution analysis
# - Median/percentiles
# - Share of households with SS income
# - Taxable vs non-taxable portion
3. Age Distribution Trends
- Senior population growth (65+)
- Working-age population (18-64)
- Demographic dependency ratios
- Validate against SSA Trustees projections if possible
4. Income Component Trends
- How AGI composition changes over 75 years
- Growth rates of different income sources
- Real vs nominal values (inflation adjustments)
5. Reform-Specific Variables
- For Option 4 ($500 tax credit): Eligible population size
- For Options 5-6 (Roth Swap): Employment income base trends
- For Option 7 (Senior deduction): Senior population with standard deduction
6. Data Quality Checks
- Missing values by variable and year
- Outliers and extreme values
- Consistency with known demographic trends
- Comparison with SSA Trustees intermediate assumptions
7. Summary Tables
Export CSV summaries similar to RI exploration:
75_year_ss_variables_summary_weighted.csv
75_year_ss_variables_summary_unweighted.csv
- Include breakdowns by decade
Deliverable
A Jupyter notebook (analysis/75_year_dataset_exploration.ipynb) that:
- Mirrors the structure of
ri_dataset_exploration.ipynb
- Focuses on Social Security and age-related variables
- Produces summary tables for documentation
- Identifies any data limitations for 75-year projections
- Does NOT run during CI (place in
analysis/ folder, not jupyterbook/)
Related Work
- Current 10-year analysis:
analysis/policy-impacts-dynamic.ipynb
- RI dataset exploration: PolicyEngine/analysis-notebooks#88
- This will enable extending current analysis to full 75-year window
Notes
- Timing: Wait for dataset availability announcement
- Location: Create in
analysis/ folder (not jupyterbook/) to avoid CI timeouts
- Execution time: May be long (expect similar runtime to 10-year dynamic calculations)
- Commit outputs: Execute notebook and commit with outputs for documentation
Explore 75-Year Projection Datasets for Social Security Reform Analysis
Background
The current Social Security taxation reform analysis in this repository is limited to 10-year projections (2026-2035). New PolicyEngine datasets will soon enable 75-year projections, which are critical for:
Objective
Create a data exploration notebook (similar to analysis-notebooks PR #88) to:
Variables to Explore
Core Social Security Variables
taxable_social_security- Portion of SS benefits subject to income tax (critical for Options 1-4, 8)social_security- Total Social Security benefits receivedsocial_security_retirement- Retirement benefits specificallysocial_security_disability- SSDI benefitssocial_security_dependents- Dependent/survivor benefitsage- Age distribution over 75 years (demographic shifts)is_senior- Senior population trendsRelated Tax Variables
adjusted_gross_income- AGI trends over 75 yearstaxable_income- Taxable income baseincome_tax- Federal income tax (for revenue calculations)standard_deduction- Changes over timeitemized_deductions- Itemization patternsEmployment/Payroll Variables (for Option 5-6 Roth-Style Swap)
employment_income- Wage base for employer-side payroll taxfica_ss_tax- Current Social Security payroll taxemployer_social_security_tax- Employer portion (relevant to Options 5-6)Analysis Structure
Following the RI dataset exploration approach:
1. Dataset Overview
Microsimulation(dataset="<75-year-dataset-path>")2. Social Security Benefit Distribution
3. Age Distribution Trends
4. Income Component Trends
5. Reform-Specific Variables
6. Data Quality Checks
7. Summary Tables
Export CSV summaries similar to RI exploration:
75_year_ss_variables_summary_weighted.csv75_year_ss_variables_summary_unweighted.csvDeliverable
A Jupyter notebook (
analysis/75_year_dataset_exploration.ipynb) that:ri_dataset_exploration.ipynbanalysis/folder, notjupyterbook/)Related Work
analysis/policy-impacts-dynamic.ipynbNotes
analysis/folder (notjupyterbook/) to avoid CI timeouts