mortality-longevity-analysis

Actuarial mortality and longevity analysis project using ONS mortality data, life tables, and a transparent Lee-Carter forecasting framework.

Project Overview

This repository implements an end-to-end mortality analytics workflow:

Ingest ONS mortality and projection datasets from local Excel files.
Normalize data into a consistent long format.
Fit a simple Lee-Carter model by sex.
Generate baseline and stress-tested longevity scenarios.
Build life tables and derive life expectancy metrics (e0, e65).
Backtest model performance on holdout years.
Compare custom projections to ONS principal/high/low variants.
Produce publication-ready tables and figures.

Actuarial Relevance

Mortality forecasting sits at the core of pricing, reserving, and capital for long-duration liabilities. This project is structured to support practical actuarial use cases:

Pension scheme funding projections and de-risking analysis.
Annuity pricing and profitability sensitivity testing.
Longevity trend monitoring and assumption governance.
Stress/scenario analysis for solvency and risk committees.

Data Sources (Placeholders)

Update these placeholders with exact ONS publication names, release dates, and links used in your analysis.

ONS_DATASET_NAME_OBSERVED_QX_SINGLE_AGE
ONS_DATASET_NAME_PRINCIPAL_PROJECTION_QX
ONS_DATASET_NAME_HIGH_LIFE_EXPECTANCY_VARIANT_QX
ONS_DATASET_NAME_LOW_LIFE_EXPECTANCY_VARIANT_QX

Expected raw files are stored in data/raw/ and mapped in configs/default.yaml.

Methodology

1. Data ingestion and normalization

Dataset-specific parsers handle differing ONS sheet names/layouts.
All datasets are normalized to:
- source, scenario, geography, sex, year, age, qx
Validation checks:
- Required columns
- Age bounds and data type checks
- qx bounded in [0, 1]

2. Transformations

Conversion between probability and central death rate:
- mx = -log(1 - qx)
- qx = 1 - exp(-mx)
Mortality improvement calculations by age/year.
Train/holdout split helpers for model validation.

3. Lee-Carter model

By sex, fit:

log(mx_{x,t}) = a_x + b_x * k_t + error_{x,t}

Estimation uses:

a_x: mean of log mortality by age.
SVD on centered log mortality matrix for b_x, k_t.
Identifiability constraints:
- sum(b_x) = 1
- sum(k_t) = 0

k_t forecasting:

Random walk with drift (default).
Optional ARIMA(0,1,0)+drift via statsmodels.

4. Life tables and longevity metrics

From projected qx, complete life tables are built (lx, dx, Lx, Tx, ex) with an explicit closing-age assumption.

Outputs include:

Life expectancy at birth (e0)
Life expectancy at age 65 (e65)

5. Scenario analysis and backtesting

Implemented scenarios:

Baseline
Faster mortality improvement
Slower mortality improvement
Temporary mortality shock

Backtest framework:

Fit on a training window.
Evaluate on holdout years versus observed mortality and life expectancy.
Compare custom projections with ONS principal/high/low variants when available.

Repository Structure

mortality-longevity-analysis/
├── configs/
│   └── default.yaml
├── data/
│   ├── raw/
│   ├── interim/
│   └── processed/
├── outputs/
│   ├── figures/
│   └── tables/
├── reports/
│   └── project_report.md
├── src/mortality_longevity/
│   ├── backtest.py
│   ├── config.py
│   ├── data_download.py
│   ├── data_parse.py
│   ├── lee_carter.py
│   ├── life_table.py
│   ├── plots.py
│   ├── scenarios.py
│   └── transform.py
├── tests/
├── Makefile
└── pyproject.toml

Setup Instructions

Prerequisites

Python 3.10+
Local ONS Excel files placed in data/raw/

Install

python -m pip install -e ".[dev]"
pre-commit install

Run quality checks

make check

How To Run The Pipeline

1. Ingest and normalize ONS data

python - << 'PY'
from pathlib import Path
from mortality_longevity.data_download import ingest_ons_qx

output_path = ingest_ons_qx(Path("configs/default.yaml"))
print(f"Normalized dataset written to: {output_path}")
PY

2. Generate scenarios and save summary tables

python - << 'PY'
import pandas as pd
from mortality_longevity.transform import qx_to_mx
from mortality_longevity.scenarios import generate_standard_scenarios, save_scenario_summary_tables

normalized = pd.read_csv("data/interim/ons_qx_normalized.csv")
observed = normalized.loc[normalized["scenario"] == "observed", ["sex", "year", "age", "qx"]].copy()
observed["mx"] = qx_to_mx(observed["qx"])

scenario_projection = generate_standard_scenarios(observed[["sex", "year", "age", "mx"]], years_ahead=30)
paths = save_scenario_summary_tables(scenario_projection)
print(paths)
PY

3. Run backtest and write holdout diagnostics

python - << 'PY'
import pandas as pd
from mortality_longevity.transform import qx_to_mx
from mortality_longevity.backtest import run_backtest

normalized = pd.read_csv("data/interim/ons_qx_normalized.csv")
observed = normalized.loc[normalized["scenario"] == "observed", ["sex", "year", "age", "qx"]].copy()
observed["mx"] = qx_to_mx(observed["qx"])

result = run_backtest(observed[["sex", "year", "age", "mx"]], train_end_year=2015)
print(result.saved_tables)
PY

4. (Optional) Compare with ONS variants

python - << 'PY'
import pandas as pd
from mortality_longevity.transform import qx_to_mx
from mortality_longevity.scenarios import generate_standard_scenarios
from mortality_longevity.backtest import compare_custom_projection_with_ons_variants

normalized = pd.read_csv("data/interim/ons_qx_normalized.csv")
observed = normalized.loc[normalized["scenario"] == "observed", ["sex", "year", "age", "qx"]].copy()
observed["mx"] = qx_to_mx(observed["qx"])

custom = generate_standard_scenarios(observed[["sex", "year", "age", "mx"]], years_ahead=30)
variants = normalized.loc[
    normalized["scenario"].isin(["projection_principal", "projection_high_life_expectancy", "projection_low_life_expectancy"])
].copy()

tables = compare_custom_projection_with_ons_variants(
    custom_projection=custom,
    ons_variant_data=variants,
)
print({k: v.shape for k, v in tables.items()})
PY

Key Outputs

Normalized input data:
- data/interim/ons_qx_normalized.csv (or parquet)
Summary tables (outputs/tables/):
- Scenario life expectancy summary
- Scenario mortality summary
- Backtest mortality/life expectancy summaries
- Custom vs ONS comparison summaries
Figures (outputs/figures/):
- Standardized naming format:
  - mortality_longevity_<plot_name>[_qualifiers].png

Limitations

Lee-Carter assumes a stable age pattern and smooth period trend.
One-factor structure may miss cohort effects and cause-of-death shifts.
Extreme-age estimates can be volatile due to sparse exposure.
Short or noisy data windows can destabilize k_t drift estimates.
Temporary shocks (for example pandemic years) are hard to model with stationary drift assumptions.

Next Steps

Add exposure-weighted fitting and diagnostics.
Introduce cohort-aware and multi-factor mortality models.
Add bootstrap/parameter uncertainty around longevity outputs.
Extend backtests with rolling-origin evaluation.
Package a single CLI command for full pipeline execution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mortality-longevity-analysis

Project Overview

Actuarial Relevance

Data Sources (Placeholders)

Methodology

1. Data ingestion and normalization

2. Transformations

3. Lee-Carter model

4. Life tables and longevity metrics

5. Scenario analysis and backtesting

Repository Structure

Setup Instructions

Prerequisites

Install

Run quality checks

How To Run The Pipeline

1. Ingest and normalize ONS data

2. Generate scenarios and save summary tables

3. Run backtest and write holdout diagnostics

4. (Optional) Compare with ONS variants

Key Outputs

Limitations

Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
data		data
outputs		outputs
reports		reports
src/mortality_longevity		src/mortality_longevity
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

mortality-longevity-analysis

Project Overview

Actuarial Relevance

Data Sources (Placeholders)

Methodology

1. Data ingestion and normalization

2. Transformations

3. Lee-Carter model

4. Life tables and longevity metrics

5. Scenario analysis and backtesting

Repository Structure

Setup Instructions

Prerequisites

Install

Run quality checks

How To Run The Pipeline

1. Ingest and normalize ONS data

2. Generate scenarios and save summary tables

3. Run backtest and write holdout diagnostics

4. (Optional) Compare with ONS variants

Key Outputs

Limitations

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages