Retail Credit PD Model Risk Validation Lab

Portfolio project focused on junior model risk analyst responsibilities in banking: model comparison, independent-style validation, drift monitoring, and stress testing.

Project Objective

Build a practical model risk validation workflow for a retail credit Probability of Default (PD) model, using a clear end-to-end process:

ingest data
train champion and challenger models
perform independent-style validation
monitor drift
run stress scenarios
produce governance-style outputs

Why This Is Relevant to Model Risk Management

Model Risk Management (MRM) in banking is not only about predictive accuracy. It also requires:

transparent model choices
independent performance and calibration review
ongoing monitoring for population and score drift
stress testing to understand behavior under adverse conditions

This repository is structured around that lifecycle rather than only model development.

Results Snapshot

Validation (Champion vs Challenger)	Monitoring (Drift Review)	Stress Testing (Scenario Impact)

Challenger Random Forest shows stronger discrimination than the champion Logistic Regression on holdout ROC.	Recession-like monitoring sample shows AMBER drift, with PSI increases in both features and model scores.	Predicted default rates increase in all scenarios, with the largest uplift under the severe recession case.

Dataset

Real public dataset: UCI Default of Credit Card Clients (dataset id 350) via ucimlrepo
Raw copy saved to: data/raw/uci_credit_card_default_clean_raw.csv
Modelling-ready copy saved to: data/processed/uci_credit_card_default_modelling.csv
Monitoring scenario samples saved to: data/monitoring/
Target column standardized to: default_flag

Modelling Approach (Champion vs Challenger)

Champion: Logistic Regression
Chosen as a conservative baseline that is easy to explain in governance settings.
Challenger: Random Forest
Used to test whether additional non-linearity improves holdout performance, with explicit trade-off discussion around explainability.

Both models are trained and evaluated on the same holdout split for fair comparison.

Validation, Monitoring, and Stress Testing

Validation
- ROC AUC, KS, Brier score, confusion matrix
- calibration table and score distribution review
- basic data integrity checks
- simple sensitivity analysis with traffic-light outcomes
Monitoring
- PSI on selected input variables and model scores
- drift flags (GREEN / AMBER / RED)
- monitoring summary table and diagnostic charts
Stress Testing
- three scenarios: mild recession, severe recession, consumer leverage shock
- transparent variable shifts (balances, payments, delinquency migration)
- impact on predicted default rates, score distributions, and key metrics

Repository Structure

model-risk-validation-lab/
  README.md
  requirements.txt
  configs/
    stress_scenarios.yaml
  data/
    monitoring/
    raw/
    processed/
  outputs/
    metrics/
    trained_models/
  reports/
    figures/
    validation_report.md
  src/
    data_ingestion.py
    train_model.py
    challenger_model.py
    validate_model.py
    monitoring.py
    stress_testing.py
    model_risk_validation_lab/
      champion_model.py
      challenger_model.py
      preprocessing.py
      validation.py
      monitoring.py
      stress_testing.py
      reporting.py
  tests/

How to Run the Project

From repository root:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run the full workflow in order:

python src/data_ingestion.py
python src/train_model.py
python src/challenger_model.py
python src/validate_model.py
python src/monitoring.py
python src/stress_testing.py

Run tests:

pytest -q

Notes:

data_ingestion.py requires internet access to fetch UCI data.
Running workflows writes artefacts under outputs/ and reports/.

Artefacts Produced

Model metrics: outputs/metrics/champion_metrics.json
Champion vs challenger comparison: outputs/metrics/champion_vs_challenger_metrics.json
Independent validation outputs:
- outputs/metrics/independent_validation_summary.json
- outputs/metrics/independent_calibration_table.csv
- outputs/metrics/independent_sensitivity_analysis.csv
- reports/validation_report.md
Monitoring outputs:
- outputs/metrics/monitoring_metrics.json
- outputs/metrics/monitoring_summary_table.csv
- reports/figures/monitoring_psi.png
- reports/figures/monitoring_score_distribution.png
Stress testing outputs:
- outputs/metrics/stress_testing_metrics.json
- outputs/metrics/stress_testing_summary_table.csv
- reports/figures/stress_predicted_default_rates.png
- reports/figures/stress_score_distributions.png

Limitations

This is a portfolio project, not production model risk infrastructure.
Uses one public dataset; portfolio representativeness is limited.
No model deployment controls, approvals workflow, or scheduler orchestration.
No fairness, bias, or policy optimization analysis is included.
Sensitivity and stress design is intentionally simple for transparency.

Future Improvements

Add challenger alternatives with stronger explainability controls (for example, constrained tree models).
Expand sensitivity testing to feature-level monotonicity and stability checks.
Add temporal backtesting and out-of-time validation.
Add automated run scripts/Makefile for one-command execution.
Add a concise model documentation pack (assumptions, limitations, governance checklist).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retail Credit PD Model Risk Validation Lab

Project Objective

Why This Is Relevant to Model Risk Management

Results Snapshot

Dataset

Modelling Approach (Champion vs Challenger)

Validation, Monitoring, and Stress Testing

Repository Structure

How to Run the Project

Artefacts Produced

Limitations

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
data		data
reports		reports
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Retail Credit PD Model Risk Validation Lab

Project Objective

Why This Is Relevant to Model Risk Management

Results Snapshot

Dataset

Modelling Approach (Champion vs Challenger)

Validation, Monitoring, and Stress Testing

Repository Structure

How to Run the Project

Artefacts Produced

Limitations

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages