🏦 Credit Risk Analysis – German Credit Dataset

A complete end-to-end Credit Risk Analysis project using the German Credit Dataset. This repository covers everything from EDA → preprocessing → feature engineering → modeling → evaluation → deployment.

📘 Overview

This project analyzes credit applicant data to understand patterns that lead to good or bad credit outcomes and builds a predictive model to assess credit risk.

It includes:

Clean and documented datasets

Notebooks for each stage

Final model pipeline

Streamlit deployment code

🎯 Project Objectives

✅ Primary Goals

Understand customer-level credit factors

Clean and preprocess raw credit data

Engineer meaningful and interpretable features

Build and evaluate ML models

Implement the best model in a deployable format

🧠 Key Questions Answered

Which customer attributes influence creditworthiness?

What patterns separate defaulters from non-defaulters?

Which model performs best for predicting loan default?

📊 Dataset Description

📂 Dataset: German Credit Risk Dataset

Contains 1,000 applicants with categorical + numeric attributes:

Personal information

Credit history

Loan purpose & amount

Payment behavior

Financial stability

Many features come with coded values (e.g., A41, A93), which were decoded during preprocessing.

🛠️ Project Workflow

1. 🔍 Exploratory Data Analysis (EDA)

Distribution checks

Correlation visualization

Categorical decoding

Outlier identification

2. 🧹 Data Preprocessing

Handling missing values

Feature type correction

Ordinal & One-Hot Encoding

Scaling numeric variables

Outlier treatment

3. ⚙️ Feature Engineering

Creation of ratio-based variables

Credit utilisation features

Binning & transformations

SMOTE for class imbalance

4. 🤖 Modeling

Models evaluated:

Logistic Regression

Random Forest

XGBoost

LightGBM

Grid Search & cross-validation used for tuning

Performance evaluation on Recall, Precision, F1, ROC-AUC

5. 🚀 Deployment

Streamlit app created for model prediction

User-friendly UI with input legends/explanations

Final model pipeline saved via joblib

📈 Results Summary

Best model achieved strong Recall for identifying risky applicants

Proper feature engineering significantly improved performance

Model generalized well on unseen test data

(You can add exact scores if needed.)

📂 Project Structure
Credit-Risk-Analysis/
│
├── data/
│   ├── gd.csv
│   ├── german.data
│   ├── german.data-numeric
│   ├── german.doc
│   └── Index
│
├── notebooks/
│   ├── data_exploration.ipynb
│   ├── feature_engineering.ipynb
│   ├── modeling.ipynb
│   └── evaluation.ipynb
│
├── app/
│   ├── streamlit_app.py
│   └── best_model/
│       └── xgb_pipeline.joblib
│
└── README.md```

💻 Technologies Used

Python 🐍

Pandas, NumPy

Scikit-Learn

XGBoost / LightGBM

Imbalanced-Learn

Matplotlib & Seaborn

Streamlit

Joblib

🚧 Future Enhancements

Add SHAP-based interpretability

Add API endpoints for production use

Add monitoring & drift detection

🙌 Acknowledgements

Dataset source: UCI Machine Learning Repository – German Credit Dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.dvc		.dvc
.github/workflows		.github/workflows
data		data
models		models
notebooks		notebooks
src		src
test		test
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🏦 Credit Risk Analysis – German Credit Dataset

📘 Overview

🎯 Project Objectives

✅ Primary Goals

🧠 Key Questions Answered

📊 Dataset Description

🛠️ Project Workflow

1. 🔍 Exploratory Data Analysis (EDA)

2. 🧹 Data Preprocessing

3. ⚙️ Feature Engineering

4. 🤖 Modeling

5. 🚀 Deployment

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

ashiq-km/Credit-Risk-Analysis---German-Bank-

Folders and files

Latest commit

History

Repository files navigation

🏦 Credit Risk Analysis – German Credit Dataset

📘 Overview

🎯 Project Objectives

✅ Primary Goals

🧠 Key Questions Answered

📊 Dataset Description

🛠️ Project Workflow

1. 🔍 Exploratory Data Analysis (EDA)

2. 🧹 Data Preprocessing

3. ⚙️ Feature Engineering

4. 🤖 Modeling

5. 🚀 Deployment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages