This project demonstrates how a churn prediction model can be transformed into a profit-maximizing decision policy by integrating machine learning outputs with business assumptions. Rather than focusing solely on predictive accuracy, the proposed approach determines which customers should be contacted in order to maximize expected profit.
The framework combines predicted churn probabilities with customer lifetime value (CLV), campaign cost, and retention uplift assumptions to derive an economically rational contact policy.
The analysis is based on the Telco Customer Churn dataset.
To focus on early-stage churn behavior, the dataset is restricted to customers with tenure ≤ 1, where churn is primarily driven by onboarding experience and initial service perception rather than long-term factors.
Key preprocessing steps include:
- Handling missing values in TotalCharges
- One-hot encoding of categorical variables
- Stratified train/test split to preserve churn distribution
Feature redundancy is reduced through correlation-based filtering. Highly correlated features are removed to improve model stability and interpretability.
A Random Forest model is then trained solely to inspect feature importances. No additional feature elimination is performed at this stage; the model is used as a diagnostic tool to validate that the remaining features carry meaningful signal.
Two classification models are trained and evaluated:
- Logistic Regression
- Used as a baseline and as the main probability generator
- Produces well-calibrated and interpretable churn probabilities
Model evaluation includes ROC-AUC scores and a confusion matrix to assess classification performance.
The core contribution of this project is a decision framework that converts churn probabilities into actionable business decisions.
- Retention horizon: 6 months
- Campaign cost: 15 units per contacted customer
- Retention uplift: 15% probability of retaining a churn-prone customer
Customer lifetime value is approximated as:
CLV = MonthlyCharges × horizon_months
For each customer, the expected value (EV) of contacting is computed as:
EV = P(churn) × retention_uplift × CLV − campaign_cost
The decision rule is:
- CONTACT if EV > 0
- NO ACTION otherwise
This removes the need for arbitrary probability thresholds and ensures that decisions are economically justified.
The model-driven policy:
- Contacts fewer customers than a naive contact-all strategy
- Achieves higher total and per-customer profit
- Effectively prioritizes high-risk, high-value customers
A sensitivity analysis over different campaign costs and retention uplift values shows that the policy remains robust under varying business conditions.
The project produces the following outputs:
- decision_framework_results.csv: customer-level decisions, EV, and simulated profit
- sensitivity_results.csv: profitability and contact rates under alternative scenarios
This project illustrates how predictive models can be operationalized into profit-oriented decision systems. By integrating churn probabilities with business assumptions, the proposed framework outperforms naive strategies while remaining transparent, interpretable, and adaptable to real-world constraints.The framework is model-agnostic and can be extended to alternative probability estimators, provided that calibrated churn probabilities are available.
- Clone the repository
- Install dependencies:
pip install -r requirements.txt - Run the notebook in the
notebooks/folder