Setting up local environment

Installing python

brew install python

Creating python virtal environment

Creating a virtual environement is a good way to ensure our work does not get poluted by other python libs downloaded on the same machine.

This thing needs to be done once:

rm -rf ./pythonvenv
python3 -m venv ./pythonvenv
source pythonvenv/bin/activate
pip install --upgrade pip
pip install pandas scikit-learn m2cgen pypmml sklearn2pmml joblib

in every new shell, you'll then need to do:

source pythonvenv/bin/activate

Getting Data to work on

The following MB reports can be used to download some data to train and evaluate our models:

Purpose	Reference Date	Link	Expected Location
Full Extract (BD)	2025-01-01	Metabase	data/full_extract.csv

Running the scoring

The following steps should be run:

bash   ./01.split_data_called_notcalled.sh
python ./02.split_data.py ./data/generated.called.patients.csv
python ./02.split_data.py ./data/generated.not_called.patients.csv
python ./13.notcalled.train_models.py
python ./14.notcalled.validate_models.py
python ./15.notcalled.evaluate_model_quality.py
python ./23.called.train_models.py
python ./24.called.validate_models.py
python ./25.called.evaluate_model_quality.py

or just run

bash 99.clean_all.sh; bash 90.run_all.sh

Model Quality Metrics Explained

This document explains the meaning and interpretation of the numerical metrics used to evaluate the performance of your predictive models (Classifier and Combined Metric) on the validation dataset.

1. Classification Metrics

These metrics assess how accurately your model predicts the binary outcome (\text{Visit}=1) or (\text{No Visit}=0) based on the raw chances\_to\_visit score.

Metric	Meaning	Interpretation	Goal
AUC-ROC	Area Under the Receiver Operating Characteristic Curve.	Measures the model's ability to distinguish between the "Visit" class and the "No Visit" class across all possible probability thresholds. This is the best single measure of model separation quality.	Closer to 1.0 (Typically >0.75 is good)
Precision	Quality of Positive Predictions.	Of all the patients the model predicted would visit, how many actually did? Crucial for addressing overconfidence (false positives).	Closer to 1.0
Recall (Sensitivity)	Completeness of Positive Predictions.	Of all the patients who actually visited, what percentage did the model correctly identify?	Closer to 1.0
F1 Score	Balance.	The harmonic mean of Precision and Recall. Useful when you need a balanced measure of performance, particularly if minimizing both false positives and false negatives is important.	Closer to 1.0
Accuracy	Overall Correctness.	The total percentage of all predictions (both correct visits and correct no-shows) that were correct. (Note: Can be misleading if classes are highly imbalanced).	Closer to 1.0

2. Combined Likelihood Metric Efficiency

These metrics specifically evaluate the quality of our goal prediction: the likelihood that a patient will visit AND do so with less than 30 days of delay (`chances_to_visit_under_30d`).

Metric	Meaning	Interpretation	Goal
Combined Metric AUC-ROC	Goal Ranking Quality.	Measures how well your final combined score ranks the patients who actually met the $\text{Visit} + <30 \text{ Days}$ goal.	Closer to 1.0
Combined Metric Precision	Goal Confidence.	If you select a cohort based on a high combined score, this is the percentage that truly met the $\text{Visit} + <30 \text{ Days}$ success condition.	Closer to 1.0

3. Example of Result

==================================================
           MODEL QUALITY EVALUATION
==================================================
Accuracy                      : 0.9782
Precision                     : 0.9832
Recall (Sensitivity)          : 0.9949
F1 Score                      : 0.9890
AUC-ROC                       : 0.6370
Combined Metric AUC-ROC       : 0.6983
Combined Metric Precision     : 0.8866
==================================================

==================================================
           MODEL QUALITY EVALUATION
==================================================
Accuracy                      : 0.8693
Precision                     : 0.8131
Recall (Sensitivity)          : 0.9471
F1 Score                      : 0.8750
AUC-ROC                       : 0.9211
Combined Metric AUC-ROC       : 0.4811
Combined Metric Precision     : 1.0000
==================================================

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Setting up local environment

Installing python

Creating python virtal environment

Getting Data to work on

Running the scoring

Model Quality Metrics Explained

1. Classification Metrics

2. Combined Likelihood Metric Efficiency

3. Example of Result

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
data		data
.gitignore		.gitignore
01.split_data_called_notcalled.sh		01.split_data_called_notcalled.sh
02.split_data.py		02.split_data.py
13.notcalled.train_models.py		13.notcalled.train_models.py
14.notcalled.validate_models.py		14.notcalled.validate_models.py
15.notcalled.evaluate_model_quality.py		15.notcalled.evaluate_model_quality.py
23.called.train_models.py		23.called.train_models.py
24.called.validate_models.py		24.called.validate_models.py
25.called.evaluate_model_quality.py		25.called.evaluate_model_quality.py
31.combined_scoring.py		31.combined_scoring.py
41.evaluate_allocation.py		41.evaluate_allocation.py
90.run_all.sh		90.run_all.sh
99.clean_all.sh		99.clean_all.sh
README.md		README.md

simpledotorg/poc_scoring_partient_return

Folders and files

Latest commit

History

Repository files navigation

Setting up local environment

Installing python

Creating python virtal environment

Getting Data to work on

Running the scoring

Model Quality Metrics Explained

1. Classification Metrics

2. Combined Likelihood Metric Efficiency

3. Example of Result

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages