Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
a33a60b
Bump pyasn1-modules from 0.3.0 to 0.4.1
dependabot[bot] Oct 7, 2024
537805f
Bump annotated-types from 0.6.0 to 0.7.0
dependabot[bot] Oct 7, 2024
96ae37d
Bump traitlets from 5.11.2 to 5.14.3
dependabot[bot] Oct 7, 2024
55ab966
Bump werkzeug from 3.0.3 to 3.0.6
dependabot[bot] Oct 26, 2024
00ab68c
Interpolation
IbrahimKhan07 Nov 6, 2024
8c49ebf
visualisations
IbrahimKhan07 Nov 7, 2024
a67c7ed
adding sub-icb visualisations
IbrahimKhan07 Nov 8, 2024
ece62d9
Bump werkzeug from 3.0.3 to 3.1.3
dependabot[bot] Nov 11, 2024
fc2ef7f
initial commit
IbrahimKhan07 Nov 11, 2024
6194421
Processed all data sources
IbrahimKhan07 Nov 12, 2024
99aa03a
data pre-processings
IbrahimKhan07 Nov 13, 2024
3c5c4c6
adding working days and appointments for whole year
IbrahimKhan07 Nov 14, 2024
4e3c678
calculating ratios
IbrahimKhan07 Nov 15, 2024
832d39b
patient to GP ratio analysis
IbrahimKhan07 Nov 18, 2024
9d4d0a4
Addind age standardisation + markdown file
IbrahimKhan07 Nov 19, 2024
a4a94b6
standardisation
IbrahimKhan07 Nov 19, 2024
1db3441
Determining age factor
IbrahimKhan07 Nov 20, 2024
4af54e0
age weights using regression
IbrahimKhan07 Nov 21, 2024
745235e
age weight using regression
IbrahimKhan07 Nov 21, 2024
450c726
Modelling and prediction of GP appointments
IbrahimKhan07 Nov 22, 2024
1f147c0
Bump tornado from 6.4.1 to 6.4.2
dependabot[bot] Nov 22, 2024
1e41d8c
Bump dawidd6/action-download-artifact in /.github/workflows
dependabot[bot] Nov 25, 2024
6121168
Markdown and images+ final notebook changes
IbrahimKhan07 Nov 26, 2024
929b150
Bump flatbuffers from 23.5.26 to 24.12.23
dependabot[bot] Dec 30, 2024
47f17fa
Merge pull request #165 from SNEE-ICS/133-dc-notebook-run-forecast
AJarman Jan 8, 2025
d86764f
Merge pull request #167 from SNEE-ICS/164-appointments-vs-staffing
AJarman Jan 8, 2025
5daec8b
Merge pull request #172 from SNEE-ICS/dependabot/pip/flatbuffers-24.1…
AJarman Jan 8, 2025
c08e6bd
Merge pull request #140 from SNEE-ICS/dependabot/pip/annotated-types-…
AJarman Jan 8, 2025
8c0d021
Merge pull request #160 from SNEE-ICS/dependabot/pip/werkzeug-3.0.6
AJarman Jan 8, 2025
8b48610
Merge pull request #141 from SNEE-ICS/dependabot/pip/traitlets-5.14.3
AJarman Jan 8, 2025
5a3df0a
Bump jinja2 from 3.1.4 to 3.1.5
dependabot[bot] Jan 8, 2025
4f37c1d
Merge pull request #173 from SNEE-ICS/dependabot/pip/jinja2-3.1.5
AJarman Jan 8, 2025
cd4a62e
Merge pull request #170 from SNEE-ICS/dependabot/github_actions/dot-g…
AJarman Jan 8, 2025
6847b2e
Bump watchfiles from 0.21.0 to 1.0.3
dependabot[bot] Jan 8, 2025
cf08864
Merge branch 'develop' into dependabot/pip/tornado-6.4.2
AJarman Jan 8, 2025
b9a6dc8
Merge pull request #168 from SNEE-ICS/dependabot/pip/tornado-6.4.2
AJarman Jan 8, 2025
ceef34a
Merge pull request #137 from SNEE-ICS/dependabot/pip/pyasn1-modules-0…
AJarman Jan 8, 2025
5a29904
Merge branch 'develop' into dependabot/pip/werkzeug-3.1.3
AJarman Jan 8, 2025
b1957bd
Merge pull request #166 from SNEE-ICS/dependabot/pip/werkzeug-3.1.3
AJarman Jan 8, 2025
50053dd
Merge pull request #171 from SNEE-ICS/dependabot/pip/watchfiles-1.0.3
AJarman Jan 8, 2025
78c8e1f
parallel processing implementation
AJarman Nov 6, 2024
a1ce6a9
parallel processing
AJarman Nov 20, 2024
eb95d00
fixes to did not attend yaml filename
AJarman Nov 20, 2024
8ee237d
fixes
AJarman Nov 20, 2024
c91ac03
added convenient shell script for running all notebooks
AJarman Jan 8, 2025
d5aaf69
refreshed notebooks
AJarman Jan 8, 2025
d23057e
documentation
AJarman Jan 8, 2025
a7f2683
removed icecream
AJarman Jan 8, 2025
8bdabcd
Interpolation
IbrahimKhan07 Nov 6, 2024
fcb9c38
visualisations
IbrahimKhan07 Nov 7, 2024
6a82afc
adding sub-icb visualisations
IbrahimKhan07 Nov 8, 2024
9e31509
age weights using regression
IbrahimKhan07 Nov 21, 2024
e6a8b23
parallel processing implementation
AJarman Nov 6, 2024
d471fad
parallel processing
AJarman Nov 20, 2024
014615a
fixes to did not attend yaml filename
AJarman Nov 20, 2024
bbe1928
fixes
AJarman Nov 20, 2024
4c3075c
refreshed notebooks
AJarman Jan 8, 2025
07527fd
fixes to did not attend yaml filename
AJarman Nov 20, 2024
17666b3
fixes
AJarman Nov 20, 2024
7922fe7
parallel processing implementation
AJarman Nov 6, 2024
3155425
parallel processing
AJarman Nov 20, 2024
e0ac98b
fixes to did not attend yaml filename
AJarman Nov 20, 2024
29ad758
fixes
AJarman Nov 20, 2024
55035d4
refreshed notebooks
AJarman Jan 8, 2025
ac21647
updating snee style repo
IbrahimKhan07 Jul 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/github_pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
- name: Checkout
uses: actions/checkout@v3
- name: Download notebooks workflow artifact
uses: dawidd6/action-download-artifact@v2.28.0
uses: dawidd6/action-download-artifact@v6
with:
workflow: notebook_run.yml
name: notebook-outputs
Expand Down
144 changes: 144 additions & 0 deletions PelicanWebsite/content/capacity-analysis/appointments_vs_staffing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
Title: Appointments vs. Staffing
Date: 2024-01-04
Modified: 2024-10-11
Category: Capacity Analysis
Authors: A.Jarman & I.Khan
Summary: Analysis on primary care appointments, staffing and patient-to-gp ratio.

<br>

## Introduction & Background
Analyze the relationship between the number of primary care appointments per head of population and staffing levels per head of population. The goal is to investigate how the patient-to-GP ratio correlates with appointments per head specifically in the context of SNEE compared to broader English sub-ICBs.
<br><br>

## Data Sources
The primary data for this analysis is derived from the extensive appointments dataset, GP Patient list and the staffing dataset provided by NHS England.
<table>
<thead>
<tr>
<th>Dataset used</th>
<th>Website URL</th>
<th>Download zip</th>
</tr>
</thead>
<tbody>
<tr>
<td>Appointments dataset</td>
<td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/appointments-in-general-practice", target=blank>NHS Digital - Appointments in General Practice</a></td>
<td><a href="https://files.digital.nhs.uk/D5/4B437E/Appointments_GP_Regional_CSV_Aug_24.zip">Actual_Duration_CSV_Aug_24.csv</a></td>
</tr>
<tr>
<td>Registred patients dataset</td>
<td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/patients-registered-at-a-gp-practice", target=blank>NHS Digital - Patients Registered at a GP practice</a></td>
<td><a href="https://files.digital.nhs.uk/31/AA1C9E/gp-reg-pat-prac-quin-age.zip">gp-reg-pat-prac-quin-age.csv</a></td>
</tr>
<tr>
<td>General Practice workforce</td>
<td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/general-and-personal-medical-services", target=blank>NHS Digital - General Practice Workforce</a></td>
<td><a href="https://files.digital.nhs.uk/45/72481B/GPWIndividualCSV.082024.zip">General Practice – August 2024 Individual Level.csv</a></td>
</tr>
<tr>
<td>Appointments by Region dataset</td>
<td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/appointments-in-general-practice", target=blank>NHS Digital - GP Appointments by Region</a></td>
<td><a href="https://files.digital.nhs.uk/D5/4B437E/Appointments_GP_Regional_CSV_Aug_24.zip">Regional_CSV_SuffolkNEEssex.csv</a></td>
</tr>
</tbody>
</table>

- Number of patients is a snapshot of patients in Sep/2024
- Workforce is also a snapshot of staff but in Aug/2024
- Appointments are the sum of latest year appointments (Aug/23 - Aug/24)
<br><br>

## Methodology
The analysis involved a systematic approach to pre-process, integrate, and analyze datasets to explore relationships between key variables and predict outcomes. Below are the steps undertaken:

1. Data Pre-processing and Integration:
- Each dataset was pre-processed based on specific requirements and then merged into a single comprehensive dataset.

2. Calculation of Key Ratios:
Several ratios were computed to facilitate analysis, including:

- All Appointments per Head of Population: *Count of all appointments / Number of Patients*
- GP Appointments per Head of Population: *Count of GP appointments / Number of Patients*
- Staffing per 1,000 Registered Population: *(Combined Staff FTE / Number of Patients) * 1000*
- Patient-to-GP Ratio: *Number of Patients / GP FTE*

3. Analysis of Staffing Levels and Primary Care Appointments:
- The relationship between staffing levels and all primary care appointments (per head of population) was assessed using Spearman's and Pearson's correlation coefficients.
- A regression model was then employed to quantify the relationship.

4. Analysis of Patient-to-GP Ratio and GP Appointments:
- The relationship between the patient-to-GP ratio and GP appointments (per head of population) was similarly evaluated using correlation analysis (Spearman's and Pearson's).
- A regression model was run to understand and quantify this relationship.

5. Visual Comparison Across Regions:
- Patient-to-GP ratio and GP appointments per head of population were plotted for Integrated Care Boards (ICBs) and Sub-ICBs across England to enable regional comparisons.

6. Analysis of Age and GP Appointments:
- Patients were categorized into three age bands: 0–19, 20–64, and 65+.
- For each age band, correlations (Spearman's/Pearson's) were calculated to assess the relationship between age and GP appointments.
- Multiple regression models were run to predict GP appointment counts, with the best-performing model selected for further predictions.

7. Unmet Demand Estimation:
- Predicted and actual GP appointment counts were compared to calculate the proportion of unmet demand for each ICB and Sub-ICB.
<br><br>

## Results and Inferences

### Staffing Levels and Primary Care Appointments:
- The correlation coefficient (0.6512) shows a moderate-to-strong positive relationship. This suggests that, as staffing levels per head increase, there is a significant tendency for appointments per head to also increase.
- The P-value (4.08e-14) is far below 0.05, confirming the correlation is statistically significant.
- Regression Results:
- Slope (1.4216): For every one-unit increase in staffing per 1k, All appointments per head increase by 1.4216.
- The slope is significant (p-value < 0.05).
- R-squared (0.424) indicates 42.4% of appointment variability is explained by staffing levels.


### Patient-to-GP Ratio and GP Appointments:
- Pearson (-0.42) and Spearman (-0.41) coefficients show a moderate negative relationship. between Patient_to_gp_Ratio and GP Appointments per head
- The p-value (< 0.05) confirms th relationship is statistically significant.
- Regression Results:
- Coefficient (-0.0006): A one-unit increase in Patient-to-GP Ratio reduces GP appointments per head by 0.0006.
- The relationship is significant but has a negligible practical impact.
- While the relationship exists, the practical impact is negligible given the low R-squared value (0.177).


### Visual Comparison Across Regions:
Comparisons of Patient-to-GP ratios and appointments per head across ICBs and Sub-ICBs are visualized in the graphs below:
<br>

- ICB's
![ICB Comparison]({attach}/img/Appointments_vs_staffing_3.png)
<br>

- Sub-ICB's
![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_6.png)


### Analysis of Age and GP Appointments:
- Pearson and Spearman coefficients (> 0.95) indicate a very strong positive correlation. More patients in any age group result in more GP appointments.
- The p-value is extremely small, confirming the correlation is significant.
- Visualising the relation:
<br>

![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_7.png)
<br>

Regression Inferences:

- A 1 FTE increase in GPs increases annual appointments by 1,567.
- A one-person increase in the total population raises GP appointments by 2.46.
- A one-person increase in the 65+ group raises GP appointments by 3.635 (2.46 + 1.175).

### Unmet Demand Estimation:
- Comparison of actual and predicted GP appointments:
![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_10.png)

- Visualization of unmet demand across Integrated Care Boards (ICBs):
![ICB Comparison]({attach}/img/Appointments_vs_staffing_8.png)

- Visualization of unmet demand across Sub-Integrated Care Boards (Sub-ICBs):
![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_9.png)

<br><hr><br>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 9 additions & 2 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
from collections import defaultdict
from typing import Dict, Literal, Optional, Union, Tuple
from tqdm import tqdm
from icecream import ic
import json

from src.various_methods import is_working_day
Expand All @@ -23,7 +22,6 @@

from src.simulation import (SimulationData,
DailyRegionalModel)
from src.simulation_schemas import SimulationOutputs

# Configure logging
log_filename = f"outputs/simulation_log_{datetime.now(tz=pytz.timezone('UTC')).isoformat()}.txt"
Expand Down Expand Up @@ -100,10 +98,19 @@ def run_simulation(start_date:dt.date,end_date:dt.date, n_runs:int):

if __name__ == '__main__':


start_time = datetime.now()
logging.info(f"Simulation started at {start_time}")

simulation_outputs = run_simulation(
start_date = dt.date(2024,6,1),
end_date = dt.date(2025,5,31),
n_runs=2)
# save the outputs

end_time = datetime.now()
logging.info(f"Simulation ended at {end_time}")
logging.info(f"Total time taken for simulation: {end_time - start_time}")

with open('outputs/simulation_outputs.json', 'w') as f:
json.dump(simulation_outputs, f)
193 changes: 193 additions & 0 deletions main_para.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
import numpy as np
import pandas as pd
import datetime as dt
import pytz
import random
from collections import defaultdict
from typing import Dict, Literal, Optional, Union, Tuple
from tqdm import tqdm
import json
import multiprocessing as mp
from functools import partial
import logging
from datetime import datetime

from src.various_methods import is_working_day
from src.constants import (SARIMA_FORECAST_OUTPUT_FILENAME,
APPOINTMENT_DURATION_OUTPUT_FILENAME,
STAFF_TYPE_PROPENSITY_OUTPUT_FILENAME,
APPOINTMENT_MODE_PROPENSITY_OUTPUT_FILENAME,
POPULATION_PROJECTIONS_OUTPUT_FILENAME,
ACUTE_REFERRAL_RATES_OUTPUT_FILENAME,
WORKFORCE_CURRENT_STAFF_FTE)

from src.simulation import (SimulationData,
DailyRegionalModel)

# Configure logging with process ID
def setup_logging():
log_filename = f"outputs/simulation_log_{datetime.now(tz=pytz.timezone('UTC')).isoformat()}.txt"
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - Process %(process)d - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler(log_filename),
logging.StreamHandler()
]
)

DEMAND_FORECASTS = [
'SARIMA',
]

CAPACITY_POLICIES = ['Do Nothing']

def merge_daily_summaries(existing_summary: Dict, new_summary: Dict) -> Dict:
"""
Merge two daily summaries without requiring DailyRegionalModel instance.
This function should implement the same logic as DailyRegionalModel.update_summary()
"""
merged = existing_summary.copy()
# Add your merging logic here based on your summary structure
# For example, if your summaries contain counts or averages:
for key in new_summary:
if key in merged:
if isinstance(merged[key], (int, float)):
merged[key] = (merged[key] + new_summary[key]) / 2 # or sum, depending on your needs
elif isinstance(merged[key], list):
merged[key].extend(new_summary[key])
elif isinstance(merged[key], dict):
merged[key] = merge_daily_summaries(merged[key], new_summary[key])
else:
merged[key] = new_summary[key]
return merged

def process_single_run(run_params: tuple) -> Dict:
"""
Process a single simulation run with the given parameters.
"""
simulation_run, start_date, end_date, simulation_data = run_params
setup_logging() # Setup logging for this process

date_range = pd.date_range(start=start_date, end=end_date).date
run_output = {}

logging.info(f"Starting simulation run {simulation_run}")

for demand_forecast in DEMAND_FORECASTS:
logging.info(f"Demand Forecast: {demand_forecast}")
run_output[demand_forecast] = {}

for capacity_policy in CAPACITY_POLICIES:
logging.info(f"Capacity Policy: {capacity_policy}")
run_output[demand_forecast][capacity_policy] = {}

for region in ['06L', '07K', '06T']:
logging.info(f"Region: {region}")
run_output[demand_forecast][capacity_policy][region] = {}

for day in date_range:
if is_working_day(day):
logging.info(f"Processing day {day}")
daily_model = DailyRegionalModel(
sim_data=simulation_data,
date=day,
run_number=simulation_run,
region=region,
forecast_model=demand_forecast,
capacity_policy=capacity_policy
)
daily_model.process_day()
day_isoformat = day.isoformat()
run_output[demand_forecast][capacity_policy][region][day_isoformat] = daily_model.create_initial_summary()

return run_output

def merge_run_outputs(outputs_list: list) -> Dict:
"""
Merge the outputs from multiple simulation runs into a single dictionary.
"""
merged_outputs = {}

for run_output in outputs_list:
for demand_forecast in run_output:
if demand_forecast not in merged_outputs:
merged_outputs[demand_forecast] = {}

for capacity_policy in run_output[demand_forecast]:
if capacity_policy not in merged_outputs[demand_forecast]:
merged_outputs[demand_forecast][capacity_policy] = {}

for region in run_output[demand_forecast][capacity_policy]:
if region not in merged_outputs[demand_forecast][capacity_policy]:
merged_outputs[demand_forecast][capacity_policy][region] = {}

for day, summary in run_output[demand_forecast][capacity_policy][region].items():
if day in merged_outputs[demand_forecast][capacity_policy][region]:
# Merge summaries directly without creating a new DailyRegionalModel
merged_outputs[demand_forecast][capacity_policy][region][day] = merge_daily_summaries(
merged_outputs[demand_forecast][capacity_policy][region][day],
summary
)
else:
merged_outputs[demand_forecast][capacity_policy][region][day] = summary

return merged_outputs

def run_simulation(start_date: dt.date, end_date: dt.date, n_runs: int) -> Dict:
"""
Run the simulation in parallel using multiple processes.
"""
setup_logging()
logging.info("Starting parallel simulation")

# Initialize simulation data (shared between processes)
simulation_data = SimulationData()

# Create a pool of workers
num_processes = mp.cpu_count() #- 1 # Leave one CPU free for system tasks
pool = mp.Pool(processes=num_processes)

# Prepare parameters for each run
run_params = [(i, start_date, end_date, simulation_data) for i in range(n_runs)]

# Run simulations in parallel with progress bar
outputs_list = list(tqdm(
pool.imap(process_single_run, run_params),
total=n_runs,
desc="Processing simulation runs"
))

# Clean up
pool.close()
pool.join()

# Merge results from all runs
merged_outputs = merge_run_outputs(outputs_list)

logging.info("Parallel simulation completed")
return merged_outputs

if __name__ == '__main__':

import time

# Start timing the simulation
start_time = time.time()

# Run the simulation
simulation_outputs = run_simulation(
start_date=dt.date(2024, 6, 1),
end_date=dt.date(2024, 6, 30),
n_runs=10
)

# End timing the simulation
end_time = time.time()
elapsed_time = end_time - start_time

# Log the elapsed time
logging.info(f"Simulation completed in {elapsed_time:.2f} seconds")
# Save the outputs
with open('outputs/simulation_outputs.json', 'w') as f:
json.dump(simulation_outputs, f)
Loading
Loading