SNEE-ICS · AJarman · Oct 7, 2024 · Oct 7, 2024 · Oct 7, 2024 · Oct 26, 2024
diff --git a/.github/workflows/github_pages.yml b/.github/workflows/github_pages.yml
@@ -34,7 +34,7 @@ jobs:
       - name: Checkout
         uses: actions/checkout@v3
       - name: Download notebooks workflow artifact
-        uses: dawidd6/action-download-artifact@v2.28.0
+        uses: dawidd6/action-download-artifact@v6
         with:
           workflow: notebook_run.yml
           name: notebook-outputs

diff --git a/PelicanWebsite/content/capacity-analysis/appointments_vs_staffing.md b/PelicanWebsite/content/capacity-analysis/appointments_vs_staffing.md
@@ -0,0 +1,144 @@
+Title: Appointments vs. Staffing
+Date: 2024-01-04
+Modified: 2024-10-11
+Category: Capacity Analysis
+Authors: A.Jarman & I.Khan
+Summary: Analysis on primary care appointments, staffing and patient-to-gp ratio.
+
+<br>
+
+## Introduction & Background
+Analyze the relationship between the number of primary care appointments per head of population and staffing levels per head of population. The goal is to investigate how the patient-to-GP ratio correlates with appointments per head specifically in the context of SNEE compared to broader English sub-ICBs.
+<br><br>
+
+## Data Sources
+The primary data for this analysis is derived from the extensive appointments dataset, GP Patient list and the staffing dataset provided by NHS England.
+<table>
+    <thead>
+        <tr>
+            <th>Dataset used</th>
+            <th>Website URL</th>
+            <th>Download zip</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td>Appointments dataset</td>
+            <td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/appointments-in-general-practice", target=blank>NHS Digital - Appointments in General Practice</a></td>
+            <td><a href="https://files.digital.nhs.uk/D5/4B437E/Appointments_GP_Regional_CSV_Aug_24.zip">Actual_Duration_CSV_Aug_24.csv</a></td>
+        </tr>
+        <tr>
+            <td>Registred patients dataset</td>
+            <td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/patients-registered-at-a-gp-practice", target=blank>NHS Digital - Patients Registered at a GP practice</a></td>
+            <td><a href="https://files.digital.nhs.uk/31/AA1C9E/gp-reg-pat-prac-quin-age.zip">gp-reg-pat-prac-quin-age.csv</a></td>
+        </tr>
+        <tr>
+            <td>General Practice workforce</td>
+            <td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/general-and-personal-medical-services", target=blank>NHS Digital - General Practice Workforce</a></td>
+            <td><a href="https://files.digital.nhs.uk/45/72481B/GPWIndividualCSV.082024.zip">General Practice – August 2024 Individual Level.csv</a></td>
+        </tr>
+        <tr>
+            <td>Appointments by Region dataset</td>
+            <td><a href="https://digital.nhs.uk/data-and-information/publications/statistical/appointments-in-general-practice", target=blank>NHS Digital - GP Appointments by Region</a></td>
+            <td><a href="https://files.digital.nhs.uk/D5/4B437E/Appointments_GP_Regional_CSV_Aug_24.zip">Regional_CSV_SuffolkNEEssex.csv</a></td>
+        </tr>
+    </tbody>
+</table>
+
+- Number of patients is a snapshot of patients in Sep/2024
+- Workforce is also a snapshot of staff but in Aug/2024
+- Appointments are the sum of latest year appointments (Aug/23 - Aug/24)
+<br><br>
+
+## Methodology
+The analysis involved a systematic approach to pre-process, integrate, and analyze datasets to explore relationships between key variables and predict outcomes. Below are the steps undertaken:
+
+1. Data Pre-processing and Integration:
+    - Each dataset was pre-processed based on specific requirements and then merged into a single comprehensive dataset.
+
+2. Calculation of Key Ratios:
+Several ratios were computed to facilitate analysis, including:
+
+    - All Appointments per Head of Population: *Count of all appointments / Number of Patients*
+    - GP Appointments per Head of Population: *Count of GP appointments / Number of Patients*
+    - Staffing per 1,000 Registered Population: *(Combined Staff FTE / Number of Patients) * 1000*
+    - Patient-to-GP Ratio: *Number of Patients / GP FTE*
+
+3. Analysis of Staffing Levels and Primary Care Appointments:
+    - The relationship between staffing levels and all primary care appointments (per head of population) was assessed using Spearman's and Pearson's correlation coefficients.
+    - A regression model was then employed to quantify the relationship.
+
+4. Analysis of Patient-to-GP Ratio and GP Appointments:
+    - The relationship between the patient-to-GP ratio and GP appointments (per head of population) was similarly evaluated using correlation analysis (Spearman's and Pearson's).
+    - A regression model was run to understand and quantify this relationship.
+
+5. Visual Comparison Across Regions:
+    - Patient-to-GP ratio and GP appointments per head of population were plotted for Integrated Care Boards (ICBs) and Sub-ICBs across England to enable regional comparisons.
+
+6. Analysis of Age and GP Appointments:
+    - Patients were categorized into three age bands: 0–19, 20–64, and 65+.
+    - For each age band, correlations (Spearman's/Pearson's) were calculated to assess the relationship between age and GP appointments.
+    - Multiple regression models were run to predict GP appointment counts, with the best-performing model selected for further predictions.
+
+7. Unmet Demand Estimation:
+    - Predicted and actual GP appointment counts were compared to calculate the proportion of unmet demand for each ICB and Sub-ICB.
+<br><br>
+
+## Results and Inferences
+
+### Staffing Levels and Primary Care Appointments: 
+- The correlation coefficient (0.6512) shows a moderate-to-strong positive relationship. This suggests that, as staffing levels per head increase, there is a significant tendency for appointments per head to also increase.
+- The P-value (4.08e-14) is far below 0.05, confirming the correlation is statistically significant.
+- Regression Results:
+    - Slope (1.4216): For every one-unit increase in staffing per 1k, All appointments per head increase by 1.4216.
+    - The slope is significant (p-value < 0.05).
+- R-squared (0.424) indicates 42.4% of appointment variability is explained by staffing levels.
+
+
+### Patient-to-GP Ratio and GP Appointments:
+- Pearson (-0.42) and Spearman (-0.41) coefficients show a moderate negative relationship. between Patient_to_gp_Ratio and GP Appointments per head
+-  The p-value (< 0.05) confirms th relationship is statistically significant.
+- Regression Results:  
+    - Coefficient (-0.0006): A one-unit increase in Patient-to-GP Ratio reduces GP appointments per head by 0.0006.
+    - The relationship is significant but has a negligible practical impact.
+- While the relationship exists, the practical impact is negligible given the low R-squared value (0.177).
+
+
+### Visual Comparison Across Regions:
+Comparisons of Patient-to-GP ratios and appointments per head across ICBs and Sub-ICBs are visualized in the graphs below:
+<br>
+
+- ICB's
+![ICB Comparison]({attach}/img/Appointments_vs_staffing_3.png)
+<br>
+
+- Sub-ICB's
+![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_6.png)
+
+
+### Analysis of Age and GP Appointments:
+- Pearson and Spearman coefficients (> 0.95) indicate a very strong positive correlation. More patients in any age group result in more GP appointments.
+- The p-value is extremely small, confirming the correlation is significant.
+- Visualising the relation:
+<br>
+
+![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_7.png)
+<br>
+
+Regression Inferences:
+
+- A 1 FTE increase in GPs increases annual appointments by 1,567.
+- A one-person increase in the total population raises GP appointments by 2.46.
+- A one-person increase in the 65+ group raises GP appointments by 3.635 (2.46 + 1.175).
+
+### Unmet Demand Estimation:
+- Comparison of actual and predicted GP appointments:
+![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_10.png)
+
+- Visualization of unmet demand across Integrated Care Boards (ICBs):
+![ICB Comparison]({attach}/img/Appointments_vs_staffing_8.png)
+
+- Visualization of unmet demand across Sub-Integrated Care Boards (Sub-ICBs):
+![SUB-ICB Comparison]({attach}/img/Appointments_vs_staffing_9.png)
+
+<br><hr><br>
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_1.png b/PelicanWebsite/content/img/Appointments_vs_staffing_1.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_10.png b/PelicanWebsite/content/img/Appointments_vs_staffing_10.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_2.png b/PelicanWebsite/content/img/Appointments_vs_staffing_2.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_3.png b/PelicanWebsite/content/img/Appointments_vs_staffing_3.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_4.png b/PelicanWebsite/content/img/Appointments_vs_staffing_4.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_5.png b/PelicanWebsite/content/img/Appointments_vs_staffing_5.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_6.png b/PelicanWebsite/content/img/Appointments_vs_staffing_6.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_7.png b/PelicanWebsite/content/img/Appointments_vs_staffing_7.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_8.png b/PelicanWebsite/content/img/Appointments_vs_staffing_8.png
diff --git a/PelicanWebsite/content/img/Appointments_vs_staffing_9.png b/PelicanWebsite/content/img/Appointments_vs_staffing_9.png
diff --git a/main.py b/main.py
@@ -6,7 +6,6 @@
 from collections import defaultdict
 from typing import Dict, Literal, Optional, Union, Tuple
 from tqdm import tqdm
-from icecream import ic
 import json
 
 from src.various_methods import is_working_day
@@ -23,7 +22,6 @@
 
 from src.simulation import (SimulationData, 
                             DailyRegionalModel)
-from src.simulation_schemas import SimulationOutputs
 
 # Configure logging
 log_filename = f"outputs/simulation_log_{datetime.now(tz=pytz.timezone('UTC')).isoformat()}.txt"
@@ -100,10 +98,19 @@ def run_simulation(start_date:dt.date,end_date:dt.date, n_runs:int):
 
 if __name__ == '__main__':
 
+
+    start_time = datetime.now()
+    logging.info(f"Simulation started at {start_time}")
+
     simulation_outputs = run_simulation(
         start_date = dt.date(2024,6,1),
         end_date = dt.date(2025,5,31),
         n_runs=2)
     # save the outputs
+
+    end_time = datetime.now()
+    logging.info(f"Simulation ended at {end_time}")
+    logging.info(f"Total time taken for simulation: {end_time - start_time}")
+
     with open('outputs/simulation_outputs.json', 'w') as f:
         json.dump(simulation_outputs, f)
diff --git a/main_para.py b/main_para.py
@@ -0,0 +1,193 @@
+import numpy as np
+import pandas as pd
+import datetime as dt
+import pytz
+import random
+from collections import defaultdict
+from typing import Dict, Literal, Optional, Union, Tuple
+from tqdm import tqdm
+import json
+import multiprocessing as mp
+from functools import partial
+import logging
+from datetime import datetime
+
+from src.various_methods import is_working_day
+from src.constants import (SARIMA_FORECAST_OUTPUT_FILENAME, 
+                         APPOINTMENT_DURATION_OUTPUT_FILENAME, 
+                         STAFF_TYPE_PROPENSITY_OUTPUT_FILENAME, 
+                         APPOINTMENT_MODE_PROPENSITY_OUTPUT_FILENAME, 
+                         POPULATION_PROJECTIONS_OUTPUT_FILENAME,
+                         ACUTE_REFERRAL_RATES_OUTPUT_FILENAME,
+                         WORKFORCE_CURRENT_STAFF_FTE)
+
+from src.simulation import (SimulationData, 
+                          DailyRegionalModel)
+
+# Configure logging with process ID
+def setup_logging():
+    log_filename = f"outputs/simulation_log_{datetime.now(tz=pytz.timezone('UTC')).isoformat()}.txt"
+    logging.basicConfig(
+        level=logging.INFO,
+        format='%(asctime)s - Process %(process)d - %(levelname)s - %(message)s',
+        handlers=[
+            logging.FileHandler(log_filename),
+            logging.StreamHandler()
+        ]
+    )
+
+DEMAND_FORECASTS = [
+    'SARIMA',
+]
+
+CAPACITY_POLICIES = ['Do Nothing']
+
+def merge_daily_summaries(existing_summary: Dict, new_summary: Dict) -> Dict:
+    """
+    Merge two daily summaries without requiring DailyRegionalModel instance.
+    This function should implement the same logic as DailyRegionalModel.update_summary()
+    """
+    merged = existing_summary.copy()
+    # Add your merging logic here based on your summary structure
+    # For example, if your summaries contain counts or averages:
+    for key in new_summary:
+        if key in merged:
+            if isinstance(merged[key], (int, float)):
+                merged[key] = (merged[key] + new_summary[key]) / 2  # or sum, depending on your needs
+            elif isinstance(merged[key], list):
+                merged[key].extend(new_summary[key])
+            elif isinstance(merged[key], dict):
+                merged[key] = merge_daily_summaries(merged[key], new_summary[key])
+        else:
+            merged[key] = new_summary[key]
+    return merged
+
+def process_single_run(run_params: tuple) -> Dict:
+    """
+    Process a single simulation run with the given parameters.
+    """
+    simulation_run, start_date, end_date, simulation_data = run_params
+    setup_logging()  # Setup logging for this process
+
+    date_range = pd.date_range(start=start_date, end=end_date).date
+    run_output = {}
+
+    logging.info(f"Starting simulation run {simulation_run}")
+
+    for demand_forecast in DEMAND_FORECASTS:
+        logging.info(f"Demand Forecast: {demand_forecast}")
+        run_output[demand_forecast] = {}
+
+        for capacity_policy in CAPACITY_POLICIES:
+            logging.info(f"Capacity Policy: {capacity_policy}")
+            run_output[demand_forecast][capacity_policy] = {}
+
+            for region in ['06L', '07K', '06T']:
+                logging.info(f"Region: {region}")
+                run_output[demand_forecast][capacity_policy][region] = {}
+
+                for day in date_range:
+                    if is_working_day(day):
+                        logging.info(f"Processing day {day}")
+                        daily_model = DailyRegionalModel(
+                            sim_data=simulation_data,
+                            date=day,
+                            run_number=simulation_run,
+                            region=region,
+                            forecast_model=demand_forecast,
+                            capacity_policy=capacity_policy
+                        )
+                        daily_model.process_day()
+                        day_isoformat = day.isoformat()
+                        run_output[demand_forecast][capacity_policy][region][day_isoformat] = daily_model.create_initial_summary()
+
+    return run_output
+
+def merge_run_outputs(outputs_list: list) -> Dict:
+    """
+    Merge the outputs from multiple simulation runs into a single dictionary.
+    """
+    merged_outputs = {}
+
+    for run_output in outputs_list:
+        for demand_forecast in run_output:
+            if demand_forecast not in merged_outputs:
+                merged_outputs[demand_forecast] = {}
+
+            for capacity_policy in run_output[demand_forecast]:
+                if capacity_policy not in merged_outputs[demand_forecast]:
+                    merged_outputs[demand_forecast][capacity_policy] = {}
+
+                for region in run_output[demand_forecast][capacity_policy]:
+                    if region not in merged_outputs[demand_forecast][capacity_policy]:
+                        merged_outputs[demand_forecast][capacity_policy][region] = {}
+
+                    for day, summary in run_output[demand_forecast][capacity_policy][region].items():
+                        if day in merged_outputs[demand_forecast][capacity_policy][region]:
+                            # Merge summaries directly without creating a new DailyRegionalModel
+                            merged_outputs[demand_forecast][capacity_policy][region][day] = merge_daily_summaries(
+                                merged_outputs[demand_forecast][capacity_policy][region][day],
+                                summary
+                            )
+                        else:
+                            merged_outputs[demand_forecast][capacity_policy][region][day] = summary
+
+    return merged_outputs
+
+def run_simulation(start_date: dt.date, end_date: dt.date, n_runs: int) -> Dict:
+    """
+    Run the simulation in parallel using multiple processes.
+    """
+    setup_logging()
+    logging.info("Starting parallel simulation")
+
+    # Initialize simulation data (shared between processes)
+    simulation_data = SimulationData()
+
+    # Create a pool of workers
+    num_processes = mp.cpu_count() #- 1  # Leave one CPU free for system tasks
+    pool = mp.Pool(processes=num_processes)
+
+    # Prepare parameters for each run
+    run_params = [(i, start_date, end_date, simulation_data) for i in range(n_runs)]
+
+    # Run simulations in parallel with progress bar
+    outputs_list = list(tqdm(
+        pool.imap(process_single_run, run_params),
+        total=n_runs,
+        desc="Processing simulation runs"
+    ))
+
+    # Clean up
+    pool.close()
+    pool.join()
+
+    # Merge results from all runs
+    merged_outputs = merge_run_outputs(outputs_list)
+
+    logging.info("Parallel simulation completed")
+    return merged_outputs
+
+if __name__ == '__main__':
+
+    import time
+
+    # Start timing the simulation
+    start_time = time.time()
+
+    # Run the simulation
+    simulation_outputs = run_simulation(
+        start_date=dt.date(2024, 6, 1),
+        end_date=dt.date(2024, 6, 30),
+        n_runs=10
+    )
+
+    # End timing the simulation
+    end_time = time.time()
+    elapsed_time = end_time - start_time
+
+    # Log the elapsed time
+    logging.info(f"Simulation completed in {elapsed_time:.2f} seconds")
+    # Save the outputs
+    with open('outputs/simulation_outputs.json', 'w') as f:
+        json.dump(simulation_outputs, f)