Skip to content

Develop#44

Merged
KYANJO merged 2 commits intoICESEE-project:mainfrom
KYANJO:develop
Feb 20, 2026
Merged

Develop#44
KYANJO merged 2 commits intoICESEE-project:mainfrom
KYANJO:develop

Conversation

@KYANJO
Copy link
Collaborator

@KYANJO KYANJO commented Feb 20, 2026

This pull request introduces two new SLURM batch scripts for strong and weak scaling tests, and updates the ensemble analysis methods in the EnKF Python code to simplify their outputs. The most significant change is that all analysis methods now return only the updated ensemble, removing the computation and return of the analysis error covariance. This change is reflected in both the analysis method implementations and their usage in the serial data assimilation workflow.

Scaling batch scripts:

  • Added run_job_strong_spack.sbatch and run_job_weak_spack.sbatch scripts for launching strong and weak scaling tests, respectively, with Spack environment setup and SLURM resource configuration. [1] [2]

Ensemble analysis output simplification:

  • Modified EnKF_Analysis, DEnKF_Analysis, EnRSKF_Analysis, and EnTKF_Analysis methods in EnKF.py to return only the updated ensemble (ensemble_analysis), removing the computation and return of the analysis error covariance. [1] [2] [3] [4]
  • Updated icesee_model_data_assimilation_serial in icesee_da_serial.py to match new method signatures, only assigning the updated ensemble and not expecting covariance output.

Debug and experimental code changes:

  • Commented out unused code and debug statements related to noise generation and analysis error covariance computation in EnKF.py. [1] [2] [3] [4] [5]

These changes streamline the ensemble analysis workflow and add infrastructure for performance testing.

Copilot AI review requested due to automatic review settings February 20, 2026 17:52
@KYANJO KYANJO merged commit 67f1341 into ICESEE-project:main Feb 20, 2026
1 of 2 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request simplifies the EnKF analysis workflow by removing the computation and return of analysis error covariance from all ensemble analysis methods. It also adds infrastructure for strong and weak scaling performance tests via new SLURM batch scripts.

Changes:

  • Modified all four EnKF analysis methods (EnKF_Analysis, DEnKF_Analysis, EnRSKF_Analysis, EnTKF_Analysis) to return only the updated ensemble instead of both ensemble and error covariance
  • Updated the serial data assimilation workflow to match the new single-return signatures
  • Added SLURM batch scripts for conducting strong and weak scaling tests with Spack environment configuration

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File Description
src/EnKF/python_enkf/EnKF.py Commented out covariance computation and simplified return signatures for all four analysis methods
src/run_model_da/icesee_da_serial.py Updated method calls to match new single-return signatures (no longer unpacking covariance)
applications/issm_model/examples/ISMIP_Choi/run_job_weak_spack.sbatch New script for weak scaling tests with varying ensemble sizes and fixed model processes
applications/issm_model/examples/ISMIP_Choi/run_job_strong_spack.sbatch New script for strong scaling tests with fixed ensemble size and varying process counts

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

#!/bin/bash
#SBATCH -t250:00:00
#SBATCH -JISMIP
#SBATCH -N 14 # 1080 / 24 = 45 nodes
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 4 is inconsistent with the actual SLURM directive. The comment says "1080 / 24 = 45 nodes" but the directive specifies 14 nodes. Either the comment should be updated to "336 / 24 = 14 nodes" (based on the -n 336 directive on line 5), or if 45 nodes is actually needed, the -N directive should be corrected.

Suggested change
#SBATCH -N 14 # 1080 / 24 = 45 nodes
#SBATCH -N 14 # 336 / 24 = 14 nodes

Copilot uses AI. Check for mistakes.

# Compute total ranks used = np * (MODEL_NPROCS + 1)
local rank_size=$(( np * (MODEL_NPROCS + 1) ))
local data_path="_modelrun_datasets"
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable data_path is defined on line 43 but never used. Either remove this unused variable or add the necessary logic that references this path.

Suggested change
local data_path="_modelrun_datasets"

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +130
# ensemble[indx_map[key],ens] = value + q0[:len(value)]
# ensemble[indx_map[key],ens] = value + np.random.multivariate_normal(np.zeros(len(value)), 0.0*np.eye(len(value)))
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation for these commented-out lines appears inconsistent. Line 129 has correct indentation (matching the comment above it), but line 130 has extra indentation that doesn't align with the surrounding code structure. This should be corrected for consistency, even though the code is commented out.

Copilot uses AI. Check for mistakes.
Comment on lines +413 to +414
# return ensemble_analysis, analysis_error_cov
return ensemble_analysis
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for this method (lines 380-389, outside the changed region) still indicates it returns both ensemble_analysis and analysis_error_cov, but the implementation now only returns ensemble_analysis. While the docstring itself cannot be edited in this PR since it's outside the changed regions, this creates an inconsistency between documentation and implementation that should be addressed.

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +42
# Compute total ranks used = np * (MODEL_NPROCS + 1)
local rank_size=$(( np * (MODEL_NPROCS + 1) ))
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable rank_size is computed on line 42 but never used. This calculation appears intended for resource allocation or validation but is not utilized. Either remove this unused variable or add the necessary logic that uses it for validation or configuration.

Suggested change
# Compute total ranks used = np * (MODEL_NPROCS + 1)
local rank_size=$(( np * (MODEL_NPROCS + 1) ))

Copilot uses AI. Check for mistakes.
#SBATCH --mail-user=bkyanjo3@gatech.edu

# set -euo pipefail
cd $SLURM_SUBMIT_DIR
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent quoting: cd $SLURM_SUBMIT_DIR should use quotes like cd "$SLURM_SUBMIT_DIR" (as done in the weak scaling script on line 15) to handle paths with spaces safely. This is a shell scripting best practice.

Suggested change
cd $SLURM_SUBMIT_DIR
cd "$SLURM_SUBMIT_DIR"

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +38
# "$MPI_DIR/bin/mpirun" -np "$np" \
# srun --mpi=pmix -n "$np" \

Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These commented-out lines (36-37) appear to be leftover debug or alternative implementation code. Since this is production code for scaling tests, consider removing these comments to improve readability, or if they serve as documentation for alternative MPI launch methods, add a clearer explanation of why they're preserved.

Suggested change
# "$MPI_DIR/bin/mpirun" -np "$np" \
# srun --mpi=pmix -n "$np" \

Copilot uses AI. Check for mistakes.
Comment on lines +453 to +454
# return ensemble_analysis, analysis_error_cov
return ensemble_analysis
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for this method (lines 417-426, outside the changed region) still indicates it returns both ensemble_analysis and analysis_error_cov, but the implementation now only returns ensemble_analysis. This creates an inconsistency between documentation and implementation that should be addressed.

Copilot uses AI. Check for mistakes.
# analysis_error_cov = ensemble_analysis + self.Cov_model@(U@np.diag(np.sqrt(S))@U.T)

return ensemble_analysis, analysis_error_cov
return ensemble_analysis
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for this method (lines 457-466, outside the changed region) still indicates it returns both ensemble_analysis and analysis_error_cov, but the implementation now only returns ensemble_analysis. This creates an inconsistency between documentation and implementation that should be addressed.

Copilot uses AI. Check for mistakes.
Comment on lines +526 to +527
# return ensemble_analysis, analysis_error_cov
return ensemble_analysis
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for this method (lines 490-499, outside the changed region) still indicates it returns both ensemble_analysis and analysis_error_cov, but the implementation now only returns ensemble_analysis. This creates an inconsistency between documentation and implementation that should be addressed.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants