Reproducible research artifact for the NEMO spin-up acceleration method. This repository contains the full end-to-end workflow as a citable, versioned snapshot.
| Submodule | Description |
|---|---|
nemo-spinup-forecast |
Dimensionality reduction and forecasting |
nemo-spinup-restart |
Restart file generation |
nemo-spinup-evaluation |
Evaluation and validation |
Reference data (DINO output, restart files, mesh_mask.nc) is archived on Zenodo:
Zenodo DOI: 10.5281/zenodo.19557419
This installs all three nemo-spinup-{forecast, restart, evaluation} packages in a single virtual environment.
-
Clone this repository with submodules
git clone --recurse-submodules https://github.com/m2lines/nemo-spinup-bench.git cd nemo-spinup-bench -
Create a virtual environment and install dependencies
python3 -m venv venv source venv/bin/activate pip install ./nemo-spinup-{forecast,restart,evaluation}
This describes the complete end-to-end pipeline to run the benchmark. We omit details like building and compiling NEMO/DINO.
The entire pipeline assumes NEMO 4.2.0 and a completed cold-start NEMO run, i.e. output files, restart files, and a
mesh_mask.ncare available before starting.The commands below assume reference data is downloaded to
data/50/. Substitute this with your own data directory if not using the reference data.
Each step below is also available as a standalone script in
pipeline/, runnable from the bench root:bash pipeline/1-download-data.sh bash pipeline/2-evaluate-baseline.sh bash pipeline/3-forecast.sh bash pipeline/4-restart.sh bash pipeline/5-evaluate-projected.sh
-
Get simulation data
The entire benchmark will run using sample data hosted on Zenodo. Alternatively you may run NEMO/DINO yourself; we recommend running for at least 50–100 years. The Zenodo data contains 50 years of DINO output files to train on.
Download data from Zenodo
Download
50.zipfrom Zenodo record 19557419, unzip it, and place the contents indata/50/. The record also provides200.zip(200 years of DINO output) andrestart.zip(200 annual restart files) for extended experiments.mkdir -p data/50 curl -L -o 50.zip https://zenodo.org/records/19557419/files/50.zip unzip 50.zip -d data/50/ rm 50.zip
Or run
bash pipeline/1-download-data.sh. -
(Optional) Combine restart files and mesh mask using REBUILD_NEMO:
This step is only required if you are using your own NEMO run. The Zenodo reference data already includes combined files. You can use the same module environment used to run NEMO/DINO to compile
rebuild_nemo../rebuild_nemo -n ./nam_rebuild data/50/DINO_00576000_restart 36 ./rebuild_nemo -n ./nam_rebuild data/50/mesh_mask 36
-
(Optional) Resample data
This step is not required when using the Zenodo reference data, which already contains the resampled file
DINO_1m_to_1y_grid_T.nc.All data must be temporally aligned before forecasting. If you are bringing your own NEMO output, convert monthly SSH to annual using
cdo:cdo yearmean DINO_1m_grid_T.nc DINO_1m_To_1y_grid_T.nc
Temperature and salinity (3-D) are already annual (
DINO_1y_grid_T.nc).If more training data is needed, concatenate monthly outputs
*grid_T.ncwithncrcat, part of the NCO (netCDF Operators).
The spin-up acceleration pipeline forecasts the ocean state forward in time using dimensionality reduction and Gaussian process regression, generates updated restart files, and evaluates the result against a reference numerical run. We begin with a baseline evaluation of the reference simulation so that the final evaluation can be compared against it.
-
Establish a baseline evaluation of the cold-start reference simulation:
nemo-spinup-evaluation \ --sim-path data/50 \ --config configs/DINO-evaluation.yaml \ --results-dir evaluation-output \ --result-file-prefix baseline \ --mode both
Results are written to
evaluation-output/baseline_restart.csvandevaluation-output/baseline_grid.csv.Or run
bash pipeline/2-evaluate-baseline.sh. -
Create the projected state
The default technique is PCA for dimensionality reduction with Gaussian process regression for forecasting. The key parameters to adjust are
--startand--steps:--startcontrols how many years of spin-up are used for training (here 30 with 20 years thrown away), and--stepscontrols how many years are skipped forward (here 30). Increasing--stepsgives a larger acceleration but may reduce accuracy.nemo-spinup-forecast \ --ye True \ --start 20 \ --end 50 \ --comp 1 \ --steps 30 \ --data-path data/50 \ --output-path data/50_projected
Argument Description --yeSimulation expressed in years ( True) or months (False)--startStarting year for training data --endEnding year (usually the last simulated year) --compNumber or ratio of components to use --stepsJump size (years if --ye True, months otherwise)--data-pathDirectory containing the simulation files --output-pathDirectory to write forecast results to; a timestamped run directory is created under data/50_projected/runs/anddata/50_projected/latestis a symlink to it--ocean-termsPath to ocean_terms.yamlmapping logical terms (SSH, Salinity, Temperature) to dataset variable names; uses packaged default if omitted--techniques-configPath to techniques_config.yamlselecting DR and forecast techniques; uses packaged default if omittedWith the example above, the forecast outputs predicted ocean state variables to
data/50_projected/latest/forecast/simu_predicted/.Or run
bash pipeline/3-forecast.sh.
-
Create the updated restart file
Using the forecasted ocean state from the previous step,
nemo-spinup-restartinjects the predicted variables (SSH, temperature, salinity, and derived velocities) into the original NEMO restart file. A new restart file is created withNEW_prepended to the filename, leaving the original intact and ready to initialise NEMO at the projected year.ln -sf ../50/DINO_00576000_restart.nc data/50_projected/DINO_00576000_restart.nc nemo-spinup-restart \ --restart_path data/50_projected/ \ --radical DINO_00576000_restart \ --mask_file data/50/mesh_mask.nc \ --prediction_path data/50_projected/latest/forecast/simu_predicted/ \ --ocean_terms configs/ocean_terms.DINO.yaml
The source restart is symlinked into
data/50_projected/so thatnemo-spinup-restartreads from and writes theNEW_file into the same directory.--radicalis the prefix of the restart file (e.g.DINO_00576000_restart)- Output files are named as the originals but with
NEWprepended
Or run
bash pipeline/4-restart.sh.
-
Evaluate the projected restart:
ln -sf ../50/mesh_mask.nc data/50_projected/mesh_mask.nc nemo-spinup-evaluation \ --sim-path data/50_projected \ --config configs/DINO-evaluation.yaml \ --results-dir evaluation-output \ --result-file-prefix projected \ --mode restart
Compare
evaluation-output/projected_restart.csvagainst the baseline from step 4 (evaluation-output/baseline_restart.csv) to assess the impact of the spin-up acceleration.Or run
bash pipeline/5-evaluate-projected.sh.
-
Copy the experiment directory
EXP00as a backup; the original will be overwritten in the next step. -
Copy the updated restart files (
NEW_DINO_<time>_restart_<proc_id>.nc) back to the original experiment directory. -
Update
namelist_cfgundernamrun:
| Parameter | Description |
|---|---|
nn_it000 |
First timestep (last timestep + 1) |
nn_itend |
Final timestep |
cn_ocerst_in |
Restart filename (matches latest restart file) |
ln_rstart |
.true. to start from a restart file |
- Restart DINO using the updated restart file.
Results from running the benchmark with 50 years of DINO data, forecasting 30 years ahead from year 20–50 using PCA + Gaussian process regression (--start 20 --end 50 --steps 30 --comp 1).
| Metric | Baseline (50 yr) | Projected | Difference | % Change |
|---|---|---|---|---|
| check_density_from_file | 0.000020 | 0.011721 | +0.011701 | — |
| check_density_computed | 0.000032 | 0.011721 | +0.011689 | — |
| temperature_500m_30NS (°C) | 11.508 | 11.441 | −0.067 | −0.58% |
| temperature_BWbox (°C) | 5.197 | 5.203 | +0.005 | +0.10% |
| temperature_DWbox (°C) | 5.329 | 5.318 | −0.011 | −0.21% |
| ACC_Drake (Sv) | 188.69 | −102.75 | −291.44 | −154% |
| ACC_Drake_2 (Sv) | 188.69 | −102.75 | −291.44 | −154% |
| NASTG_BSF_max (Sv) | 35.52 | 16.81 | −18.70 | −52.7% |
Observations:
- Temperature metrics are well preserved (< 1% change), indicating the scalar field forecast is accurate.
- Density monotonicity violations increased from near-zero to ~1.2% of grid points.
- Transport metrics (ACC Drake, NASTG BSF) show large deviations. The geostrophic velocity reconstruction in
nemo-spinup-restartproduces physically unrealistic velocities - this is a known issue under investigation.
If you use this work, please cite:
Citation to be added upon publication.