This repository accompanies the MSc thesis "TRENDY-Emulator: A Bias-Corrected Deep Learning Emulator of Terrestrial Carbon and Water Dynamics", presented in fulfillment of the Global Forestry Erasmus Mundus MSc at AgroParisTech, Montpellier. It provides the full workflow used to build, train, and benchmark the TRENDY-Emulator.
The pipeline includes:
- preprocessing TRENDY forcing and DGVM outputs
- generating spatial masks
- constructing Zarr datasets for training and inference
- model training (Base → Stable → Transfer-Learned)
- scenario prediction and NetCDF export
- ILAMB benchmarking
- plotting and analysis
Location: scripts/preprocessing/
Processes all TRENDY forcing and ancillary datasets.
Steps:
-
Run in any order
climate/ClimateProcessor.shco2/co2.pyjsbach/jsbach_processor.pyluh2/LUHProcessor.shndep/ndep_processor.shnfert/nfert_processor.shpotential_radiation/potential_radiation.shpopulation/population.pyavh15c1.sh— processes LAI observations (after downloading via ILAMB).
-
rolling_means.sh— computes 30-year trailing means for selected climate variables. -
preindustrial.sh— generates pre-industrial CO₂, land-use, and climate inputs for TRENDY scenarios. -
model_outputs/models.sh- preprocesses all the individual DGVMs -
model_outputs/ensmean.sh- takes the ensemble mean of DGVMs.
Location: scripts/masking/
nan_mask.sh— creates a mask of pixels where all forcing + output variables are finite.land_mask.sh— selects pixels where CLM, ORCHIDEE, ELM & CLASSIC agree land fraction > 0.9; combined with the nan mask.tvt_mask.sh— creates the longitudinal-band Train/Validation/Test split.
Location: scripts/make_zarrs/training/main/
- Run
make_training_tiles.sh— generates training, validation, and testing Zarrs. - Run
consolidate.shandfinalize.sh.
Location: scripts/make_zarrs/training/other/
fill_potential_rad_nans.sh— fills missing potential radiation data in the testing Zarr.add_avh15c1.sh— adds LAI observations for transfer learning.
Location: scripts/make_zarrs/inference/
make_inference.sh— creates inference-ready Zarrs for all scenarios.add_avh15c1.sh— inserts observed LAI for scenario 3.
Location: scripts/standardisation/
standardisation.sh— computes global means and standard deviations for each variable.merge.py— assembles the full standardisation JSON.
Location: pipeline/1.train/
- Run train.sh
Three emulator versions are produced:
- Base-Emulator — trained without autoregressive carrying
- Stable-Emulator — autoregressive carrying with progressively longer horizons
- TL-Emulator — transfer-learned on AVH15C1 LAI observations
These versions were produced by manually adjusting the condigurations in train.sh.
Location: pipeline/2.predict/
- Run predict.sh
Outputs:
- Zarr prediction files
- (optional) NetCDF exports (
export_nc=true) - automatic copying into ILAMB-ready structure
Location: pipeline/3.benchmark/
Steps:
- Download ILAMB benchmark datasets (see https://www.ilamb.org/doc/tutorial.html).
- Place emulator output inside
MODELS/. - Configure
build.cfg(regions, confrontations, variables). - Run
submit.sh.
Location: pipeline/4.Analyse/
Main manuscript figures are generated via:
create_csvs/plot_csvs/
Additional analysis scripts are provided in the same directory.
This work is being prepared for publication, please contact the author before use.