-
Notifications
You must be signed in to change notification settings - Fork 154
In-Situ Cyclone Tracking pipeline #645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mariusaurus
wants to merge
79
commits into
NVIDIA:main
Choose a base branch
from
mariusaurus:mkoch/tc_tracking
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
79 commits
Select commit
Hold shift + click to select a range
d3a2245
resolved conflict
mariusaurus 0189d52
update changelog
dallasfoster f11b18b
move seed initialization and fix dxwrapper tests
dallasfoster d063760
tempest extremes diagnostic model
mariusaurus a4d2544
error message
mariusaurus c1cdca0
testing if TE is available and works
mariusaurus 016f16b
started working on support for batch sizes >1, currently works for bs 1
mariusaurus 68e33b5
halfway to larger batch support
mariusaurus 7bd60e1
enabling TE for batch sizes of >1. async version seems to work as wel…
mariusaurus 3b0c00e
option to pass file names to TE connector
mariusaurus 1e9bbe8
array equal test
mariusaurus d6be6dd
first stable try
mariusaurus 1e9b275
support for per-member parallel execution and lets user controll max …
mariusaurus b5f5c18
precommit hooks
mariusaurus af8bc71
vibe-coded some tests, need to be hand-tested and selected
mariusaurus a9fd2bc
vibe-coded some tests, need to be hand-tested and selected
mariusaurus 526e6bf
passing all pre-commit tests, still need to sub-select tests as there…
mariusaurus c3258d9
subselected tests
mariusaurus 3fd145d
install doc
mariusaurus c26f453
throwing an error in case cleanup is not called before object goes ou…
mariusaurus d2a8e4a
custom depenmdency failure message for TE
mariusaurus 0ab6d67
moved tensor tiling and concatenation to utils
mariusaurus 8ca3fae
enable setting fcn3 random seed
dallasfoster e93932e
add proper noise handling for fcn3
dallasfoster bc9e3ac
fix linting and test issues
dallasfoster 2685f90
update lockfile
dallasfoster e3a4e3d
move seed initialization and fix dxwrapper tests
dallasfoster 1dec990
tc tracking pipeline
mariusaurus 02945f1
update
mariusaurus f89efe3
updated uv.lock
mariusaurus 92896eb
seems to work now
mariusaurus 9e0e106
wind gust from HRRR analysis
mariusaurus 5550ad9
minor updates
mariusaurus 869b8fe
stability test
mariusaurus 343d035
version check for torch-harmonics import
mariusaurus 037a5a7
addressed greptile comments
mariusaurus 608315b
time import
mariusaurus 3f55702
comma
mariusaurus 7fefe5b
merged main
mariusaurus dc0cd79
updated env
mariusaurus 70361a7
moved tempest_extremes
mariusaurus e172e50
wip
mariusaurus edb1978
exploring aifs ensemble capability
mariusaurus e3b2ed0
Merge branch 'main' into mkoch/tc_tracking
mariusaurus 8c3c848
thread issue with writing to netcdf in threads
mariusaurus d41351e
automated testing of writing TE files and their reproducibility. bug …
mariusaurus 780e458
README for tc_hunt test
mariusaurus 4360ea4
second test for extracting historic data
mariusaurus 8f766f5
added aux data for tests
mariusaurus 5bd90c6
merged main
mariusaurus f3286fb
test for reference track extraction
mariusaurus 3638bcc
wip
mariusaurus 5b7ae87
track plotting notebook
mariusaurus 3002fd0
field and track notebook
mariusaurus 0b968b3
plotting tracks and fields notebook
mariusaurus 67f7fc6
case study notebook
mariusaurus 2df89f7
Merge branch 'main' into mkoch/tc_tracking
mariusaurus 0f046a1
REAMEs and markdowns in notebooks
mariusaurus fbdbc9f
drafted readme
mariusaurus b495b31
first iteration over readme
mariusaurus ad4717c
final touches README
mariusaurus 34fda29
...gif
mariusaurus 78dc346
wip
mariusaurus b76a7d3
Merge branch 'main' into mkoch/tc_tracking
mariusaurus fb7f207
verified plotting for west-pacific
mariusaurus d4926cf
final touches
mariusaurus f0fcff0
removed some configs
mariusaurus 2c8eaf3
pyproject project name
mariusaurus b4af03e
README comment about conainer build time
mariusaurus 1bf9850
removed TE from models/dx/__init__
mariusaurus 99ae8ba
moving a bracket around
mariusaurus 6048522
Merge branch 'main' into mkoch/tc_tracking
mariusaurus 5da7854
updated base container
mariusaurus ff5dcba
Merge branch 'main' into mkoch/tc_tracking
mariusaurus acd0354
Merge branch 'mkoch/tc_tracking' of github.com:mariusaurus/earth2stud…
mariusaurus f067969
merged main, might be broken
mariusaurus 6c85a95
Merge branch 'mkoch/tc_tracking' of github.com:mariusaurus/earth2stud…
mariusaurus 2f4b495
fixed some bugs to be in line with new main
mariusaurus d240ae2
split plan
mariusaurus File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Python-generated files | ||
| __pycache__/ | ||
| *.py[oc] | ||
| build/ | ||
| dist/ | ||
| wheels/ | ||
| *.egg-info | ||
| outputs*/ | ||
| *.zarr | ||
| *.nc | ||
| *.gif | ||
| .python-version | ||
|
|
||
| # Virtual environments | ||
| .venv |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| FROM nvcr.io/nvidia/physicsnemo/physicsnemo:25.11 | ||
|
|
||
| # update repo info | ||
| RUN apt update -y && \ | ||
| apt install -y libibmad5 unixodbc && \ | ||
| apt install -y netcdf-bin libnetcdf-dev | ||
|
|
||
| # upgrade cmake | ||
| RUN apt remove cmake -y && \ | ||
| pip install cmake --upgrade | ||
|
|
||
| # Install uv | ||
| RUN wget -qO- https://astral.sh/uv/install.sh | sh | ||
| ENV PATH="/root/.local/bin:$PATH" | ||
| ENV CC=/usr/bin/gcc | ||
| ENV CXX=/usr/bin/g++ | ||
|
|
||
| # install TempestExtremes | ||
| WORKDIR / | ||
| RUN git clone https://github.com/ClimateGlobalChange/tempestextremes.git && \ | ||
| mkdir -p /tempestextremes/build | ||
| WORKDIR /tempestextremes/build | ||
| RUN cmake .. && \ | ||
| make -j && \ | ||
| cp ./bin/DetectNodes /usr/local/bin && \ | ||
| cp ./bin/StitchNodes /usr/local/bin | ||
|
|
||
| # copy source into the container. | ||
| RUN mkdir -p /tc_tracking_src | ||
| COPY . /tc_tracking_src | ||
| WORKDIR /tc_tracking_src | ||
|
|
||
| ENV FORCE_CUDA_EXTENSION=1 | ||
| ENV TORCH_CUDA_ARCH_LIST="8.0 8.6 9.0 10.0 12.0, 13.0+PTX" | ||
| RUN uv pip install --system --break-system-packages --no-cache-dir . |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,116 @@ | ||
| # Splitting the TC Tracking Recipe into Incremental PRs | ||
|
|
||
| Breakdown into 4 PRs, ordered from foundational to supplementary. Each PR is self-contained and reviewable on its own. | ||
|
|
||
| --- | ||
|
|
||
| ## PR 1 -- Core infrastructure + `generate_ensemble` mode | ||
|
|
||
| The bulk of the code. Introduces the full pipeline for running AI weather model ensembles and tracking tropical cyclones with TempestExtremes. | ||
|
|
||
| **Files to include:** | ||
|
|
||
| - `tc_hunt.py` -- entry point, but only dispatching `generate_ensemble` (no import of `baseline_extraction`, no `reproduce_members` dispatch) | ||
| - `src/__init__.py` | ||
| - `src/tempest_extremes.py` -- the full TempestExtremes + async wrapper (~1250 lines, core of the recipe) | ||
| - `src/utils.py` -- shared helpers | ||
| - `src/data/utils.py` -- `DataSourceManager`, `load_heights` | ||
| - `src/data/file_output.py` -- output setup for Zarr/NetCDF | ||
| - `src/modes/generate_ensembles.py` -- **full file including `reproduce_members`**. Since `reproduce_members` and `generate_ensemble` share `initialise`, `load_model`, `run_inference`, `distribute_runs`, and `configure_runs`, splitting the file would be artificial. The function simply sits unused until PR 2 wires it up. | ||
| - `cfg/helene.yaml`, `cfg/hato.yaml` -- example configs for tracking | ||
| - `pyproject.toml` -- **without `tropycal`** (only needed by baseline extraction in PR 3) | ||
| - `Dockerfile`, `set_envs.sh`, `.gitignore` | ||
| - `README.md` -- documenting only the `generate_ensemble` mode | ||
| - `test/test_tc_hunt.sh`, `test/cfg/baseline_helene.yaml`, `test/README.md`, `test/.gitignore` -- basic test for the generate mode | ||
|
|
||
| **Notes:** | ||
|
|
||
| - This is the largest PR but it is all one coherent feature: "run ensemble forecasts and track cyclones". | ||
| - `tropycal`, `moviepy`, and plotting-only dependencies can be dropped from `pyproject.toml` for this PR to keep the dependency surface small. | ||
| - The `testsource.py` debug script is not part of the recipe proper; leave it out (it is an untracked file anyway). | ||
|
|
||
| --- | ||
|
|
||
| ## PR 2 -- Reproduction mode | ||
|
|
||
| A very small, easy-to-review PR. Wires up the `reproduce_members` function that already exists in `generate_ensembles.py`. | ||
|
|
||
| **Changes:** | ||
|
|
||
| - `tc_hunt.py` -- add `reproduce_members` import and dispatch case (~3 lines changed) | ||
| - `cfg/reproduce_helene.yaml` -- example config for reproducing specific ensemble members | ||
| - `test/cfg/reproduce_helene.yaml` -- test config | ||
| - `README.md` -- add documentation for the `reproduce_members` mode | ||
|
|
||
| This PR is intentionally tiny. The only new logic is the dispatch wiring and configs; the implementation already landed in PR 1 as part of `generate_ensembles.py`. | ||
|
|
||
| --- | ||
|
|
||
| ## PR 3 -- Baseline extraction from reanalysis | ||
|
|
||
| Adds the `extract_baseline` mode, which fetches ERA5 reanalysis data, runs TempestExtremes on it, and matches the detected tracks against IBTrACS ground truth. | ||
|
|
||
| **Files to include:** | ||
|
|
||
| - `src/modes/baseline_extraction.py` -- the full extraction pipeline (~208 lines) | ||
| - `tc_hunt.py` -- add `extract_baseline` import and dispatch case | ||
| - `cfg/extract_era5.yaml` -- config for Helene + Hato extraction | ||
| - `aux_data/ibtracs.HATO_HELENE.list.v04r01.csv` -- IBTrACS subset | ||
| - `aux_data/reference_track_hato_2017_west_pacific.csv` -- reference track | ||
| - `aux_data/reference_track_helene_2024_north_atlantic.csv` -- reference track | ||
| - `test/test_historic_tc_extraction.sh`, `test/cfg/extract_era5.yaml` -- extraction test | ||
| - `pyproject.toml` -- add `tropycal>=1.4` dependency | ||
| - `README.md` -- add documentation for `extract_baseline` | ||
|
|
||
| **Notes:** | ||
|
|
||
| - This is the only PR that adds `tropycal` as a dependency (used for IBTrACS access). | ||
| - The `aux_data/` CSV files are small reference datasets, fine to commit. | ||
|
|
||
| --- | ||
|
|
||
| ## PR 4 -- Plotting and analysis tools | ||
|
|
||
| Adds the visualisation and analysis tooling. Entirely optional for the core pipeline to work; can be merged last or even deferred. | ||
|
|
||
| **Files to include:** | ||
|
|
||
| - `plotting/analyse_n_plot.py` | ||
| - `plotting/data_handling.py` | ||
| - `plotting/plotting_helpers.py` | ||
| - `plotting/plot_tracks_n_fields.ipynb` | ||
| - `plotting/tracks_slayground.ipynb` | ||
| - `plotting/README.md` | ||
| - `plotting/.gitignore` | ||
| - `pyproject.toml` -- ensure `cartopy`, `matplotlib`, `moviepy` are present (likely already there from PR 1, but verify) | ||
|
|
||
| --- | ||
|
|
||
| ## Dependency flow between PRs | ||
|
|
||
| ```mermaid | ||
| graph LR | ||
| PR1["PR 1: Core + generate_ensemble"] | ||
| PR2["PR 2: reproduce_members"] | ||
| PR3["PR 3: extract_baseline"] | ||
| PR4["PR 4: Plotting"] | ||
| PR1 --> PR2 | ||
| PR1 --> PR3 | ||
| PR1 --> PR4 | ||
| PR3 --> PR4 | ||
| ``` | ||
|
|
||
| PR 2, PR 3, and PR 4 all depend on PR 1. PR 3 and PR 4 are independent of PR 2. PR 4 may reference outputs from PR 3 (reference tracks), so ordering PR 3 before PR 4 is ideal but not strictly required. | ||
|
|
||
| --- | ||
|
|
||
| ## Implementation approach | ||
|
|
||
| For each PR, we create a branch off main and stage only the relevant files. Since all files are new (no modifications to existing e2s files), this is straightforward -- each PR is a subset of the current `recipes/tc_tracking/` directory. The main work is: | ||
|
|
||
| 1. For PR 1: temporarily strip `tc_hunt.py` to only handle `generate_ensemble`, and trim `pyproject.toml` dependencies. | ||
| 2. For PR 2: minimal diff -- add 3 lines to `tc_hunt.py` + config files. | ||
| 3. For PR 3: add `baseline_extraction.py` + wiring + configs + aux data + tropycal dep. | ||
| 4. For PR 4: add `plotting/` directory. | ||
|
|
||
| Each subsequent PR is a clean additive diff on top of the previous one. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated version check logic - same code appears in lines 38-50 inside the try block