Score-based Data Assimilation StormCast#730
Conversation
Greptile SummaryThis PR introduces Key changes:
|
| Filename | Overview |
|---|---|
| earth2studio/models/da/sda_stormcast.py | Core StormCast SDA implementation — large, well-structured new file. NaN observation values are not filtered before scatter-add in _build_obs_tensors, which can silently corrupt DPS guidance. Most previously-flagged issues (debug prints, mutable default args, typos, sorting guard, duplicate obs averaging, sampler_args key validation) have been addressed. |
| earth2studio/data/utils.py | Adds a legacy mode to fetch_data that returns a raw xr.DataArray with cupy backing on CUDA. The device.index or 0 guard is correctly applied in the new non-legacy path. Minor: interp_method is silently ignored in the non-legacy path when interp_to=None. |
| earth2studio/models/da/base.py | Protocol extended to allow None observations in __call__ and create_generator, and adds optional *args init parameters and init_coords() method. Changes are clean and well-documented. |
| earth2studio/models/da/interp.py | Renames tolerance to time_tolerance and adds init_coords() returning None. The rename is a breaking API change but accepted per prior thread discussion. Smolyak interpolation logic is unchanged and looks correct. |
| earth2studio/utils/coords.py | New map_coords_xr function implementing GPU/CPU-aware nearest-neighbor coordinate mapping without calling xarray.interp() (avoiding the previous scipy/CPU-only path). Uses sort-based searchsorted with correct ascending-order handling via np.argsort. Logic looks correct. |
| examples/21_stormcast_sda.py | Well-written end-to-end example. cartopy/matplotlib are imported twice (lines 165–167 and 235–237), which is redundant. The .get() calls are CUDA-specific but the example is documented as GPU-only. Other previously flagged issues appear to be addressed. |
| test/models/da/test_da_sda_stormcast.py | Comprehensive test suite covering polygon point-in-polygon, observation tensor building (including None, out-of-grid, and duplicate-averaging cases), conditioning fetch, __call__, create_generator, and exception handling. GPU interpolation test validates against scipy reference. Good coverage. |
| test/data/test_data_utils.py | New tests for the legacy=False mode of fetch_data and updated prep_data_inputs behaviour. Coverage looks correct and complete. |
Last reviewed commit: 45d157f
|
@greptile-ai |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
@greptile-ai |
|
@greptile-ai |
| """Stateless forward pass""" | ||
| input_coords = self.input_coords() | ||
| (output_coords,) = self.output_coords(input_coords, **x.attrs) | ||
| (output_coords,) = self.output_coords(input_coords, **obs.attrs) |
There was a problem hiding this comment.
Should we have some check to make sure the obs.attrs contains the required request_time arg? Is this part of the general "not having super extensive checks/handshakes" status of the DA?
There was a problem hiding this comment.
Yeah, deferring this for some later PR focused on these utils
|
/blossom-ci |
|
/blossom-ci |
|
/blossom-ci |
1 similar comment
|
/blossom-ci |
Earth2Studio Pull Request
Description
create_generatorand the first set of parameters in callCoverage:
Rendered Example
Closer results
Checklist
Dependencies