Skip to content

Conversation

@jdbcode
Copy link
Member

@jdbcode jdbcode commented Nov 17, 2025

Major Refactor - Explicit Pixel Grids & CF Dimensions

This PR represents a major refactor to the core of Xee's backend. It is merges the simplify_pixel_grid_params branch into main. The simplify_pixel_grid_params has months of changes - the significant changes to Xee API have been through review, doc changes and CI/CD Python version changes were directly pushed. The most significant changes are adopting explicit pixel grid parameters for opening datasets (see discussion) and updating the default dimension ordering to align with community standards (see discussion). After this PR is merged we should release v0.1.0 (from v0.0.x).

⚠️ This version contains breaking changes. All users must update their existing xarray.open_dataset calls when upgrading to this version. Please refer to the Migration Guide for detailed instructions.


💥 Breaking Changes & Rationale

1. Explicit Pixel Grid Definition

The previous, heuristic-based grid definition arguments (scale, geometry, and projection) have been removed from xr.open_dataset(..., engine='ee').

  • What Changed: The open_dataset function now requires three explicit parameters to define the pixel grid: crs, crs_transform (a 6-tuple affine matrix), and shape_2d (width/height pixel count).
  • Rationale: This shift forces users to explicitly define the output grid, eliminating ambiguity and ensuring archival reproducibility. This is crucial for precise, repeatable geospatial workflows.

2. CF-friendly Dimension Ordering

The default order of spatial dimensions in the resulting Xarray objects has been changed.

  • What Changed: Datasets are now returned in the dimension order [time, y, x] instead of the old [time, x, y].
  • Rationale: This brings Xee into alignment with CF conventions and the expectations of most geospatial libraries (like rioxarray and cartopy), significantly reducing the need for manual .transpose() calls.

✨ New Features & Helper Utilities

To support the new explicit grid workflow, a new xee.helpers module has been added with key utilities:

  • extract_grid_params(ee_obj): Automatically derives the required crs, crs_transform, and shape_2d from an existing ee.Image or ee.ImageCollection.
    • Rationale: This is the primary way to achieve a "match source grid" workflow, allowing reviewers to verify the use of the object's native grid parameters simply.
  • fit_geometry(...): Computes the required grid parameters to cover a specific shapely.geometry (AOI) at either a fixed scale/resolution or a fixed shape/pixel count.
  • Refined Transform Logic: The internal calculation for the affine transform now uses math.floor and math.ceil to precisely snap the grid to the bounding box extents, improving coordinate accuracy and preventing sub-pixel misalignment.

📚 Documentation & Infrastructure Updates

  • Migration Guide Added: A new detailed guide, docs/migration-guide-v0.1.0.md, has been created to assist users in updating their code to the new v0.1.0 API.
  • Extensive Documentation Refactor: New Core Concepts (concepts.md) and a User Guide (guide.md) were added to clarify the philosophy behind the pixel grid parameters and collect common workflows. The main README.md and examples have also been fully updated.
  • CI/CD Updates: Python 3.10 support was removed from all CI workflows, and the default publish environment was updated to Python 3.11.

📋 Checklist

  • Implemented new PixelGridParams signature and removed old implicit arguments.
  • Updated default dimension ordering to [time, y, x].
  • Added core grid helper utilities: extract_grid_params, fit_geometry, and set_scale.
  • Updated documentation and added a dedicated Migration Guide.
  • Updated CI/CD infrastructure to target Python 3.11.
  • All tests in ext_integration_test.py and ext_integration_test.py pass.

tylere and others added 30 commits February 3, 2025 09:58
PiperOrigin-RevId: 712905652
Ensure shape_2d is a tuple
@jdbcode jdbcode requested review from naschmitz and schwehr November 17, 2025 21:17
@jdbcode
Copy link
Member Author

jdbcode commented Nov 17, 2025

I see that there are conflicts, I'll work on resolving these.

Copy link
Collaborator

@schwehr schwehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only blocker: Can we bring back python 3.10?

matrix:
python-version: [
"3.9",
"3.10",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that 3.9 is out, but 3.10? I see it documented in the PR description, but we still need to support 3.10 in ee and geemap until 2026-10.

https://devguide.python.org/versions/

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to use 3.13? It might make the rules run a touch faster.

Same for the other python versions used in CI rules



assert sys.version_info >= (3, 8)
assert sys.version_info >= (3, 9)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert sys.version_info >= (3, 10)

with self.assertRaises(ValueError):
ext._check_request_limit(chunks, dtype_size, xee.REQUEST_BYTE_LIMIT)

@mock.patch(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer mock.patch.object

 @mock.patch.object(
      xee.ext.EarthEngineStore, 'get_info',
      new_callable=mock.PropertyMock,
  )

xee/ext_test.py Outdated
)
def test_init_with_tuple_transform(self, mock_get_info):
"""Test that a tuple object can be passed for crs_transform."""
# (Setup the mock_get_info.return_value just like in the other test)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the parens

from pyproj import Transformer
import shapely
from shapely.ops import transform
from typing import TypedDict, Union
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should go with the other stdlib import (math)

xee/helpers.py Outdated

def fit_geometry(
geometry: shapely.geometry.base.BaseGeometry,
*,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of folks do not know that this starts the required keyword section, so a comment might be good.

| Parameter | Meaning |
|-----------|---------|
| `crs` | Coordinate Reference System for the output grid (e.g. `EPSG:4326`, `EPSG:32610`). |
| `crs_transform` | Affine transform tuple `(x_scale, x_skew, x_trans, y_skew, y_scale, y_trans)` describing pixel size, rotation/skew, and origin translation in CRS units. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think GDAL does something different: trans, scale, skew instead of scale, skew, trans.

Should we mention this is the rasterio/affine convention?


## Dimension Ordering

Datasets are returned as `[time, y, x]` (v1.0+) aligning with CF conventions and most geospatial libraries. Prior versions used `[time, x, y]`. If code assumed positional indices, update to name-based access: `ds.sizes['x']`, `ds.sizes['y']`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this v1.0+ here?


## CRS Units & Transforms

All scale/translation values are expressed in units of `crs`. Degrees for geographic CRSs; meters (or feet) for projected CRSs. Plate Carrée (`EPSG:4326`) has non-uniform ground size — consider a projected CRS for area/length sensitive analysis.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "Plate Carrée" standard technical language for WGS84? Given my experience, I can't tell whether this is a useful name to include here or "AI slop".

grid_params = helpers.fit_geometry(
geometry=sf_aoi_shapely,
grid_crs='EPSG:32610', # Target CRS in meters (UTM Zone 10N)
grid_scale=(30, -30) # Use Landsat's 30m resolution
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Extra horizontal space here.

- [Core Concepts](concepts.md)
- [Performance & Limits](performance.md)
- [FAQ](faq.md)
- Examples: see `examples/` directory in the repository
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the other points, consider linking to the examples/ directory.


#### Example 1: Global dataset at fixed scale

**Before (v0.x):**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically v0.1.0 fits this expression. It's probably better to use v0.0.x here and elsewhere.

# data as a single chunk.
Chunks = Union[int, dict[Any, Any], Literal['auto'], None]

# Types for type hints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"type annotations". I don't think this comment is necessary though.

xee/ext.py Outdated
'shearX',
'translateX',
'shearY',
'scaleY',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scale, shear, translate? X and Y transform orders are different.

I'm a little surprised the integration tests didn't catch this.

[123, 0, 100, 0, 456, 200]
)


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: extra vertical space after some of these test cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants