Support multiple items in DataTree.getitem and improve NodePath (renamed to TreePath) #10854

shoyer · 2025-10-15T00:04:47Z

This PR adds support for indexing with multiple items as a list of paths in DataTree.__getitem__, e.g., tree[['first', 'second']].

It also includes internal improvements to NodePath (now renamed to TreePath):

Rename NodePath to TreePath to make its name slightly more obvious
Automatically normalize paths in the TreePath constructor
Use joinpath() and normalized tree paths to simplify implementations of _get_item and _set_item.

- Rename `NodePath` to `TreePath` to make its name slightly more obvious - Automatically normalize paths in the `TreePath` constructor - Use `joinpath()` and normalized tree paths to simplify implementations of `_get_item` and `_set_item`. None of these changes are user facing.

keewis · 2025-10-22T16:31:46Z

looks like there was a similar attempt in #10400, in case it helps

According to our policy, we can drop python=3.11 from 2026-04-04 onwards – you can simulate this by passing today to minimum_versions:

python minimum_versions.py --policy ci/policy.yaml --today 2026-04-04 ci/requirements/min-all-deps.yml

shoyer · 2025-10-29T00:08:27Z

This is ready for review.

The main thing this could use is clear documentation, to explain that in the case of indexing multiple keys, the resulting DataTree is always defined relative to the node being indexed. This is rather different from the API proposed in #10400, which tries to index the selected variables at each node.

Ideally we could supply this functionality in a dedicated method (which would also make it easier to document), e.g., DataTree.subset() as we discussed last week at the Xarray meeting. This could be similar to the existing discussion about adding a public API for Dataset._copy_listed(): #3894

cc @eni-awowale

eni-awowale · 2025-10-30T16:01:13Z

xarray/tests/test_datatree.py

+
+    def test_getitem_on_child(self) -> None:
+        data = DataTree.from_dict({"a/b/c": 0, "a/d": 1, "e": 2})
+        child = data.children["a"]


Would this be the distinction between the future .subset method and [['a']] selection?

For example with a datatree like:

dt1 = xr.DataTree.from_dict({'/': xr.Dataset(coords={'x': [1, 2, 3]}), '/a': xr.Dataset({'n': 1}), '/a/b': xr.Dataset({'foo': 1})}) <xarray.DataTree> Group: / │ Dimensions: (x: 3) │ Coordinates: │ * x (x) int64 24B 1 2 3 └── Group: /a │ Dimensions: () │ Data variables: │ n int64 8B 1 └── Group: /a/b Dimensions: () Data variables: foo int64 8B 1

When we select with dt1[['a/b']] we get a datatree that has an empty "a" group, with coordinates from the root group so:

<xarray.DataTree> Group: / └── Group: /a └── Group: /a/b Dimensions: (x: 3) Coordinates: * x (x) int64 24B 1 2 3 Data variables: foo int64 8B 1

So .subset would do something like dt1.children['a'][['/a/b']], we get a datatree that only returns the b group with the "x" coordinates from root group, so:

<xarray.DataTree 'a'> Group: / └── Group: /b Dimensions: (x: 3) Coordinates: * x (x) int64 24B 1 2 3 Data variables: foo int64 8B 1

shoyer added 2 commits October 14, 2025 16:34

simplify

5943f73

shoyer requested a review from TomNicholas October 15, 2025 00:04

github-actions bot added topic-backends topic-zarr Related to zarr storage library topic-DataTree Related to the implementation of a DataTree class io labels Oct 15, 2025

shoyer added 5 commits October 15, 2025 10:44

Fix python 3.11

afdff85

Fix python 3.13 again

849ac5a

Merge branch 'main' into tree-node-improve

03b9e8a

support for multiple keys in __getitem__

ae19f51

whats new

74d3cf6

shoyer changed the title ~~Internal improvements to NodePath (renamed to TreePath)~~ Support multiple items in DataTree.__getitem__ and improve NodePath (renamed to TreePath) Oct 15, 2025

shoyer added 3 commits October 15, 2025 11:58

Fix typing

b5f1a15

Fix py3.13 again

299b0b7

Merge branch 'main' into tree-node-improve

381be8b

shoyer added 2 commits October 26, 2025 16:44

Fix handling of getitem on child nodes

e28d552

Merge branch 'main' into tree-node-improve

73ab794

eni-awowale reviewed Oct 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support multiple items in DataTree.getitem and improve NodePath (renamed to TreePath) #10854

Support multiple items in DataTree.getitem and improve NodePath (renamed to TreePath) #10854

shoyer commented Oct 15, 2025 •

edited

Loading

Uh oh!

keewis commented Oct 22, 2025 •

edited

Loading

Uh oh!

shoyer commented Oct 29, 2025

Uh oh!

eni-awowale Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Support multiple items in DataTree.__getitem__ and improve NodePath (renamed to TreePath) #10854

Are you sure you want to change the base?

Support multiple items in DataTree.__getitem__ and improve NodePath (renamed to TreePath) #10854

Conversation

shoyer commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keewis commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shoyer commented Oct 29, 2025

Uh oh!

eni-awowale Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support multiple items in DataTree.getitem and improve NodePath (renamed to TreePath) #10854

Support multiple items in DataTree.getitem and improve NodePath (renamed to TreePath) #10854

shoyer commented Oct 15, 2025 •

edited

Loading

keewis commented Oct 22, 2025 •

edited

Loading