-
Notifications
You must be signed in to change notification settings - Fork 25
Description
merge_cubes currently works in "union" mode: the result covers the union of the label ranges for each dimension, which is useful for use cases where you want to stitch together several data subsets containing same kind of data. The main purpose of the overlap resolver is to have a clean seam between them. You have to be careful what you do in the overlap resolver because it might not be applied to all parts of the result (depending on the degree of overlap). If you go beyond taking the mean or first non-null value, the results might be hard to reason about.
Maybe also useful is to provide a "intersection" or "overlap-only" mode of merge_cubes (through an additional parameter, or as a dedicated process, I'm not sure yet) where you only return the dimension ranges where both cubes exist. It's then easier to reason about the overlap resolver because it's guaranteed to be applied everywhere. It would make merge_cubes also useful for use cases where you fuse different kinds of data.