`write_dataframe_async` with `increment=True` should aggregate before splitting dataframe

When using `write_dataframe_async` with `increment=True`, the aggregation/incrementation should happen before the dataframe is split into smaller chunks for parallel processing.

Currently, the aggregation appears to happen per chunk, meaning that if the original dataframe contains duplicate records (same intersection), these duplicates might be distributed across different chunks. As a result, parallel writes can overwrite each other instead of properly incrementing values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`write_dataframe_async` with `increment=True` should aggregate before splitting dataframe #1307

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

write_dataframe_async with increment=True should aggregate before splitting dataframe #1307

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`write_dataframe_async` with `increment=True` should aggregate before splitting dataframe #1307