Skip to content

deps: upgrade to DataFusion 53.0, Arrow to 58.1#3629

Draft
mbutrovich wants to merge 35 commits intoapache:mainfrom
mbutrovich:df53
Draft

deps: upgrade to DataFusion 53.0, Arrow to 58.1#3629
mbutrovich wants to merge 35 commits intoapache:mainfrom
mbutrovich:df53

Conversation

@mbutrovich
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich commented Mar 4, 2026

Which issue does this PR close?

Closes #3574.

Rationale for this change

Upgrade dependencies.

What changes are included in this PR?

Dependency changes:

  • datafusion, datafusion-datasource, datafusion-physical-expr-adapter, datafusion-spark53.0.0
  • datafusion-spark now with features = ["core"]
  • datafusion-functions-nested (test dep) 52.4.0 → 53.0.0
  • arrow 57.3.0 → 58.1.0
  • parquet 57.3.0 → 58.1.0
  • object_store 0.12.3 → 0.13.1
  • iceberg, iceberg-storage-opendal → git rev 477a1e5 (DF 53 support)
  • opendal, object_store_opendal → git rev 173feb6 (unreleased commit on main with object_store 0.13 support, tracking opendal#7237)
    API fixes:
  • ExecutionPlan::properties() returns &Arc<PlanProperties> — wrapped cache fields in Arc across 7 files (expand, iceberg_scan, parquet_writer, scan, shuffle_scan, shuffle_writer, and their properties() return types)
  • Removed ExecutionPlan::statistics() from parquet_writer and shuffle_writer (no longer in trait)
  • HashJoinExec::try_new takes new null_aware: bool param — added false (Spark doesn't use null-aware anti join path)
  • PhysicalExprAdapterFactory::create now returns Result — updated SparkPhysicalExprAdapterFactory and IcebergScanExec
  • EncryptionFactory methods return Result<...> instead of Result<..., DataFusionError>
  • Migrated hdfs ObjectStore impl to object_store 0.13 API — removed trait methods moved to ObjectStoreExt (get, get_range, head, delete, copy, rename, copy_if_not_exists), added new required methods (delete_stream, copy_opts), rewrote get_ranges to open the file once and read all ranges directly
  • RoundFunc now expects Int32 for decimal_places — converted point arg from Int64 to Int32 in spark_round

Behavioral fixes:

  • Type coercion strategy for UDFs: DF53's fields_with_udf() aggressively promotes types (e.g. Utf8→Utf8View, Int32→Int64). New 3-tier strategy: (1) try coerce_types() for UDFs that implement it, (2) use fields_with_udf() only for "well-supported” signatures (Coercible, String, Numeric, Comparable) that preserve input types, (3) keep original types for all other signatures (Variadic, Exact, etc.)
  • View type casting: DF53 changed some UDFs (e.g. md5) to return Utf8View/BinaryView. Added casts back to Utf8/Binary since Comet does not yet support view types
  • SparkArrayCompact: New Comet UDF replacing array_remove_all(arr, null) — DF53 changed array_remove_all to return NULL when the element arg is NULL, breaking array_compact semantics
  • SparkArrayRepeat not registered: Intentionally skipped because it returns NULL when the element is NULL (e.g. array_repeat(null, 3) → NULL instead of [null, null, null]). Comet's Scala serde wraps the call in a CaseWhen, so DataFusion's built-in ArrayRepeat is sufficient
  • CometFairMemoryPool: DF53 changed timing of reservation atomic updates — reservation.size() now reflects post-shrink/pre-grow values. Switched to tracking via state.used instead
  • Removed CoalesceBatchesExec wrapping of SMJ: CoalesceBatchesExec is deprecated in DF53; removed the special-case wrapping for filtered sort-merge joins
  • Schema adapter column resolution: wrap_all_type_mismatches now resolves logical fields by name instead of column index, and remaps column indices to the physical file schema. Fixes pruned-schema scenarios where filter expressions reference columns at different indices than the full file schema

Feature-gated:

  • hdfs-opendal — cfg-gated HDFS code paths in parquet_writer so it compiles cleanly when the feature is off

Tests:

  • Added regression test for nested schema pruning with array-of-struct and filter (found during upgrade)

How are these changes tested?

Existing tests.

@mbutrovich mbutrovich closed this Mar 4, 2026
@mbutrovich mbutrovich reopened this Mar 17, 2026
@mbutrovich mbutrovich changed the title deps: test DataFusion 53 deps: test DataFusion 53.0 Mar 17, 2026
@mbutrovich
Copy link
Copy Markdown
Contributor Author

mbutrovich commented Mar 17, 2026

So the shuffle failures are related to apache/arrow-rs#9506

I opened an upstream bug for hash join: apache/datafusion#20995

@mbutrovich
Copy link
Copy Markdown
Contributor Author

Down to array_repeat as the only problematic expression at this point, I think.

@mbutrovich mbutrovich changed the title deps: test DataFusion 53.0 deps: upgrade to DataFusion 53.0 Mar 31, 2026
# Conflicts:
#	native/Cargo.lock
#	native/Cargo.toml
#	native/core/Cargo.toml
#	native/core/src/execution/planner.rs
#	native/core/src/parquet/parquet_support.rs
#	native/core/src/parquet/schema_adapter.rs
@mbutrovich
Copy link
Copy Markdown
Contributor Author

Bumped to released crates. Let's see how CI goes.

@mbutrovich mbutrovich changed the title deps: upgrade to DataFusion 53.0 deps: upgrade to DataFusion 53.0, Arrow to 58.1 Mar 31, 2026
@comphead
Copy link
Copy Markdown
Contributor

comphead commented Apr 2, 2026

Most of tests fail on, checking it:

Comet native panic: panicked at /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-53.0.0/src/simplifier/mod.rs:64:60:
called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
17:15:49.878 ERROR org.apache.comet.CometExecIterator: Native execution for task 9522 failed
org.apache.comet.CometNativeException: native panic: called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
	at org.apache.comet.Native.executePlan(Native Method)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:154)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:153)
	at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:153)
	at org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
	at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:151)
17:15:49.881 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1683.0 (TID 9522)
org.apache.comet.CometNativeException: native panic: called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
	at org.apache.comet.Native.executePlan(Native Method)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:154)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:153)
	at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:153)
	at org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
	at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:151)
17:15:49.883 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1683.0 (TID 9522) (3ef5a585ffb7 executor driver): org.apache.comet.CometNativeException: native panic: called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
	at org.apache.comet.Native.executePlan(Native Method)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:154)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:153)
	at org.apache.comet.vector.Na...

@mbutrovich
Copy link
Copy Markdown
Contributor Author

Most of tests fail on, checking it:

Comet native panic: panicked at /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-53.0.0/src/simplifier/mod.rs:64:60:
called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
17:15:49.878 ERROR org.apache.comet.CometExecIterator: Native execution for task 9522 failed
org.apache.comet.CometNativeException: native panic: called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
	at org.apache.comet.Native.executePlan(Native Method)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:154)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:153)
	at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:153)
	at org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
	at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:151)
17:15:49.881 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1683.0 (TID 9522)
org.apache.comet.CometNativeException: native panic: called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
	at org.apache.comet.Native.executePlan(Native Method)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:154)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:153)
	at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:232)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:153)
	at org.apache.comet.Tracing$.withTrace(Tracing.scala:31)
	at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:151)
17:15:49.883 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1683.0 (TID 9522) (3ef5a585ffb7 executor driver): org.apache.comet.CometNativeException: native panic: called `Result::unwrap()` on an `Err` value: Internal("Unexpected data type in GetArrayStructFields: Int32")
	at org.apache.comet.Native.executePlan(Native Method)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:154)
	at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2$adapted(CometExecIterator.scala:153)
	at org.apache.comet.vector.Na...

what's odd is those didn't fail on earlier versions of this branch, I don't think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore: DataFusion 53.0.0

2 participants