Skip to content

fix: array to array cast#2897

Open
manuzhang wants to merge 1 commit intoapache:mainfrom
manuzhang:test-array-cast
Open

fix: array to array cast#2897
manuzhang wants to merge 1 commit intoapache:mainfrom
manuzhang:test-array-cast

Conversation

@manuzhang
Copy link
Copy Markdown
Member

@manuzhang manuzhang commented Dec 13, 2025

Which issue does this PR close?

Part of #2766

Rationale for this change

What changes are included in this PR?

  • Fixed native array-to-array casting by recursively casting list elements in native/spark-expr/src/conversion_funcs/cast.rs.
  • Added native coverage for null-only array casts and safer binary-to-string conversion in native/spark-expr/src/conversion_funcs/cast.rs.
  • Added generated ArrayType -> ArrayType coverage in spark/src/test/scala/org/apache/comet/CometCastSuite.scala, with shared matrix logic for flat and nested arrays.
  • Improved array test data generation in spark/src/test/scala/org/apache/comet/CometCastSuite.scala with edge cases, reused date/timestamp/binary sources, and bounded float/double/long values to avoid timestamp overflow.
  • Stabilized cast and try_cast comparisons in spark/src/test/scala/org/apache/comet/CometCastSuite.scala by ordering on a synthetic row id.
  • Added a temporary JVM-side guard in spark/src/main/scala/org/apache/comet/expressions/CometCast.scala to stop unsupported Array[Date] casts from reaching native.

How are these changes tested?

Added tests.


Co-authored-by: @codex

@manuzhang manuzhang force-pushed the test-array-cast branch 4 times, most recently from 864a25f to 92a7867 Compare December 14, 2025 15:23
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 14, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.53%. Comparing base (f09f8af) to head (5f7b017).
⚠️ Report is 815 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2897      +/-   ##
============================================
+ Coverage     56.12%   59.53%   +3.40%     
- Complexity      976     1376     +400     
============================================
  Files           119      167      +48     
  Lines         11743    15496    +3753     
  Branches       2251     2569     +318     
============================================
+ Hits           6591     9225    +2634     
- Misses         4012     4971     +959     
- Partials       1140     1300     +160     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@manuzhang
Copy link
Copy Markdown
Member Author

Created #2914 for the failed test

  "...The value -96833550.[]7BD of the type "DEC..." did not equal "...The value -96833550.[0]7BD of the type "DEC..." (CometCastSuite.scala:1328)

@manuzhang
Copy link
Copy Markdown
Member Author

Submitted #2916 to fix the failed test.

@github-actions
Copy link
Copy Markdown

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Mar 13, 2026
@github-actions github-actions bot removed the Stale label Mar 18, 2026
@manuzhang manuzhang force-pushed the test-array-cast branch 2 times, most recently from d9770c7 to fae74f6 Compare March 18, 2026 07:26
@coderfender
Copy link
Copy Markdown
Contributor

Taking a look

@manuzhang
Copy link
Copy Markdown
Member Author

@coderfender Sorry, there are more test failures that I need to fix first.

@manuzhang
Copy link
Copy Markdown
Member Author

@coderfender It's ready for review now.

@coderfender
Copy link
Copy Markdown
Contributor

@manuzhang , Thank you the code / feature looks good over all to me. It would be great if we could add some more comprehensive tests both on spark and rust side (including different data types, eval modes , edge cases etc) and benchmarks (can be a follow up PR too). Thank you

Copy link
Copy Markdown
Contributor

@coderfender coderfender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added few minor comments but PR looks good overall

Copy link
Copy Markdown
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for adding the tests requested by @coderfender

@manuzhang
Copy link
Copy Markdown
Member Author

I'll add more DataTypes to the test once #3768 is resolved.

@coderfender
Copy link
Copy Markdown
Contributor

Sure @manuzhang

Copy link
Copy Markdown
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Some minor comments.

testArrayCastMatrix(types, ArrayType(_), generateArrays(100, _))
}

ignore("cast nested ArrayType to nested ArrayType") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a github issue for this? Can we reference it here to document why it is being ignored?

(fromType, toType) match {
case (dt: ArrayType, _: ArrayType) if dt.elementType == NullType => Compatible()
case (ArrayType(DataTypes.DateType, _), ArrayType(toElementType, _))
if toElementType != DataTypes.StringType &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't DateType -> IntegerType (and maybe others) be possible too?

CometCast.isSupported(fromWrappedType, toWrappedType, None, CometEvalMode.TRY) ==
Compatible()
castTest(
generateInput(fromType),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: we can probably call generateInput once in the outer loop

val fromWrappedType = wrapType(fromType)
val toWrappedType = wrapType(toType)
if (fromType != toType &&
testNames.contains(name) &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants