fix: Ignore `dy.Any` columns in `Schema.cast` by gab23r · Pull Request #315 · Quantco/dataframely

gab23r · 2026-04-03T10:02:06Z

Motivation

Fixes: #314

Changes

Schema.cast ignores the dy.Any columns.

Copilot

Pull request overview

This PR fixes issue #314 by preventing dy.Any columns from being cast during schema validation. The Schema.cast method now skips casting for dy.Any columns since they should accept any data type, rather than attempting to cast to their default pl.Null() type. This resolves the error that occurred when roundtripping collections with parquet files containing dy.Any columns.

Changes:

Modified Schema.cast method to check if a column is of type dy.Any and skip casting for such columns
Added import for the Any column type as AnyColumn in schema.py
Added a test to verify that casting a DataFrame with an Any column preserves the original dtype

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`dataframely/schema.py`	Added import for `Any` column type and modified `cast` method to skip casting for `dy.Any` columns
`tests/column_types/test_any.py`	Added test verifying that `Schema.cast` preserves dtype for `dy.Any` columns

Comments suppressed due to low confidence (1)

tests/column_types/test_any.py:5

An extra blank line is added after the license header (line 4), creating two blank lines between the license header and imports. This doesn't match the convention in other test files (e.g., test_string.py) which have only one blank line. Consider removing the extra blank line to maintain consistency.


from typing import Any

codecov · 2026-04-03T10:06:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (72fb1a6) to head (378f7c1).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #315   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           56        56           
  Lines         3218      3234   +16     
=========================================
+ Hits          3218      3234   +16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Andreas Albert (AndreasAlbertQC)

Thanks for finding this and providing a fix! I'd just like to refactor it a little

Andreas Albert (AndreasAlbertQC) · 2026-04-04T10:42:51Z

dataframely/schema.py

        lf = df.lazy().select(
-            pl.col(name).cast(col.dtype) for name, col in cls.columns().items()
+            # Skip casting for Any columns since they accept any type
+            pl.col(name) if isinstance(col, AnyColumn) else pl.col(name).cast(col.dtype)


Instead of building special treatment for Any here, how about we move ownership of casting into the Column itself? I.e. Column gets a cast method, and the default implementation is:

def cast(self, col: pl.Expr) -> pl.Expr: return col.cast(self.dtype)

In Any, we then implement the override:

def cast(self, col: pl.Expr) -> pl.Expr: return col

I think this would be neat because you never have to think about special casting logic outside the column implementations themselves

I was thinking about this solution as well, and then I thought about the dy.Integer column. How should we manage this case ? Maybe this Column.cast function should take as well the type of the input expression, meaning that we need to wrapped it with pipe_with_schema.
I am away from the computer right now, I can have a deeper look on Tuesday.

gab23r requested review from Andreas Albert (AndreasAlbertQC), Oliver Borchert (borchero) and Daniel Elsner (delsner) as code owners April 3, 2026 10:02

Copilot AI review requested due to automatic review settings April 3, 2026 10:02

github-actions bot added the fix label Apr 3, 2026

fix: Ignore columns in

378f7c1

Copilot started reviewing on behalf of gab23r April 3, 2026 10:02 View session

gab23r force-pushed the fix-cast-with-any branch from 8114fec to 378f7c1 Compare April 3, 2026 10:02

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Andreas Albert (AndreasAlbertQC) requested changes Apr 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Ignore `dy.Any` columns in `Schema.cast`#315

fix: Ignore `dy.Any` columns in `Schema.cast`#315
gab23r wants to merge 1 commit intoQuantco:mainfrom
gab23r:fix-cast-with-any

gab23r commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

codecov bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

Andreas Albert (AndreasAlbertQC) left a comment

Uh oh!

Andreas Albert (AndreasAlbertQC) Apr 4, 2026

Uh oh!

gab23r Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gab23r commented Apr 3, 2026

Motivation

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

codecov bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Andreas Albert (AndreasAlbertQC) left a comment

Choose a reason for hiding this comment

Uh oh!

Andreas Albert (AndreasAlbertQC) Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

gab23r Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Apr 3, 2026 •

edited

Loading