Skip to content

Conversation

@DeflateAwning
Copy link
Contributor

Motivation

Yesterday, I was bothered when I couldn't find this and had to re-read the quickstart guide.

Today, I still couldn't remember how to do it. There are no hints once you have the bad dataclass instance. Thus, we show the example here.

Changes

Added example to docstring. Copied out of quickstart guide. Having this example in two places makes a lot of sense.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Oct 23, 2025
@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (a7ea0b1) to head (edb8ba7).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #188   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           53        53           
  Lines         3005      3005           
=========================================
  Hits          3005      3005           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@delsner delsner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think it makes sense to have such an example. Would you mind adding it to the Collection class as well? There one receives a dict[str, FailureInfo] in bad.

@DeflateAwning
Copy link
Contributor Author

Seems reasonable, but I'm not sure what that example should look like.

Can you suggest what the example should look like? Haven't used Collections yet, and the docs don't really have complete enough examples to copy from imo.

@AndreasAlbertQC
Copy link
Collaborator

One technical aspect to be aware of: I think our sphinx setup will currently not correctly render markdown in docstrings, because those are handled by sphinx.ext.autodoc, which afaict only supports RST (See also here).

@delsner
Copy link
Member

delsner commented Oct 27, 2025

Can you suggest what the example should look like? Haven't used Collections yet, and the docs don't really have complete enough examples to copy from imo.

The example can be very similar to the schema example:

# Define collection
class HospitalInvoiceData(dy.Collection):
    invoice: dy.LazyFrame[InvoiceSchema]
    ...

# Filter the data and cast columns to expected types
good, failure = HospitalInvoiceData.filter(df, cast=True)

# Inspect the reasons for the failed rows for member `invoice`
print(failure.invoice.counts())

# Inspect the failed rows
failed_df = failure.invoice.invalid()
print(failed_df)

@DeflateAwning
Copy link
Contributor Author

Review comments implemented. Rebased. Thanks.

@delsner
Copy link
Member

delsner commented Oct 30, 2025

The docs build is still failing with indentation errors. Looking at this docstring, I think you need to also indent .. code:: python

/home/docs/checkouts/readthedocs.org/user_builds/dataframely/checkouts/188/dataframely/collection/collection.py:docstring of dataframely.collection.collection.Collection.filter:21: ERROR: Unexpected indentation. [docutils]
/home/docs/checkouts/readthedocs.org/user_builds/dataframely/checkouts/188/dataframely/collection/collection.py:docstring of dataframely.collection.collection.Collection.filter:21: ERROR: Unexpected indentation. [docutils]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants