Skip to content

Add utilities to convert feature masks to cell masks and vice versa, and add optional cell mask output from segmentation#512

Open
w-k-jones wants to merge 16 commits intotobac-project:RC_v1.6.xfrom
w-k-jones:feature_mask_conversion
Open

Add utilities to convert feature masks to cell masks and vice versa, and add optional cell mask output from segmentation#512
w-k-jones wants to merge 16 commits intotobac-project:RC_v1.6.xfrom
w-k-jones:feature_mask_conversion

Conversation

@w-k-jones
Copy link
Member

@w-k-jones w-k-jones commented Jul 12, 2025

Adds features requested in #508, along with postprocessing utilities to perform the conversion in both directions. Note that in the presence of stub cells, this operation cannot be performed perfectly.

Some additional features not currently included are conversion to track masks (although the reverse is not possible) and a function for simultaneously filtering features from both a feature dataframe and the corresponding feature mask.

  • Have you followed our guidelines in CONTRIBUTING.md?
  • Have you self-reviewed your code and corrected any misspellings?
  • Have you written documentation that is easy to understand?
  • Have you written descriptive commit messages?
  • Have you added NumPy docstrings for newly added functions?
  • Have you formatted your code using black?
  • If you have introduced a new functionality, have you added adequate unit tests?
  • Have all tests passed in your local clone?
  • If you have introduced a new functionality, have you added an example notebook?
  • Have you kept your pull request small and limited so that it is easy to review?
  • Have the newest changes from this branch been merged?

@w-k-jones w-k-jones self-assigned this Jul 12, 2025
@w-k-jones w-k-jones added the enhancement Addition of new features, or improved functionality of existing features label Jul 12, 2025
@codecov
Copy link

codecov bot commented Jul 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 65.39%. Comparing base (1433722) to head (e3767f7).
⚠️ Report is 164 commits behind head on RC_v1.6.x.

Additional details and impacted files
@@              Coverage Diff              @@
##           RC_v1.6.x     #512      +/-   ##
=============================================
+ Coverage      63.53%   65.39%   +1.86%     
=============================================
  Files             27       27              
  Lines           3842     4066     +224     
=============================================
+ Hits            2441     2659     +218     
- Misses          1401     1407       +6     
Flag Coverage Δ
unittests 65.39% <100.00%> (+1.86%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Jul 12, 2025

Linting results by Pylint:

Your code has been rated at 8.33/10 (previous run: 8.36/10, -0.03)
The linting score is an indicator that reflects how well your code version follows Pylint’s coding standards and quality metrics with respect to the RC_v1.6.x branch.
A decrease usually indicates your new code does not fully meet style guidelines or has potential errors.

Copy link
Member

@JuliaKukulies JuliaKukulies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful addition, @w-k-jones! I have tested this with our example notebooks and it works without problems. The design looks also good to me!

I am just wondering if there was a reason why we are not adding the capability to have either cell or track numbers in the mask (and limit the conversion from features to track to a one-direction conversion only). Same is true for filtered features. With the function that you have for converting features to cell numbers, it seems like an easy add to instead of the cell numbers use tracks or filtered features.

Also, what do you think about modifying one of our example notebooks in a way that we apply this new function. I think as of now, most notebooks so the tracking first and then the segmentation, but it would be fairly easy to change one example to the tracking first and then show how the returned mask in the segmentation contains the cell numbers. I am happy to help with such an example!

@JuliaKukulies JuliaKukulies added this to the v1.6.2 milestone Oct 10, 2025
@freemansw1 freemansw1 mentioned this pull request Dec 8, 2025
11 tasks
@w-k-jones
Copy link
Member Author

I have added functions to convert feature/cell masks to tracks, but no operation in the other direction is possible (individual features get merged together and become inseparable). I have also added an inplace keyword to perform this operation without copying the input mask to save memory (e.g. when performed during segmentation).

Ready for re-review when tests have passed @JuliaKukulies @freemansw1

@freemansw1 freemansw1 removed their request for review December 12, 2025 15:21
Copy link
Member

@JuliaKukulies JuliaKukulies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additions @w-k-jones, great job and so useful!! See my specific comments below. The main thing is that we need to either write clear documentation which extra steps need to be done to add the track column to the dataframe of tracked features, or the function needs to be changed so it works with the original output from the merge split function. I think the first option would be easier, but maybe the second one cleaner - what do you think?

Parameters
----------
features : pd.DataFrame
A feature dataframe with cell values provided by tobac.linking_trackpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this documentation to tracks provided by the merge split module in tobac

Parameters
----------
features : pd.DataFrame
A feature dataframe with cell values provided by tobac.linking_trackpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

timestep that is time_padding off of the feature. Extremely useful when
converting between micro- and nanoseconds, as is common when using Pandas
dataframes.
return_cells: bool, optional (default: False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you added the capability to directly go from features to tracks as well, should that also be an option in the segmentation function?

datetime(2000, 1, 1, 0), datetime(2000, 1, 1, 2), periods=3
),
"cell": [1, 1, 1],
"track": [1, 1, 1],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions convert_feature_mask_to_track and convert_cell_mask_to_track require the format of the original Features pandas dataframe with an added column track it seems like. As in one of your example notebooks (MCS tracking in ICON simulations), the suggested workflow is:

merge_splits = merge_splits.merge_split_MEST(Track, dxy = dxy)
Track['track'] = merge_splits.feature_parent_track_id

which I think is good but it needs to be documented somewhere. From a user perspective, it is not clear whether to input the output of merge_splits.merge_split_MEST or do that extra step. I am making this comment here, because maybe we should include this in the test as well; to ensure the output format of merging and splitting can be used with this function.

@freemansw1
Copy link
Member

@JuliaKukulies @w-k-jones any opposition to pushing this to v1.6.3?

@JuliaKukulies
Copy link
Member

@JuliaKukulies @w-k-jones any opposition to pushing this to v1.6.3?

No problem from my side!

But also: sorry that this got delayed because of my comments. The code works all well, so I am also fine with merging as is, and I could provide some documentation in the do strings tomorrow to clarify what the input data frame for the features to track conversion needs to look like. Does that sound reasonable @w-k-jones ?

@w-k-jones
Copy link
Member Author

@JuliaKukulies Thanks for flagging that issue, I've got so used to appending the tracks output to the feature dataframe that I forgot it isn't like that by default 😅

@freemansw1 Two options I think, first is to roll back to just the feature/cell conversion utilities and merge those for v1.6.2, second is to delay to v1.6.3 and possibly update the merge/split code to be able to optionally return a feature dataframe rather than a dataset

@JuliaKukulies
Copy link
Member

@JuliaKukulies Thanks for flagging that issue, I've got so used to appending the tracks output to the feature dataframe that I forgot it isn't like that by default 😅

Haha, makes sense! I also like your workflow and it makes things easier.

@freemansw1 Two options I think, first is to roll back to just the feature/cell conversion utilities and merge those for v1.6.2, second is to delay to v1.6.3 and possibly update the merge/split code to be able to optionally return a feature dataframe rather than a dataset

No matter which of the two options we go for, I am definitely in favor of updating the merge/split code to optionally return a dataframe and make it consistent so it also contains all columns from the feature detection/segmentation/tracking output. @kelcyno are you OK with that? I think you had a function for this in some example but I couldn't find it anymore

@kelcyno
Copy link
Collaborator

kelcyno commented Dec 17, 2025

I'm fine with updating the merge/split code to add the optional output. I do have code to combine the columns. However, we should make that standalone so we have the combined and compressed dataset outside of merge/split. I can start this PR over the break.

@freemansw1 freemansw1 modified the milestones: v1.6.2, v.1.6.3 Dec 17, 2025
@freemansw1
Copy link
Member

Given the complexities this has opened, I've updated the milestone to 1.6.3. I'll try to get 1.6.2 released in the next couple of days.

@freemansw1 freemansw1 modified the milestones: v.1.6.3, v.1.6.4 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Addition of new features, or improved functionality of existing features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow users to have segmentation mask the cell number rather than the feature number

4 participants