Conversation
asmacdo
left a comment
There was a problem hiding this comment.
a couple of questions to start off with-- I'll have another look tomorrow.
waxlamp
left a comment
There was a problem hiding this comment.
Thank you, @candleindark! One thing to change listed below. And I've asked @jjnesbitt to provide a more comprehensive review on this R as well.
|
@jjnesbitt I have put in changes requested by @waxlamp regarding the command line interface. With the latest push, two unrelated tests failed. Can they be related to the latest changes in dandi-cli? |
|
I've pushed a commit which aims to simplify the logic in this management command. Namely:
IMO this is ready, maybe @waxlamp should take another look. |
|
As per the meeting of 2025-03-10 (involving @yarikoptic, @candleindark, @jjnesbitt, @kabilar, and @waxlamp), we decided to reduce the scope of this change to just the draft versions; @jjnesbitt will work on adjusting the PR to that narrowed scope. @candleindark, @yarikoptic: are we planning to release a new schema version that has restrictions on extra fields? If we apply the metadata correction to draft versions, then I suppose whether or not such a schema is released, there will not be any further validation errors. |
Yes, some minor changes to the schema will be introduced by this PR
I can't say that after the corrections to the draft version that there will be no more validation errors in those versions. I can only say that if you make the corrections to the draft versions, there will not be validation errors in those version due to that particular change brought by that PR dandi/dandi-schema#266 (comment) provides an analysis of validation errors in the metadata instances caused by the changes in the PR but not an analysis of validation errors in the metadata instances in general. |
Sorry, this was what I meant. |
|
Placing this into draft mode until #2224 is merged, at which point this PR can be updated to make use of it. |
|
do you think you would have time to work out solution for #2224 (it is an issue) or should someone else try to approach it to facilitate a potentially (not necessarily "likely" ;) ) faster resolution? |
0f84e38 to
e79b486
Compare
Provide solution to correct the corruption of `Affiliation` JSON objects documented in dandi/dandi-schema#276
This make the default behavior to require user to specify a particular dandiset version to apply the correct to. Only when the `--all` flag is provided, should the command apply the correction to all dandiset versions
- Don't allow correct function configuration - Remove verbose error handling logic - Write manifest files synchronously - Only support correction of the `Affiliation` schema key - Remove `find_objs` tests against generalized schema key - Use transactions to couple logic of metadata save and manifest file generation
e79b486 to
b02e2ff
Compare
|
@waxlamp @mvandenburgh This is ready to go now. |
mvandenburgh
left a comment
There was a problem hiding this comment.
Just some suggested optimizations to memory usage. It's probably not needed due to the number of Versions not being super high, but the heroku run dynos have pretty limited memory so it seems logical to do.
Otherwise, LGTM
Co-Authored-By: Mike VanDenburgh <michael.vandenburgh@kitware.com>
|
So this was merged and now we got the command. Who is allowed to run it, as could I ? |
It is only in staging at the moment, not in production. I am planning to run in staging, and then once this is deployed in production (blocked by vue3), run there as well. |
|
🚀 PR was released in |
|
This has been successfully applied in staging and production. |
Metadata correction
This PR provides a solution to correct the corruptions in metadata documented in dandi/dandi-schema#276.
The solution is implemented in two parts.
correct_metadata, to correct dandiset metadata.Extensive tests are provided for the helper function and its supporting function. However, because I am not familiar with this repo and Django in general, I am not able to provide tests for the command which interacts with the database. Advice and additional tests are very much appreciated.
The command can be run on a targeted dandiset at a particular version and run on all versions of all dandisets. Running on all dandiset versions will only correct the corrupted dandisets. If running only on targeted dandiset version is preferable, please let me know, and I will provide the list of corrupted dandiset versions. (Additionally, would changing the interface of the command to a file consisting of corrupted dandiset versions be better?)