-
Notifications
You must be signed in to change notification settings - Fork 235
Update node serialization/deserialization and other Pydantic issues #6990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Update node serialization/deserialization and other Pydantic issues #6990
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6990 +/- ##
==========================================
- Coverage 79.60% 79.52% -0.07%
==========================================
Files 566 566
Lines 43538 43700 +162
==========================================
+ Hits 34655 34749 +94
- Misses 8883 8951 +68 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
9b7cfbd to
d83ced6
Compare
|
@khsrali regarding one of the failed tests... In this PR, I lift some validation to the top of the constructor (of stash data classes - see 91e4ad5), to avoid any operations if the object is bound to fail. However, this seems to introduce failures in testing them. It suggests that perhaps there is some order to the operations, though I don't see it. Can you please comment? |
It was checking for |
|
@GeigerJ2 okay, this is ready for others to inspect. I did my best to isolate the commits and provided comments on each. Happy to discuss further. Pinging also @agoscinski @superstar54 if interested. Pinging @sphuber for input/feedback, if he has time. |
|
It may be possible to rely on the post models of aiida-restapi as a reference for defining ORM constructor parameters, as the post models are intended to represent serialized objects passed to the REST API for object construction. Looking into this. |
c8ded01 to
476e144
Compare
476e144 to
7c191fa
Compare
|
@danielhollas what is this about? Nevermind. I see that importing |
516a6df to
2c3e5df
Compare
Nice, the system works. :-) Feel free to improve the error message here to make it more obvious that this is about the |
d6e44ff to
0b37ef6
Compare
|
|
@danielhollas what do you think about ignoring Ruff N806 - "~variable should be lowercase"? See case below. |
Do you mean ignoring locally (fine) or globally? |
Global. There are many cases when the variable is a class, not an instance. I've pushed this in my last commit just to verify that it works. Okay with removing it in favor of local handling, but would like to hear the reason against a global N806 rule. UpdateNice. Ignoring N806 globally raised a whole lot of RUF100 due to the codebase being littered with local N806 rules. I'd say that supports my case 🙂 |
|
@danielhollas done for tonight. Will revisit this in the morning 😴 |
Yeah, running Seems fine to remove it, or alternatively, use the e.g. something like this in pyproject.toml In any case please open a separate PR for that so we don't polute this one with bikeshedding discussion and unrelated changes. |
Thanks @danielhollas. Then for this one, since there are only a few cases in my PR, I will locally ignore them. Will open a PR for the pattern handling shortly after. |
5faafb0 to
a4a5b3e
Compare
Thanks @mikibonacci for the assist 🙏
de5f884 to
4da0a60
Compare
Pydantic provides via its model configuration `ser_json_bytes` and `val_json_bytes`. Here we set both to 'base64', globally stating that `bytes` are to be (de)serialized as 'base64'. This covers `SinglefileData.contents`, `ArrayData.arrays`, and `Node.repository_content`.
affc5b5 to
0586747
Compare
| 'computer': 'dbcomputer_id', | ||
| 'user': 'aiidauser_id', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikibonacci this is what I meant by adding mappings in the backend.
| ALIAS_MAP = { | ||
| 'id': 'pk', | ||
| 'dbcomputer_id': 'computer', | ||
| 'user_id': 'user', | ||
| 'dbnode_id': 'node', | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used later in generating the QB warning for unrecognized keys.
| keys = [] | ||
| for key in alias._sa_class_manager.mapper.c.keys(): | ||
| if colalias := ALIAS_MAP.get(key): | ||
| keys.append(f'{key} (alias: {colalias})') | ||
| else: | ||
| keys.append(key) | ||
| raise ValueError( | ||
| '{} is not a column of {}\nValid columns are:\n{}'.format( | ||
| colname, alias, '\n'.join(alias._sa_class_manager.mapper.c.keys()) | ||
| ) | ||
| '{} is not a column of {}\nValid columns are:\n{}'.format(colname, alias, '\n'.join(keys)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikibonacci happy to modify this. This preserves the old keys, but places the schema keys upfront. For example,
orm.QueryBuilder().append(
orm.Node,
project=['unrecognized'],
).first()yields
ValueError: unrecognized is not a column of aliased(DbNode)
Valid columns are:
pk (alias for id)
uuid
node_type
process_type
label
description
ctime
mtime
attributes
extras
repository_metadata
computer (alias for dbcomputer_id)
user (alias for user_id)
Some parts of the ORM module do not pass a serialization round trip. This PR addresses this, but further discussion is needed regarding what should actually be (de)serialized. If interested, good to read aiidateam/AEP#40 and discussion on #6255. Some discussion regarding mismatch between an object's constructor args and its Model fields can be found here.
Good to also discuss if the use of pydantic should be extended to ORM instance creation in general, not only by way of
from_serialized. This is not implemented in this PR (and is out of scope) but can be addressed in a follow-up PR if deemed "correct".Open questions
repository_content. The current implementation usesbase64encoding/decoding, but this is clearly not ideal for large files. Some discussion was had w.r.t switching to links to some online storage of files. TBDUpdates
The PR now also introduces the following:
Dataplugins viaattributesattributesinDataplugins (not allowed in the backend node)InputModel- a derived view of the defined entityModelsuitable forEntitycreationNote for PR reviewers
There are a few changes that are likely out of scope. These will move to dedicated PRs prior to merge of this PR.