-
Notifications
You must be signed in to change notification settings - Fork 5
Transition value JSON from old to new flat format #539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some notes/questions as review comments.
spinedb_api/alembic/versions/a973ab537da2_reencode_parameter_values.py
Outdated
Show resolved
Hide resolved
spinedb_api/alembic/versions/a973ab537da2_reencode_parameter_values.py
Outdated
Show resolved
Hide resolved
TimePattern was implemented as annotated type for schema generation, however this is not distinguishable at runtime, so add an alternate dataclass implementation.
make columns instead of records from old format parameter_value
- update only rows that need changes - batch row updates - convert types to `table` where necessary - add `transition_data` function override for debugging
Remove unnecessary nullable types, and unused union type (ValueTypes)
Do not use pandas as intermediate step, instead transform ourselves from record based to column based - easier for type inspection. TODO: factor out into specific for data transition and generally useful to inserting into spinedb from outside sources when using spinedb_api as a library.
|
I think we need to allow null values in
This raises the question whether we need the index arrays as separate types at all. I also reintroduced custom conversion for some types into |
|
Hi, Strictly speaking, the index column and value column distinction is not needed. But I would like to have them because then downstream code can make useful assumptions. But before I get into that, I think there's a misunderstanding here, mostly because I think I didn't document this anywhere. There is no requirement for all but the last column to be an index column. So this is acceptable:
It would mean The reason this is useful, say, when converting to a dataframe (or something else in a user script), we can treat Of course this is not feasible when working only with the Anyway, I think we should discuss this a bit further. I'll email you. Cheers, |
These are very good points. I agree that the first index column (col1 in the example) should not be nullable. We should indeed keep the index column types. |
- Moved dump_db_value(), from_database_to_dimension_count(), join_value_and_type() and split_value_and_type() to a new module incomplete_values. - Added JSONConverter to importer's convert functions. This replaces the functionality where we tried to convert every parameter value string to parameter value.
parameter_value from old to new flat formatThe value JSON is not compatible with previous versions.
GAMS version check runs GAMS executable to resolve its version. The executable fails if executed in read-only directory, so we create a temporary directory to avoid that.
This reverts commit 31588a0.
This comment was marked as resolved.
This comment was marked as resolved.
|
|
||
|
|
||
| # types | ||
| class TimePeriod(str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this class be replaced by typing.NewType?
TimePeriod = NewType("TimePeriod", str)
Migrate old JSON
parameter_values into the new schema that is more like a flat table (fortime_series,array, andmap) and singularpyarrowcompatible values fordate_time,duration, andtime_pattern.No related issue
Checklist before merging
Authors
Since GH doesn't support setting multiple people as author in a PR, documenting it here
@OleMussmann, @suvayu