temporal: t.rast.aggregate: add -e flag to extend existing STRDS#7223
temporal: t.rast.aggregate: add -e flag to extend existing STRDS#7223Sourish-spc wants to merge 23 commits intoOSGeo:mainfrom
Conversation
In stds_import.py, the finally block used the deprecated 'gisdbase' option when calling g.mapset to switch back to the original location. This caused a deprecation warning on every t.rast.import run with the project parameter. Replaced 'gisdbase' with 'dbase' to match the current g.mapset API. Fixes OSGeo#4231
7ee4adc to
9b2110e
Compare
temporal/t.rast.aggregate/testsuite/test_aggregation_absolute.py
Outdated
Show resolved
Hide resolved
ninsbl
left a comment
There was a problem hiding this comment.
Please also install pre-commit in your local working directory to avoid CI runs for formalities...
| gs.fatal( | ||
| _( | ||
| "Space time raster dataset <%s> does not exist. " | ||
| "Cannot extend a non-existing dataset." | ||
| ) | ||
| % output | ||
| ) |
There was a problem hiding this comment.
Personally, I do not think a fatal error should be thrown here. I would say, a warning at best. In any case, this part should be consistent across all temporal tools that receive the e-flag.
There was a problem hiding this comment.
Updated: changed fatal error to warning and fall back to creating new STRDS when dataset doesn't exist.
Adds support for extending an existing Space Time Raster Dataset by running t.rast.aggregate with the -e flag. This allows recurring workflows to append new aggregated results to an existing STRDS without deleting previous data. See OSGeo#3427
| # Verify STRDS was extended | ||
| lister = SimpleModule("t.rast.list", input="B", columns="name", flags="u") | ||
| self.runModule(lister) | ||
| self.assertIn("b_ext", lister.outputs.stdout) |
There was a problem hiding this comment.
This only checks that some map with "b_ext" in the name exists. It doesn't verify the original maps are still present, that the STRDS wasn't just overwritten, and/or the total map count.
There was a problem hiding this comment.
Thank you for the review. The test has been strengthened to also verify that original maps are still present and the total map count increased after extending.
| if output_exists: | ||
| output_strds.update_command_string(dbif=dbif) |
There was a problem hiding this comment.
Does this make sense when overwriting?
There was a problem hiding this comment.
Good point, iI did not really think about it. It may make sense in any case. I did not check the behavior. Updates of the STDS should be written to the history I`ld say, which is what update_command_string does. The command should not be duplicated in the history though. That should be checked...
There was a problem hiding this comment.
Updated the condition to if not output_exists or extend to avoid duplicating the command string in history when overwriting, while still updating it correctly when extending.
|
|
||
| if __name__ == "__main__": | ||
| options, flags = gs.parser() | ||
|
|
There was a problem hiding this comment.
Please, review your code, including the PR changes.
There was a problem hiding this comment.
Thank you for the review. The extra blank line has been removed in the latest commit.
…ctions for extend support
ninsbl
left a comment
There was a problem hiding this comment.
Please keep in mind that the code in open_stds pre-dates Python 3 and has not been re-factored after since.
We should not do re-factoring as part of this PR, but don`t repeat outdated pattern in stuff you add....
| dbif, connection_state_changed = init_dbif(dbif) | ||
| msgr = get_tgis_message_interface() | ||
|
|
||
| mapset = get_current_mapset() | ||
| id = name + "@" + mapset if name.find("@") < 0 else name | ||
|
|
||
| if type in {"strds", "rast", "raster"}: | ||
| sp = dataset_factory("strds", id) | ||
| elif type in {"str3ds", "raster3d", "rast3d", "raster_3d"}: | ||
| sp = dataset_factory("str3ds", id) | ||
| elif type in {"stvds", "vect", "vector"}: | ||
| sp = dataset_factory("stvds", id) | ||
| else: | ||
| msgr.fatal(_("Unknown type: %s") % (type)) | ||
| if extend and sp.is_in_db(dbif): | ||
| # extend-if-exists: load and return existing dataset | ||
| sp.select(dbif) | ||
| if connection_state_changed: | ||
| dbif.close() | ||
| return sp |
There was a problem hiding this comment.
You duplicate quite a bit of code... You may consider adding a helper function similar to e.g. (untested, so not suitable for copy and paste!):
def get_stds(stds_id: str, stds_type: str) -> SpaceTimeDataset:
"""Return an initialized SpaceTimeDataset (STDS)."""
msgr = get_tgis_message_interface()
supported_stds_types = {
"strds": "strds",
"rast": "strds",
"raster": "strds",
"str3ds": "str3ds",
"raster3d": "str3ds",
"rast3d": "str3ds",
"raster_3d": "str3ds",
"stvds": "stvds",
"vect": "stvds",
"vector": "stvds",
}
if stds_type not in supported_stds_types:
msgr.fatal(_("Unknown type: %s") % (type))
return dataset_factory(supported_stds_types[stds_type], stds_id)]
There was a problem hiding this comment.
Added _get_stds helper function to avoid repeating the type mapping block, and ensure_id helper to avoid repeating the id construction pattern. Kept them private ( prefix) since they're only used internally in open_stds.py happy to make them public if needed.
ninsbl
left a comment
There was a problem hiding this comment.
Now we are getting somewhere. Almost done. I just had some hopefully final, minor comments.
Once those are addressed, this can be approved as far as I am concerned. I would appreciate a second opinion, especially on having the new helper function public or private...
|
|
||
| dbif, connection_state_changed = init_dbif(dbif) | ||
|
|
||
| if sp.is_in_db(dbif) and overwrite is False: |
There was a problem hiding this comment.
The old syntax is preferable. See: https://docs.astral.sh/ruff/rules/collapsible-if/
| dbif, connection_state_changed = init_dbif(dbif) | ||
| output_exists = sp.is_in_db(dbif) | ||
|
|
||
| if output_exists: |
There was a problem hiding this comment.
See the comment above about collapsible-if...
| """ | ||
| msgr = get_tgis_message_interface() | ||
|
|
||
| if "@" in name: |
There was a problem hiding this comment.
You can skip this check if you _ensure-id before.
If ensure_id is a public function and used in t.rast.aggregate, the new functions you add here could expect an id as input and those checks are not needed at all. Yet, they are insignificant and may be safer to double-check. No strong opinion from my side on this...
| ) | ||
|
|
||
| # Check original maps are still present (STRDS was not overwritten) | ||
| for original_map in original_maps.strip().split(os.linesep): |
There was a problem hiding this comment.
Maybe use all() or a set() based check and only one assertion...
| self.assertIn(original_map, extended_maps) | ||
|
|
||
| # Check total map count increased (original 3 + extended 3 = 6) | ||
| info = SimpleModule("t.info", flags="g", input="B") |
There was a problem hiding this comment.
You get the number of maps from t.rast.list. No need to run t.info in addition for this check. If you want to check that the STRDS history is updated properly (command added to comments) you need to.info...
Description
This PR adds support for extending an existing Space Time Raster Dataset (STRDS) in
t.rast.aggregateby introducing a new-eflag, following the same pattern implemented in #3798 fort.rast.neighbors.Changes
-eflag (Extend existing space time raster dataset) to module interface-eflag is set instead of always creating a new oneupdate_command_string()when extending existing STRDStest_extend_existing_strdsto existing test suiteUsage
First run — creates new STRDS:
t.rast.aggregate input=daily_rain output=monthly_rain granularity="1 month" method=average basename=result
Second run — extends existing STRDS:
t.rast.aggregate input=daily_rain output=monthly_rain granularity="1 month" method=average basename=result_new -e
Related
Closes part of #3427
See also #3798