-
Notifications
You must be signed in to change notification settings - Fork 91
Reports localization #2452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/reports-localization
Are you sure you want to change the base?
Reports localization #2452
Changes from all commits
3bda1d6
d4a391f
efcdf19
805a2f1
2442ae9
1e97509
d816eb5
0caf99c
ae40516
24bd476
b883a30
7cb66d5
0364f7c
70f19f7
ecf709f
88ab86b
95da27c
b7c0e5d
62e8ba2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -8,7 +8,7 @@ | |||||
| import pygeoprocessing | ||||||
|
|
||||||
| from natcap.invest import __version__ | ||||||
| from natcap.invest import gettext | ||||||
| from natcap.invest import gettext, get_locale | ||||||
| from natcap.invest.reports import jinja_env, raster_utils, report_constants | ||||||
| from natcap.invest.reports.raster_utils import RasterDatatype, RasterPlotConfig | ||||||
| from natcap.invest.spec import ModelSpec | ||||||
|
|
@@ -131,7 +131,7 @@ def report(file_registry: dict, args_dict: dict, model_spec: ModelSpec, | |||||
| raster_path=file_registry['c_storage_bas'], | ||||||
| datatype=RasterDatatype.continuous, | ||||||
| spec=model_spec.get_output('c_storage_bas'))] | ||||||
|
|
||||||
| intermediate_raster_config_lists = [[ | ||||||
| RasterPlotConfig( | ||||||
| raster_path=file_registry[f'c_{pool_type}_bas'], | ||||||
|
|
@@ -178,7 +178,7 @@ def report(file_registry: dict, args_dict: dict, model_spec: ModelSpec, | |||||
| input_raster_config_list) | ||||||
| input_raster_caption = raster_utils.caption_raster_list( | ||||||
| input_raster_config_list) | ||||||
|
|
||||||
| outputs_img_src = raster_utils.plot_and_base64_encode_rasters( | ||||||
| output_raster_config_list) | ||||||
| output_raster_caption = raster_utils.caption_raster_list( | ||||||
|
|
@@ -214,6 +214,7 @@ def report(file_registry: dict, args_dict: dict, model_spec: ModelSpec, | |||||
|
|
||||||
| with open(target_html_filepath, 'w', encoding='utf-8') as target_file: | ||||||
| target_file.write(TEMPLATE.render( | ||||||
| locale=get_locale(), | ||||||
| report_script=model_spec.reporter, | ||||||
| invest_version=__version__, | ||||||
| report_filepath=target_html_filepath, | ||||||
|
|
@@ -230,10 +231,10 @@ def report(file_registry: dict, args_dict: dict, model_spec: ModelSpec, | |||||
| outputs_img_src=outputs_img_src, | ||||||
| outputs_caption=output_raster_caption, | ||||||
| intermediate_raster_sections=intermediate_raster_sections, | ||||||
| raster_group_caption=report_constants.RASTER_GROUP_CAPTION, | ||||||
| raster_group_caption=report_constants.raster_group_caption(), | ||||||
| output_raster_stats_table=output_raster_stats_table, | ||||||
| input_raster_stats_table=input_raster_stats_table, | ||||||
| stats_table_note=report_constants.STATS_TABLE_NOTE, | ||||||
| stats_table_note=report_constants.stats_table_note(), | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO, it would be a bit more intuitive to keep the strings as constants and wrap them directly in
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree—it doesn't seem quite right to define "constants" as functions—but I think the reason I ended up doing that is because I wanted the constants module itself to handle the localization. While your suggestion makes sense (and works), it shifts the responsibility to each reporter module, and as more reports are added I worry that it would be too easy to forget to wrap those constants in I was trying to avoid reloading the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, well it is nice to avoid needing to reload the module. Reloading is really a kind of hacky antipattern, and I have wondered before if we should refactor other parts of invest to wrap translated objects in a function, rather than reloading. Since the reference to I'd be curious what others think about the value of the wrapper functions as a "reminder" mechanism vs. directly calling gettext. If we keep the current design, I'd suggest renaming |
||||||
| model_spec_outputs=model_spec.outputs, | ||||||
| )) | ||||||
|
|
||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,18 +8,14 @@ | |
| import json | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Removed some unused imports. |
||
| import logging | ||
| import multiprocessing | ||
| import os | ||
| import pprint | ||
| import sys | ||
| import textwrap | ||
| import warnings | ||
|
|
||
| import natcap.invest | ||
| from natcap.invest import datastack | ||
| from natcap.invest import set_locale | ||
| from natcap.invest import spec | ||
| from natcap.invest import ui_server | ||
| from natcap.invest import utils | ||
| from natcap.invest import models | ||
| from pygeoprocessing.utils import GDALUseExceptions | ||
|
|
||
|
|
@@ -463,8 +459,15 @@ def main(user_args=None): | |
|
|
||
| target_model = models.model_id_to_pyname[args.model] | ||
| model_module = importlib.import_module(name=target_model) | ||
| LOGGER.info('Imported target %s from %s', | ||
| model_module.__name__, model_module) | ||
|
|
||
| LOGGER.info( | ||
| f'Imported target {model_module.__name__} from {model_module}') | ||
|
|
||
| generate_report = bool(model_module.MODEL_SPEC.reporter) | ||
| if generate_report: | ||
| # Reload model module to ensure text defined in model spec is | ||
| # localized before it is passed to the report. | ||
| importlib.reload(model_module) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So in the past the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct. The report uses strings from But also … your question got me wondering how we manage to localize model_spec-defined strings in the Workbench, and it looks like we are actually missing a piece of the puzzle, since I found some text that is appearing in English even though we have a Spanish translation for it. I'll be digging into that separate from this PR.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The same thing happens in |
||
|
|
||
| # We're deliberately not validating here because the user | ||
| # can just call ``invest validate <datastack>`` to validate. | ||
|
|
@@ -479,7 +482,7 @@ def main(user_args=None): | |
| generate_metadata=True, | ||
| save_file_registry=True, | ||
| check_outputs=False, | ||
| generate_report=bool(model_module.MODEL_SPEC.reporter)) | ||
| generate_report=generate_report) | ||
|
|
||
| if args.subcommand == 'serve': | ||
| ui_server.app.run(port=args.port) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| import pandas | ||
|
|
||
| from natcap.invest import __version__ | ||
| from natcap.invest import gettext | ||
| from natcap.invest import gettext, get_locale | ||
| import natcap.invest.spec | ||
| from natcap.invest.reports import jinja_env | ||
|
|
||
|
|
@@ -222,9 +222,17 @@ def report(file_registry, args_dict, model_spec, target_html_filepath): | |
|
|
||
| na_count = exposure_geo.exposure.isna().sum() | ||
| if na_count: | ||
| # There is some redundancy here to ensure the checkbox label is | ||
| # localized properly. Note that passing an f-string to gettext will not | ||
| # work; we must explicitly use .format() to populate variable values. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the issue that "point(s)" might not translate well into other languages?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep! Some languages use different adjective forms for singular vs. plural nouns; verbs often must change as well (that's true in English, but sometimes we can sidestep that with implied verbs, e.g., "1 point [is] missing" vs. "2 points [are] missing"); and changes in word order are not out of the question. It's typically best to provide the full context for both the singular option and the plural option. (Shorthand forms that rely on the ability to perceive punctuation, like "point(s)" or "is/are", aren't great for screenreader users either, but that's a bit of an aside—the primary motivation here is support for translations.) |
||
| if na_count == 1: | ||
| null_cb_name = gettext( | ||
| '{na_count} point missing the exposure index. Show:') | ||
| else: | ||
| null_cb_name = gettext( | ||
| '{na_count} points missing the exposure index. Show:') | ||
| null_checkbox = altair.binding_checkbox( | ||
| name=(f"{na_count} " | ||
| f"{gettext('point(s) missing the exposure index. Show:')}")) | ||
| name=(null_cb_name.format(na_count=na_count))) | ||
| show_null = altair.param(value=False, bind=null_checkbox) | ||
| null_points_chart = base_points.add_params( | ||
| show_null | ||
|
|
@@ -396,6 +404,7 @@ def report(file_registry, args_dict, model_spec, target_html_filepath): | |
|
|
||
| with open(target_html_filepath, 'w', encoding='utf-8') as target_file: | ||
| target_file.write(TEMPLATE.render( | ||
| locale=get_locale(), | ||
| report_script=model_spec.reporter, | ||
| invest_version=__version__, | ||
| report_filepath=target_html_filepath, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,6 +6,9 @@ None of the translations files (.pot, .po, .mo) should be manually edited by us. | |
| ### `messages.pot` | ||
| Message catalog template file. This contains all the strings ("messages") that are translated, without any translations. All the PO files are derived from this. | ||
|
|
||
| ### `babel_config.ini` | ||
| Mappings file that tells pybabel where to look when extracting messages into the message catalog. By default, pybabel will extract messages from Python source files; we need this mappings file to ensure it also extracts messages from the Jinja templates that are used for HTML reports. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for adding this explanation! |
||
|
|
||
| ### `locales/` | ||
| Locale directory. The contents of this directory are organized in a specific structure that `gettext` expects. `locales/` contains one subdirectory for each language for which there are any translations (not including the default English). The subdirectories are named after the corresponding ISO 639-1 language code. Each language subdirectory contains a directory `LC_MESSAGES`, which then contains the message catalog files for that language. | ||
|
|
||
|
|
@@ -22,14 +25,15 @@ No changes are immediately needed when we add, remove, or edit strings that are | |
| When we are ready to get a new batch of translations, here is the process. These instructions assume you have defined the two-letter locale code in an environment variable `$LL`. | ||
|
|
||
|
|
||
| 1. Run the following from the root invest directory, replacing `<LANG>` with the language code: | ||
| 1. Run the following from the root invest directory, replacing `$LL` with the language code: | ||
| ``` | ||
| pybabel extract \ | ||
| --no-wrap \ | ||
| --project InVEST \ | ||
| --version $(python -m setuptools_scm) \ | ||
| --msgid-bugs-address natcap-software@lists.stanford.edu \ | ||
| --copyright-holder "Natural Capital Alliance" \ | ||
| --mapping src/natcap/invest/internationalization/babel_config.ini \ | ||
| --output src/natcap/invest/internationalization/messages.pot \ | ||
| src/ | ||
|
|
||
|
|
@@ -39,7 +43,7 @@ pybabel update \ | |
| --input-file src/natcap/invest/internationalization/messages.pot \ | ||
| --output-file src/natcap/invest/internationalization/locales/$LL/LC_MESSAGES/messages.po | ||
| ``` | ||
| This looks through the source code for strings wrapped in the `gettext(...)` function and writes them to the message catalog template. Then it updates the message catalog for the specificed language. New strings that don't yet have a translation will have an empty `msgstr` value. Previously translated messages that are no longer needed will be commented out but remain in the file. This will save translator time if they're needed again in the future. | ||
| This looks through the source code for strings wrapped in the `gettext(...)` function and writes them to the message catalog template. Then it updates the message catalog for the specified language. New strings that don't yet have a translation will have an empty `msgstr` value. Previously translated messages that are no longer needed will be commented out but remain in the file. This will save translator time if they're needed again in the future. | ||
|
|
||
| 2. Check that the changes look correct, then commit: | ||
| ``` | ||
|
|
@@ -69,7 +73,9 @@ Then follow the "Process to update translations" instructions above, starting fr | |
| * Model titles | ||
| * `MODEL_SPEC` `name` and `about` text | ||
| * Validation messages | ||
| * Strings that appear in the UI, such as button labels and tooltip text | ||
| * Strings that appear in HTML reports, such as section headings, figure captions, and table column headers | ||
|
|
||
| Strings that appear exclusively in the Workbench UI, such as button labels and tooltip text, are also translated, but they are handled separately. See the [Workbench README](../../../../workbench/readme.md#internationalization) for details. | ||
|
|
||
| We are not translating: | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # Extract messages from Python source files | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I named this file |
||
|
|
||
| [python: **.py] | ||
|
|
||
| # Extract messages from Jinja HTML templates | ||
|
|
||
| [jinja2: **/templates/**.html] | ||
| ignore_tags = script,style | ||
| include_attrs = alt | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,21 +1,35 @@ | ||
| from natcap.invest import gettext | ||
|
|
||
| STATS_TABLE_NOTE = gettext( | ||
| '"Valid percent" indicates the percent of pixels that are not ' | ||
| 'nodata. Comparing "valid percent" values across rasters may help ' | ||
| 'you identify cases of unexpected nodata.' | ||
| ) | ||
|
|
||
| RASTER_GROUP_CAPTION = gettext( | ||
| 'If a plot title includes "resampled," that raster was resampled to ' | ||
| 'a lower resolution for rendering in this report. Full resolution ' | ||
| 'rasters are available in the output workspace.' | ||
| ) | ||
|
|
||
| STREAM_CAPTION_APPENDIX = gettext( | ||
| ' The stream network may look incomplete at this resolution, and ' | ||
| 'therefore it may be necessary to view the full-resolution raster ' | ||
| 'in GIS to assess its accuracy.' | ||
| ) | ||
| # All `gettext`-wrapped strings defined in this module should be returned from | ||
| # a function, _not_ defined on the module object itself, since text defined at | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure that the wrapper functions are adding anything here. It should be sufficient if we just make sure that |
||
| # the module level is not localized unless/until the module is reloaded. | ||
| # Wrapping text strings in functions avoids the need to reload the module | ||
| # and ensures text is localized properly when it is needed. | ||
|
|
||
|
|
||
| def stats_table_note(): | ||
| return gettext( | ||
| '"Valid percent" indicates the percent of pixels that are not ' | ||
| 'nodata. Comparing "valid percent" values across rasters may help ' | ||
| 'you identify cases of unexpected nodata.' | ||
| ) | ||
|
|
||
|
|
||
| def raster_group_caption(): | ||
| """Get "pre-caption" note about raster resampling.""" | ||
| return gettext( | ||
| 'If a plot title includes "resampled," that raster was resampled to ' | ||
| 'a lower resolution for rendering in this report. Full resolution ' | ||
| 'rasters are available in the output workspace.' | ||
| ) | ||
|
|
||
|
|
||
| def stream_caption_appendix(): | ||
| return gettext( | ||
| ' The stream network may look incomplete at this resolution, and ' | ||
| 'therefore it may be necessary to view the full-resolution raster ' | ||
| 'in GIS to assess its accuracy.' | ||
| ) | ||
|
|
||
|
|
||
| TABLE_PAGINATION_THRESHOLD = 10 | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This getter is needed so we can pass the current locale code to the base HTML template. If we import and reference
LOCALE_CODEdirectly, it will always be 'en' because that was its initial value—it won't change even if it has been modified byset_locale.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have done this originally, but may I suggest renaming the variable to
_LOCALE_CODEto emphasize that it's not meant to be used outside this module?