Skip to content

WT-854 Support Fallback Languages In Wagtail#1122

Open
dchukhin wants to merge 57 commits intomainfrom
WT-854-language-fallback-for-pages
Open

WT-854 Support Fallback Languages In Wagtail#1122
dchukhin wants to merge 57 commits intomainfrom
WT-854-language-fallback-for-pages

Conversation

@dchukhin
Copy link
Copy Markdown
Collaborator

@dchukhin dchukhin commented Mar 5, 2026

One-line summary

This pull request implements support for fallback languages for Wagtail pages, according to the alias locale proposal document.

Significant changes and points to review

URLs (Wagtail and non-Wagtail) should now be served according to the following logic:

  1. if a URL matches a page, that page content is served
  2. if the page doesn't exist, then we try to serve the content for the page from the fallback locale (ex: if an es-AR page doesn't exist, then serve the relevant es-MX page, since es-MX is the fallback locale for es-AR)
  3. the rest of the page serving logic should be the same as before

Major changes:

  • CMSLocaleFallbackMiddleware now catches a 404 response (for example, if a user requests /es-AR/somepage, and that page doesn't exist), and tries to find the page in the fallback locale (the somepage in the es-MX locale). If it's found, then the es-MX somepage content is served at the /es-AR/somepage URL. Note: this is not a redirect.
  • the i18n context processor adds the CANONICAL_LANG context variable, which is used to set the canonical href and indexing data
  • the FALLBACL_LOCALES setting has been updated to correctly define locales and their fallbacks
  • a migration adds any alias (non-fallback) locales to the database
  • a find_fallback_page_for_locale() function finds a particular page translation in the fallback locale of a particular locale (ex: find the fallback "somepage" in the "es-AR"'s locale returns the "somepage" in the "es-MX" locale)

Issue / Bugzilla link

WT-854

Testing

Note: this pull request covers a lot of scenarios, so please look carefully, and consider if anything has been missed.
We need to make sure the following are addressed appropriately (expected content is visible, expected URL appears, expected canonical URL defined, expected indexing code):

  • Wagtail pages:

    • serving a Page at a Locale that has a translation
    • serving a Page at a Locale that has no translation, but has a fallback locale that has a translation
    • serving a Page at a Locale that has no translation, but has a fallback locale but the fallback locale has no translation
    • serving a Page at a Locale that has no translation, and no fallback locale
  • non-Wagtail pages should have consistent behavior:

    • serving a view at a Locale that has a fluent files
    • serving a view at a Locale that does not have fluent files
    • serving a view at a Locale that does not have fluent files, but has a fallback locale that has a fluent files
    • serving a view at a Locale that does not have fluent files, but has a fallback locale but the fallback locale has no fluent files
    • serving a view that has a static view and a Wagtail page - should prefer Wagtail page

dchukhin added 26 commits March 3, 2026 10:10
…ocale when the requested locale page is not found
Note: this logic shouldn't be needed, since all of
the locales should be present in the database, but
in case one of the locale records is removed from
the database, but the locale is still referenced
in the code (for example, in the settings), this
logic will handle it.
CANONICAL_LANG matches LANG, unless a page is served
for a fallback locale. In this case, the template
can determine which context variable (LANG or
CANONICAL_LANG) to use in each relevant place.
when a user requests a page and is given content
for a fallback locale, the page links in the
content should match the requested URL's locale.
…ent from alias locale URL when alias locale has no content
… alias Locale does not exist in the database
…when alias Locale does not exist in the database
this fixes an error where translating a page into
the new Locales was causing a server error,
because wagtail-localize was not able to correctly
find the root page in the new Locale.
@dchukhin dchukhin marked this pull request as draft March 5, 2026 16:54
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 93.64162% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.38%. Comparing base (701242a) to head (7a1cda1).

Files with missing lines Patch % Lines
springfield/settings/base.py 40.00% 3 Missing ⚠️
springfield/cms/wagtail_hooks.py 84.61% 2 Missing ⚠️
springfield/cms/wagtail_urls.py 85.71% 2 Missing ⚠️
springfield/cms/blocks.py 96.15% 1 Missing ⚠️
springfield/cms/middleware.py 94.44% 1 Missing ⚠️
springfield/cms/utils.py 96.96% 1 Missing ⚠️
springfield/cms/views.py 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1122      +/-   ##
==========================================
+ Coverage   79.16%   79.38%   +0.22%     
==========================================
  Files         134      135       +1     
  Lines        8323     8466     +143     
==========================================
+ Hits         6589     6721     +132     
- Misses       1734     1745      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dchukhin dchukhin requested a review from stevejalim March 16, 2026 18:29
@stevejalim
Copy link
Copy Markdown
Contributor

I would recommend putting this on a demo server; I'll get it on demo-1 now

Is this on fxc-demo-1 already? Happy to try it out there if so

@janbrasna
Copy link
Copy Markdown
Collaborator

Not currently, demo1 has been taken for a spin yesterday with QR codes. That's shipped now so this can be deployed again to demo1 I guess.

@dchukhin
Copy link
Copy Markdown
Collaborator Author

Cool, I'll push this to demo-1 again

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements alias-locale fallback support across Wagtail and Fluent-rendered pages so that missing page translations can transparently serve fallback-locale content (without redirect), while ensuring canonical/hreflang/indexing reflect the content locale rather than the URL locale.

Changes:

  • Add Wagtail alias-locale fallback serving via a custom wagtail_serve_with_locale_fallback view + custom Wagtail URLconf, and extend CMSLocaleFallbackMiddleware to serve fallback pages on 404s.
  • Update canonical/hreflang behavior using a new CANONICAL_LANG context var plus explicit alias-locale filtering for alternates/sitemaps.
  • Expand settings/tests/migrations to support alias locales (new locale codes, fallback map, admin UI badges, and locale DB records migration).

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
springfield/urls.py Switch Wagtail catch-all to custom URLconf that uses alias-locale fallback serve wrapper.
springfield/sitemaps/tests/test_utils.py Add sitemap test ensuring alias-locale URLs aren’t emitted as indexable entries.
springfield/settings/base.py Update FALLBACK_LOCALES, add alias locales to supported langs, add localize dashboard filter options.
springfield/cms/wagtail_urls.py New custom Wagtail URL module that replaces Wagtail’s wagtail_serve route with fallback-aware serve view.
springfield/cms/wagtail_hooks.py Inject alias-locale map into admin + load JS for locale “alias → X” badges.
springfield/cms/views.py Add wagtail_serve_with_locale_fallback and helpers for pre-Wagtail interception + fallback serving.
springfield/cms/utils.py Add find_fallback_page_for_locale and split locale computation into (all vs content) locales.
springfield/cms/tests/test_utils.py Add unit tests for alias locale expansion and fallback page lookup.
springfield/cms/tests/test_models.py Update tests to patch the new compute_cms_page_locales API and assert content locales are set.
springfield/cms/tests/test_middleware.py Add extensive middleware tests for alias-locale serving, view restrictions, and Accept-Language redirect behavior.
springfield/cms/tests/test_locale_fallback_rendering.py New integration tests for canonical/noindex/hreflang output across CMS + non-CMS pages.
springfield/cms/tests/test_decorators.py Update prefer_cms tests to account for alias locale expansion in CMS locale annotations.
springfield/cms/tests/test_blocks.py Add link-block tests for alias-locale URL construction when locale DB records may be missing.
springfield/cms/tests/test_alias_locale_url_routing.py New tests ensuring alias-locale fallback respects Django URL routing vs Wagtail catch-all/prefer_cms.
springfield/cms/tests/templates/test-hreflang.html Minimal template used by hreflang alternate rendering tests.
springfield/cms/tests/conftest.py Add autouse fixture to reset Django translation state between tests.
springfield/cms/models/base.py Patch requests with both all locales and content locales for CMS pages.
springfield/cms/migrations/0059_create_alias_locale_records.py New migration to create alias Locale DB records and non-live locale root pages.
springfield/cms/middleware.py Extend 404 middleware to transparently serve fallback-locale CMS pages for alias locales before redirect logic.
springfield/cms/decorators.py Update prefer_cms to use fallback-aware Wagtail serve wrapper.
springfield/cms/blocks.py Update link block URL generation and validation to use fallback-aware Wagtail serve wrapper; handle alias URL prefixes.
springfield/base/tests/test_helpers.py Add tests for expanding alias locales into locale options for Fluent-only pages.
springfield/base/tests/test_context_processors.py Add tests for CANONICAL_LANG behavior in the i18n context processor.
springfield/base/templatetags/helpers.py Expand locale options with alias locales when fallback canonical locale is available (Fluent-only).
springfield/base/templates/includes/canonical-url.html Use CANONICAL_LANG for canonical URL, add noindex for alias-served content, and skip non-content alias locales in hreflang.
springfield/base/context_processors.py Add CANONICAL_LANG derived from request.content_locale when present.
requirements/prod.txt Bump wagtail-localize-dashboard to 0.3.0 (hashed).
requirements/prod.in Bump wagtail-localize-dashboard to 0.3.0.
requirements/dev.txt Bump wagtail-localize-dashboard to 0.3.0 (hashed).
media/static-bundles.json Register new admin JS bundle for locale alias badges.
media/js/cms/wagtailadmin-locale-badges.js New JS that decorates Wagtail locale list rows with “alias → fallback” badges.
lib/l10n_utils/tests/test_base.py Add tests for alias-locale behavior in Fluent rendering and for locale preference order.
lib/l10n_utils/init.py Add non-CMS alias-locale transparent serving and track URL-locale separately to avoid incorrect redirects; prioritize content_locale in get_locale().

(False, "", False, "", "en-US", "en-US"),
),
)
def test_get_locale_prefernce_order(locale_is_set, locale_value, content_locale_is_set, content_locale_value, language_code_settting, expected):
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the test name: test_get_locale_prefernce_order should be test_get_locale_preference_order for clarity and easier searching/grepping.

Suggested change
def test_get_locale_prefernce_order(locale_is_set, locale_value, content_locale_is_set, content_locale_value, language_code_settting, expected):
def test_get_locale_preference_order(locale_is_set, locale_value, content_locale_is_set, content_locale_value, language_code_settting, expected):

Copilot uses AI. Check for mistakes.
assert sorted(Page.objects.filter(translation_key=en_us_test_page.translation_key).values_list("locale__language_code", flat=True)) == sorted(
translation_locales
)
# Since "en-MX" is a fallback locale for "es-AR" and "es-CL", we expect that
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the comment: this scenario is about es-MX being the fallback locale, not en-MX. This is confusing given the surrounding assertions are for Spanish locales.

Suggested change
# Since "en-MX" is a fallback locale for "es-AR" and "es-CL", we expect that
# Since "es-MX" is a fallback locale for "es-AR" and "es-CL", we expect that

Copilot uses AI. Check for mistakes.
middleware = CMSLocaleFallbackMiddleware(get_response=get_404_response)
response = middleware(request)

# The user is served the pt_br_page content at the URL for the pt-PR locale.
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in this comment: it refers to the "pt-PR" locale, but the test (and the settings override) are for the "pt-PT" alias locale.

Suggested change
# The user is served the pt_br_page content at the URL for the pt-PR locale.
# The user is served the pt_br_page content at the URL for the pt-PT locale.

Copilot uses AI. Check for mistakes.
if locale_is_set:
request.locale = locale_value
if content_locale_is_set:
request.locale = content_locale_value
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this parametrized test, content_locale_is_set branch assigns request.locale = content_locale_value, which overwrites the URL locale and never sets request.content_locale. That means the test is not actually validating the precedence logic added to get_locale() (content_locale should win over locale). Update the test to set request.content_locale when content_locale_is_set is true so the assertions match the intended behavior.

Suggested change
request.locale = content_locale_value
request.content_locale = content_locale_value

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@stevejalim stevejalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. Are there any pages on demo1 where I can see this in action?

I've left a bunch of comments, and have general question: my read of this is that we're kind of relying on the idea of an unpublished site root as indicator of whether a locale is an alias or not - that's a bit brittle/risks being broken by someone publishing the site root. Could we look at settings.FALLBACK_LOCALES instead for something enforced in code? Wondering what you think @dchukhin


def create_alias_locales(apps, schema_editor):
# Skip in test environments — test fixtures create the locale records they need.
if "pytest" in sys.modules:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also add a way to disable this when doing the DB export? eg

from bedrock.base.config_manager import config

...

    if "pytest" in sys.modules or getenv('SQLITE_EXPORT_MODE', parser=bool, default="false"):

and then update export-db-to-sqlite.sh at around ~L213 to pass this in - e.g.:

PROD_DETAILS_STORAGE=product_details.storage.PDFileStorage \
SQLITE_EXPORT_MODE=True \
    python manage.py migrate || all_well=false

# and serve the canonical locale's content with the correct canonical
# link. The non-live root still gives the alias locale its own page
# tree so that Wagtail routes within it and produces a genuine 404 for
# unknown paths, rather than silently falling back to the en-US tree.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this merges, can we ensure we document the need for this special "shadow locale" in https://github.com/mozmeao/platform-docs/tree/main/docs/cms ?

# pt-PT child page does not exist, so it must not appear — even though the
# middleware would serve pt-BR content at the pt-PT URL.
assert "pt-PT" not in urls["/test-page/child-page/"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not reading this in order, so I may answer my own question in a moment, but at a guess this doesn't need a change to sitemap.utils because it's actually Page.get_url that's doing the work and that never needs to return an alias locale, because the pages pulled in for Sitemap generation all definitely exist in the DB. Correct?

"Expected to remove exactly 1 pattern ('wagtail_serve') from Wagtail's urlpatterns, but removed %d. Wagtail patterns: %r",
_removed_count,
_wagtail_urlpatterns,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the caution



def _alias_needs_prewagtail_intercept(lang_prefix):
"""Return True if the alias locale requires pre-interception.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there two interception stages? If not, let's refer to this one as just incerception (and pre-Wagtail interception)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good 👁️ . I need to update the comments after the refactoring. There should be only 1 interception, which happens before Wagtail.

#
# We build full Wagtail url_paths by looking up the site root page's
# translations for each candidate locale. This avoids hard-coding
# the root slug.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌 - thank you!

wagtail_response = wagtail_serve(request, path)
# Does Wagtail have a route that matches this? If so, show that page.
# wagtail_serve_with_locale_fallback handles alias-locale
# pre-interception before deferring to Wagtail's serve().
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# pre-interception before deferring to Wagtail's serve().
# interception before deferring to Wagtail's serve().

# a page from the fallback es-MX locale), the user-facing locale is the
# alias (es-AR), but the content locale is the fallback locale (es-MX).
# Use locale_in_url for URL prefix comparisons to avoid spurious redirects.
locale_in_url = getattr(request, "locale", locale) or locale
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: could this get tripped up by middleware changes in the future? I wonder if extracting the locale from the URL at this point would be more robust?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

request.content_locale = fallback_locale
locale = normalize_language(fallback_locale)
# Reload Fluent with the fallback locale so templates render the
# correct translations instead of falling back to en-US.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart. Good catch

Copy link
Copy Markdown
Collaborator Author

@dchukhin dchukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look and for the suggestions!

The root page's existence or publication status shouldn't determine if a locale is a fallback locale. I wasn't positive the code did this correctly, so I added some unit tests, and it looks like it does correctly handle the case that a root page is live, draft, or doesn't exist (a8f811a).

There is some code that was rewritten due to issues with serving homepages for alias locales (8ea18af), but this should allow us to serve the correct locale content whether there is a root page in a locale or not.

Please take a look and test it out, though! There's a lot, so it's very possible that something is not working correctly.



def _alias_needs_prewagtail_intercept(lang_prefix):
"""Return True if the alias locale requires pre-interception.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good 👁️ . I need to update the comments after the refactoring. There should be only 1 interception, which happens before Wagtail.

what we want to happen for alias locales. Instead, we would want to serve
the fallback locale's Page for the request's URL.
"""
alias_locale = WagtailLocale.objects.filter(language_code=lang_prefix).first()
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line happens after the if lang_prefix in fallback_locales check, so by this point we know that the lang_prefix is for an alias locale.
However, it's possible that it has a live page that matches the request's URL (that check happens later).

# a page from the fallback es-MX locale), the user-facing locale is the
# alias (es-AR), but the content locale is the fallback locale (es-MX).
# Use locale_in_url for URL prefix comparisons to avoid spurious redirects.
locale_in_url = getattr(request, "locale", locale) or locale
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

@slightlyoffbeat
Copy link
Copy Markdown
Contributor

QA Testing Results — PR #1122

Tested locally against synced production database on WT-854-language-fallback-for-pages branch.


Core Fallback Behavior

Test URL Expected Result
Canonical locale (no fallback) /es-MX/features/control/ es-MX content, self-canonical, indexable ✅ Pass
Alias locale fallback /es-AR/features/control/ es-MX content at /es-AR/ URL, no redirect ✅ Pass
Second alias, same fallback /es-CL/features/control/ es-MX content at /es-CL/ URL, no redirect ✅ Pass
Homepage fallback (es-AR) /es-AR/ es-MX homepage at /es-AR/ URL ✅ Pass
Homepage fallback (es-CL) /es-CL/ es-MX homepage at /es-CL/ URL ✅ Pass
Alias — fallback also missing /es-AR/features/test-title/ (en-US only page) Redirects to en-US ✅ Pass
Non-alias locale 404 /fr/features/test-title/ (en-US only page) Redirects to en-US (existing behavior) ✅ Pass

SEO Verification (View Source)

Test URL Expected Result
Alias canonical URL /es-AR/features/control/ canonicales-MX/features/control/ ✅ Pass
Alias noindex /es-AR/features/control/ <meta name="robots" content="noindex,follow"> ✅ Pass
Alias homepage canonical /es-AR/ canonicales-MX/ ✅ Pass
Alias homepage noindex /es-AR/ noindex,follow present ✅ Pass
Canonical page — no noindex /es-MX/features/control/ No noindex meta tag ✅ Pass
Alias excluded from hreflang /es-AR/features/control/ No hreflang="es-AR" tag ✅ Pass
Canonical in hreflang /es-AR/features/control/ hreflang="es-MX" present ✅ Pass

Language Switcher

Test URL Expected Result
Alias locales in dropdown /en-US/features/control/ es-AR, es-CL, pt-PT, en-GB, en-CA all visible ✅ Pass
Switch to alias locale Select "Español (de Argentina)" Navigates to /es-AR/features/control/, shows es-MX content ✅ Pass
Current locale highlighted /es-AR/features/control/ "Español (de Argentina)" selected in dropdown ✅ Pass

Wagtail Admin

Test Where Expected Result
Locale badges Settings → Locales Alias locales show "alias → en-US" badge ⚠️ Partial — en-CA and en-GB show badges. es-AR, es-CL, pt-PT missing (see note below).

Notes & Discussion Items

1. Missing Locale records for es-AR, es-CL, pt-PT

These three locales don't have Wagtail Locale DB records in the synced production database. en-GB and en-CA already existed (they were set up pre-PR). Migration 0059 is intended to create these, but didn't populate them against the synced prod data. Need to verify this migration runs correctly on a fresh production deploy — this is the critical path for es-AR, es-CL, and pt-PT to work in production.

2. Hreflang excludes alias locales — intentional improvement over spec

The spec (section 4.1) originally said alias locales should appear in hreflang. The implementation intentionally excludes them. This is the better approach: alias pages are noindex,follow with canonical pointing to the fallback locale. Including them in hreflang would create a contradictory signal (noindex says "don't index" while hreflang says "this is the correct page for this region"). Excluding them gives Google a clean, unambiguous signal. The canonical tag already consolidates ranking to the fallback locale. Recommend updating the spec to match. If an alias locale is later promoted (gets its own translations), it automatically appears in hreflang since it would have real content.


Tests Not Covered Locally

  • Sitemap exclusion/sitemap.xml returns 404 locally. Needs verification on dev server.
  • pt-PT fallback — pt-PT has no Locale record locally (see note 1), so couldn't fully test pt-PT → pt-BR fallback.
  • Draft-only fallback page — Did not test scenario where fallback locale page exists only as draft.
  • View-restricted pages — Did not test alias fallback for pages behind view restrictions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants