Skip to content

SCRUM-5933 SCRUM-5966 fix speciesOrder sorting and add species to variant search#1561

Merged
oblodgett merged 2 commits intostagefrom
SCRUM-5933
Apr 10, 2026
Merged

SCRUM-5933 SCRUM-5966 fix speciesOrder sorting and add species to variant search#1561
oblodgett merged 2 commits intostagefrom
SCRUM-5933

Conversation

@oblodgett
Copy link
Copy Markdown
Member

@oblodgett oblodgett commented Apr 10, 2026

Summary

  • Fix buildSpeciesOrder in disease and expression indexers to set non-self species keys to the subject's own phylogenetic order instead of copying each species' individual order
  • This gives each species a unique speciesOrder.<taxonId> value, preventing interleaving when sorting annotations on gene page tables (e.g., rat annotations appearing in the middle of human annotations)
  • Same fix applied to TestSpeciesOrder utility class
  • Set species.fullName on variant search result documents so the species facet has values (SCRUM-5966)

Context

On gene page disease/expression tables, sorting by speciesOrder.<focusTaxonId> should group the focus species first (value 0), then other species in phylogenetic order. The old code copied the full lookup map, so all non-self documents had the same value for a given focus taxon key (e.g., both human and rat docs had speciesOrder.10090 = 30), causing them to interleave by disease name.

The variant search category was missing a species facet because VariantSummaryConverter never set fullName on the Species object, so VariantSearchResultConverter serialized species: null for all ~50M variant documents.

Jira: https://agr-jira.atlassian.net/browse/SCRUM-5933
Jira: https://agr-jira.atlassian.net/browse/SCRUM-5966

Test plan

  • Reindex disease and expression annotations on stage
  • Verify gene page disease table for MGI:109583 (Pten) — mouse annotations should appear first, then human grouped together, then rat grouped together
  • Verify gene page expression table species grouping
  • Re-run variant indexer on stage
  • Verify variant search category has a working species facet

@oblodgett oblodgett requested a review from a team as a code owner April 10, 2026 08:05
@claude
Copy link
Copy Markdown

claude bot commented Apr 10, 2026

Code Review

Changes reviewed: buildSpeciesOrder logic fix in DiseaseAnnotationCurationIndexer, GeneExpressionAnnotationIndexer, and TestSpeciesOrder (3 files, identical change).

Verdict: Looks correct.

The old code copied every entry from speciesOrderLookup as-is into the per-document order map. When sorting a gene page by speciesOrder.<focusTaxonId>, all non-focus species annotations had the same value (the focus taxon's phylogenetic order), causing interleaving.

The fix sets each key's value to the subject's own phylogenetic order, so when sorted by the focus taxon key, annotations group by their species' phylogenetic position — which is the intended behavior.

Traced through the logic:

  • Old: human annotation gets speciesOrder.10090 = 30, rat annotation also gets speciesOrder.10090 = 30 → tied, interleaved
  • New: human annotation gets speciesOrder.10090 = 10 (human's order), rat gets speciesOrder.10090 = 40 (rat's order) → correctly grouped

No bugs, no performance issues, no security concerns. Clean fix applied consistently across all three copies of the method.

The buildSpeciesOrder method was setting non-self species keys to each
species' own phylogenetic order from the lookup map. This meant all
non-self documents shared the same speciesOrder value for a given focus
taxon key, causing species to interleave on gene page disease/expression
tables.

Now all non-self keys are set to the subject species' phylogenetic order,
giving each species a unique value when sorted by speciesOrder.<taxonId>.

Also consolidates the duplicated SpeciesInterface proxy, species lookup,
and buildSpeciesOrder from both DiseaseAnnotationCurationIndexer and
GeneExpressionAnnotationIndexer into the shared Indexer base class with
lazy initialization on first use.
The Species object created in VariantSummaryConverter only had
abbreviation set. VariantSearchResultConverter calls
getSpecies().getFullName() to populate the species field on
variant_search_result documents, which returned null because
fullName was never set. This caused the species facet to show
no values in the variant search category.
@oblodgett oblodgett changed the title SCRUM-5933 fix speciesOrder for gene page species sorting SCRUM-5933 SCRUM-5966 fix speciesOrder sorting and add species to variant search Apr 10, 2026
@oblodgett oblodgett merged commit 3a50833 into stage Apr 10, 2026
5 checks passed
@oblodgett oblodgett deleted the SCRUM-5933 branch April 10, 2026 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants