SCRUM-5933 SCRUM-5966 fix speciesOrder sorting and add species to variant search#1561
SCRUM-5933 SCRUM-5966 fix speciesOrder sorting and add species to variant search#1561
Conversation
Code ReviewChanges reviewed: Verdict: Looks correct. The old code copied every entry from The fix sets each key's value to the subject's own phylogenetic order, so when sorted by the focus taxon key, annotations group by their species' phylogenetic position — which is the intended behavior. Traced through the logic:
No bugs, no performance issues, no security concerns. Clean fix applied consistently across all three copies of the method. |
The buildSpeciesOrder method was setting non-self species keys to each species' own phylogenetic order from the lookup map. This meant all non-self documents shared the same speciesOrder value for a given focus taxon key, causing species to interleave on gene page disease/expression tables. Now all non-self keys are set to the subject species' phylogenetic order, giving each species a unique value when sorted by speciesOrder.<taxonId>. Also consolidates the duplicated SpeciesInterface proxy, species lookup, and buildSpeciesOrder from both DiseaseAnnotationCurationIndexer and GeneExpressionAnnotationIndexer into the shared Indexer base class with lazy initialization on first use.
The Species object created in VariantSummaryConverter only had abbreviation set. VariantSearchResultConverter calls getSpecies().getFullName() to populate the species field on variant_search_result documents, which returned null because fullName was never set. This caused the species facet to show no values in the variant search category.
Summary
buildSpeciesOrderin disease and expression indexers to set non-self species keys to the subject's own phylogenetic order instead of copying each species' individual orderspeciesOrder.<taxonId>value, preventing interleaving when sorting annotations on gene page tables (e.g., rat annotations appearing in the middle of human annotations)TestSpeciesOrderutility classspecies.fullNameon variant search result documents so the species facet has values (SCRUM-5966)Context
On gene page disease/expression tables, sorting by
speciesOrder.<focusTaxonId>should group the focus species first (value 0), then other species in phylogenetic order. The old code copied the full lookup map, so all non-self documents had the same value for a given focus taxon key (e.g., both human and rat docs hadspeciesOrder.10090 = 30), causing them to interleave by disease name.The variant search category was missing a species facet because
VariantSummaryConverternever setfullNameon theSpeciesobject, soVariantSearchResultConverterserializedspecies: nullfor all ~50M variant documents.Jira: https://agr-jira.atlassian.net/browse/SCRUM-5933
Jira: https://agr-jira.atlassian.net/browse/SCRUM-5966
Test plan