Pretrained models from Gnomix used in the LARGE-PD paper.
We recommend using pre-trained models only when the dataset has 100% SNP coverage relative to the pre-trained reference panel. Tests performed within GP2 indicate that even small differences in SNP coverage can have a substantial impact on the results, leading to instability in the model outputs.
Pretrained model panel and list of variants for the LARGE-PD Phase 2 data. This panel is composed of samples from the 1000 Genomes 30x dataset, using the references from Shriner et al. 2023 (PMID: PMC10507155, doi:10.1016/j.xhgg.2023.100235), and non-admixed Native American samples from LARGE-PD Phase 2, genotyped using the Illumina NeuroBooster array.
Pretrained model panel and list of variants for the LARGE-PD Phase 1 data. This panel is composed of samples from the 1000 Genomes 30x dataset, using the references from Shriner et al. 2023 (PMID: PMC10507155, doi:10.1016/j.xhgg.2023.100235), and non-admixed Native American samples from LARGE-PD Phase 1, genotyped using the Illumina Infinium Global Diversity Array.
List of samples for each parental meta-population based on Shriner et al. 2023 (PMID: PMC10507155, doi:10.1016/j.xhgg.2023.100235). We have files for each ancestry: African (AFR.txt), Europeans (EUR.txt), East and South Asian (EAS.txt and SAS.txt), and Native Americans (ARM.txt). Besides that we have a file with the correspondence between sample ID and population ID (ID_POP_1KGP_HGDP.txt)
This work is supported by NIH Grant R01 1R01NS112499-01A1, MJFF Grant ID: 18298, ASAP-GP2, and the Parkinson’s Foundation. We also thank Dr. Daniel N. Shriner for providing the list of reference samples.