Massive-scale single-nucleus multi-omics identifies novel rare noncoding drivers of Parkinson’s disease
Last updated: Feb 2026
This repository contains code required to reproduce the analyses and figures reported in Massive-scale single-nucleus multi-omics identifies novel rare noncoding drivers of Parkinson’s disease (Menon et. al, 2026).
Some data used in the preparation of this article were obtained from the Global Parkinson’s Genetics Program (GP2; https://gp2.org). In this analysis we used Tier 2 GP2 Release 8 data (10.5281/zenodo.13755496).
All GP2 data are hosted in collaboration with the Accelerating Medicines Partnership in Parkinson’s disease, and are available via application on the website (https://amp-pd.org/register-for-amp-pd). For up-to-date information on GP2 data acquisition, access, and policies, visit https://gp2.org/. Tier 1 data can be accessed by completing a form on the Accelerating Medicines Partnership in Parkinson’s Disease (AMP®-PD) website (https://amp-pd.org/register-for-amp-pd). Tier 2 data access requires approval and a Data Use Agreement signed by your institution.
(pending publication)
(pending publication)
(pending publication)
- The
analysis/directory includes all analyses discussed in the manuscript.
├── ChromatinAccessibility
│ ├── 001_mapPeaks2genes.R
│ ├── 002_CorcesLab_ATAC_bamToFragsPeaksBW.R
│ ├── 003_Fig2_DARs.R
│ ├── 004_Fig4_EG_QC.R
│ ├── 005_BiasModel_Train.sh
│ ├── 006_TF_Train.sh
│ ├── 007_Fig4_RareVariant_Permutation.R
│ ├── 008_ Fig4_ML_Variant.R
│ └── 009_Fig4_plotModel.R
├── FigurePlotting
│ ├── 001_Colors_Mapping.R
│ └── 002_plotting_helper_corces.R
├── GeneExpression
│ ├── 001_Fig2_DEGs.R
│ ├── 002_trajectory_analysis.R
│ ├── 003_trajectory_analysis_peaks.R
│ └── 004_Fig2_DEGs_Trajectory_Pathways.R
├── MicroC
│ ├── 001_run_align.sh
│ ├── 002_combine.sh
│ ├── 003_qc_script.sh
│ └── 004_run_cooler.sh
├── Preprocessing
│ ├── 001_runCellRanger.sh
│ ├── 002_runCellBender.sh
│ ├── 003_runSCDS.sh
│ ├── 004_runDemuxlet.sh
│ ├── 005_QC_Script.R
│ ├── 006_Preprocess_Data_forIGVF.R
│ └── complete_sample.txt
├── QC_CelltypeAnnotation
│ ├── 001_Fig1_Sex_Age.R
│ ├── 002_Fig1_UMAPs.R
│ ├── 003_Fig1_QC.R
│ ├── 004_Fig2_RegionHeatmap.R
│ ├── 005_Fig1_markerGeneHeatmap.R
│ ├── 006_Fig2_PropChanges.R
│ ├── 007_Fig2_OligoModules.R
│ ├── 008_Fig1_CellTypeComposition.R
│ ├── 009_Fig1_DopaSubcluster.R
│ └── 010_Fig1_RegionalSubclustering.R
├── QTLs
│ ├── 001_rasqual_process.R
│ ├── 002_runRasqual.sh
│ ├── 003_runTensoqtl_sge.sh
│ ├── 004_Fig3_eQTL.R
│ └── 005_Fig3_rs33_allelicimbalance.R
├── README.md
└── RVAT
├── 001_runSKAT_gnomad_gene.R
├── 002_skat_peak_gnomad_maf.sh
└── 003_mirrored_lollipop.R
| Software | Version(s) | Resource URL | RRID | Notes |
|---|---|---|---|---|
| ANNOVAR | d.06.08.2020 | http://www.openbioinformatics.org/annovar/ | RRID:SCR_012821 | Used for variant annotation. |
| BCFtools | v.1.17+ | http://samtools.sourceforge.net/mpileup.shtml | RRID:SCR_005227 | Used for genomic file manipulation. |
| BWA | v.0.7.17 | http://bio-bwa.sourceforge.net/ | RRID:SCR_010910 | Used to align sequencing reads. |
| GATK | v.4.3.0.0 | https://gatk.broadinstitute.org/ | RRID:SCR_001876 | Used for variant calling and genotyping. |
| gnomAD | v.4.1 | http://gnomad.broadinstitute.org/ | RRID:SCR_014964 | Used to retrieve population allele frequency data. |
| ggplot2 | v.3.4.4 | https://ggplot2.tidyverse.org/ | RRID:SCR_014601 | Used for data visualization in R. |
| propeller | v.1.0.0 | https://bioconductor.org/packages/speckle/ | NA | Used for differential cell type proportion analysis. |
| Seurat | v.5.0.1 | https://satijalab.org/seurat/ | RRID:SCR_007322 | Used for single-cell and single-nucleus RNA-seq analysis. |
| ArchR | v.1.0.2 | https://www.archrproject.com/ | RRID:SCR_022282 | Used for single-cell chromatin accessibility analysis. |
| ChromBPNet | v.0.1 | https://github.com/kundajelab/chrombpnet | NA | Used to model chromatin accessibility and predict variant effects. |
| VCFtools | v.0.1.16 | https://vcftools.github.io/index.html | RRID:SCR_001235 | Used for processing and filtering VCF files. |
| Mustache | v.1.0 | https://github.com/ay-lab/mustache | NA | Used to identify chromatin loops from Hi-C data. |
| Cooler | v.0.9.3 | https://github.com/open2c/cooler | RRID:SCR_017328 | Used to store, process, and analyze Hi-C contact matrices. |
| Juicer | v.1.6 | https://github.com/aidenlab/juicer | RRID:SCR_017226 | Used for processing and visualizing Hi-C data. |
| R Project for Statistical Computing | v.4.2.2 | http://www.r-project.org/ | RRID:SCR_001905 | Used for statistical computing and data analysis. |
| Python Programming Language | v.3.8–3.11 | http://www.python.org/ | RRID:SCR_008394 | Used for general data processing, analysis, and visualization. |
| PLINK | v.1.9, v.2.0 | http://www.nitrc.org/projects/plink | RRID:SCR_001757 | Used for genetic and association analyses. |
| SKAT | v.2.0 | https://cran.r-project.org/package=SKAT | RRID:SCR_014442 | Used for rare variant association testing. |
| ABC (Activity-By-Contact) Model | NA | https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction | NA | Used to map regulatory elements to target genes. |
|