Skip to content

goodarzilab/GENEVA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GENEVA

This is an analysis pipeline for GENEVA datasets. Please follow the following steps to execute your dataset.

  • The dataset is composed of the gene expression of each single cell (coded by a Cell barcode).
  • The same cell barcodes are connected to a HTO barcode (present in a different fastq file).
  • Once you process the HTO fastq files, you use the cell barcodes to map the cells to their correct drug labels.
  • Lastly, we perform the 3'end mRNA sequencing of cells to collect their genotype.
  • We then use their genotype to map each cell BC to their corresponding cell line.

10x single-cell expression

  1. Follow the script called cellranger_processing.sh in the scripts folder to convert fastq files into count matrices.
  2. Follow the notebook called 10x_processing.ipynb in the notebooks folder to process your raw table into a processed matrix.

HTO Barcodes Demultiplexing (Cell-Hashing)

  1. Follow the notebook called pymulti_processing.ipynb in the notebooks folder to process your HTO fastq files.
  2. Follow the notebook called 10x_pymulti_merging.ipynb in the notebooks folder to merge your processed matrix with the demultiplexed data.

Cell-Line Genotype Demultiplexing

  1. Follow the script called 3_end_mrna_processing.sh in the scripts folder to process all your mrna data into deduplicated bam files.

  2. Follow the script called process_3_end_vcfs.sh in the scripts folder to process all your bam files into a final concatenated vcf file ready for demuxlet.

  3. Follow the script called demuxlet_processing.sh in the scripts folder to process the merged vcf files along with the 10x outputs into a demultiplexed vcf files.

  4. Follow the script called freemux_processing.sh in the scripts folder to process the demux outputs into clustered vcf outputs that sceasymode can analyze.

  5. Follow the notebook called assign_genotypes_sceasymode.ipynb in the notebooks folder to process the final vcf files into dictionaries with the genotyping module of sceasy mode.

GENEVA Analyses

Follow the notebooks in the analysis_notebooks directory

About

Analysis pipeline for GENEVA

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •