Required Software

To run the Stop_codon_readthrough pipeline you will need the following software and associated packages:

R (dplyr, stringr, stringi, GGally, ggpubr, ggplot2, viridis, tidyverse, seqinr, matrixStats, data.table, rtracklayer, openxlsx, reshape2, caret, hexbin, png, grid, gridExtra, MuMIn, tidyr, rstatix, ggridges, hrbrthemes, glmnet, spgs, ggtext, devtools, ggdendroplot, UpSetR)

Required Data

Read counts (DiMSum output), readthrough efficiencies, and required miscellaneous files should be downloaded from Main dataset and other files, Fig.1, Fig.2, Fig.3, Fig.4, Fig.5 and Fig.6 (files are organised based on the Figure they are used in) to your project directory (named 'base_dir') i.e. where output files should be written.

Installation Instructions

Make sure you have git and conda installed and then run (expected install time <10min):

# Install dependencies (preferably in a fresh conda environment)
conda install -c conda-forge r-dplyr, r-stringr, r-stringi, r-ggally r-ggpubr r-ggplot2 r-viridis r-tidyverse r-seqinr r-matrixstats r-data.table r-openxlsx r-reshape2 r-caret r-hexbin r-png r-gridextra r-mumin r-tidyr r-rstatix r-ggridges r-hrbrthemes r-glmnet r-spgs r-ggtext r-devtools r-UpSetR r-biocmanager
conda install -c bioconda bioconductor-biomart
conda install -c bioconda bioconductor-rtracklayer
conda install conda-forge::r-gridgraphics

Alternatively load the 'RT_diseasePTCs.yml' (set up in Linux Operating System (Scientific Linux 7.2)) which contains the conda environment already generated.

Usage

The 7 R Markdown files contain the code to reproduce the figures and results from the computational analyses described in the following publication: Genome-scale quantification and prediction of drug-induced readthrough of pathogenic premature termination codons (Toledano I, Supek F & Lehner B, 2023). See Required Data for instructions on how to obtain all required data and miscellaneous files before running the pipeline. If using/downloading the files from Required Data and only plotting the figures, the expected run time is <10min. However, if generating all the files (i.e. the in silico PTC saturation dataset of the human genome) and models needed for all main and supplementary figures, the expected run time is ~2days (without data parallelisation). All steps in which the user can decide whether to generate the file/model or to download it from Required Data are indicated. R Markdown files are meant to be run in the following order:

1. Generate_treated_samples.Rmd
2. Fig1_EDFig1.Rmd
3. Fig2_EDFig2.Rmd
4. Fig3_EDFig3.Rmd
5. Fig4_EDFig4.Rmd
6. Fig5_EDFig5.Rmd
7. Fig6.Rmd

Additional scripts and software

The following software package is required for pre-processing of raw FASTQ files:

DiMSum v1.2.9 (pipeline for pre-processing deep mutational scanning data i.e. FASTQ to fitness). Download the FastQ files from Sequence Read Archive (SRA) with accession number PRJNA996618: http://www.ncbi.nlm.nih.gov/bioproject/996618 to your base directory (base_dir). Store the Clitocine, DAP and SRI FastQ files in a separate folder (named 'round_A_fastq') than CC90009, FUr, Gentamicin, G418, SJ6986 and untreated conditions (folder named 'round_B_fastq'). That is because they were assayed in two different rounds (named 'A' and 'B') and we did a separate Dimsum run for each. Shell scripts to run both Dimsum rounds can be found in Required Data.

Configuration files and additional scripts for running DiMSum are available in the "DiMSum" folder here.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
Fig1_EDFig1.Rmd		Fig1_EDFig1.Rmd
Fig2_EDFig2.Rmd		Fig2_EDFig2.Rmd
Fig3_EDFig3.Rmd		Fig3_EDFig3.Rmd
Fig4_extdataFig4.Rmd		Fig4_extdataFig4.Rmd
Fig5_extdataFig5.Rmd		Fig5_extdataFig5.Rmd
Fig6.Rmd		Fig6.Rmd
Generate_treated_samples.Rmd		Generate_treated_samples.Rmd
LICENSE		LICENSE
README.md		README.md
RT_diseasePTCs.yml		RT_diseasePTCs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Table Of Contents

Required Software

Required Data

Installation Instructions

Usage

Additional scripts and software

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

License

lehner-lab/Stop_codon_readthrough

Folders and files

Latest commit

History

Repository files navigation

Table Of Contents

Required Software

Required Data

Installation Instructions

Usage

Additional scripts and software

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Packages