swan viewer for dogme annotation output of the reconcileBams.py script, which matches novel transcripts across multiple Bam files.
The viewer always for differential gene and transcript expression (DEG, DTE) using pyDeseq2, Swan's DIE analysis, and to view the isoforms and expression levels of individual genes.
To install the swanViewer application, you'll need Python 3.7 or higher and the following dependencies:
- streamlit
- swan-vis
- pyDeseq2
- pandas
- matplotlib
- anndata
- scanpy
- numpy
- plotly
You can install these dependencies using pip.
The swanViewer can be run using the streamlit command with the required file paths:
streamlit run swanViewer/swanview.py -- --abundance dogme_reconciled_abundance.tsv --ref_gtf reference.gtf --transcriptome_gtf dogme_reconciled.gtf --metadata metadata.csvRequired arguments:
--abundance: Path to DOGME abundance file (TSV format)--ref_gtf: Path to the unaltered reference GTF file--transcriptome_gtf: Path to the transcriptome GTF file to be processed
Optional argument:
--metadata: Path to a metadata CSV/TSV file containing dataset,condition,replicate columns
This will take a while to load the files and process the files. Once processed, you can use the application, as well as save the swan files from the 'Save Session' button in the QC side bar.
The abundance file from running reconcileBams.py
- Reference GTF file: the reference annotations used for running reconcileBams.py
- Transcriptome GTF file: the Dogme GTF file from running reconcileBams.py
The metadata file should be a CSV or TSV file with the following columns:
dataset: Dataset identifier, which must match the sample column names in the abundance filecondition: Condition identifier (alphanumeric characters, underscores, hyphens)replicate: Replicate number (positive integer)
You can run swanViewer using a prebuilt container image or build the image locally. The container runs Streamlit on port 8501 by default.
Pull the latest image from GitHub Container Registry:
docker pull ghcr.io/mortazavilab/swanviewer/swan-app:latestStart the container and map port 8501 to the host:
docker run -p 8501:8501 --rm ghcr.io/mortazavilab/swanviewer/swan-app:latestOpen http://localhost:8501 in your browser.
If you have local data files (abundance, GTFs, metadata), mount a directory into the container and pass the paths to the app. Example (run from the folder that contains your data):
docker run -p 8501:8501 --rm -v "$(pwd):/app/data" ghcr.io/mortazavilab/swanviewer/swan-app:latest \
streamlit run swanview.py --server.port=8501 --server.address=0.0.0.0 -- \
--abundance /app/data/dogme_reconciled_abundance.tsv --ref_gtf /app/data/reference.gtf \
--transcriptome_gtf /app/data/dogme_reconciled.gtf --metadata /app/data/metadata.csvNotes:
- The double dash (
--) separates Streamlit's options from the application arguments. Adjust the paths after the--to match where you mounted your files inside the container (above we mount into/app/data).
To build from the included Dockerfile and run locally:
docker build -t swan-app:local .
docker run -p 8501:8501 --rm swan-app:local- If Streamlit reports missing files, ensure you mounted your local folder into the container and specified the correct internal paths (e.g.
/app/data/...). - To run on a different host port, change both the
-pmapping (host:container) and the Streamlit--server.portargument. - If you see Python dependency errors, ensure
requirements.txtis up-to-date before building the image.
processSwan.py is a helper script included in this repository to pre-process the raw DOGME abundance file and GTFs into pickled objects that the Streamlit app can load quickly.
Important notes (script behavior):
--abundance,--ref_gtf,--transcriptome_gtf, and--metadataare required arguments to the script.- The script writes two files into
--out_dir(defaults tosave):<out_name>.p— the pickledSwanGraphobject<out_name>.metadata.p— the pickled metadata DataFrame
From your project directory run:
python processSwan.py \
--abundance /path/to/dogme_reconciled_abundance.tsv \
--ref_gtf /path/to/reference.gtf \
--transcriptome_gtf /path/to/dogme_reconciled.gtf \
--metadata /path/to/metadata.csv \
--out_dir /path/to/output_folder \
--out_name my_sessionThis will create /path/to/output_folder/my_session.p and /path/to/output_folder/my_session.metadata.p.
If you'd rather run the pre-processing inside the container (so dependencies from the image are used), mount your data directory and override the container command. Example (run from your data folder):
docker run --rm -v "$(pwd):/app/data" -w /app ghcr.io/mortazavilab/swanviewer/swan-app:latest \
python processSwan.py \
--abundance /app/data/dogme_reconciled_abundance.tsv \
--ref_gtf /app/data/reference.gtf \
--transcriptome_gtf /app/data/dogme_reconciled.gtf \
--metadata /app/data/metadata.csv \
--out_dir /app/data/out --out_name my_sessionAfter this runs, you will have out/my_session.p and out/my_session.metadata.p in your local folder (because the container wrote them to /app/data/out).
Once you have the pickled session file, you can launch the app and point it at the saved SwanGraph pickle using the --loadfrom CLI option supported by swanview.py.
Run locally:
streamlit run swanview.py -- --loadfrom /path/to/output_folder/my_session.pOr run with the Docker image and mount the output folder so the container can read it:
docker run -p 8501:8501 --rm -v "$(pwd):/app/data" ghcr.io/mortazavilab/swanviewer/swan-app:latest \
streamlit run swanview.py --server.port=8501 --server.address=0.0.0.0 -- \
--loadfrom /app/data/out/my_session.pNotes on metadata loading:
- When
swanview.pyloads a session via--loadfrom, it will attempt to find a metadata pickle next to the provided path by looking for<base>.metadata<ext>or falling back to<path>.metadata.p. IfprocessSwan.pyproduced the companion metadata pickle in the same folder, the app will load it automatically.