-
Notifications
You must be signed in to change notification settings - Fork 39
Support Parallel computing #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
geertvandeweyer
wants to merge
19
commits into
kircherlab:master
Choose a base branch
from
geertvandeweyer:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
da3af9f
support for threading
geertvandeweyer cdd125b
tuning the threading load
geertvandeweyer 1419ef1
tuning the threading load
geertvandeweyer 50cadc1
Print Parallelization summary on launch
geertvandeweyer a9c4987
Provide options to specify the temp folder to use
geertvandeweyer cc558a6
correction to tmp setup
geertvandeweyer 7dcf09a
numpy 2+ gave errors in esm env
geertvandeweyer 8e089ff
assign scaled threads to high esm and mms rules to enable multi-cpu u…
geertvandeweyer 7399d98
working on dockerfile
wings-public c7fc589
expose memory
wings-public fe3f90b
fixed missing arg
wings-public da98ffc
more fixes
wings-public 77b56b0
fix for empty files in MMsplice
wings-public 1a4a8f8
revise slots for cpu/gpu separately
wings-public 38572b5
fix for input file with only invalid variants
wings-public 7b49812
fix error when no valid varaints in infile
wings-public cb68b1e
fixes to no or bad input
wings-public d99f5a0
Merge pull request #2 from geertvandeweyer/Fix/max_memory
geertvandeweyer 6365061
Merge branch 'Fix/max_memory'
wings-public File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,29 +1,67 @@ | ||
| FROM condaforge/mambaforge:latest | ||
| LABEL io.github.snakemake.containerized="true" | ||
| LABEL io.github.snakemake.conda_env_hash="cb2c51dd0ad3df620c4914840c5ef6f5570a5ffd8cfd54cec57d2ffef0a76b08" | ||
| ###################### | ||
| # aws output handler # | ||
| ###################### | ||
|
|
||
| # Step 1: Retrieve conda environments | ||
| # includes: | ||
| # - the cmg modules | ||
| # - dependencies | ||
|
|
||
| RUN mkdir -p /conda-envs/a4fcaaffb623ea8aef412c66280bd623 | ||
| COPY envs/environment_minimal.yml /conda-envs/a4fcaaffb623ea8aef412c66280bd623/environment.yaml | ||
| FROM ubuntu:24.04 | ||
|
|
||
| RUN mkdir -p /conda-envs/ef25c8d726aebbe9e0ee64fee6c3caa9 | ||
| COPY envs/esm.yml /conda-envs/ef25c8d726aebbe9e0ee64fee6c3caa9/environment.yaml | ||
| ## needed apt packages | ||
| ARG BUILD_PACKAGES="wget git ssh bzip2 curl axel" | ||
| # needed conda packages (only packages not in the requirements of cmg-package) | ||
| ARG CONDA_PACKAGES="python==3.12.3 snakemake==8.16.0" | ||
| ENV MAMBA_ROOT_PREFIX=/opt/conda/ | ||
| ENV PATH /opt/micromamba/bin:/opt/conda/bin:$PATH | ||
| # ADD credentials on build | ||
| ARG SSH_PRIVATE_KEY | ||
| ## ENV SETTINGS during runtime | ||
| ENV LANG=C.UTF-8 LC_ALL=C.UTF-8 | ||
| ENV PATH=/opt/conda/bin:/opt/CADD-scripts/:$PATH | ||
| ENV DEBIAN_FRONTEND noninteractive | ||
| ENV CADD=/opt/CADD-scripts | ||
| SHELL ["/bin/bash", "-l", "-c"] | ||
|
|
||
| RUN mkdir -p /conda-envs/7f88b844a05ae487b7bb6530b5e6a90c | ||
| COPY envs/mmsplice.yml /conda-envs/7f88b844a05ae487b7bb6530b5e6a90c/environment.yaml | ||
| # install base packages | ||
| RUN echo "Acquire::http::Pipeline-Depth 0;" > /etc/apt/apt.conf.d/99fixbadproxy && \ | ||
| echo "Acquire::http::No-Cache true;" >> /etc/apt/apt.conf.d/99fixbadproxy && \ | ||
| echo "Acquire::BrokenProxy true;" >> /etc/apt/apt.conf.d/99fixbadproxy && \ | ||
| apt-get -y update && \ | ||
| apt-get -y upgrade && \ | ||
| apt-get install -y $BUILD_PACKAGES && \ | ||
| apt-get clean && \ | ||
| rm -rf /var/lib/apt/lists/* | ||
|
|
||
| RUN mkdir -p /conda-envs/dfc51ced08aaeb4cbd3dcd509dec0fc5 | ||
| COPY envs/regulatorySequence.yml /conda-envs/dfc51ced08aaeb4cbd3dcd509dec0fc5/environment.yaml | ||
| # Install conda/miniforge3 | ||
| RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" && \ | ||
| /bin/bash Miniforge3-$(uname)-$(uname -m).sh -b -p /opt/conda && \ | ||
| rm Miniforge3-$(uname)-$(uname -m).sh && \ | ||
| mamba install -y -c conda-forge -c bioconda $CONDA_PACKAGES && \ | ||
| conda clean --tarballs --index-cache --packages --yes && \ | ||
| conda config --set channel_priority strict && \ | ||
| echo ". /opt/conda/etc/profile.d/conda.sh && conda activate base" >> /etc/skel/.bashrc && \ | ||
| echo ". /opt/conda/etc/profile.d/conda.sh && conda activate base" >> ~/.bashrc | ||
|
|
||
| RUN mkdir -p /conda-envs/89fe1049cc18768b984c476c399b7989 | ||
| COPY envs/vep.yml /conda-envs/89fe1049cc18768b984c476c399b7989/environment.yaml | ||
| # install cadd & run test file to generate all envs | ||
| RUN cd /opt && \ | ||
| git clone --branch Fix/max_memory https://github.com/geertvandeweyer/CADD-scripts.git | ||
| #cd CADD-scripts && \ | ||
| #snakemake test/input.vcf \ | ||
| # --software-deployment-method conda \ | ||
| ## --conda-create-envs-only \ | ||
| # --conda-prefix envs/conda \ | ||
| # --configfile config/config_GRCh38_v1.7.yml \ | ||
| # --snakefile Snakefile -c 1 | ||
|
|
||
| # Step 2: Generate conda environments | ||
| #COPY Install_Annotations.sh /opt/CADD-scripts/Install_Annotations.sh | ||
| RUN chmod a+x /opt/CADD-scripts/Install_Annotations.sh | ||
|
|
||
| RUN mamba env create --prefix /conda-envs/a4fcaaffb623ea8aef412c66280bd623 --file /conda-envs/a4fcaaffb623ea8aef412c66280bd623/environment.yaml && \ | ||
| mamba env create --prefix /conda-envs/ef25c8d726aebbe9e0ee64fee6c3caa9 --file /conda-envs/ef25c8d726aebbe9e0ee64fee6c3caa9/environment.yaml && \ | ||
| mamba env create --prefix /conda-envs/7f88b844a05ae487b7bb6530b5e6a90c --file /conda-envs/7f88b844a05ae487b7bb6530b5e6a90c/environment.yaml && \ | ||
| mamba env create --prefix /conda-envs/dfc51ced08aaeb4cbd3dcd509dec0fc5 --file /conda-envs/dfc51ced08aaeb4cbd3dcd509dec0fc5/environment.yaml && \ | ||
| mamba env create --prefix /conda-envs/89fe1049cc18768b984c476c399b7989 --file /conda-envs/89fe1049cc18768b984c476c399b7989/environment.yaml && \ | ||
| mamba clean --all -y | ||
| ## some follow up instructions are needed: | ||
| RUN echo "WARNING: CADD-scripts installed. To use the container, the following commands are needed: " | ||
| RUN echo "# download the annotations sources" | ||
| RUN echo "docker run -v /mnt/CADD_data:/opt/CADD-scripts/data my-cadd-scripts:my_version /opt/CADD-Scripts/Install_Annotations.sh /opt/CADD-scripts/data GRCh38" | ||
| RUN echo "# run the script on the test data to prepare all conda envs" | ||
| RUN echo "docker run --name prep-container -w /opt/CADD-scripts -v /mnt/CADD_data/annotations:/opt/CADD-scripts/data/annotations -v /mnt/CADD_data/prescored:/opt/CADD-scripts/data/prescored my-cadd-scripts:my_version bash -c 'snakemake test/input.tsv.gz --resources load=100 --sdm conda --conda-prefix /opt/CADD-scripts/envs/conda --configfile /opt/CADD-scripts/config/config_GRCh38_v1.7_noanno.yml --snakefile /opt/CADD-scripts/Snakefile -c 1 ; rm -Rf /opt/CADD-scripts/test/input_splits /opt/CADD-scripts/test/input.chunk* /opt/CADD-scripts/test/input.*.log /opt/conda/pkgs/*' " | ||
| RUN echo "# commit the changes to the image" | ||
| RUN echo "docker commit prep-container my-cadd-scripts:my_version" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| #!/usr/bin/env bash | ||
|
|
||
| set -euo pipefail | ||
|
|
||
| # need two arguments: | ||
| # 1. target | ||
| # 2. build | ||
| if [ "$#" -ne 2 ]; then | ||
| echo "Usage: $0 <target_folder> <build_version>" | ||
| exit 1 | ||
| fi | ||
| TARGET=$1 | ||
| BUILD=$2 | ||
|
|
||
| # LOCATIONS: | ||
| DOWNLOAD_LOCATION=https://krishna.gs.washington.edu/download/CADD | ||
|
|
||
|
|
||
|
|
||
| # supported builds: GRCh37, GRCh38 | ||
| if [ "$BUILD" != "GRCh37" ] && [ "$BUILD" != "GRCh38" ]; then | ||
| echo "Usage: $0 <target_folder> <build_version>" | ||
| echo "Supported builds: GRCh37, GRCh38" | ||
| exit 1 | ||
| fi | ||
|
|
||
| ## ANNOTATIONS | ||
| echo "1. ANNOTATIONS" | ||
| mkdir -p $TARGET/annotations/ | ||
| cd $TARGET/annotations/ | ||
| URL="$DOWNLOAD_LOCATION/v1.7/$BUILD/${BUILD}_v1.7.tar.gz" | ||
| echo " - download" | ||
| axel -a "$URL" | ||
| axel -a "$URL.md5" | ||
| echo " - md5sum" | ||
| md5sum -c ${BUILD}_v1.7.tar.gz.md5 | ||
| echo " - untar" | ||
| tar -xzvf ${BUILD}_v1.7.tar.gz | ||
| rm ${BUILD}_v1.7.tar.gz | ||
| rm ${BUILD}_v1.7.tar.gz.md5 | ||
|
|
||
| ## PRESCORED | ||
| echo "2. PRESCORED" | ||
| mkdir -p $TARGET/prescored/${BUILD}_v1.7/noanno/ | ||
| cd $TARGET/prescored/${BUILD}_v1.7/noanno/ | ||
| URL="$DOWNLOAD_LOCATION/v1.7/$BUILD/whole_genome_SNVs.tsv.gz" | ||
| echo " - download" | ||
| axel -a "$URL" | ||
| axel -a "$URL.md5" | ||
| axel -a "$URL.tbi" | ||
| axel -a "$URL.tbi.md5" | ||
| URL="$DOWNLOAD_LOCATION/v1.7/$BUILD/gnomad.genomes.r4.0.indel.tsv.gz" | ||
| axel -a "$URL" | ||
| axel -a "$URL.md5" | ||
| axel -a "$URL.tbi" | ||
| axel -a "$URL.tbi.md5" | ||
| echo " - md5sum" | ||
| md5sum -c *.md5 | ||
| rm *.md5 | ||
|
|
||
|
|
||
|
|
||
|
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.