Selection of mutated protein fragments for therapeutic personalized cancer vaccines.
vaxrank \
--vcf tests/data/b16.f10/b16.vcf \
--bam tests/data/b16.f10/b16.combined.bam \
--vaccine-peptide-length 25 \
--mhc-predictor netmhc \
--mhc-alleles H2-Kb,H2-Db \
--padding-around-mutation 5 \
--output-ascii-report vaccine-peptides.txt \
--output-pdf-report vaccine-peptides.pdf \
--output-html-report vaccine-peptides.htmlYou can specify common parameters in a YAML configuration file to avoid repeating them on every run:
vaxrank --config my_config.yaml --vcf variants.vcf --bam tumor.bamExample my_config.yaml:
epitope_config:
min_epitope_score: 0.001
logistic_epitope_score_midpoint: 350.0
logistic_epitope_score_width: 150.0
vaccine_config:
vaccine_peptide_length: 25
padding_around_mutation: 5
max_vaccine_peptides_per_variant: 1
num_mutant_epitopes_to_keep: 1000 # set to 0 to keep allCLI arguments override values from the config file.
Vaxrank can be installed using pip:
pip install vaxrank
Requirements: Python 3.9+
Note: to generate PDF reports, you first need to install wkhtmltopdf, which you can do (on macOS) like so:
brew install --cask wkhtmltopdf
Vaxrank uses PyEnsembl for accessing information about the reference genome. You must install an Ensembl release corresponding to the reference genome associated with the mutations provided to Vaxrank.
Example for GRCh38 (adjust release to match your reference):
pyensembl install --release 113 --species human
Example for GRCh37 (legacy):
pyensembl install --release 75 --species human
If your variants were called from alignments against hg19 then you can still use GRCh37 but should ignore mitochondrial variants.
Vaxrank filters out peptides that exist in the reference proteome to focus on truly novel mutant sequences. This uses a set-based kmer index for O(1) membership testing. The index is built once and cached locally for subsequent runs.
Vaxrank annotates variants that occur at known cancer mutation hotspots using bundled data from cancerhotspots.org (Chang et al. 2016, 2017). This helps identify clinically relevant mutations. The hotspot data includes ~2,700 recurrently mutated positions across cancer types.
Vaxrank integrates with multiple MHC binding predictors via mhctools, including:
- NetMHC / NetMHCpan
- MHCflurry (open source, installed by default)
There is a Vaxrank paper on biorxiv called Vaxrank: A Computational Tool For Designing Personalized Cancer Vaccines which can be cited as:
@article {Rubinsteyn142919,
author = {Rubinsteyn, Alex and Hodes, Isaac and Kodysh, Julia and Hammerbacher, Jeffrey},
title = {Vaxrank: A Computational Tool For Designing Personalized Cancer Vaccines},
year = {2017},
doi = {10.1101/142919},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Therapeutic vaccines targeting mutant tumor antigens ({\textquotedblleft}neoantigens{\textquotedblright}) are an increasingly popular form of personalized cancer immunotherapy. Vaxrank is a computational tool for selecting neoantigen vaccine peptides from tumor mutations, tumor RNA data, and patient HLA type. Vaxrank is freely available at www.github.com/hammerlab/vaxrank under the Apache 2.0 open source license and can also be installed from the Python Package Index.},
URL = {https://www.biorxiv.org/content/early/2017/05/27/142919},
eprint = {https://www.biorxiv.org/content/early/2017/05/27/142919.full.pdf},
journal = {bioRxiv}
}
To install Vaxrank for local development:
git clone git@github.com:openvax/vaxrank.git
cd vaxrank
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -e .
# Examples; adjust release to match your reference
pyensembl install --release 113 --species human
pyensembl install --release 113 --species mouseRun linting and tests:
./lint.sh && ./test.shThe first run of the tests may take a while to build the reference proteome kmer index, but subsequent runs will use the cached index.
Vaxrank uses msgspec Struct objects for configuration:
-
EpitopeConfig: Parameters for epitope scoring and filteringlogistic_epitope_score_midpoint: IC50 value at which epitope score is 0.5 (default: 350 nM)logistic_epitope_score_width: Width parameter for logistic scoring function (default: 150)min_epitope_score: Minimum normalized score threshold (default: 0.00001)binding_affinity_cutoff: Maximum IC50 to consider (default: 5000 nM)
-
VaccineConfig: Parameters for vaccine peptide assemblyvaccine_peptide_length: Length of vaccine peptides (default: 25 aa)padding_around_mutation: Off-center windows to consider (default: 5)max_vaccine_peptides_per_variant: Max peptides per variant (default: 1)num_mutant_epitopes_to_keep: Epitopes to keep per variant (default: 1000, set to 0 to keep all)
reference_proteome.py: Set-based kmer index for checking if peptides exist in the reference proteomecancer_hotspots.py: Lookup for known cancer mutation hotspotsepitope_logic.py: Epitope scoring and filtering logiccore_logic.py: Main vaccine peptide selection algorithmreport.py: Report generation (ASCII, HTML, PDF, XLSX)
Key dependencies:
pyensembl: Reference genome annotationvarcode: Variant effect predictionisovar: RNA-based variant callingmhctools: MHC binding predictionmsgspec: Configuration serialization (YAML/JSON)pandas,numpy: Data processingjinja2,pdfkit: Report generation
Helper scripts included in the repo:
develop.sh: installs the package in editable mode and setsPYTHONPATHto the repo root.lint.sh: runs ruff onvaxrankandtests.test.sh: runs pytest with coverage.deploy.sh: runs lint/tests, builds a distribution withbuild, uploads viatwine, and tags the release (vX.Y.Z). Deploy is restricted to themain/masterbranch.