Skip to content

Allow classification and reclassification to run as modules#109

Open
rmcolq wants to merge 5 commits intoartic-network:mainfrom
rmcolq:add_reclassification_module
Open

Allow classification and reclassification to run as modules#109
rmcolq wants to merge 5 commits intoartic-network:mainfrom
rmcolq:add_reclassification_module

Conversation

@rmcolq
Copy link
Collaborator

@rmcolq rmcolq commented Sep 15, 2025

Changes include:

  • Generating concatenated fastq by default when running module
  • The kraken_classification module includes optional viral reclassification
  • Additionally kraken_reclassification allows you to only reclassify

This has included pulling out the reclassify part of the classify workflow and calling it from classify.

Example commands:
Run basic kraken classification with default db

nextflow run main.nf --module kraken_classification --unique_id "test" --fastq test/test_data/barcode01/barcode01.fq.gz -profile docker,local

Run reclassification, passing in the assignments/report from the first classification

nextflow run main.nf --module kraken_reclassification --kraken_assignments output/test/classifications/PlusPF-8.kraken_assignments.tsv --kraken_report output/test/classifications/PlusPF-8.kraken_report.txt --unique_id "test" --fastq test/test_data/barcode01/barcode01.fq.gz -profile docker,local

Chain these steps together, extracting and passing in the viral and unclassified fraction to the second step. NB this is slightly different than running the above 2 steps as they pass the entire fastq to the second step.

nextflow run main.nf --module kraken_classification --unique_id "test" --fastq test/test_data/barcode01/barcode01.fq.gz --run_viral_reclassification -profile docker,local

NB this commit includes small bugfixes to merge.py and report.py to fix errors which were occurring when the reclassification was on the whole file, rather than just the viral+unclassified subset of the file. These scripts are mirrored in https://github.com/rmcolq/krakenpy where new tests have been added to reflect/test these changes.

NB2 to run on climb will probably want to specify --kraken_database.default.host --kraken_database.default.port and --kraken_database.default.path or the equivalent flags with viral instead of default when there is an active kraken2 viral server

@rmcolq rmcolq marked this pull request as ready for review September 16, 2025 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant