-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Hello,
Are you able to give some guidance on how to interpret the output?
For example:
INFO:splitStrain.py has started.
INFO:sample name: SAMEA1100847.ERR2509676.recal.bam
INFO:reference name: Chromosome, reference length: 4411532
INFO:regionStart: 100, regionEnd: 4000000
INFO:depth threshold percent: 75
INFO:entropy threshold: 0.0
INFO:using gff: tuberculosis.filtered-intervals.gff
INFO:Likelihood Ratio Statistic: -2*log(LR) = 12495, treshold: 1920
INFO:using the model:GMM
file alpha min_LR_thresh LR_statistic log-p-value p-value proportions
SAMEA1100847.ERR2509676.recal.bam 0.05 1920 12495 -14.367 0.000 0.83 0.17
How should I interpret this? I note the p-value is 0, does this mean that multiple strains are detected confidently?
In the manuscript 10.1099/mgen.0.000607 it is mentioned that the ROC curvers are generated using the likelihood ratio. Is that equivalent to the LR_statistic above? Is there a recommend threshold for the LR_statistic to discriminate between pure and mixed infections?
Thank you.
Metadata
Metadata
Assignees
Labels
No labels