Skip to content

Running the run_calls wrapper with different numbers of samples produces different SNPs #17

@bioinfo11

Description

@bioinfo11

Hi Zam,

I have three bacterial samples that I want to use to look for SNPs against an assembled reference. To do this, I have tried two methods using the run_calls wrapper. The first was to run each sample against the reference individually. The second, was to run all three samples together against the reference.

The command I used was:

/path/to/CORTEX_release_v1.0.5.17/scripts/calling/run_calls.pl --first_kmer 31 --fastaq_index /path/to/index --auto_cleaning yes --bc yes --pd no --outdir output --outvcf outputvcf --ploidy 1 --stampy_hash stampy_refhash --stampy_bin /path/to/stampy-1.0.31/stampy.py --list_ref_fasta list_ref_fasta --refbindir /path/to/working_directory --genome_size 2800000 --qthresh 5 --mem_height 17 --mem_width 1000 --vcftools_dir /path/to/vcftools_0.1.11 --do_union yes --ref CoordinatesOnly --workflow joint --logfile logfile_output log_1.txt

When I compared the two methods using the 'wk_flow_J_RefCO_FINALcombined_BC_calls_at_all_k.decomp.vcf' I was expecting to see the same SNPs/positions called for each sample, regardless of the method used, however this was not the case. The only difference between the two methods was the index files used:

For running samples individually (e.g. for sample 1):

cat index
sample1 . /path /to/sample1_R1 /path /to/sample1_R2

For running samples together:

cat index
sample1 . /path /to/sample1_R1 /path /to/sample1_R2
sample2 . /path /to/sample2_R1 /path /to/sample2_R2
sample3 . /path /to/sample3_R1 /path /to/sample3_R2

Am I doing something wrong?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions