Skip to content

Parallelization #37

@lskatz

Description

@lskatz

I was wondering if you are interested in parallelizing? Maybe there are some python packages that could help. I am simulating the genomes of a 1700-taxon tree and it's just taking a very long time, but it wouldn't be so bad if I could simulate one genome per processor. I tried an xargs statement for the ART step, and I'm not sure if it would be helpful or not to you.

\ls *.fasta | xargs -P 12 -n 1 bash -c '
  b=$(basename $0 .fasta); 
  dir="tmp/$b"; 
  prefix="$dir/$b"; 
  mkdir -p $dir; 
  art_illumina -1 /scicomp/home/gzu2/bin/ART/Illumina_profiles/EmpMiSeq250R1.txt -2 /scicomp/home/gzu2/bin/ART/Illumina_profiles/EmpMiSeq250R2.txt -na -sam -p -i $0 -l 150 -f 40 -m 380 -s 10 -o $prefix && \
  gzip -v $dir/*.fq && \
  samtools view -bS -o $prefix.bam $prefix.sam && \
  samtools sort $prefix.bam $prefix.sorted.bam && \
  rm -v $prefix.bam $prefix.sam
'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions