Draft: Update feat/dnase-2.7 with changes from main #58

jemma-nelson · 2022-11-28T20:42:37Z

This is a draft PR to see all the changes that would be pulled in. Primary motivation is for commit c86a544.

Logic now matches that seen in the rest of our pipeline - prefer using the alignment's sample_name, and fall back to constructing it manually only when necessary. This should resolve the collation issues that have been dogging us this year.

CopyComplete.txt is a better signal that a flowcell is ready for processing than RTAComplete.txt. Older sequencers did not create CopyComplete.txt, I believe.

hpcz-2 was decommissioned, switching default queue for this.

We will re-enable this once we get the fastq deadline hit

Alt-seq

Accidentally duplicated the input specification during a git merge

Now that we're regularly copying data over rather than symlinking it, it makes sense to remove the work directory in these cases.

Actually tested this time.

Make it more obvious when and where an alignment cannot be set up

With this fix, we should set `unset LIBRARY_KIT_METHOD` correctly in our bash scripts

Fix: alignprocess.py: library kits are optional

jemma-nelson · 2022-11-28T20:53:05Z

There may not be a path forward on merging this, and that's fine. We would not want to introduce significant changes to the DNase pipeline, as it is frozen here for a reprocessing effort.

Fix/lp collation timing

The main thing changed here is to prefer the form `logging.info("msg %s %s", arg1, arg2)` This allows the logger to do the interpolation, which can save time if the message is not printed because it is below the current log level.

previously the code generated would depend solely on where setup.sh was run, so if the execution didn't match, you would get surprising errors

Fix occasional oom error, cleanup run_pools.sh properly, and make sure run_pool.sh and run_alignments.sh are always properly submitted.

Feat/fastq container

Style/add precommit

Basically this was bash glob syntax vs. grep regex confusion. This regex was incorrectly looking for `collatefq*`, which is `collatef` followed by any number of `q`s. Adjusted to propery look for `collatefq` followed by (any number of anything) before the flowcell string.

Fix: fastqc/alignments/pools wait for collation

This was the cause of those pesky "Project_Lab/Sample_LP.../" directories that were causing us to duplicate work.

Alignprocess.py skips library pools

If this is missing, use the default analysis dir.

We don't use the output from this anymore, preferring to run the megamap pipeline or other analyses as appropriate.

jemma-nelson and others added 27 commits April 18, 2022 16:07

laneprocess.py uses correct SAMPLE_NAME

4052204

Logic now matches that seen in the rest of our pipeline - prefer using the alignment's sample_name, and fall back to constructing it manually only when necessary. This should resolve the collation issues that have been dogging us this year.

apply fix to right file, mark old file deprecated

621b345

Use CopyComplete.txt to start processing

58603ec

CopyComplete.txt is a better signal that a flowcell is ready for processing than RTAComplete.txt. Older sequencers did not create CopyComplete.txt, I believe.

Switch initial flowcell processing to hpcz-1

536ed63

hpcz-2 was decommissioned, switching default queue for this.

chore: update default queue names for Altius

3b99fd3

Add module for bcl2fastq - contains samplesheet generation

4c82c7c

Add test of alt-seq pipeline

ca3a64d

Refine altseq.nf and add process_altseq.bash

3858ef6

Connect altseq with LIMS

b636f55

altseq script optimizations - better caching

e1c3ae7

Altseq - version 1.0.0

27d61d6

Don't use scratch space for bcl2fastq & merge_fq

2a3b1ec

Use production LIMS instead of staging

24c9b4a

Altseq - skip running alignment for now

212a21b

We will re-enable this once we get the fastq deadline hit

Altseq - handle pools with same pool barcodes

bf322cb

setup.sh uses processing_information endpoint again

1341024

Merge pull request #56 from StamLab/alt-seq

ec1df14

Alt-seq

fix: encode_cram_no_ref now works again

cd3c06a

Accidentally duplicated the input specification during a git merge

fix for altseq setup.sh processing

d20c2e1

nextflow_clean script proceeds w/o output symlinks

c273b87

Now that we're regularly copying data over rather than symlinking it, it makes sense to remove the work directory in these cases.

!fixup c273b87 - missed a simple bug.

69ebe3c

Actually tested this time.

Improve alignprocess.py error logging

fb8bd40

Make it more obvious when and where an alignment cannot be set up

Fix alignprocess.py when library_kit_method=null

85b565f

With this fix, we should set `unset LIBRARY_KIT_METHOD` correctly in our bash scripts

fixup: Can't use f-strings in current python ver

b2cdab7

Merge pull request #57 from StamLab/fix/align_process_library_kits

c86a544

Fix: alignprocess.py: library kits are optional

Config: Add 137 to retry-with-more-mem exit codes

0248da8

fix/rna-agg: two typos in anaquin processing

4760b6d

jemma-nelson added 2 commits December 4, 2022 16:44

altseq - use better publishing strategy

4cbfafc

Add basic analysis

d70fdc3

jemma-nelson and others added 30 commits July 9, 2024 15:35

Merge pull request #76 from StamLab/fix/lp_collation_timing

5104779

Fix/lp collation timing

style: Format python codebase with ruff

92f6059

style: Fix lint errors from ruff

a42333f

style: order imports in python scripts with ruff

94e3c64

style: fix some possible bugs identifed by ruff

109d348

style: Make sure to use raw strings for regex

daee133

style: use pythonic capitalization consistently

2738b49

style: executables are chmod +x and have shebang

1cbe3d6

style: logging-related changes

58d776c

The main thing changed here is to prefer the form `logging.info("msg %s %s", arg1, arg2)` This allows the logger to do the interpolation, which can save time if the message is not printed because it is below the current log level.

First draft of fastq pipeline container

9eeb4b1

Add biopython to fastq container

41ce205

Update script to use apptainer when able

21ee6a2

collation/fastq updates for container

8cb0901

rework for run-time cluster loc detection

de4d687

previously the code generated would depend solely on where setup.sh was run, so if the execution didn't match, you would get surprising errors

bcl2fastq actually runs in apptainer now

6255a8e

Final tweaks

d8ef59d

Fix occasional oom error, cleanup run_pools.sh properly, and make sure run_pool.sh and run_alignments.sh are always properly submitted.

Fix fastqc script to run on new cluster

718d54e

Merge pull request #78 from StamLab/feat/fastq_container

8f2059c

Feat/fastq container

Merge pull request #77 from StamLab/style/add_precommit

899290b

Style/add precommit

Merge pull request #79 from StamLab/fix/wait_for_collation

f94689c

Fix: fastqc/alignments/pools wait for collation

Alignprocess.py skips library pools

2638c0d

This was the cause of those pesky "Project_Lab/Sample_LP.../" directories that were causing us to duplicate work.

Merge pull request #80 from StamLab/fix/skip_lp_alignments

2d1dd24

Alignprocess.py skips library pools

aggregateprocess.py: fix typo-induced bug

857cdc8

fix: handle missing project_share_directory

30e533b

If this is missing, use the default analysis dir.

Fix setup.sh for miniseq on new cluster

8fec6d1

link_nextseq.py supports R3 & R4 fastq files

f44ddc0

Fix collate/fastq/upload for up to 4 reads

3245526

Fixup for fastqc.bash

d28784a

Disable pool processing

d6a371e

We don't use the output from this anymore, preferring to run the megamap pipeline or other analyses as appropriate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft: Update feat/dnase-2.7 with changes from main #58

Draft: Update feat/dnase-2.7 with changes from main #58

Uh oh!

jemma-nelson commented Nov 28, 2022

Uh oh!

jemma-nelson commented Nov 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Draft: Update feat/dnase-2.7 with changes from main #58

Are you sure you want to change the base?

Draft: Update feat/dnase-2.7 with changes from main #58

Uh oh!

Conversation

jemma-nelson commented Nov 28, 2022

Uh oh!

jemma-nelson commented Nov 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants