Skip to content

Conversation

@jemma-nelson
Copy link
Contributor

This is a draft PR to see all the changes that would be pulled in. Primary motivation is for commit c86a544.

jemma-nelson and others added 27 commits April 18, 2022 16:07
Logic now matches that seen in the rest of our pipeline - prefer using
the alignment's sample_name, and fall back to constructing it manually
only when necessary. This should resolve the collation issues that have
been dogging us this year.
CopyComplete.txt is a better signal that a flowcell is ready for
processing than RTAComplete.txt.
Older sequencers did not create CopyComplete.txt, I believe.
hpcz-2 was decommissioned, switching default queue for this.
We will re-enable this once we get the fastq deadline hit
Accidentally duplicated the input specification during a git merge
Now that we're regularly copying data over rather than symlinking it, it
makes sense to remove the work directory in these cases.
Actually tested this time.
Make it more obvious when and where an alignment cannot be set up
With this fix, we should set `unset LIBRARY_KIT_METHOD` correctly in our
bash scripts
Fix: alignprocess.py: library kits are optional
@jemma-nelson
Copy link
Contributor Author

There may not be a path forward on merging this, and that's fine. We would not want to introduce significant changes to the DNase pipeline, as it is frozen here for a reprocessing effort.

jemma-nelson and others added 30 commits July 9, 2024 15:35
The main thing changed here is to prefer the form
`logging.info("msg %s %s", arg1, arg2)`

This allows the logger to do the interpolation, which can save time
if the message is not printed because it is below the current log level.
previously the code generated would depend solely on where setup.sh was
run, so if the execution didn't match, you would get surprising errors
Fix occasional oom error, cleanup run_pools.sh properly, and make sure
run_pool.sh and run_alignments.sh are always properly submitted.
Basically this was bash glob syntax vs. grep regex confusion.

This regex was incorrectly looking for `collatefq*`, which is `collatef`
followed by any number of `q`s. Adjusted to propery look for `collatefq`
followed by (any number of anything) before the flowcell string.
Fix: fastqc/alignments/pools wait for collation
This was the cause of those pesky "Project_Lab/Sample_LP.../"
directories that were causing us to duplicate work.
Alignprocess.py skips library pools
If this is missing, use the default analysis dir.
We don't use the output from this anymore, preferring to run the megamap
pipeline or other analyses as appropriate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants