Skip to content

Conversation

@svohr
Copy link
Collaborator

@svohr svohr commented Sep 19, 2025

This PR fixes the split file output by introducing a class that manages the PrintWriters for each proxy key and file type ("IBD" and "HBD"). It includes some suggested changes from a previous review. I tested this branch to confirm that the split files match the output from the single thread run and that using multiple threads decreases the wall runtime.

Some notes:

  • Threads that call getWriterForProxyKey do not close PrintWriters, but there is nothing preventing them from doing so, which would cause problems for other threads trying to write to the same file.
  • This PR does not include the temporary fix that ensures all possible output files are created. I don't know if it is needed beyond the ad hoc pipeline.
  • The homozygous output files contain the ids and chromosome phase indexes for both haplotypes, which may not be necessary.

svohr added 4 commits August 28, 2025 16:30
Introduces a class SplitFileWriterManager that wraps the PrintWriters
required for the split file output so they can be accessed concurrently
and closed once after IBD detection is complete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants