Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
287 commits
Select commit Hold shift + click to select a range
6b71f0e
flip some defaults to dataproc
ch-kr Jul 28, 2025
27dc30a
add rwb
ch-kr Jul 28, 2025
10d69a4
update field selection
ch-kr Jul 28, 2025
78c465d
remove [0] because sample count is already an int
ch-kr Jul 28, 2025
505e8aa
transmute>annotate
ch-kr Jul 28, 2025
e695dd9
annotate back to transmute but move row fields definition
ch-kr Jul 28, 2025
aae780c
v4_COVERAGE_RELEASE > v4_AN_RELEASE
ch-kr Jul 28, 2025
d3a4018
switch the constant in the correct spot
ch-kr Jul 28, 2025
17595e8
remove reformat aou cov HT step (not really necessary)
ch-kr Jul 28, 2025
9768fe1
add missing n_partitions found during testing
ch-kr Jul 28, 2025
31a9ff6
update HT reads
ch-kr Jul 28, 2025
a59824d
add missing partition filter
ch-kr Jul 29, 2025
d4f0c6e
5>2
ch-kr Jul 29, 2025
83c8346
add todos noting that we will need to remove data from v4 samples
ch-kr Aug 7, 2025
47953af
Apply suggestions from code review
ch-kr Aug 21, 2025
d8a8bc8
remove meta in get_genomes_group_membership_ht call
ch-kr Aug 21, 2025
9ec017a
rename meta>meta_ht
ch-kr Aug 21, 2025
22a5d03
add annotations constants
ch-kr Aug 21, 2025
fbc285c
add annotations resource file
ch-kr Aug 21, 2025
f9adb5c
add downsampling resource
ch-kr Aug 21, 2025
2e38c1e
add downsampling HT write
ch-kr Aug 21, 2025
9f84f70
add downsamplings to get_genomes_group_membership_ht
ch-kr Aug 21, 2025
d19af22
remove indent to see if it fixes docs
ch-kr Aug 21, 2025
a0b4b89
remove parentheses
ch-kr Aug 21, 2025
e9861f1
outer > left
ch-kr Aug 21, 2025
cc0141f
add get_gnomad_v4_genomes_vds to basics
ch-kr Aug 21, 2025
34a705c
update get_gnomad_v4_genomes_vds import to be from v5 basics
ch-kr Aug 21, 2025
cf16085
make resource for group membership ht
ch-kr Aug 21, 2025
7eb7e93
add missing overwrites to check resources
ch-kr Aug 21, 2025
42e15d2
fix typo
ch-kr Aug 22, 2025
9911741
make meta and ds ht args
ch-kr Aug 22, 2025
b6d6708
update hard filtered samples removal from v4 vds
ch-kr Aug 22, 2025
9382320
add downsampling to build_freq_stratification_list
ch-kr Aug 22, 2025
024a870
add missed s
ch-kr Aug 22, 2025
a285a50
read row not global
ch-kr Aug 22, 2025
518fb00
add ds gen anc
ch-kr Aug 22, 2025
8b60a5b
add gen anc annotation
ch-kr Aug 22, 2025
ca293ac
update ds to include all gen anc groups
ch-kr Aug 22, 2025
ea34565
add ds_gen_anc_counts
ch-kr Aug 22, 2025
2945bc8
add transmute for v4
ch-kr Aug 22, 2025
19d454e
change downsampling test to be v4
ch-kr Aug 22, 2025
e7729a1
move downsampling HT read
ch-kr Aug 22, 2025
6184418
remove release filter
ch-kr Aug 22, 2025
3858b54
remove select
ch-kr Aug 22, 2025
daf3b14
add genomes filter
ch-kr Aug 22, 2025
d3dff28
add parens
ch-kr Aug 22, 2025
bc6d27b
remove pop idx transmute
ch-kr Aug 22, 2025
cc17b97
use project meta
ch-kr Aug 22, 2025
4cf8c88
fix project meta read
ch-kr Aug 22, 2025
dd83ebe
fix downsampling call
ch-kr Aug 22, 2025
2234f66
add release filter back
ch-kr Aug 22, 2025
6a26582
add release filter to ds
ch-kr Aug 22, 2025
bc17cfa
update sex karyotype annotation
ch-kr Aug 22, 2025
8d1208b
add note with TODOs
ch-kr Sep 10, 2025
81dc8c3
add note to todo about subtraction
ch-kr Sep 19, 2025
684aa34
rename downsampling
ch-kr Sep 29, 2025
f209f67
update todo
ch-kr Sep 29, 2025
c52856c
rename v4 vds function
ch-kr Sep 29, 2025
50661fe
Apply suggestions from code review
ch-kr Sep 29, 2025
2e420d3
Update gnomad_qc/v5/annotations/compute_coverage.py
ch-kr Sep 29, 2025
7a403e9
Merge branch 'kc/coverage' of https://github.com/broadinstitute/gnoma…
ch-kr Sep 29, 2025
a3ee5f4
remove unused resources from release resources
ch-kr Sep 29, 2025
c0450d6
remove unused imports
ch-kr Sep 29, 2025
475e871
add filter centromeres and telomeres
ch-kr Sep 29, 2025
ce4e6ed
update comment
ch-kr Sep 29, 2025
0f8f32f
add test for consistency
ch-kr Sep 29, 2025
a97c5d8
rename group_membership_ht to add aou
ch-kr Sep 29, 2025
dac1249
remove get gen anc ht import
ch-kr Sep 29, 2025
8aa622f
update import
ch-kr Sep 29, 2025
98770cb
fix get_downsampling call
ch-kr Sep 29, 2025
c39757b
Merge remote-tracking branch 'origin/main' into kc/coverage
ch-kr Sep 29, 2025
5fbd495
add comment to fix pylint
ch-kr Sep 29, 2025
7f656f5
another attempt to fix pylint
ch-kr Sep 29, 2025
ea02f3d
try noqa
ch-kr Sep 29, 2025
414de6f
cast to list because pylint
ch-kr Sep 29, 2025
5386b54
Merge remote-tracking branch 'origin/main' into kc/coverage
ch-kr Oct 8, 2025
6101ac7
add qual hist annotation resource, also update group_membership to ta…
ch-kr Oct 10, 2025
b8234ac
add s
ch-kr Oct 10, 2025
c8948ac
add code merging qual hists (untested)
ch-kr Oct 10, 2025
7f07b06
add note
ch-kr Oct 10, 2025
e61ecce
add another s
ch-kr Oct 10, 2025
10b18dd
rename coverage script and add new one for gnomad
ch-kr Oct 10, 2025
12aa5af
remove old coverage script (forgot to in previous commit)
ch-kr Oct 10, 2025
f6fecf1
update aou logger
ch-kr Oct 10, 2025
7bd50e9
add code to get gnomad group membership HT
ch-kr Oct 10, 2025
0633001
add test for group membership
ch-kr Oct 10, 2025
cf72d3b
fix typo
ch-kr Oct 10, 2025
0a4a138
fix import
ch-kr Oct 10, 2025
f830ebb
update git ignore ti ignore python cache files
ch-kr Oct 10, 2025
caade71
import hail
ch-kr Oct 10, 2025
6e95196
update log path to enable local testing
ch-kr Oct 10, 2025
d4e3ff9
add missing parens
ch-kr Oct 10, 2025
67202ca
remove comment about group membership ht since that's been done in ot…
ch-kr Oct 10, 2025
94f0146
fix parenthesis
ch-kr Oct 10, 2025
818c843
finally fix logic
ch-kr Oct 10, 2025
81efb1b
rename aou coverage script back to general name
ch-kr Oct 10, 2025
5edf8ae
move gnomad code back into aou script
ch-kr Oct 10, 2025
cd533f1
rename group membership
ch-kr Oct 10, 2025
b5bb413
add env arg
ch-kr Oct 10, 2025
149857f
fix meta path for gnomad group membership
ch-kr Oct 10, 2025
89f2d26
remove gnomad coverage script (merged back into other script)
ch-kr Oct 10, 2025
c50d1c9
specify downsampling step is aou
ch-kr Oct 10, 2025
f87dbb9
cursor code to make sure gnomad is run in dataproc
ch-kr Oct 10, 2025
f7249e1
make message better
ch-kr Oct 10, 2025
2f89a60
add project arg
ch-kr Oct 10, 2025
60440ec
add comment
ch-kr Oct 10, 2025
6312f0f
move raw coverage and AN HT to annotations resources
ch-kr Oct 10, 2025
5fba029
update group membership code block in main
ch-kr Oct 10, 2025
aaf1370
add logger
ch-kr Oct 14, 2025
6599a58
first pass subtract consent samples in coverage function
ch-kr Oct 14, 2025
cb00701
clean up code slightly
ch-kr Oct 14, 2025
a356a5e
fix call to join_aou_and_gnomad_coverage_ht
ch-kr Oct 14, 2025
7555599
first pass subtract gnomad samples from AN
ch-kr Oct 14, 2025
382a6cf
add checkpoints
ch-kr Oct 14, 2025
c505b21
update function call
ch-kr Oct 14, 2025
526cd0a
small fixes
ch-kr Oct 14, 2025
526f49b
remove todo
ch-kr Oct 14, 2025
747c995
move _rename_cov_annotaitons and _merge_coverage_fields out so can ru…
ch-kr Oct 20, 2025
68c8877
create arg for merge gnomad cov step
ch-kr Oct 20, 2025
6667333
add aou filters for test
ch-kr Oct 20, 2025
8f5c5d4
add arg to merge gnomad AN
ch-kr Oct 20, 2025
b8620ab
cursor prompted changes to separate gnomAD AN merging from gnomAD+AoU…
ch-kr Oct 20, 2025
b70a10b
add testing code block to qual hists code
ch-kr Oct 20, 2025
b952ee9
project > project_name
ch-kr Oct 20, 2025
4d5cd95
remove redundant environment arg
ch-kr Oct 20, 2025
e5d24f2
fix typo
ch-kr Oct 20, 2025
05196ff
update metadata annotation to use v4 meta rather than v5 project meta
ch-kr Oct 20, 2025
be87fe2
add note
ch-kr Oct 20, 2025
5710646
move release annotation
ch-kr Oct 20, 2025
b50bc51
remove relateds to drop reference for dataproc
ch-kr Oct 20, 2025
f3f77c1
remove group membership from check
ch-kr Oct 20, 2025
b3f8a8d
add read for group membership
ch-kr Oct 20, 2025
67daf68
add sex karyotype field param
ch-kr Oct 20, 2025
a4e47ca
re-annotate sex
ch-kr Oct 20, 2025
91a91bb
update qual hist compute
ch-kr Oct 20, 2025
3db7e8b
fix sex karyotype annotation
ch-kr Oct 20, 2025
3f43980
fix missed qual hist annotation
ch-kr Oct 20, 2025
fbfdf16
add missing .path
ch-kr Oct 20, 2025
4c6051d
update read
ch-kr Oct 20, 2025
029a248
remove qual hists drop from gnomad table
ch-kr Oct 21, 2025
c8e1746
what the heck is rename_globals
ch-kr Oct 22, 2025
68996e5
remove asterisks
ch-kr Oct 22, 2025
894f84a
fix globals
ch-kr Oct 22, 2025
6e0259a
subtract>diff
ch-kr Oct 22, 2025
3a6f5e7
flip the direction of the join
ch-kr Oct 22, 2025
00a7d97
also flip other join
ch-kr Oct 22, 2025
9524c92
fix calls to qc_temp_prefix
ch-kr Oct 22, 2025
83330d3
remove unnecessary slashes
ch-kr Oct 22, 2025
2be5e13
update get_gnomad_v5_genomes_vds function
ch-kr Oct 22, 2025
ae4100b
add missing if block
ch-kr Oct 22, 2025
58b6402
remove unused import and no longer necessary hard filter block
ch-kr Oct 22, 2025
346071e
update get_gnomad_v5_vds call
ch-kr Oct 22, 2025
5eb1ea2
make test args mutually exclusive
ch-kr Oct 22, 2025
28ad718
reorder args slightly
ch-kr Oct 22, 2025
a491bb9
remove redundant code
ch-kr Oct 22, 2025
926d933
update logic again in get_gnomad_v5_genomes_vds
ch-kr Oct 22, 2025
6ebd58a
switch from v4 to v5 downsamplings
ch-kr Oct 22, 2025
29dc8df
fix genomes vds again
ch-kr Oct 22, 2025
7584624
add missing release_only back
ch-kr Oct 22, 2025
0f2e6a0
actually fix genomes vds logic this time
ch-kr Oct 22, 2025
fb66641
update genomes vds call
ch-kr Oct 22, 2025
856e22b
remove test from genomes vds call because 0 consent drop samples are …
ch-kr Oct 22, 2025
7ffbbb2
fix group membership for gnomad genomes
ch-kr Oct 22, 2025
19e5ac9
! -> =
ch-kr Oct 22, 2025
517cad1
actually fix group membership for gnomad genomes
ch-kr Oct 22, 2025
e998e3f
also filter gnomad genomes group membership to release
ch-kr Oct 22, 2025
14d54bd
gnomad > project
ch-kr Oct 22, 2025
b4ad18b
remove extra line
ch-kr Oct 22, 2025
84cde7b
fix group membership path issue
ch-kr Oct 22, 2025
900999c
remove unused table
ch-kr Oct 22, 2025
684c392
hard code test v5 meta
ch-kr Oct 22, 2025
841f1a7
update group membership again
ch-kr Oct 22, 2025
98d24f2
add release_only and annotate_meta to v5 vds read
ch-kr Oct 22, 2025
d6b6750
uncomment release_only and annotate_meta for v5 aou vds read
ch-kr Oct 22, 2025
f6ab5dc
update downsamplings
ch-kr Oct 23, 2025
6fc3dc1
update field name
ch-kr Oct 23, 2025
68a507b
rename fields again
ch-kr Oct 23, 2025
e28efa2
add filtering code back
ch-kr Oct 23, 2025
3ffb0cd
update meta ht for group membership
ch-kr Oct 23, 2025
0b2b04e
remove test arg from get_aou_vds
ch-kr Oct 23, 2025
77a6a7e
vds>vmt
ch-kr Oct 23, 2025
31c328e
vmt is variant data
ch-kr Oct 23, 2025
dd4257a
update sex_karyotype_field
ch-kr Oct 23, 2025
77ac5af
add DP
ch-kr Oct 23, 2025
c6dfef4
move release sample filter
ch-kr Oct 23, 2025
b1e273d
add VDS validation code
ch-kr Oct 23, 2025
61d7677
add sample to make test smaller
ch-kr Oct 23, 2025
0373daf
actually cut the sample size
ch-kr Oct 23, 2025
bfc493a
Merge remote-tracking branch 'origin/main' into kc/coverage
ch-kr Oct 24, 2025
82c5ff5
add annotate_adj to aou vds
ch-kr Oct 24, 2025
a34ffac
re-add DP
ch-kr Oct 24, 2025
4504d41
drop number of samples
ch-kr Oct 24, 2025
1e4f6ef
add missing parentheses
ch-kr Oct 24, 2025
fb8056c
revert merged gnomad coverage fields
ch-kr Oct 24, 2025
7d12e97
nest meta ht read
ch-kr Oct 24, 2025
c3bb9a1
add meta_ht = None
ch-kr Oct 24, 2025
34f4994
add missing clause to if
ch-kr Oct 24, 2025
5beb3ce
update total DP
ch-kr Oct 24, 2025
84ee7e4
add type
ch-kr Oct 24, 2025
7edad4c
fix syntax
ch-kr Oct 24, 2025
a49d65e
move transmute
ch-kr Oct 24, 2025
5d60cb9
move type change again
ch-kr Oct 24, 2025
5cee3e7
update globals
ch-kr Oct 24, 2025
265cf80
update globals again
ch-kr Oct 24, 2025
00c2291
update globals for AN table
ch-kr Oct 24, 2025
f2d6888
update globals again
ch-kr Oct 24, 2025
34ce1c3
update field renaming
ch-kr Oct 24, 2025
49b3c57
add missing n
ch-kr Oct 24, 2025
461b1e7
update strata meta annotation on gnomad an ht
ch-kr Oct 24, 2025
651dc14
update checkpoint path
ch-kr Oct 24, 2025
8094ecb
remove unused reference
ch-kr Oct 24, 2025
8d51779
update TODO
ch-kr Oct 24, 2025
ffb3622
add missing .ht()
ch-kr Oct 24, 2025
2685acf
add rekey
ch-kr Oct 24, 2025
b466be7
add another select
ch-kr Oct 24, 2025
d40ddf1
update histograms merge
ch-kr Oct 24, 2025
d2d6930
update hist merge
ch-kr Oct 24, 2025
8cf288c
update to only merge hist_alls, not hist_alts
ch-kr Oct 24, 2025
703afb9
remove restriction on dataproc-only reading of gnomad data resource
ch-kr Oct 24, 2025
7f2eb30
add missing _gnomad suffix
ch-kr Oct 24, 2025
676cff2
add another TODO
ch-kr Oct 24, 2025
9616d11
fix docstring
ch-kr Nov 6, 2025
e36895d
another docstring update
ch-kr Nov 6, 2025
fba384c
remove environment from coverage_and_an_path
ch-kr Nov 6, 2025
d1bf8e6
another docstring fix
ch-kr Nov 6, 2025
970401d
remove 'all' from file path
ch-kr Nov 6, 2025
947081b
release>release_only and consent_drop>consent_drop_only
ch-kr Nov 6, 2025
54ebefd
add note
ch-kr Nov 6, 2025
98fc89e
update docstring
ch-kr Nov 6, 2025
122ae8b
update to reflect unnested sex karyotype
ch-kr Nov 6, 2025
dda30c0
update call to get_gnomad_v5_genomes_vds
ch-kr Nov 6, 2025
c470f73
update call to coverage_and_an_path
ch-kr Nov 6, 2025
2debd46
remove is defined
ch-kr Nov 7, 2025
c5cfb3c
move resource check
ch-kr Nov 7, 2025
831c2e5
add checks for project name
ch-kr Nov 7, 2025
8e5dfd4
add quick note to docstring
ch-kr Nov 7, 2025
e3eaab0
remove merge_gnomad from docstring
ch-kr Nov 7, 2025
1112d23
move merged gnomad coverage ht write
ch-kr Nov 7, 2025
fc8f4b7
add resource check
ch-kr Nov 7, 2025
fd6e152
move gnomad an ht write to main and add resource check
ch-kr Nov 7, 2025
01d3984
add project checks
ch-kr Nov 7, 2025
c012521
move arg checks
ch-kr Nov 7, 2025
0888e6f
replace fold with sum
ch-kr Nov 7, 2025
8e0b3a2
add project prefix
ch-kr Nov 7, 2025
0ad7400
fix silly error
ch-kr Nov 7, 2025
262fa1f
Merge remote-tracking branch 'origin/main' into kc/coverage
ch-kr Nov 7, 2025
e502f5b
add project prefix to reference data
ch-kr Nov 12, 2025
2e23940
Merge remote-tracking branch 'origin/main' into kc/coverage
ch-kr Nov 13, 2025
101daa3
update meta import in coverage script
ch-kr Nov 13, 2025
d2bb122
add meta import to basics
ch-kr Nov 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
hail-*.log
.DS_Store

# Python cache files
*.pyc
*.pyo
__pycache__/
1 change: 1 addition & 0 deletions gnomad_qc/v5/annotations/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# noqa: D104
Loading