QTLformer

Nextflow pipeline for qtl fine mapping transformation to gentropy format. This pipeline allows to process the eQTL Catalogue fine-mapping results into the gentropy StudyIndex and StudyLocus formats.

Dependencies

Nextflow
Google Cloud SDK
Docker (testing)
nf-test (testing)

Flow

The pipeline has following steps:

Build manifest of fine-mapping results to be processed.
Transform the fine-mapping results into gentropy format.

Usage

Check the execution in with the test profile. This configuration uses small test datasets stored in testdata/susie and testdata/sumstats folders.

make test-qtlformer # Run nf-test command

Note

Make sure to have java >17 installed and set as default java version, as nextflow depends on it.

To run the full pipeline, one configuration is implemented for Google Cloud.

nextflow run main.nf -profile googleCloud

Testing on google cloud sample

nextflow run main.nf -profile googleCloudTest

NOTE: Make sure to set up the dataset in the referenced cloud buckets before running the cloud test.

Inputs

Pipeline expects the datasets to be in a follwing structure:

testdata/susie
└── QTS000001
    ├── QTD000001
    │   ├── QTD000001.credible_sets.parquet
    │   └── QTD000001.lbf_variable.parquet
    └── QTD000002
        ├── QTD000002.credible_sets.parquet
        └── QTD000002.lbf_variable.parquet

testdata/sumstats
└── QTS000001
    ├── QTD000001
    │   └── QTD000001.cc.parquet
    └── QTD000002
        └── QTD000002.cc.parquet

Note

The pipeline expects the input datasets to be organized in the structure shown above. The testdata/susie folder is used for transformations. Summary statistics under testdata/sumstats are used just to extract the path to the summary statistics file for each dataset. which is later stored in the StudyIndex dataset.

Outputs

testdata/output
├── manifest.tsv
├── pipeline_info
│   ├── execution_report
│   │   └── execution_report_2025-11-25_23-04-39.html
│   ├── execution_timeline
│   │   └── execution_timeline_2025-11-25_23-04-39.html
│   ├── execution_trace
│   │   └── execution_trace_2025-11-25_23-04-39.txt
│   └── pipeline_dag
│       └── pipeline_dag_2025-11-25_23-04-39.html
├── study_index
│   ├── QTD000001
│   │   ├── part-00000-3b01f93a-d726-4503-9562-3f7a9e0fe512-c000.snappy.parquet
│   │   └── _SUCCESS
│   └── QTD000002
│       ├── part-00000-d0c6c8e6-28e1-4745-a90b-c62576b33b0f-c000.snappy.parquet
│       └── _SUCCESS
└── study_locus
    ├── QTD000001
    │   ├── part-00000-d94cc26a-436c-4ca0-974d-0891bd5ab774-c000.snappy.parquet
    │   └── _SUCCESS
    └── QTD000002
        ├── part-00000-4fd76204-ca41-45b6-ba75-abee2c1c7ae0-c000.snappy.parquet
        └── _SUCCESS

StudyIndex and StudyLocus datasets

Note that the study_index and study_locus folders contain parquet files, the files are saved under the dataset_id folders (QTD000001, QTD000002).

Manifest

The manifest.tsv file contains the list of processed datasets.

Pipeline_info

Metrics and reports from pipeline run

Testing qtlformer tools package

To run the tests for the qtlformer tools package, navigate to the tools directory and execute the following command:

uv sync --all-groups && uv run pytest

Note

Make sure to have java 11 installed and set as default java version, as pyspark depends on it.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
.vscode		.vscode
conf		conf
docs		docs
modules		modules
testdata		testdata
tests		tests
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENCE.md		LICENCE.md
Makefile		Makefile
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nf-test.config		nf-test.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QTLformer

Dependencies

Flow

Usage

Inputs

Outputs

StudyIndex and StudyLocus datasets

Manifest

Pipeline_info

Testing qtlformer tools package

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

QTLformer

Dependencies

Flow

Usage

Inputs

Outputs

StudyIndex and StudyLocus datasets

Manifest

Pipeline_info

Testing qtlformer tools package

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages