Skip to content

NUBE is an Ultrafast Bioinformatics calculation Engine. It is the ECDC genomic analysis calculation system.

License

Notifications You must be signed in to change notification settings

EU-ECDC/ecdc_nube

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NUBE

Introduction

NUBE (NUBE is an Ultrafast Bioinformatic analyses calculation Engine) is the ECDC's calculation engine for genomic analyses. It relies on containerization and version control to ensure reproducibility, and on a workflow management system to guarantee portability, orchestration, parallelization, and robustness. The system can leverage cloud infrastructure for scalability, although this is not a strict requirement. NUBE is also modular by design, allowing it to be adapted to different analytical needs.

The current implementation relies on Docker for containerization, git for version control, NextFlow as workflow management system and Microsoft Azure as cloud provider.

A modular system

NUBE was designed following a modular approach. Currently there is a main module (main.nf), which is designed to handle genome assemblies and cgMLST allele calling operations. Other specialised modules were created for other type of analyses including QC, AMR or advanced typing analyses. Such modules can be found on the modules/ folder. The idea is that the user should be able to strip part of the workflow to use it for individual task or as a base for ad-hoc analysis.

Signals

Analyses are triggered through signals, which are .json files which follow a defined specification. NUBE constantly checks for signals appearing on data/signals/ and triggers the appropriate analyses based on type of input data and the expected outputs, which are defined the "experiment_list" portion of the signal. Here is an example of a signal:

{
    "C834F03B-3894-5360-8C00-EABC8F3064D8": {
        "sequencing_technology": [
            "ILLUMINA"
        ],
        "project": "LEGIISO",
        "organism": "LEGIISO",
        "experiment_list": [
            "assembly",
            "allele_call",
            "qc"
        ],
        "reads": [
            [
                "S3|ecdc-epc-prod/LEGIISO/FR/025171512401_S11_R1.fastq.gz",
                "S3|ecdc-epc-prod/LEGIISO/FR/025171512401_S11_R2.fastq.gz"
            ]
        ]
    }
}

About

NUBE is an Ultrafast Bioinformatics calculation Engine. It is the ECDC genomic analysis calculation system.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •