✨ New documentation: https://cnag-biomedical-informatics.github.io/beacon2-cbi-tools
🐳 Docker Hub Image: https://hub.docker.com/r/manuelrueda/beacon2-cbi-tools/tags
🚫 Legacy B2RI Documentation: https://b2ri-documentation.readthedocs.io/
Actively maintained by CNAG Biomedical Informatics
Note: This repository was formerly known as beacon2-ri-tools (Beacon v2 Reference Implementation). It has been renamed to beacon2-cbi-tools (CNAG Biomedical Informatics) to better reflect its identity under CNAG.
beacon2-cbi-tools is a suite of tools originally developed as part of the ELIXIR–Beacon v2 Reference Implementation, now continuing under CNAG Biomedical Informatics. It provides essential functionality around the Beacon Friendly Format (BFF) data exchange format, including:
- Validating XLSX/JSON files against Beacon v2 schemas
- Converting VCF and microarray files into BFF (genomicVariations)
- Loading BFF data (metadata and genomic variations) into MongoDB
This toolkit streamlines data preparation, validation, and ingestion for federated genomic and phenotypic data sharing under Beacon v2. The resulting BFF-formatted data can be used with any implementation of the Beacon v2 API specification that operates on MongoDB.
BFF-Tools script (bin/bff-tools):
A command-line tool for converting VCF data into BFF format and inserting the resulting BFF data into a MongoDB instance.
The tool offers five modes:
-
vcf: Convert a VCF.gz file into BFF format.
-
🆕 tsv: Convert a SNP microarray file (e.g., from 23andme) into BFF format.
-
load: Load BFF-formatted data into a MongoDB instance.
-
full: Perform both TSV/VCF conversion and MongoDB loading.
-
validate: Validate XLSX or JSON metadata against Beacon v2 schemas and serialize into BFF. An Excel template is provided to help structure your metadata.
A collection of support tools to aid in data ingestion. Key among them:
-
A web application for interactive visualization of BFF data, particularly
genomicVariationsandindividuals. -
A simple API and web application to query BFF data via MongoDB.
A synthetic dataset for testing and demonstration purposes.
* Beacon v2 - CBI Tools *
___________
XLSX | |
or | Metadata | (incl. Phenotypic data)
JSON |__________|
_________ |
| | |
| TSV | | bff-tools validate
|______ | |
| | Beacon v2
| bff-tools tsv |
____v____ ____v____ __________ ______
| | | | | | | | <---- Request
| VCF | -----> | BFF | ---------> | Database | <----> | API |
|_______| |_ _____| |__________| |_____| ----> Response
| MongoDB
bff-tools vcf | bff-tools load
|
|
Optional (utils)
|
_____v_____
| |
| utils/ |
| bff- |
| browser | Visualization
| (beta) |
|_________|
-----------------------------------------------|||---------------------------
beacon2-cbi-tools e.g. beacon2-ri-api
beacon2-pi-api
java-beacon-v2.api
...
Latest Update: May-2025
This repository has been widely adopted in Beacon v2 implementations and is also used internally at CNAG. As a result, we plan to continue its development. Some of our upcoming plans include:
-
Implement Beacon 2.x specification changes
- For VCF: Adopt VRS nomenclature and transition away from LegacyVariation. Support for structural variants may be added.
- For other entities: Align with the latest schema used in the BFF Validator and the Excel metadata template.
- Update the CINECA Synthetic Cohort dataset.
You can install beacon2-cbi-tools using one of two methods:
Follow the guide here to use Docker for a streamlined setup.
See here for manual installation instructions.
The author requests that any published work that utilizes these tools includes a citation to the following reference:
Rueda, M, Ariosa R. "Beacon v2 Reference Implementation: a toolkit to enable federated sharing of genomic and phenotypic data". Bioinformatics, btac568, https://doi.org/10.1093/bioinformatics/btac568
Written by Manuel Rueda, PhD. Info about CNAG Biomedical Informatics can be found at https://www.cnag.eu
The software in this repository is copyrighted. See the LICENSE file included in this distribution.