FALCON2 is an ultra-fast method to infer metagenomic composition of sequenced reads. FALCON2 measures similarity between any FASTQ file (or FASTA), independently from the size, against any multi-FASTA database, such as the entire set of complete genomes from the NCBI. FALCON2 supports single reads, paired-end reads, and compositions of both. It has been tested in many platforms, such as Illumina MySeq, HiSeq, Novaseq, IonTorrent.
FALCON2 is efficient to detect the presence and authenticate a given species in the FASTQ reads. The core of the method is based on relative data compression. FALCON2 uses variable multi-threading, without multiplying the memory for each thread, being able to run efficiently in a common laptop.
The tool is also able to identify locally where, in each reference sequence, the similarity occurs. FALCON2 provides subcommands to filter the local results (filter), visualize the results (fvisual), perform database inter-similarity analysis (inter), and visualize inter-similarities (ivisual).
git clone https://github.com/cobilab/FALCON2.git
cd FALCON2/src/
cmake .
make
cp FALCON2 ../
cd ../Cmake is needed for installation.
Search for the top 15 similar viruses in sample reads that we provide in folder test:
cd test
./FALCON2 meta -v -F -t 15 -l 47 -x top.txt test/reads.fq.gz test/VDB.fa.gzIt will identify Zaire Ebolavirus in the samples (top.txt) according to the following image
An example of building a reference database from NCBI:
# Download reference genomes from NCBI (append <organism> as an argument; defaults to "viruses" if none is provided)
https://raw.githubusercontent.com/cobilab/FALCON2/main/utils/download_references_ncbi.sh
# Use process_gz_files.sh for compressed files (It will concatenate all .gz files)
https://raw.githubusercontent.com/cobilab/FALCON2/main/utils/process_gz_files.sh
# Alternative: Manual concatenation from decompressed files
cat /path/to/reference_fastas/*.fna > input-sequences.fnaFor building reference databases for multiple domains/kingdoms (bacterial, fungi, protozoa, plant, etc), use:
https://raw.githubusercontent.com/cobilab/gto/master/scripts/gto_build_dbs.shAn already reference viral database is available here. With this example, you don't need to decompress; use it directly in FALCON2 along with the FASTQ reads.
FALCON2 is a unified tool with multiple subcommands:
- FALCON2 meta: metagenomic composition analysis (main FALCON functionality);
- FALCON2 filter: local interactions - localization;
- FALCON2 fvisual: visualization of global and local similarities;
- FALCON2 inter: inter-similarity between database genomes;
- FALCON2 ivisual: visualization of inter-similarities.
To see all available commands:
./FALCON2or
./FALCON2 -hThis will display:
COMMANDS
meta - Infer metagenomic sample composition
(Main FALCON functionality)
filter - Filter and segment regions identified by FALCON
fvisual - Create visualization of filtered regions
inter - Evaluate similarity of genomes
ivisual - Create heatmap visualization of genome similarities
Use 'FALCON2 <command> -h' for help with a specific command.
To see the possible options of FALCON2 meta:
./FALCON2 meta -hThis will print the following options:
Non-mandatory arguments:
-h, --help show this help message
-F, --force overwrite output files
-V, --version display version and exit
-v, --verbose verbose mode (more information)
-Z, --local database local similarity
-s, --show show compression levels
-l, --level <level> compression level [1;47]
-p, --sample <rate> subsampling (default: 1)
-t, --top <num> top of similarity (default: 20)
-n, --nThreads <num> number of threads (default: 2)
-x, --output <file> similarity top filename
-y, --profile <file> profile filename (-Z must be on)
-S, --save-model save models after learning
-L, --load-model load models previously saved model
-M, --model-file <file> model filename
-I, --model-info model info
-T, --train-model train model only (no inference)
(Attention!) Is expected to only receive
the first file group (FASTQ)
Mandatory arguments:
[FILE1]:[FILE2]:... metagenomic filename (FASTQ),
Use ":" for splitting files.
[FILE1]:[FILE2]:... database filename (Multi-FASTA).
Use ":" for splitting files.
MAGNET integration:
-mg, --magnet enable MAGNET filtering
Mandatory arguments:
-mf, --magnet-filter <file> FASTA reference for filtering
Non-mandatory arguments:
-mv, --magnet-verbose verbose mode (more information)
-mt <val> similarity threshold [0.0;1.0] (default: 0.9)
-ml <val> sensitivity level [1;44] (default: 36)
-mi, --magnet-invert invert filter
-mp <val> portion of acceptance (default: 1)
Example usage:
./FALCON2 meta -v -F -l 47 -Z -y profile.com reads1.fq:reads2.fq VDB.faFor local interactions detection and visualization, FALCON2 provides the filter and fvisual subcommands.
To see the possible options of FALCON2 filter:
./FALCON2 filter -hThis will print the following options:
Non-mandatory arguments:
-h give this help
-F force mode (overwrites top file)
-V display version number
-v verbose mode (more information)
-s <size> filter window size
-w <type> filter window type
-x <sampling> filter window sampling
-sl <lower> similarity lower bound
-su <upper> similarity upper bound
-dl <lower> size lower bound
-du <upper> size upper bound
-t <threshold> threshold [0;2.0]
-o <FILE> output segmented filename
Mandatory arguments:
[FILE] profile filename (from FALCON2 meta).
Example usage:
./FALCON2 filter -v -F -t 0.5 -o positions.pos profile.comTo see the possible options of FALCON2 fvisual:
./FALCON2 fvisual -hThis will print the following options:
Non-mandatory arguments:
-h give this help
-F force mode (overwrites top file)
-V display version number
-v verbose mode (more information)
-w <width> square width (for each value)
-s <ispace> square inter-space (between each value)
-i <indexs> color index start
-r <indexr> color index rotations
-u <hue> color hue
-sl <lower> similarity lower bound
-su <upper> similarity upper bound
-dl <lower> size lower bound
-du <upper> size upper bound
-g <color> color gamma
-e <size> enlarge painted regions
-bg show only the best of group
-ss do NOT show global scale
-sn do NOT show names
-o <FILE> output image (SVG) filename
Mandatory arguments:
[FILE] segmented filename (from FALCON2 filter).
Example usage:
./FALCON2 fvisual -v -F -o map.svg positions.posTo see the possible options of FALCON2 inter:
./FALCON2 inter -hThis will print the following options:
Non-mandatory arguments:
-h give this help
-V display version number
-v verbose mode (more information)
-s show compression levels
-l <level> compression level [1;30]
-n <nThreads> number of threads
-x <FILE> similarity matrix filename
-o <FILE> labels filename
Mandatory arguments:
[FILE]:[FILE]:[...] input files (last arguments).
Use ":" for file splitting.
Example usage:
./FALCON2 inter -v file1.fa:file2.fa:file3.faTo see the possible options of FALCON2 ivisual:
./FALCON2 ivisual -hThis will print the following options:
Non-mandatory arguments:
-h give this help
-V display version number
-v verbose mode (more information)
-w square width (for each value)
-a square inter-space (between each value)
-s index color start
-r index color rotations
-u color hue
-g color gamma
-l <FILE> labels filename
-x <FILE> heatmap filename
Mandatory arguments:
[FILE] input matrix file (from FALCON2 inter).
Example usage:
./FALCON2 ivisual -F -l labels.txt -o heatmap.svg matrix.txtCreate the following bash script:
#!/bin/bash
./FALCON2 meta -v -n 4 -t 200 -F -Z -l 47 -y complexity.com $1 $2
./FALCON2 filter -v -F -t 0.5 -o positions.pos complexity.com
./FALCON2 fvisual -v -F -o draw.svg positions.posName it FALCON2-meta.sh and give run access:
chmod +x FALCON2-meta.shThen, run FALCON2:
./FALCON2-meta.sh reads1.fastq:reads2.fastq VDB.fareads1.fastq, reads2.fastq, and VDB.fa are only examples.
FALCON2 introduces the ability to save and load trained models for faster subsequent analyses:
# Train and save a model
./FALCON2 meta -v -l 47 -S -M mymodel.fcm -T reads.fq
# Load a previously trained model
./FALCON2 meta -v -l 47 -L -M mymodel.fcm reads.fq VDB.faOptions:
-S, --save-model: Save models after learning-L, --load-model: Load previously saved model-M, --model-file <file>: Specify model filename-I, --model-info: Display model information-T, --train-model: Train model only (no inference)
FALCON2 now integrates MAGNET filtering for enhanced read processing:
./FALCON2 meta -v -l 47 -mg -mf reference.fa -mt 0.9 -ml 36 reads.fq VDB.faOptions:
-mg, --magnet: Enable MAGNET filtering-mf, --magnet-filter <file>: FASTA reference for filtering (mandatory with-mg)-mv, --magnet-verbose: Verbose mode for MAGNET-mt <val>: Similarity threshold [0.0;1.0] (default: 0.9)-ml <val>: Sensitivity level [1;44] (default: 36)-mi, --magnet-invert: Invert filter-mp <val>: Portion of acceptance (default: 1)
For any issue let us know at issues link.
GPL v3.
For more information see LICENSE file or visit
http://www.gnu.org/licenses/gpl-3.0.html
Copyright (C) 2014-2025, IEETA, University of Aveiro.

