Architecture

Jump to bottom

Václav Bartoš edited this page Aug 22, 2019 · 2 revisions

Logically, NERD system consists of the following parts:

Backend (NERDd) - a set of programs which do most of the data processing. Described in more detail below.
Frontend (NERDweb) - web server written in Flask, providing simple web portal and API to access all the data.
Entity database - the main database storing records of IP addresses and other entities.
Event database - (optional) a database to store a copy of each alert (IDEA message) received from Warden (the main NERD instance currently uses database of the Mentat project instead)
Configuration, various data files, etc.

The architecture is depicted on the following picture:

Architecture scheme

The backend part is a scalable system consisting of several components, some of them able to run in multiple instances in parallel:

Worker processes - The core of the system where all record processing is done. Each worker reads tasks from the main queue, performs requested updates to entity records and calls plug-in modules to fetch or compute additional data (see Task processing for more details).
- All secondary data are processed by plug-in modules in the workers.
Primary data receivers - Modules fetching data about malicious IP addresses from primary sources. They create tasks in the main queue to create or update IP records.
Updater - A simple module that issues a task to perform periodic update once per day for each record in the database (each record is updated at different time, so updates are spread across the whole day to achieve constant system load).
Main task queue - A system of queues to distribute tasks to workers. See Task distribution for details.
Redis - In-memory key-value database for caching various data (e.g. blacklists) and to store processing statistics.
Data download scripts - Various scripts to periodically download and cache external data, such as blacklists, geolocation database, etc. Some of the data are stored as local files and loaded by worker plug-in modules, others are stored in Redis, where the modules can read them.

All the processes are run and monitored by a supervisord daemon.