(backronymed as Frequency Registry Of Dictionary Objects)
FRODO is a frequency database for corpus metadata and word forms, exposed as an HTTP JSON API service.
- Imports NoSkE and SketchEngine vertical corpus files
- Uses MariaDB (MySQL) as the data backend
- Provides absolute, IPM, and ARF frequency information for each word form
- Supports two-level lemmatization (see CNC Wiki)
- Enables fast exploration of corpus structure (e.g., "Show all media types and authors when only fiction is selected")
- Searches for all (sub)lemmas containing a given word form, plus all their other forms
- Supports general n-grams, not limited to words
At the CNC, FRODO is integrated with several applications:
- KonText
- Query suggestions
- Interactive subcorpus text type selection
- Word at a Glance
- Fast word overview
- Finding words with similar frequency (ARF)
- Retrieving a lemma's word forms
For more information, see API.md.
See API.md 🚧
- Clone the repository:
git clone --depth 1 https://github.com/czcorpus/frodo.git - Install dependencies:
go mod tidy - Build:
make
