HatEval Trainer

A clean, reproducible training pipeline for Spanish datasets, featuring a classic Bidirectional GRU baseline and an hybrid quantum head powered by PennyLane.

Repository Structure

hateval-trainer/
├── LICENSE
├── README.md
├── .gitignore
├── pyproject.toml          # Project metadata & dependencies
├── requirements.txt        # Runtime dependencies
├── Makefile                # Handy shortcuts (train, lint, test…)
│
├── scripts/                # Utility scripts
│   ├── download_nltk.py    # Pre-download NLTK resources
│   └── examples.sh         # Example training commands
│
├── src/
│   └── hateval_trainer/    # Main package
│       ├── __init__.py
│       ├── config.py       # Config dataclass
│       ├── data.py         # Data loading & preprocessing
│       ├── models.py       # GRU & hybrid quantum models
│       ├── train.py        # Training & evaluation loop
│       └── cli.py          # Command-line interface (hateval-train)
│
└── tests/                  # Pytest-based unit tests
    └── test_smoke.py       # Minimal smoke test

Features

Text preprocessing: lowercasing, punctuation stripping, Spanish stopwords, basic lemmatization.
Class balancing via upsampling.
TextVectorization + stacked BiGRU baseline.
Hybrid quantum-classical head (VQC) with PennyLane.
Confusion matrix plot (optional) and classification report.
CLI (hateval-train) and modular source code (src/hateval_trainer).

Installation

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
pip install -e .
# optional dev tooling
pip install black ruff pytest

Usage

After installation, you can run the training pipeline via the CLI:

HatEval (official split)

We follow the official HatEval train/test split. To train the classical backbone and then the hybrid head:

PYTHONPATH=src python -m hateval_trainer.cli \\
  --csv HatEvalES.csv \\
  --output modelo_clasico_base.keras \\
  --epochs 20 \\
  --hybrid-epochs 15 \\
  --vocab 10000 \\
  --batch-size 32 \\
  --seq-len 120 \\
  --device cpu \\
  --verbose --no-plots \\
  --no-balance \\
  --show-classic-logs

This will:

Preprocess and split HatEvalES.csv.
Train the BiGRU backbone (logs optional).
Train the hybrid quantum head for 15 epochs.
Save the backbone to modelo_clasico_base.keras.

HaterNet (10-fold cross-validation)

For HaterNet we use stratified cross-validation without class balancing, as recommended in prior work and reviewer comments:

PYTHONPATH=src python -m hateval_trainer.cli \\
  --csv HaterNet.csv \\
  --epochs 20 \\
  --hybrid-epochs 15 \\
  --vocab 10000 \\
  --batch-size 32 \\
  --seq-len 120 \\
  --device cpu \\
  --verbose --no-plots \\
  --no-balance \\
  --crossval 10 \\
  --cv-save --cv-dir outputs/cv_hybrid \\
  --cv-use-hybrid

This will:

Perform 10-fold stratified CV.
Train backbone + hybrid head in each fold.
Save per-fold reports and confusion matrices under outputs/cv_hybrid/.
Print a summary with mean ± std accuracy and macro-F1.

Notes

Use --device gpu to enable GPU if available.
Use --tune-threshold to tune decision thresholds per fold (binary tasks).
Disable balancing with --no-balance to avoid data leakage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HatEval Trainer

Repository Structure

Features

Installation

Usage

HatEval (official split)

HaterNet (10-fold cross-validation)

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
scripts		scripts
src/hateval_trainer		src/hateval_trainer
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_ablation.py		run_ablation.py

Folders and files

Latest commit

History

Repository files navigation

HatEval Trainer

Repository Structure

Features

Installation

Usage

HatEval (official split)

HaterNet (10-fold cross-validation)

Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages