A modular PyTorch project for detecting AI-generated vs natural content in images and videos.
- TTI: temporal token interleaving (no thumbnails).
- LoTS: low-rank temporal sketching + difference token to compress across time.
- CAA: content-aware attention guided by per-frame cues (entropy, high-frequency energy, jitter*, I-frame).
- L-GRB: tiny graph reasoning over pooled region tokens.
Jitter requires face landmarks; if unavailable, it’s auto-disabled.
# 1) Create environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 2) Train with default config\Python scripts/train.py --config configs/default.yaml
# 3) Evaluate
Python scripts/evaluate.py --config configs/default.yaml --ckpt path/to/checkpoint.pt