Skip to content

nilomr/perch-cpu-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Perch v2 CPU Inference

High-performance acoustic embedding extraction and species classification using Google's Perch v2 model for large-scale bioacoustics inference on CPUs.

Example of 2D projection of resulting embeddings

Features

  • High-performance CPU inference — Up to 750x realtime with ONNX or TFLite
  • Batch processing — Large-scale dataset processing with checkpointing and resume
  • Visualization — Spectrogram overlays and detection time series charts

Quick Start

# Install dependencies
pip install -r requirements.txt

# Download model
wget https://huggingface.co/justinchuby/Perch-onnx/resolve/main/perch_v2.onnx -P models/perch_v2/

# Run inference
python scripts/inference/perch-onnx-inference.py --audio-dir ./data/test-data --output-dir ./output

Documentation

Guide Description
Usage Guide Installation, running inference, visualization, output formats, benchmarks
Batch Processing Large-scale inference, monitoring, log parsing

Model

  • Perch v2: Google's classifier pre-trained on a multi-taxa dataset (Paper)
  • Input: 5-second audio chunks at 32kHz
  • Output: Species predictions and 1536-dimensional embeddings

Credits

Format Source Prepared by
ONNX Hugging Face Justin Chu (@justinchuby)
TFLite Bioacoustics Model Zoo Lapp, S., and Kitzes, J. (2025)

Project structure

├── scripts/inference/         # Inference scripts
│   ├── perch-onnx-inference.py    # Main ONNX inference script
│   └── perch-tflite-inference.py  # TFLite inference script
├── scripts/                   # Benchmarking and visualization tools
├── tools/                     # Shell scripts for batch processing
├── docs/                      # Documentation
├── models/                    # Model files (downloaded separately)
└── data/                      # Test data

License

See LICENSE file for details.


Developed as part of the PhenoScale project, UKRI Frontiers grant EP/X024520/1 awarded to Ben Sheldon, University of Oxford.

About

High-performance bioacoustics inference on CPUs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors