Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.
python audio-analysis speech-to-text speaker-recognition speech-processing speaker-diarization spectral-clustering voice-activity-detection onnx speaker-embedding diarization apache-2 rttm cpu-inference meeting-transcription who-spoke-when
-
Updated
Mar 6, 2026 - Python