GitHub - speechmatics/speechmatics-academy: Speechmatics Academy

Working examples, integrations, and templates for the Speechmatics SDK's.

Comprehensive collection of code examples demonstrating real-world applications, third-party integrations, and best practices.

Examples • Integrations • Use Cases • Copy-Paste Ready

Browse Examples • Quick Start • Contributing • Portal • Documentation

What is Speechmatics?

Speechmatics is a leading Automatic Speech Recognition (ASR) platform providing highly accurate speech-to-text (STT) and text-to-speech (TTS) APIs. Whether you're building real-time voice assistants, conversational voice AI agents, transcription services, or call center tools, Speechmatics provides the foundation for accurate, scalable speech AI.

Flexible Deployment — Cloud SaaS, on-premises, air-gapped environments, or on-device edge deployment.

Advanced Features — Domain-specific models, custom dictionaries, speaker diarization, speaker identification, and speaker focus for multi-speaker scenarios and much more.

⚡ Quick Start

Prerequisites

1. Get your API Key portal.speechmatics.com

2. Install the SDK for your use case:

# Choose the package for your use case:

# Batch transcription
pip install speechmatics-batch

# Real-time streaming
pip install speechmatics-rt

# Voice agents
pip install speechmatics-voice

# Text-to-speech
pip install speechmatics-tts

📦 Package Details • Click to see what's included in each package

speechmatics-batch - Async batch transcription API

Upload audio files for processing
Get transcripts with highly accurate timestamps, speakers, entities
Supports all audio intelligence features

speechmatics-rt - Real-time WebSocket streaming

Stream audio for live transcription
Ultra-low latency
Partial and final transcripts

speechmatics-voice - Voice agent SDK

Build conversational AI applications
Speaker diarization and turn detection
Optional ML-based smart turn: pip install speechmatics-voice[smart]

speechmatics-tts - Text-to-speech

Convert text to natural-sounding speech
Multiple voices
Streaming and batch modes

SDK Documentation | API Reference

Option 1: Clone and Run

# Clone the repository
git clone https://github.com/speechmatics/speechmatics-academy.git
cd speechmatics-academy

# Navigate to an example
cd basics/01-hello-world/python

# Setup virtual environment
python -m venv venv

# Activate virtual environment (Windows)
venv\Scripts\activate
# On Mac/Linux: source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp ../.env.example .env
# Edit .env and add your SPEECHMATICS_API_KEY

# Run the example
python main.py

Caution

Never hardcode API keys in your source code. Always use environment variables (.env files) or secure secret management systems. Never commit .env to version control - only .env.example with placeholder values.

Option 2: Direct Copy

Use degit to copy individual examples:

# Install degit
npm install -g degit

# Copy an example
degit speechmatics/speechmatics-academy/basics/01-hello-world my-project
cd my-project

📖 Theory

New to speech recognition? Start here to understand the core concepts before diving into code.

Topic	Description
Introduction to ASR	How automatic speech recognition converts audio to text using acoustic and language models
Introduction to LLMs	Understanding large language models and their role in voice AI applications
Prompt Engineering	Crafting effective prompts for voice agents and conversational AI
Choosing the Right Model	Comparing model types, capabilities, and when to use each

Note

Theory guides are coming soon. In the meantime, check out the "How It Works" sections in each example.

📚 Example Categories

Fundamentals

Fundamental examples for getting started with the Speechmatics SDK.

Example	Description	Packages	Difficulty
Hello World	The absolute simplest transcription example	`Batch`	Beginner
Batch vs Real-time	Learn the difference between API modes	`Batch` `RT`	Beginner
Configuration Guide	Common configuration options	`Batch`	Beginner
Text-to-Speech	Convert text to natural-sounding speech	`TTS`	Beginner
Channel Diarization	Multi-channel transcription with speaker attribution	`Voice` `RT`	Beginner
Audio Intelligence	Extract insights with sentiment, topics, and summaries	`Batch`	Intermediate
Multilingual & Translation	Transcribe 50+ languages and translate	`RT`	Intermediate
Basic Turn Detection	Silence-based turn detection with Real-Time SDK	`RT`	Intermediate
Intelligent Turn Detection	Smart turn detection with Voice SDK presets	`Voice`	Intermediate
Speaker ID & Speaker Focus	Extract speaker IDs and control which speakers drive conversation	`Voice`	Intermediate

Browse all basics examples

Integrations

Third-party framework and service integrations.

Example	Features	Languages
Simple Voice Assistant	WebRTC, VAD, diarization, focus speakers, passive filtering, LLM, TTS	Python
Telephony with Twilio	Phone calls via SIP, LiveKit Agents, Krisp noise cancellation, LLM, TTS	Python
Simple Voice Bot	Local audio, VAD, diarization, focus speakers, passive filtering, LLM, TTS, interruptions	Python
Simple Voice Bot (Web)	Browser-based WebRTC, VAD, diarization, focus speakers, passive filtering, LLM, TTS	Python
Outbound Dialer	REST API, outbound calls, Media Streams, Speechmatics STT, ElevenLabs TTS	Python
Voice Assistant	Voice AI platform, Speechmatics STT, diarization, custom vocabulary, LLM, TTS	Python
Coming Soon	Vercel AI SDK integration	TypeScript

Browse all integrations

Use Cases

Example applications for specific industries.

Industry	Example	Features
Healthcare	Medical Transcription	Real-time, custom medical vocabulary, HIPAA compliance
Media	Video Captioning	SRT generation, timestamp sync, batch processing
Contact Center	Call Analytics	Channel diarization, sentiment analysis, topic detection, summarization
Business	AI Receptionist	LiveKit voice agent, Twilio SIP, Google Calendar booking, function calling
Entertainment	Santa Voice Agent	LiveKit, ElevenLabs TTS, custom vocabulary, Twilio SIP telephony

Browse all use cases

🔄 Migration Guides

Switching from another speech-to-text provider? Our migration guides help you transition smoothly with feature mappings, code comparisons, and practical examples.

From	Guide	Features Covered	Status
Deepgram	Migration Guide	Batch, Streaming, Diarization, Custom Vocabulary	Available
AssemblyAI	Migration Guide	Transcription, Audio Intelligence, Real-time	Coming Soon
Google Cloud Speech	Migration Guide	Batch, Streaming, Multi-language	Coming Soon
AWS Transcribe	Migration Guide	Batch Jobs, Streaming, Custom Vocabulary	Coming Soon
Azure Speech	Migration Guide	REST API, WebSocket, Pronunciation	Coming Soon

Note

Each migration guide includes:

Feature Mapping - Direct equivalent features comparison
Code Comparison - Side-by-side before/after examples
Migration Checklist - Step-by-step migration process
Advantages - Benefits of switching to Speechmatics
Working Examples - Complete runnable code

Browse all migration guides

🔍 Finding Examples

Find examples for the SDK package you installed:

By Package

Package	Description	Examples
`speechmatics-batch`	Async transcription of audio files	Hello World, Batch vs Real-time, Configuration Guide, Audio Intelligence, Multilingual & Translation, Video Captioning, Call Analytics
`speechmatics-rt`	Real-time transcription	Batch vs Real-time, Configuration Guide, Multilingual & Translation, Basic Turn Detection, Channel Diarization, Medical Transcription
`speechmatics-voice`	Voice agent with conversation management	Intelligent Turn Detection, Speaker ID & Speaker Focus, Twilio Outbound Dialer
`speechmatics-tts`	Text-to-speech synthesis	Text-to-Speech

By Feature

Feature	Examples
Batch Transcription	Hello World, Batch vs Real-time, Configuration Guide, Audio Intelligence, Video Captioning, Call Analytics
Real-time	Batch vs Real-time, Configuration Guide, Basic Turn Detection, LiveKit Voice Assistant, Medical Transcription
Turn Detection	Basic Turn Detection, Intelligent Turn Detection
Voice Agents	Intelligent Turn Detection, Speaker ID & Speaker Focus, LiveKit Voice Assistant, Pipecat Voice Bot, Pipecat Voice Bot (Web), Twilio Outbound Dialer, VAPI Voice Assistant, AI Receptionist, Santa Voice Agent
Speaker Diarization	Configuration Guide, Speaker ID & Speaker Focus, Channel Diarization, LiveKit Voice Assistant, Call Analytics
Speaker Identification	Speaker ID & Speaker Focus
Sentiment Analysis	Audio Intelligence, Call Analytics
Topic Detection	Audio Intelligence, Call Analytics
Summarization	Audio Intelligence, Call Analytics
Translation	Multilingual & Translation
Text-to-Speech	Text-to-Speech

By Integration

Integration	Examples	Documentation	Status
LiveKit	Simple Voice Assistant, Telephony with Twilio, AI Receptionist, Santa Voice Agent	LiveKit Docs	Available
Pipecat AI	Simple Voice Bot, Simple Voice Bot (Web)	Pipecat Docs	Available
Twilio	Outbound Dialer, Telephony with Twilio, AI Receptionist, Santa Voice Agent	Twilio Media Streams	Available
VAPI	Voice Assistant	docs.vapi.ai	Available

By Language

Language	Examples	Status
Python	Hello World, Batch vs Real-time, Configuration Guide, Audio Intelligence, Multilingual & Translation, Text-to-Speech, Basic Turn Detection, Intelligent Turn Detection, Speaker ID & Speaker Focus, Channel Diarization, LiveKit Voice Assistant, LiveKit Telephony, Pipecat Voice Bot, Pipecat Voice Bot (Web), Twilio Outbound Dialer, VAPI Voice Assistant, Medical Transcription, Video Captioning, Call Analytics, AI Receptionist, Santa Voice Agent	Available
Typescript	-	Coming Soon
C#	-	Coming Soon

By Difficulty

Difficulty	Examples
Beginner	Hello World, Batch vs Real-time, Configuration Guide, Text-to-Speech, Channel Diarization, VAPI Voice Assistant, Video Captioning, Call Analytics
Intermediate	Audio Intelligence, Multilingual & Translation, Basic Turn Detection, Intelligent Turn Detection, Speaker ID & Speaker Focus, LiveKit Voice Assistant, Pipecat Voice Bot, Pipecat Voice Bot (Web), Medical Transcription
Advanced	LiveKit Telephony, Twilio Outbound Dialer, AI Receptionist, Santa Voice Agent

📁 Example Structure

Every example follows a consistent structure:

example-name/
├── python/
│   ├── main.py             # Primary Python implementation
│   ├── requirements.txt    # Python dependencies
│   └── .gitignore          # Ignore venv/, __pycache__/, .env
├── assets/                 # Sample files, images, etc.
│   ├── sample.wav          # Sample audio (if needed)
│   └── agent.md            # Agent prompt (for voice agents)
├── .env.example            # Environment variables template
└── README.md               # Main documentation (REQUIRED)

Note

Each example includes:

What You'll Learn - Key concepts covered
Prerequisites - Required setup
Quick Start - Step-by-step instructions
How It Works - Step-by-step explanation
Key Features - Demonstrated capabilities
Expected Output - Sample results
Next Steps - Related examples
Troubleshooting - Common issues
Resources - Relevant documentation

🤝 Contributing

We welcome contributions! There are many ways to help:

Ways to Contribute

Add New Examples - Share your implementations
Improve Existing Examples - Fix bugs, add features
Add Language Support - Port examples to other languages
Fix Documentation - Improve README files
Report Issues - Help us improve quality

Adding a New Example

Choose category (basics/integrations/use-cases)
Follow structure (see EXAMPLE_TEMPLATE.md)
Add metadata to docs/index.yaml
Write README using the template
Test thoroughly
Submit PR with clear description

See CONTRIBUTING.md for detailed guidelines.

Quality Standards

Note

All examples must meet these standards:

Clean, readable, well-commented Python code
Follows SDK best practices
Includes proper error handling
No hardcoded secrets
Complete documentation
Tested end-to-end
Metadata in index.yaml

🆘 Support & Resources

Getting Help

GitHub Issues: Report bugs or request examples
GitHub Community Discussions: Ask questions, share projects
Email Support: devrel@speechmatics.com

Resources

SDK Repository: speechmatics-python-sdk
API Documentation: docs.speechmatics.com
Developer Portal: portal.speechmatics.com
Blog: speechmatics.com/blog

Documentation

Example Template - Template for new examples
Contributing Guide - How to contribute

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links

SDK: github.com/speechmatics/speechmatics-python-sdk
Docs: docs.speechmatics.com
Portal: portal.speechmatics.com

Built with ❤️ by the Speechmatics Community

Twitter • LinkedIn • YouTube

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What is Speechmatics?

📋 Table of Contents

⚡ Quick Start

Prerequisites

Option 1: Clone and Run

Option 2: Direct Copy

📖 Theory

📚 Example Categories

Fundamentals

Integrations

Use Cases

🔄 Migration Guides

🔍 Finding Examples

By Package

By Feature

By Integration

By Language

By Difficulty

📁 Example Structure

🤝 Contributing

Ways to Contribute

Adding a New Example

Quality Standards

🆘 Support & Resources

Getting Help

Resources

Documentation

📄 License

🔗 Links

About

Uh oh!

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
basics		basics
docs		docs
guides/migration-guides/deepgram		guides/migration-guides/deepgram
integrations		integrations
logos		logos
use-cases		use-cases
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

speechmatics/speechmatics-academy

Folders and files

Latest commit

History

Repository files navigation

What is Speechmatics?

📋 Table of Contents

⚡ Quick Start

Prerequisites

Option 1: Clone and Run

Option 2: Direct Copy

📖 Theory

📚 Example Categories

Fundamentals

Integrations

Use Cases

🔄 Migration Guides

🔍 Finding Examples

By Package

By Feature

By Integration

By Language

By Difficulty

📁 Example Structure

🤝 Contributing

Ways to Contribute

Adding a New Example

Quality Standards

🆘 Support & Resources

Getting Help

Resources

Documentation

📄 License

🔗 Links

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages