Easy Audio Interfaces is a Python library that provides a simple and flexible way to work with audio streams, including recording, playback, network transfer, and processing.
- Socket-based audio streaming
- Local file reading and writing
- Audio resampling and rechunking
- Voice activity detection (VAD) using Silero VAD model
- Network file transfer
Here's a simple example to get you started - record audio from a socket, process it, and save to a file:
from easy_audio_interfaces import SocketReceiver, LocalFileSink, RechunkingBlock, ResamplingBlock
async with SocketReceiver() as receiver, LocalFileSink("output.wav") as sink:
rechunker = RechunkingBlock(chunk_size=512)
resampler = ResamplingBlock(original_sample_rate=receiver.sample_rate, resample_rate=16000)
rechunked_stream = rechunker.rechunk(receiver)
resampled_stream = resampler.resample(rechunked_stream)
await sink.write_from(resampled_stream)For more control over individual audio chunks, you can use process_chunk and process_chunk_last:
from easy_audio_interfaces import ResamplingBlock
from wyoming.audio import AudioChunk
resampler = ResamplingBlock(resample_rate=16000)
await resampler.open()
# Process individual chunks
for chunk in audio_chunks:
async for resampled_chunk in resampler.process_chunk(chunk):
# Handle each resampled chunk
process_audio(resampled_chunk)
# Important: Flush remaining buffered samples
async for final_chunk in resampler.process_chunk_last():
process_audio(final_chunk)
await resampler.close()uv add easy-audio-interfacesuv add "https://github.com/AnkushMalaker/python-audio-interfaces.git"Based on the functionality you require, you should consider installing with the following extras:
# For speech-to-text
uv add "easy-audio-interfaces[stt]"
# For voice activity detection
uv add "easy-audio-interfaces[silero-vad]"
# For Bluetooth audio
uv add "easy-audio-interfaces[bluetooth]"
# For local audio devices
uv add "easy-audio-interfaces[local-audio]"- SocketReceiver: Receives audio data over a WebSocket connection
- LocalFileStreamer: Streams audio data from a local file
- SocketStreamer: Sends audio data over a WebSocket connection
- LocalFileSink: Writes audio data to a local file
- CollectorBlock: Collects audio samples for a specified duration
- ResamplingBlock: Resamples audio to a different sample rate
process_chunk_last(): Flushes remaining buffered samples from the resampler. Call this after processing all chunks to ensure no audio data is lost due to internal buffering.
- RechunkingBlock: Rechunks audio data into fixed-size chunks
- SileroVad: Uses the Silero VAD model for voice activity detection
- VoiceGate: Applies voice activity detection to segment audio
Records voice segments from a network stream using VAD:
python -m easy_audio_interfaces.examples.basic_friend_recorderTransfer audio files over a network:
# Sender
python -m easy_audio_interfaces.examples.file_network_transfer sender input_file.wav --host localhost --port 8080
# Receiver
python -m easy_audio_interfaces.examples.file_network_transfer receiver output_file.wav --host 0.0.0.0 --port 8080For more detailed usage and API documentation, please refer to the docstrings in the source code.