Code and slides of my YouTube series called "Audio Signal Proessing for Machine Learning"
This repository is a comprehensive collection of resources, code, and explanations for understanding and implementing audio signal processing techniques, with a focus on applications in machine learning. It serves as a learning guide, starting from the fundamentals of sound and waveforms and progressing to advanced feature extraction methods.
While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.
- Overview: Video | Slides
- Sound and waveforms: Video | Slides
- Intensity, loudness, and timbre: Video | Slides | Notebook
- Understanding audio signals: Video | Slides
- Types of audio features for ML: Video | Slides
- How to extract audio features: Video | Slides
- Time-domain audio features: Video | Slides
- Implementing the amplitude envelope: Video | Notebook
- RMS energy and zero-crossing rate: Video | Notebook
- Fourier Transform: The Intuition: Video | Slides
- Complex numbers for audio signal processing: Video | Slides
- Defining the Fourier transform using complex numbers: Video | Slides | Notebook
- Discrete Fourier Transform: Video | Slides
- Extracting the Discrete Fourier Transform: Video | Notebook
- Short-Time Fourier Transform explained easily: Video | Slides
- Extracting Spectrograms from Audio with Python: Video | Notebook
- Mel Spectrogram Explained Easily: Video | Slides
- Extracting Mel Spectrograms with Python: Video | Notebook
- MFCCs Explained Easily: Video | Slides
- Extracting MFCCs with Python: Video | Notebook
- Frequency-Domain Audio Features: Video | Slides
- Implementing Band Energy Ratio from Scratch with Python: Video | Notebook
- Spectral centroid and bandwidth: Video | Notebook
audio_resources/: A collection of .wav files used for the examples in the notebooks.