AudioSignalProcessingForML

Code and slides of my YouTube series called "Audio Signal Proessing for Machine Learning"

This repository is a comprehensive collection of resources, code, and explanations for understanding and implementing audio signal processing techniques, with a focus on applications in machine learning. It serves as a learning guide, starting from the fundamentals of sound and waveforms and progressing to advanced feature extraction methods.

Note on Versioning

While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.

Course Structure

Foundational Concepts

Overview: Video | Slides
Sound and waveforms: Video | Slides
Intensity, loudness, and timbre: Video | Slides | Notebook
Understanding audio signals: Video | Slides

Feature Extraction Theory

Types of audio features for ML: Video | Slides
How to extract audio features: Video | Slides
Time-domain audio features: Video | Slides

Time-Domain Implementation

Implementing the amplitude envelope: Video | Notebook
RMS energy and zero-crossing rate: Video | Notebook

Frequency-Domain Concepts

Fourier Transform: The Intuition: Video | Slides
Complex numbers for audio signal processing: Video | Slides
Defining the Fourier transform using complex numbers: Video | Slides | Notebook
Discrete Fourier Transform: Video | Slides

Frequency-Domain Implementation

Extracting the Discrete Fourier Transform: Video | Notebook
Short-Time Fourier Transform explained easily: Video | Slides
Extracting Spectrograms from Audio with Python: Video | Notebook
Mel Spectrogram Explained Easily: Video | Slides
Extracting Mel Spectrograms with Python: Video | Notebook
MFCCs Explained Easily: Video | Slides
Extracting MFCCs with Python: Video | Notebook
Frequency-Domain Audio Features: Video | Slides
Implementing Band Energy Ratio from Scratch with Python: Video | Notebook
Spectral centroid and bandwidth: Video | Notebook

Audio examples

audio_resources/: A collection of .wav files used for the examples in the notebooks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AudioSignalProcessingForML

Note on Versioning

Course Structure

Foundational Concepts

Feature Extraction Theory

Time-Domain Implementation

Frequency-Domain Concepts

Frequency-Domain Implementation

Audio examples

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.idea		.idea
01 - Overview		01 - Overview
02 - Sound and waveforms		02 - Sound and waveforms
03 - Intensity, loudness, and timbre		03 - Intensity, loudness, and timbre
04 - Understanding audio signals		04 - Understanding audio signals
05 - Types of audio features for ML		05 - Types of audio features for ML
06 - How to extract audio features		06 - How to extract audio features
07 - Time-domain audio features		07 - Time-domain audio features
08 - Implementing the amplitude envelope		08 - Implementing the amplitude envelope
09 - RMS energy and zero-crossing rate		09 - RMS energy and zero-crossing rate
10 - Fourier Transform The Intuition		10 - Fourier Transform The Intuition
11 - Complex numbers for audio signal processing		11 - Complex numbers for audio signal processing
12 - Defining the Fourier transform using complex numbers		12 - Defining the Fourier transform using complex numbers
13 - Discrete Fourier Transform		13 - Discrete Fourier Transform
14 - Extracting the Discrete Fourier Transform		14 - Extracting the Discrete Fourier Transform
15 - Short-Time Fourier Transform explained easily		15 - Short-Time Fourier Transform explained easily
16 - Extracting Spectrograms from Audio with Python		16 - Extracting Spectrograms from Audio with Python
17 - Mel Spectrogram Explained Easily		17 - Mel Spectrogram Explained Easily
18 - Extracting Mel Spectrograms with Python		18 - Extracting Mel Spectrograms with Python
19 - MFCCs Explained Easily		19 - MFCCs Explained Easily
20 - Extracting MFCCs with Python		20 - Extracting MFCCs with Python
21 - Frequency-Domain Audio Features		21 - Frequency-Domain Audio Features
22 - Implementing Band Energy Ratio from Scratch with Python		22 - Implementing Band Energy Ratio from Scratch with Python
23 - Spectral centroid and bandwidth		23 - Spectral centroid and bandwidth
audio_resources		audio_resources
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

musikalkemist/AudioSignalProcessingForML

Folders and files

Latest commit

History

Repository files navigation

AudioSignalProcessingForML

Note on Versioning

Course Structure

Foundational Concepts

Feature Extraction Theory

Time-Domain Implementation

Frequency-Domain Concepts

Frequency-Domain Implementation

Audio examples

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages