Audio-Emotion-Detection

This is the implementation of emotion detection from audio file, which classify the eight possible emotion from the given audio file. Audio emotion detection is a challenging task where language processing play a part to generate emotion from audio.

Use Case

It can be implemented in call center to detect the emotions of the caller

In virtual therapy session, the overall emotion of the patient can be detected even without facial expression.

Dataset:

(RAVDESS) https://zenodo.org/record/1188976#.X4sE0tDXKUl

(TESS) https://tspace.library.utoronto.ca/handle/1807/24487

Flow of the project

Cleaning the data

Extracting features from audio using Librosa library

Merging various features into one

Building LSTM model for training

Predicting on test data

Evaluating the emotion scores as the metric

Project Pipeline:

Digital signal processing is the hot topic in the field of Machine learning recently. We can see a lot of research on emotion detection from video or from images and we can even get a lot of pretrained model for this. However, there is still lack of research and work in the field of Speech Emotion Recognization (SER).

Since the project is a classification problem, Convolution Neural Network seems the obvious choice, and we also built Random forest model, Multilayer perceptron but they underperformed with very low accuracies which could not pass the test while predicting the right emotions.

Finally, build a recurrent neural network, namely, LSTM and model is then able to predict emotion form audio file with accuracy of more than 80%.

File:

Main.py and utils.py file contains all required code for API creation. The final model is called here and it returns JSON file consisting overall probability of each emotion present in audio file.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
img		img
Final_Model.ipynb		Final_Model.ipynb
README.md		README.md
UI.html		UI.html
git .txt		git .txt
main.py		main.py
requirements.txt		requirements.txt
testing file.txt		testing file.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio-Emotion-Detection

This is the implementation of emotion detection from audio file, which classify the eight possible emotion from the given audio file. Audio emotion detection is a challenging task where language processing play a part to generate emotion from audio.

Use Case

Dataset:

Flow of the project

Project Pipeline:

Since the project is a classification problem, Convolution Neural Network seems the obvious choice, and we also built Random forest model, Multilayer perceptron but they underperformed with very low accuracies which could not pass the test while predicting the right emotions.

Finally, build a recurrent neural network, namely, LSTM and model is then able to predict emotion form audio file with accuracy of more than 80%.

File:

Main.py and utils.py file contains all required code for API creation. The final model is called here and it returns JSON file consisting overall probability of each emotion present in audio file.

UI.HTML is User interface where API is called and we can see the plot containing overall emotion from the recording.

Final_Model.ipnp is the final model where all required code and implementation of the model is implemented.

Final LSTM Model summery

Model Accuracy

The UI containing final model looks like this:

Please Visit my personal portfolio here(https://nabinelnino.github.io/nabin.bagale/) for getting more information about this project

About

Uh oh!

Releases

Packages

Uh oh!

Languages

nabinelnino/Audio-Emotion-Detection

Folders and files

Latest commit

History

Repository files navigation

Audio-Emotion-Detection

This is the implementation of emotion detection from audio file, which classify the eight possible emotion from the given audio file. Audio emotion detection is a challenging task where language processing play a part to generate emotion from audio.

Use Case

Dataset:

Flow of the project

Project Pipeline:

Since the project is a classification problem, Convolution Neural Network seems the obvious choice, and we also built Random forest model, Multilayer perceptron but they underperformed with very low accuracies which could not pass the test while predicting the right emotions.

Finally, build a recurrent neural network, namely, LSTM and model is then able to predict emotion form audio file with accuracy of more than 80%.

File:

Main.py and utils.py file contains all required code for API creation. The final model is called here and it returns JSON file consisting overall probability of each emotion present in audio file.

UI.HTML is User interface where API is called and we can see the plot containing overall emotion from the recording.

Final_Model.ipnp is the final model where all required code and implementation of the model is implemented.

Final LSTM Model summery

Model Accuracy

The UI containing final model looks like this:

Please Visit my personal portfolio here(https://nabinelnino.github.io/nabin.bagale/) for getting more information about this project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages