diff --git a/.gitignore b/.gitignore index 932cd0e..9fc6de9 100644 --- a/.gitignore +++ b/.gitignore @@ -183,4 +183,6 @@ optuna_model.json models/ vosk-model-en-us-0.22/ vosk-model-small-sv-rhasspy-0.15/ -recognized_text.txt \ No newline at end of file +recognized_text.txt +src/models/*.keras +src/models/*.json \ No newline at end of file diff --git a/README.md b/README.md index ae03851..a210b0b 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,24 @@ [![Linting: Ruff](https://img.shields.io/badge/linting-ruff-yellowgreen)](https://github.com/charliermarsh/ruff) [![Deep Learning](https://img.shields.io/badge/Deep%20Learning-TensorFlow-orange)](https://www.tensorflow.org/) [![Keras](https://img.shields.io/badge/Keras-red)](https://keras.io/) +[![TensorFlow](https://img.shields.io/badge/TensorFlow-2.0%2B-orange)](https://www.tensorflow.org/) [![Python](https://img.shields.io/badge/Python-3.8%2B-blue)](https://www.python.org/) -[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) +[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) # Sentiment Analysis and Translation -This repository contains a sentiment analysis application and an English-to-French translation model. The sentiment analysis application uses TensorFlow and Keras to classify text data into positive or negative sentiments. The translation model implements a Transformer-based architecture for sequence-to-sequence learning. +This repository provides a comprehensive solution for real-time **speech-to-text**, **sentiment analysis**, and **English-to-French translation** using state-of-the-art machine learning techniques. It includes an interactive web application and robust pipelines for text processing, sentiment classification, and language translation. + +--- + +## Overview + +![Application Workflow](docs/images/app_workflow.png) + +*Figure: High-level workflow of the application, including speech-to-text, sentiment analysis, and translation.* + + +--- ## Features @@ -27,21 +39,58 @@ This repository contains a sentiment analysis application and an English-to-Fren --- +## Note on Models + +The sentiment analysis and translation models included in this repository are **toy models** designed for demonstration purposes. They may not achieve production-level accuracy and are intended for educational and exploratory use. + +--- + ## Installation +### Prerequisites +- Python 3.8 or higher +- Poetry for dependency management + ### Install Dependencies +1. Install Poetry: + ```bash + pip install poetry + ``` +2. Install project dependencies: + ```bash + poetry install + ``` -Install Poetry if you haven't already: +### Download the Vosk Model +1. Download the `vosk-model-en-us-0.22` model from the [official Vosk repository](https://alphacephei.com/vosk/models). +2. Extract the `.zip` file into the project directory: + ```bash + unzip vosk-model-en-us-0.22.zip -d vosk-model-en-us-0.22 + ``` +3. Ensure the extracted folder is located in the root directory: + ``` + Sentiment_Analysis/ + ├── vosk-model-en-us-0.22/ + └── ... + ``` -```bash -pip install poetry -``` +--- -Then, install the project dependencies: +## Required Datasets -```bash -poetry install -``` +### 1. Sentiment Analysis Dataset +- **Dataset**: [TripAdvisor Hotel Reviews Dataset](https://www.kaggle.com/datasets/andrewmvd/trip-advisor-hotel-reviews) +- **Description**: This dataset consists of 20,000 reviews crawled from TripAdvisor, allowing you to explore what makes a great hotel and potentially use this model in your travels. +- **Setup**: + - Download the dataset from the link above. + - Place the dataset in the `src/data/` directory. + +### 2. English-to-French Translation Dataset +- **Dataset**: [English-French Dataset (Kaggle)](https://www.kaggle.com/datasets/dhruvildave/en-fr-translations) +- **Description**: This dataset contains English sentences paired with their French translations. It is used to train and evaluate the Transformer-based translation model. +- **Setup**: + - Download the dataset from the link above. + - Place the dataset in the `src/data/` directory as `en-fr.parquet`. --- @@ -52,7 +101,6 @@ Sentiment_Analysis/ ├── app/ # Application-specific files │ ├── __init__.py # Makes the folder a Python package │ └── voice_to_text_app.py # Main application script -│ ├── src/ # Source folder │ ├── data/ # Dataset folder │ ├── models/ # Saved models @@ -107,86 +155,44 @@ Sentiment_Analysis/ ## Usage ### Interactive Application - -1. **Run the Application** - - Start the Dash-based web application: - +1. **Run the Application**: ```bash poetry run python app/voice_to_text_app.py ``` - -2. **User Interface** - - The application provides the following features: - - **Start Recording**: Click the "Start Recording" button to begin recording your speech. - - **Stop Recording**: Click the "Stop Recording" button to stop recording. +2. **Features**: + - **Start Recording**: Begin recording your speech. + - **Stop Recording**: Stop recording. - **Recognized Text**: Displays the text recognized from your speech. - **Translated Text**: Displays the English-to-French translation of the recognized text. - **Sentiment Analysis**: Displays the sentiment (positive or negative) of the recognized text. - **Download Recognized Text**: Provides a link to download the recognized text as a `.txt` file. -3. **Example Workflow** - - - Click "Start Recording" and speak into your microphone. - - Click "Stop Recording" when you're done. - - The app will display: - - The recognized text. - - The French translation of the recognized text. - - The sentiment analysis result. - ---- - ### Sentiment Analysis - -1. **Train or Load the Model** - - Run the sentiment analysis script to train or load the model: - +1. **Train or Load the Model**: ```bash poetry run python src/sentiment_analysis.py ``` - - If a saved model exists, it will be loaded. - Otherwise, a new model will be trained and saved in the `src/models/` folder. - -2. **Evaluate the Model** - - The script evaluates the model on the test dataset and with an accuracy of: - +2. **Evaluate the Model**: + The script evaluates the model on the test dataset: ``` Test Accuracy: 95.00% ``` -3. **Inference** - - Modify the `raw_text_data` variable in `src/sentiment_analysis.py` to test the model with your own text input. The script will output the predicted sentiment. - ### English-to-French Translation - -1. **Prepare the Dataset** - - Place your English-French dataset in the `src/data/` folder. The dataset should be in a format compatible with the `DatasetProcessor` class. - -2. **Train or Load the Model** - - Run the translation script to train or load the Transformer model: - +1. **Prepare the Dataset**: + Place your English-French dataset in the `src/data/` folder. +2. **Train or Load the Model**: ```bash poetry run python src/translation_french_english.py ``` - - If a saved model exists, it will be loaded. - Otherwise, a new model will be trained and saved in the `src/models/` folder. - -3. **Evaluate the Model** - - The script evaluates the model on the test dataset and calculates the BLEU score. - - Output: - +3. **Evaluate the Model**: + The script evaluates the model on the test dataset and calculates the BLEU score: ``` - Test loss: 1.97, Test accuracy: 67.26% + Test loss: 2.13, Test accuracy: 67.26% BLEU score on the test dataset: 0.52 ``` @@ -201,4 +207,12 @@ Sentiment_Analysis/ ## License -This project is licensed under the MIT License. See the `LICENSE` file for details. \ No newline at end of file +This project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for details. + +--- + +## About + +This repository is designed for researchers, developers, and enthusiasts interested in exploring advanced NLP techniques. It provides a practical implementation of speech-to-text, sentiment analysis, and translation pipelines, along with an interactive web application. + +For questions or feedback, feel free to open an issue or contact the repository maintainers. \ No newline at end of file diff --git a/app/README.md b/app/README.md new file mode 100644 index 0000000..814a1b2 --- /dev/null +++ b/app/README.md @@ -0,0 +1,103 @@ +# Speech-to-Text Translation and Sentiment Analysis App + +This directory contains the Dash web application for real-time speech processing with translation and sentiment analysis capabilities. + +## Overview + +This interactive web application allows users to: + +- **Record and convert spoken English to text** using Vosk speech recognition. +- **Translate the recognized English text to French** using a Transformer model. +- **Analyze the sentiment** of the recognized text (positive or negative). +- **Download the recognized text** as a text file. + +## Features + +- **Real-time Audio Processing**: Records audio directly from the user's microphone. +- **Speech Recognition**: Converts spoken words to text using Vosk. +- **Translation**: Translates English text to French using a Transformer-based model. +- **Sentiment Analysis**: Determines if the speech content is positive or negative. +- **Download Option**: Save recognized text for future reference. + +## Prerequisites + +Before running the application, ensure you have: + +1. **Vosk Model**: + - Download the Vosk model (`vosk-model-en-us-0.22`) from the [official Vosk repository](https://alphacephei.com/vosk/models). + - Place the extracted folder in the project root directory. + +2. **English-French Dataset**: + - Ensure the dataset (`src/data/en-fr.parquet`) is available for training and preprocessing. + +3. **Transformer Model**: + - Train or download the Transformer model for translation. + - Ensure the model is saved at the path defined in `ModelPaths.TRANSFORMER_MODEL.value`. + +4. **Inference Model**: + - Ensure the sentiment analysis inference model is available at the path defined in `ModelPaths.INFERENCE_MODEL.value`. + +5. **Dependencies**: + - Install all project dependencies using Poetry: + ```bash + poetry install + ``` + +## How to Run + +From the project root directory: + +```bash +poetry run python app.py +``` + +The application will start and be accessible at: [http://127.0.0.1:8050](http://127.0.0.1:8050) + +## Usage Instructions + +### Start Recording: +- Click the **"Start Recording"** button. +- Speak clearly into your microphone in English. + +### Stop Recording: +- Click the **"Stop Recording"** button when finished speaking. + +### View Results: +- The recognized English text will appear. +- Below that, you'll see the French translation. +- The sentiment analysis (positive or negative) will be displayed. + +### Download Results: +- Click **"Download Recognized Text"** to save the recognized text as a `.txt` file. + +## Code Structure + +The application consists of: + +1. **Initialization**: Sets up the `SpeechToText` module, loads the translation model, and configures the Dash app. +2. **User Interface**: Defines the layout with buttons and display areas. +3. **Callback Function**: Handles the recording process, speech recognition, translation, and sentiment analysis. +4. **Main Entry Point**: Starts the Dash server. + +## Troubleshooting + +If you encounter issues: + +- Verify the Vosk model is correctly installed at `vosk-model-en-us-0.22`. +- Check that the dataset exists at `en-fr.parquet`. +- Ensure the Transformer model is available at the path defined in `ModelPaths.TRANSFORMER_MODEL.value`. +- Ensure the sentiment analysis inference model is available at the path defined in `ModelPaths.INFERENCE_MODEL.value`. +- Look for error messages in the console logs. + +## Technical Details + +- **Framework**: Dash web application framework. +- **Speech Recognition**: Vosk speech recognition model. +- **Translation**: Custom Transformer model for English-to-French translation. +- **Sentiment Analysis**: Neural network-based sentiment classifier. +- **State Management**: Uses Dash callbacks and `dcc.Store` for managing application state. + +## Development Notes + +- The app runs in debug mode by default. +- For production deployment, set `debug=False` in the `app.run_server()` method. \ No newline at end of file diff --git a/docs/images/app_workflow.png b/docs/images/app_workflow.png new file mode 100644 index 0000000..2a0adce Binary files /dev/null and b/docs/images/app_workflow.png differ diff --git a/src/modules/transformer_components.py b/src/modules/transformer_components.py index b159e97..81a4d30 100644 --- a/src/modules/transformer_components.py +++ b/src/modules/transformer_components.py @@ -61,6 +61,8 @@ def build(self, input_shape): ) self.dense_proj = tf.keras.Sequential( [ + tf.keras.layers.Dense(self.dense_dim, activation="gelu"), + tf.keras.layers.Dense(self.dense_dim, activation="gelu"), tf.keras.layers.Dense(self.dense_dim, activation="gelu"), tf.keras.layers.Dense(self.embed_dim), ] @@ -104,6 +106,8 @@ def build(self, input_shape): ) self.dense_proj = tf.keras.Sequential( [ + tf.keras.layers.Dense(self.dense_dim, activation="gelu"), + tf.keras.layers.Dense(self.dense_dim, activation="gelu"), tf.keras.layers.Dense(self.dense_dim, activation="gelu"), tf.keras.layers.Dense(self.embed_dim), ]