Skip to content

Cna-Wangdi/ISO503

Repository files navigation

ISY503 – Intelligent Systems NLP Sentiment Analysis Project (Bidirectional LSTM + Streamlit App)

This repository contains the full implementation for our ISY503 Assessment 3 project, where we developed a Natural Language Processing (NLP) Sentiment Analysis System using a Bidirectional LSTM model trained on the Amazon Review Dataset. The project includes the complete training pipeline, a saved deep learning model, and a Streamlit-based web application for real-time sentiment prediction.

Project Structure:/

ISY503_NLP   
  ├── sentiment_pipeline.py      
  ├── app.py      
  ├── sentiment_lstm_model.keras      
  ├── tokenizer.pickle      
  ├── domain_sentiment_data/      
  │ └── sorted_data_acl/      
  │ ├── books/      
  │ ├── dvd/      
  │ ├── electronics/      
  │ └── kitchen_&_housewares/      
  └── README.md             

1. Project Overview

This project implements a binary sentiment classifier that predicts whether a customer review is Positive or Negative.

We used:

* Multi-Domain Sentiment Datase (Books / DVD / Electronics / Kitchen domains): https://www.cs.jhu.edu/~mdredze/datasets/sentiment/index2.html

* Text preprocessing (cleaning, tokenisation, padding)

* Bidirectional LSTM deep learning model

* Class balancing

* Train/Validation/Test split

* Streamlit web UI for demonstration

The system is lightweight, easy to use, and demonstrates key concepts of Intelligent Systems and modern NLP.

2. Model Architecture

The final model uses a Bidirectional LSTM with:

* Embedding layer

* BiLSTM (256 units)

* GlobalMaxPool1D

* Dense (128 → 64) with ReLU

* Dropout regularisation

* Sigmoid output for binary classification

Features: * Handles long-range dependencies * Learns both forward and backward context * Strong performance on sentiment classification

3. Installation & Setup

Step 1 — Clone the Repository

git clone https://github.com/USERNAME/ISY503_NLP.git

cd ISY503_NLP

Step 2 — Install Dependencies

pip install -r requirements.txt

Step 3 — Prepare the Dataset

Download or extract the dataset:/

Step 4 — Train the Model python sentiment_pipeline.py

This will generate: - sentiment_lstm_model.keras - tokenizer.pickle

4. Running the Web Application

Start the Streamlit app:
    streamlit run app.py

Your browser will open:
    http://localhost:8501

5. Dataset Information

We used the Amazon Review Polarity Dataset, containing labelled positive and negative reviews across four domains:

* Books

* DVD

* Electronics

* Kitchen & Housewares

A randomised, cleaned version was used to avoid bias and noise.

Source: Dredze, M. (n.d.). Sentiment datasets. Johns Hopkins University. http://www.cs.jhu.edu/~mdredze/datasets/sentiment/

6. Ethical Considerations

Our group considered the following ethical issues:

* Bias: model may favour positive reviews due to imbalance

* Misclassification: harmful in real-world scenarios

* Data limitations: only English customer reviews

* Transparency: cleaning steps, limitations, and confidence scores displayed

* Accountability: model should not be used for high-risk decisions

7. Final Notes

This project demonstrates a complete Intelligent Systems pipeline using modern NLP techniques and provides a fully functional sentiment analysis interface. It is suitable for academic demonstration, future extension, and practical learning.

About

Assessment 3 - NLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages