Skip to content

coder-tejas/Simple-LSTM-RNN

Repository files navigation

Simple LSTM RNN

A minimal and easy-to-understand implementation of an LSTM (Long Short-Term Memory) Recurrent Neural Network using TensorFlow/Keras. This project is aimed at beginners looking to understand how LSTMs work for sequence modeling problems.


📌 Features

  • ✅ Clean and minimalistic implementation using tensorflow.keras
  • ✅ Easy to extend for custom sequence learning tasks
  • ✅ Includes preprocessing, model building, training, and evaluation steps

🧠 What is an LSTM?

LSTM stands for Long Short-Term Memory. It is a special kind of Recurrent Neural Network (RNN) capable of learning long-term dependencies in sequential data by retaining information over long periods.

Traditional RNNs tend to struggle with long sequences due to the vanishing gradient problem. LSTMs solve this using three gates:

  • Input Gate: Controls how much of the new input should be written to memory
  • Forget Gate: Controls how much of the existing memory to forget
  • Output Gate: Controls how much of the memory to expose as output

These gates make LSTMs highly effective for tasks involving:

  • Time series prediction
  • Natural language processing (e.g. text generation, sentiment analysis)
  • Sequence classification

LSTM Diagram


🚀 How to Run

🧰 Setup using Conda (Recommended)

conda create -n lstm_env python=3.10 -y
conda activate lstm_env
pip install -r requirements.txt
jupyter notebook LSTM_RNN.ipynb

⚙️ Model Architecture

The model uses Keras Sequential API:

  • LSTM: Core layer to handle sequential data
  • Dense: Final output layer (could be binary or multi-class)

Sample architecture:

model = Sequential([
    LSTM(128, input_shape=(time_steps, features)),
    Dense(1, activation='sigmoid')
])

📊 Dataset

This repo uses synthetic sequential data generated within the notebook. You can replace it with your custom dataset:

  • Format: (samples, time_steps, features)
  • You can use numpy, pandas, or any other method for data preprocessing

🔁 How It Works

  1. Data Generation: Synthetic sequences are generated in the notebook for demonstration
  2. Preprocessing: Data is reshaped and normalized
  3. Model Building: An LSTM layer is stacked followed by a dense output layer
  4. Training: Model is trained using binary_crossentropy and the Adam optimizer
  5. Evaluation: Loss and accuracy are monitored during training and test predictions are plotted

🛠️ Customization

You can modify the model or dataset for use cases like:

  • Time series forecasting (e.g., stock prices)
  • Sentiment classification
  • Character-level text generation

🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to fork and submit a PR.


📜 License

MIT License. See LICENSE file for more info.


🙌 Acknowledgments

  • TensorFlow/Keras team for the incredible deep learning framework
  • Stanford CS231n, Coursera Deep Learning Specialization

📬 Contact

Made with ❤️ by Tejas


"Learn it by building it."

Releases

No releases published

Packages

No packages published