Skip to content

MovieMind is a FastAPI-based application that combines TMDB movie data with Elasticsearch's powerful search capabilities and Google's Gemini embeddings for semantic search. This allows users to search for movies using natural language queries and find semantically relevant results.

Notifications You must be signed in to change notification settings

aakashdeepsil/MovieMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MovieMind: Semantic Movie Search Engine

MovieMind is a FastAPI-based application that combines TMDB movie data with Elasticsearch's powerful search capabilities and Google's Gemini embeddings for semantic search. This allows users to search for movies using natural language queries and find semantically relevant results.

Features

  • Fetch and index popular movies from TMDB
  • Semantic search using Google's Gemini embeddings
  • Elasticsearch-powered search engine
  • RESTful API with FastAPI
  • Docker support for Elasticsearch
  • Async operations for better performance
  • Comprehensive test coverage

Prerequisites

  • Python 3.8+
  • Docker and Docker Compose
  • TMDB API key (get it from TMDB)
  • Google Gemini API key (get it from Google AI Studio)

Setup

  1. Clone the repository:
git clone https://github.com/yourusername/MovieMind.git
cd MovieMind
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create a .env file in the project root and add your configuration:
TMDB_API_KEY=your_tmdb_api_key_here
ELASTICSEARCH_HOST=http://localhost:9200
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=your_password_here
ELASTICSEARCH_INDEX=movies
GEMINI_API_KEY=your_gemini_api_key_here
  1. Start Elasticsearch using Docker:
docker-compose up -d
  1. Run the FastAPI application:
uvicorn app.main:app --reload

The application will be available at http://localhost:8000

API Endpoints

1. Index Movies

  • Endpoint: POST /index-movies/
  • Query Parameters:
    • limit (optional, default=100): Number of movies to index
  • Description: Fetches popular movies from TMDB and indexes them in Elasticsearch

2. Search Movies

  • Endpoint: GET /search/
  • Query Parameters:
    • query: Search query string
    • size (optional, default=10): Number of results to return
  • Description: Performs semantic search on indexed movies

How It Works

  1. Movie Indexing:

    • Fetches popular movies from TMDB
    • Generates semantic embeddings using Google's Gemini API
    • Stores movies and their embeddings in Elasticsearch
  2. Semantic Search:

    • Converts search query to embedding using Gemini API
    • Uses cosine similarity to find semantically similar movies
    • Returns ranked results based on similarity score

Testing

Run the test suite:

pytest tests/

API Documentation

Docker Support

The project uses Docker Compose for running Elasticsearch. The configuration includes:

  • Single-node Elasticsearch cluster
  • Persistent volume for data storage
  • Basic security enabled
  • Memory settings optimized for development

Development

  1. Make sure to activate the virtual environment:
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install development dependencies:
pip install -r requirements.txt
  1. Run tests:
pytest tests/

Troubleshooting

  1. Elasticsearch Connection Issues:

    • Ensure Docker is running
    • Check if Elasticsearch container is up: docker ps
    • Verify correct password in .env file
  2. API Key Issues:

    • Verify TMDB API key is valid
    • Ensure Gemini API key has proper permissions
    • Check if API keys are correctly set in .env file

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details

About

MovieMind is a FastAPI-based application that combines TMDB movie data with Elasticsearch's powerful search capabilities and Google's Gemini embeddings for semantic search. This allows users to search for movies using natural language queries and find semantically relevant results.

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages