MovieMind is a FastAPI-based application that combines TMDB movie data with Elasticsearch's powerful search capabilities and Google's Gemini embeddings for semantic search. This allows users to search for movies using natural language queries and find semantically relevant results.
- Fetch and index popular movies from TMDB
- Semantic search using Google's Gemini embeddings
- Elasticsearch-powered search engine
- RESTful API with FastAPI
- Docker support for Elasticsearch
- Async operations for better performance
- Comprehensive test coverage
- Python 3.8+
- Docker and Docker Compose
- TMDB API key (get it from TMDB)
- Google Gemini API key (get it from Google AI Studio)
- Clone the repository:
git clone https://github.com/yourusername/MovieMind.git
cd MovieMind- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Create a
.envfile in the project root and add your configuration:
TMDB_API_KEY=your_tmdb_api_key_here
ELASTICSEARCH_HOST=http://localhost:9200
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=your_password_here
ELASTICSEARCH_INDEX=movies
GEMINI_API_KEY=your_gemini_api_key_here- Start Elasticsearch using Docker:
docker-compose up -d- Run the FastAPI application:
uvicorn app.main:app --reloadThe application will be available at http://localhost:8000
- Endpoint: POST
/index-movies/ - Query Parameters:
limit(optional, default=100): Number of movies to index
- Description: Fetches popular movies from TMDB and indexes them in Elasticsearch
- Endpoint: GET
/search/ - Query Parameters:
query: Search query stringsize(optional, default=10): Number of results to return
- Description: Performs semantic search on indexed movies
-
Movie Indexing:
- Fetches popular movies from TMDB
- Generates semantic embeddings using Google's Gemini API
- Stores movies and their embeddings in Elasticsearch
-
Semantic Search:
- Converts search query to embedding using Gemini API
- Uses cosine similarity to find semantically similar movies
- Returns ranked results based on similarity score
Run the test suite:
pytest tests/- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
The project uses Docker Compose for running Elasticsearch. The configuration includes:
- Single-node Elasticsearch cluster
- Persistent volume for data storage
- Basic security enabled
- Memory settings optimized for development
- Make sure to activate the virtual environment:
source venv/bin/activate # On Windows: venv\Scripts\activate- Install development dependencies:
pip install -r requirements.txt- Run tests:
pytest tests/-
Elasticsearch Connection Issues:
- Ensure Docker is running
- Check if Elasticsearch container is up:
docker ps - Verify correct password in .env file
-
API Key Issues:
- Verify TMDB API key is valid
- Ensure Gemini API key has proper permissions
- Check if API keys are correctly set in .env file
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details