Skip to content

This project implements a text-based image search system. Users input a descriptive query and the the system will return the most closely related images to that description.

Notifications You must be signed in to change notification settings

r-butl/TextBasedImageSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

53 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ–ΌοΈ Text-Based Image Search

This project implements a machine learning system that allows users to retrieve relevant images based on natural language text queries. The system aligns text and image feature embeddings using a trained neural network and retrieves the closest-matching images based on cosine similarity.

πŸ” Overview

Given a text query (e.g., "a cat sitting on a couch"), the system:

  1. Encodes the query using a pretrained text embedding model (MiniLM-L6-v2).
  2. Uses a trained feedforward neural network to map the text embedding into image embedding space.
  3. Compares the predicted image embedding against a database of image embeddings (extracted using Dinov2).
  4. Returns the most semantically relevant images using cosine similarity.

🧠 Model Architecture

  • Text Embedding: MiniLM-L6-v2
  • Image Embedding: Dinov2
  • Neural Network: 5-layer feedforward network
  • Loss Function: Cosine Similarity Loss
  • Optimization: Adam with ReduceLRonPlateau
  • Training Dataset: TextCaps (28k images with 140k captions)

πŸ“Š Evaluation Metrics

  • MRR (Mean Reciprocal Rank)
  • Precision / Recall / Jaccard Similarity (evaluated on both fine-grained and coarse-grained class sets)

πŸ“ˆ Achieved recall of 0.84 (coarse-grained), showing strong ability to retrieve semantically relevant images.


πŸš€ Getting Started

πŸ”Ž Hyperparameter Search

Run to find the best layer sizes and training config:

python hyperparameter_search.py

πŸ‹οΈβ€β™‚οΈ Train the Model

Train using the best configuration:

python train.py

πŸ§ͺ Test the Model

Evaluate using cosine similarity & MRR:

python test.py

πŸ“Ž References & Docs


πŸ‘¨β€πŸ’» Contributors

  • Lucas Butler
  • Boxi Chen
  • Anthony Pecoraro
  • Hayat White

About

This project implements a text-based image search system. Users input a descriptive query and the the system will return the most closely related images to that description.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •