Skip to content

mvrl/QuARI

Repository files navigation

QuARI: Query Adaptive Retrieval Improvement: NeurIPS 2025

Static Badge Project Page HuggingFace

Overview

Current multimodal embedding models are widely used for image-to-image and text-to-image retrieval, but their global embeddings often miss the fine-grained cues needed for challenging retrieval tasks. QuARI tackles this by learning a query-specific linear projection of a frozen backbone embedding space. A transformer hypernetwork maps each query to both an adapted query embedding and a low-rank projection matrix that is applied to all gallery embeddings, making the adaptation cheap enough to run over millions of items. Trained with a symmetric contrastive loss and additional “semi-positive” neighbors, QuARI emphasizes subspaces that are relevant to the current query while down-weighting irrelevant directions. Experiments on ILIAS and INQUIRE show that this simple query-conditioned adaptation consistently outperforms strong baselines, including static task-adapted encoders and heavyweight re-rankers, while remaining highly efficient at inference time.

Setup

conda env create -f env.yml
conda activate vis-lang

Data Setup

Set the appropriate download directory downloading/setup_download.sh

bash download_cc12m.sh
bash download_coco.sh
python cocototar.py \
    --images-dir /path/to/coco/images \
    --captions-json /path/to/coco/captions \
    --out-tar /path/to/output/tarfile

Training

Step 1: Precompute embeddings

python precompute_embeddings.py \
    --extractor openai/clip-vit-base-patch32 \
    --output_path ./precomputed/train_chunks \
    --image_dir ./data/images \
    --tar_regex '.*\.tar$' \
    --chunk_size 50000

Step 2: Mine semi-positives

python mine_semipositives.py \
    --embeddings_path ./precomputed/train_embeds.pt \
    --output_path ./semipositives/train_semipos.pt \
    --k 100 \
    --top_n 2

Step 3: Train QuARI

python train.py \
    --json_path ./data/train.json \
    --image_dir ./data/images \
    --extractor openai/clip-vit-base-patch32 \
    --use_precomputed \
    --precomputed_dir ./precomputed \
    --train_semipositives_path ./semipositives/train_semipos.pt \
    --batch_size 512 \
    --max_epochs 10 \
    --freeze_extractors \
    --output_dir ./outputs

Evaluation

python eval_retrieval.py \
    --embeddings_dir ./precomputed/val \
    --checkpoint_path ./outputs/checkpoints/best.ckpt \
    --distractor_dirs ./distractors/yfcc \
    --eval_baseline

Using Pretrained Models

Get pretrained weights by running download_ckpts.py.

Citation

@inproceedings{xing2025quari,
  title={QuARI: Query Adaptive Retrieval Improvement},
  author={Xing, Eric and Stylianou, Abby and Pless, Robert and Jacobs, Nathan},
  booktitle={The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS)},
  year={2025}
}

About

QuARI: Query Adaptive Retrieval Improvement [NeurIPS'25]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •