Skip to content

360CVGroup/LMM-Det

Repository files navigation

LMM-Det: Make Large Multimodal Models Excel in Object Detection

This repository is the official implementation of LMM-Det, a simple yet effective approach that leverages a Large Multimodal Model for vanilla object Detection without relying on specialized detection modules.

LMM-Det: Make Large Multimodal Models Excel in Object Detection
Jincheng Li*, Chunyu Xie*, Ji Ao, Dawei Leng†, Yuhui Yin (*Equal Contribution, ✝Corresponding Author)

中文

我爱计算机视觉​

360AI研究院

🔥 News

Contents

Install

# remember to modify Line 7 in deploy.sh 
bash deploy.sh

Model Zoo

🤗LMM-Det-StageIV

🤗OWLv2-ViT

We also provide the official weight of OWlv2-ViT

Customized Dataset

We have released our customized dataset during Stage IV.

For the curation details, please refer to: [Custom Data]

Preparation

Step 1: Download the COCO dataset. You can put COCO into LMM-Det/data or make a soft link using ln -s.

Step 2: Modify the COCO Path in Lines 4-5 in LMM-Det/scripts/eval/eval_coco_model_w_sft_data.sh

Step 3: Download the model and put it into LMM-Det/checkpoints

Evaluation

bash evaluate.sh

We Are Hiring

We are seeking academic interns in the Multimodal field. If interested, please send your resume to xiechunyu@360.cn.

BibTeX

@misc{li2025lmmdet,
      title={LMM-Det: Make Large Multimodal Models Excel in Object Detection}, 
      author={Jincheng Li and Chunyu Xie and Ji Ao and Dawei Leng and Yuhui Yin},
      year={2025},
      eprint={2507.18300},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.18300}, 
}

License

This project is licensed under the Apache License (Version 2.0).

Related Projects

This work wouldn't be possible without the incredible open-source code of these projects. Huge thanks!

About

Make Large Multimodal Models excel in object detection, ICCV 2025

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors