Awesome
Retrieval-Augmented Open-Vocabulary Object Detection
This is the official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".
Jooyean Kim*, Eulrang Cho*, Sehyung Kim, Hyunwoo J. Kim.
Department of Computer Science and Engineering, Korea University
Introduction
RALF is structured into multiple branches.
prerequisite
branch: The code for prerequisites necessary for running RALF.RAF
branch: The code for training RAF.
The other branches are the integration of existing OVD model and RALF.
OADP
branch: The baseline is OADP.Centric
branch: The baseline is Object-Centric-OVD.DetPro
branch: The baseline is DetPro.
Results
COCO
Model | $\text{AP}^\text{N}_\text{50}$ |
---|---|
RALF + OADP | 33.4 |
RALF + Object-Centric-OVD | 41.3 |
LVIS
Model | $\text{AP}_\text{r}$ |
---|---|
RALF + OADP | 21.9 |
RALF + DetPro | 21.1 |
RALF + Object-Centric-OVD | 18.5 |
Citation
@inproceedings{kim2024retrieval,
title={Retrieval-Augmented Open-Vocabulary Object Detection},
author={Kim, Jooyeon and Cho, Eulrang and Kim, Sehyung and Kim, Hyunwoo J},
booktitle={CVPR},
year={2024}
}
References
This code is built on CLIP, V3Det, GPT-3, OADP, Object-Centric-OVD and DetPro.