Awesome

Retrieval-Augmented Open-Vocabulary Object Detection

This is the official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".

Jooyean Kim*, Eulrang Cho*, Sehyung Kim, Hyunwoo J. Kim.

Department of Computer Science and Engineering, Korea University

ralf_figure

Introduction

RALF is structured into multiple branches.

prerequisite branch: The code for prerequisites necessary for running RALF.
RAF branch: The code for training RAF.

The other branches are the integration of existing OVD model and RALF.

OADP branch: The baseline is OADP.
Centric branch: The baseline is Object-Centric-OVD.
DetPro branch: The baseline is DetPro.

Results

COCO

Model	$\text{AP}^\text{N}_\text{50}$
RALF + OADP	33.4
RALF + Object-Centric-OVD	41.3

LVIS

Model	$\text{AP}_\text{r}$
RALF + OADP	21.9
RALF + DetPro	21.1
RALF + Object-Centric-OVD	18.5

Citation

@inproceedings{kim2024retrieval,
  title={Retrieval-Augmented Open-Vocabulary Object Detection},
  author={Kim, Jooyeon and Cho, Eulrang and Kim, Sehyung and Kim, Hyunwoo J},
  booktitle={CVPR},
  year={2024}
}

References

This code is built on CLIP, V3Det, GPT-3, OADP, Object-Centric-OVD and DetPro.