Home

Awesome

RVT: Recurrent Vision Transformers for Object Detection with Event Cameras

<p align="center"> <img src="https://rpg.ifi.uzh.ch/img/papers/arxiv22_detection_mgehrig/combo.png" width="750"> </p>

This is the official Pytorch implementation of the CVPR 2023 paper Recurrent Vision Transformers for Object Detection with Event Cameras.

Watch the video for a quick overview.

@InProceedings{Gehrig_2023_CVPR,
  author  = {Mathias Gehrig and Davide Scaramuzza},
  title   = {Recurrent Vision Transformers for Object Detection with Event Cameras},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year    = {2023},
}

Installation

Conda

We highly recommend to use Mambaforge to reduce the installation time.

conda create -y -n rvt python=3.9 pip
conda activate rvt
conda config --set channel_priority flexible

CUDA_VERSION=11.8

conda install -y h5py=3.8.0 blosc-hdf5-plugin=1.0.0 \
hydra-core=1.3.2 einops=0.6.0 torchdata=0.6.0 tqdm numba \
pytorch=2.0.0 torchvision=0.15.0 pytorch-cuda=$CUDA_VERSION \
-c pytorch -c nvidia -c conda-forge

python -m pip install pytorch-lightning==1.8.6 wandb==0.14.0 \
pandas==1.5.3 plotly==5.13.1 opencv-python==4.6.0.66 tabulate==0.9.0 \
pycocotools==2.0.6 bbox-visualizer==0.1.0 StrEnum==0.4.10
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Detectron2 is not strictly required but speeds up the evaluation.

Venv

Alternative to the conda installation.

python -m venv rvt
source rvt/bin/activate
python -m pip install -r torch-req.txt --index-url https://download.pytorch.org/whl/cu118
python -m pip install -r requirements.txt

Optionally, install Detectron2 within the activated venv

python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

Required Data

To evaluate or train RVT you will need to download the required preprocessed datasets:

<table><tbody> <th valign="bottom"></th> <th valign="bottom">1 Mpx</th> <th valign="bottom">Gen1</th> <tr><td align="left">pre-processed dataset</td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/datasets/preprocessed/gen4.tar">download</a></td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/datasets/preprocessed/gen1.tar">download</a></td> </tr> <tr><td align="left">crc32</td> <td align="center"><tt>c5ec7c38</tt></td> <td align="center"><tt>5acab6f3</tt></td> </tr> </tbody></table>

You may also pre-process the dataset yourself by following the instructions.

Pre-trained Checkpoints

1 Mpx

<table><tbody> <th valign="bottom"></th> <th valign="bottom">RVT-Base</th> <th valign="bottom">RVT-Small</th> <th valign="bottom">RVT-Tiny</th> <tr><td align="left">pre-trained checkpoint</td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/checkpoints/1mpx/rvt-b.ckpt">download</a></td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/checkpoints/1mpx/rvt-s.ckpt">download</a></td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/checkpoints/1mpx/rvt-t.ckpt">download</a></td> </tr> <tr><td align="left">md5</td> <td align="center"><tt>72923a</tt></td> <td align="center"><tt>a94207</tt></td> <td align="center"><tt>5a3c78</tt></td> </tr> </tbody></table>

Gen1

<table><tbody> <th valign="bottom"></th> <th valign="bottom">RVT-Base</th> <th valign="bottom">RVT-Small</th> <th valign="bottom">RVT-Tiny</th> <tr><td align="left">pre-trained checkpoint</td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/checkpoints/gen1/rvt-b.ckpt">download</a></td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/checkpoints/gen1/rvt-s.ckpt">download</a></td> <td align="center"><a href="https://download.ifi.uzh.ch/rpg/RVT/checkpoints/gen1/rvt-t.ckpt">download</a></td> </tr> <tr><td align="left">md5</td> <td align="center"><tt>839317</tt></td> <td align="center"><tt>840f2b</tt></td> <td align="center"><tt>a770b9</tt></td> </tr> </tbody></table>

Evaluation

1 Mpx

python validation.py dataset=gen4 dataset.path=${DATA_DIR} checkpoint=${CKPT_PATH} \
use_test_set=${USE_TEST} hardware.gpus=${GPU_ID} +experiment/gen4="${MDL_CFG}.yaml" \
batch_size.eval=8 model.postprocess.confidence_threshold=0.001

Gen1

python validation.py dataset=gen1 dataset.path=${DATA_DIR} checkpoint=${CKPT_PATH} \
use_test_set=${USE_TEST} hardware.gpus=${GPU_ID} +experiment/gen1="${MDL_CFG}.yaml" \
batch_size.eval=8 model.postprocess.confidence_threshold=0.001

Training

1 Mpx

GPU_IDS=[0,1]
BATCH_SIZE_PER_GPU=12
TRAIN_WORKERS_PER_GPU=6
EVAL_WORKERS_PER_GPU=2
python train.py model=rnndet dataset=gen4 dataset.path=${DATA_DIR} wandb.project_name=RVT \
wandb.group_name=1mpx +experiment/gen4="${MDL_CFG}.yaml" hardware.gpus=${GPU_IDS} \
batch_size.train=${BATCH_SIZE_PER_GPU} batch_size.eval=${BATCH_SIZE_PER_GPU} \
hardware.num_workers.train=${TRAIN_WORKERS_PER_GPU} hardware.num_workers.eval=${EVAL_WORKERS_PER_GPU}

If you instead want to execute the training on 4 GPUs simply adapt GPU_IDS and BATCH_SIZE_PER_GPU accordingly:

GPU_IDS=[0,1,2,3]
BATCH_SIZE_PER_GPU=6

Gen1

GPU_IDS=0
BATCH_SIZE_PER_GPU=8
TRAIN_WORKERS_PER_GPU=6
EVAL_WORKERS_PER_GPU=2
python train.py model=rnndet dataset=gen1 dataset.path=${DATA_DIR} wandb.project_name=RVT \
wandb.group_name=gen1 +experiment/gen1="${MDL_CFG}.yaml" hardware.gpus=${GPU_IDS} \
batch_size.train=${BATCH_SIZE_PER_GPU} batch_size.eval=${BATCH_SIZE_PER_GPU} \
hardware.num_workers.train=${TRAIN_WORKERS_PER_GPU} hardware.num_workers.eval=${EVAL_WORKERS_PER_GPU}

Works Built on This Project

Open a pull request if you would like to add your project here.

Code Acknowledgments

This project has used code from the following projects: