Home

Awesome

StreamYOLO

Real-time Object Detection for Streaming Perception

<p align='left'> <img src='assets/train.png' width='721'/> </p>

Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian <br> Real-time Object Detection for Streaming Perception, CVPR 2022 (Oral)<br> [Paper] <br />

Benchmark

ModelsizevelocitysAP<br>0.5:0.95sAP50sAP75weightsCOCO pretrained weights
StreamYOLO-s600×9601x29.850.329.8githubgithub
StreamYOLO-m600×9601x33.754.534.0githubgithub
StreamYOLO-l600×9601x36.958.137.5githubgithub
StreamYOLO-l600×9602x34.656.334.7githubgithub
StreamYOLO-l600×960still39.460.040.2githubgithub

Quick Start

<details> <summary>Dataset preparation</summary>

You can download Argoverse-1.1 full dataset and annotation from HERE and unzip it.

The folder structure should be organized as follows before our processing.

StreamYOLO
├── exps
├── tools
├── yolox
├── data
│   ├── Argoverse-1.1
│   │   ├── annotations
│   │       ├── tracking
│   │           ├── train
│   │           ├── val
│   │           ├── test
│   ├── Argoverse-HD
│   │   ├── annotations
│   │       ├── test-meta.json
│   │       ├── train.json
│   │       ├── val.json

The hash strings represent different video sequences in Argoverse, and ring_front_center is one of the sensors for that sequence. Argoverse-HD annotations correspond to images from this sensor. Information from other sensors (other ring cameras or LiDAR) is not used, but our framework can be also extended to these modalities or to a multi-modality setting.

</details> <details> <summary>Installation</summary>
# basic python libraries
conda create --name streamyolo python=3.7

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

pip3 install yolox==0.3
git clone git@github.com:yancie-yjr/StreamYOLO.git

cd StreamYOLO/

# add StreamYOLO to PYTHONPATH and add this line to ~/.bashrc or ~/.zshrc (change the file accordingly)
ADDPATH=$(pwd)
echo export PYTHONPATH=$PYTHONPATH:$ADDPATH >> ~/.bashrc
source ~/.bashrc

# Installing `mmcv` for the official sAP evaluation:
# Please replace `{cu_version}` and ``{torch_version}`` with the versions you are currently using.
# You will get import or runtime errors if the versions are incorrect.
pip install mmcv-full==1.1.5 -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html

</details> <details> <summary>Reproduce our results on Argoverse-HD</summary>

Step1. Prepare COCO dataset

cd <StreamYOLO_HOME>
ln -s /path/to/your/Argoverse-1.1 ./data/Argoverse-1.1
ln -s /path/to/your/Argoverse-HD ./data/Argoverse-HD

Step2. Reproduce our results on Argoverse:

python tools/train.py -f cfgs/m_s50_onex_dfp_tal_flip.py -d 8 -b 32 -c [/path/to/your/coco_pretrained_path] -o --fp16
</details> <details> <summary>Offline Evaluation</summary>

We support batch testing for fast evaluation:

python tools/eval.py -f  cfgs/l_s50_onex_dfp_tal_flip.py -c [/path/to/your/model_path] -b 64 -d 8 --conf 0.01 [--fp16] [--fuse]
</details> <details> <summary>Online Evaluation</summary>

We modify the online evaluation from sAP

Please use 1 V100 GPU to test the performance since other GPUs with low computing power will trigger non-real-time results!!!!!!!!

cd sAP/streamyolo
bash streamyolo.sh
</details>

Citation

Please cite the following paper if this repo helps your research:

@inproceedings{streamyolo,
  title={Real-time Object Detection for Streaming Perception},
  author={Yang, Jinrong and Liu, Songtao and Li, Zeming and Li, Xiaoping and Sun, Jian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5385--5395},
  year={2022}
}
@article{yang2022streamyolo,
  title={StreamYOLO: Real-time Object Detection for Streaming Perception},
  author={Yang, Jinrong and Liu, Songtao and Li, Zeming and Li, Xiaoping and Sun, Jian},
  journal={arXiv preprint arXiv:2207.10433},
  year={2022}
}

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.