Awesome
MiniROAD: Minimal RNN Framework for Online Action Detection
Introduction
This is a pytorch implementation for our ICCV 2023 paper "MiniROAD: Minimal RNN Framework for Online Action Detection
".
Data Preparation
THUMOS14 and TVSeries
To prepare the features and targets by yourself, please refer to LSTR. You can also directly download the pre-extracted features and targets from TeSTra
.
FineAction
Download the officially available pre-extracted features from FineAction
. As mentioned in the paper, the temporal dimensions have been linearly interpolated by a factor of four as the officially available feature is too condensed (16 frames being converted into one feature).
Data Structure
-
If you want to use our dataloaders, please make sure to put the files as the following structure:
-
THUMOS'14 dataset:
$YOUR_PATH_TO_THUMOS_DATASET ├── rgb_FEATURETYPE/ | ├── video_validation_0000051.npy │ ├── ... ├── flow_FEATURETYPE/ | ├── video_validation_0000051.npy | ├── ... ├── target_perframe/ | ├── video_validation_0000051.npy (of size L x 22) | ├── ...
-
TVSeries dataset:
$YOUR_PATH_TO_TVSERIES_DATASET ├── rgb_FEATURETYPE/ | ├── Breaking_Bad_ep1.npy │ ├── ... ├── flow_FEATURETYPE/ | ├── Breaking_Bad_ep1.npy | ├── ... ├── target_perframe/ | ├── Breaking_Bad_ep1.npy (of size L x 31) | ├── ...
-
FineAction dataset:
$YOUR_PATH_TO_FINEACTION_DATASET ├── rgb_kinetics_i3d/ | ├── v_00008645.npy (of size L x 2048) │ ├── ... ├── flow_kinetics_i3d/ | ├── v_00008645.npy (of size L x 2048) | ├── ... ├── target_perframe/ | ├── v_00008645.npy (of size L x 107) | ├── ...
For appropriate FEATURETYPE, please refer to (datasets/dataset.py)
-
-
Create softlinks of datasets:
cd MiniROAD ln -s $YOUR_PATH_TO_THUMOS_DATASET data/THUMOS ln -s $YOUR_PATH_TO_TVSERIES_DATASET data/TVSERIES ln -s $YOUR_PATH_TO_FINEACTION_DATASET data/FINEACTION
Training
```
cd MiniROAD
python main.py --config $PATH_TO_CONFIG_FILE
```
Inference from checkpoint
```
cd MiniROAD
python main.py --config $PATH_TO_CONFIG_FILE --eval $PATH_TO_CHECKPOINT
```
Main Results and checkpoints
THUMOS14
method | feature | mAP (%) | config | checkpoint |
---|---|---|---|---|
MiniROAD | kinetics | 71.8 | yaml | Download |
MiniROAD | nv_kinetics | 68.4 | yaml | Download |
FINEACTION
method | feature | mAP (%) | config | checkpoint |
---|---|---|---|---|
MiniROAD | kinetics | 37.1 | yaml | Download |
TVSERIES
method | feature | mcAP (%) | config | checkpoint |
---|---|---|---|---|
MiniROAD | kinetics | 89.6 | yaml | Download |
Citations
If you are using the data/code/model provided here in a publication, please cite our paper:
@inproceedings{miniroad,
title={MiniROAD: Minimal RNN Framework for Online Action Detection},
author={An, Joungbin and Kang, Hyolim and Han, Su Ho and Yang, Ming-Hsuan and Kim, Seon Joo},
booktitle={International Conference on Computer Vision (ICCV)},
year={2023}
}
License
This project is licensed under the Apache-2.0 License.
Acknowledgements
Many of the codebase is from LSTR.