Home

Awesome

TDN: Temporal Difference Networks for Efficient Action Recognition (CVPR 2021)

1

PWC<br> PWC

News

[Mar 24, 2022] We present VideoMAE, a new SOTA on Kinetics, Something-Something, and AVA. <br> [Dec 1, 2021] We update the TDN-ResNet101 on SSV2 in model zoo. <br> [Mar 5, 2021] TDN has been accepted by CVPR 2021. <br> [Dec 26, 2020] We have released the PyTorch code of TDN. <br>

Overview

We release the PyTorch code of the TDN(Temporal Difference Networks). This code is based on the TSN and TSM codebase. The core code to implement the Temporal Difference Module are ops/base_module.py and ops/tdn_net.py.

TL; DR. We generalize the idea of RGB difference to devise an efficient temporal difference module (TDM) for motion modeling in videos, and provide an alternative to 3D convolutions by systematically presenting principled and detailed module design.

Prerequisites

The code is built with following libraries:

Data Preparation

We have successfully trained TDN on Kinetics400, UCF101, HMDB51, Something-Something-V1 and V2 with this codebase.

Model Zoo

Here we provide some off-the-shelf pretrained models. The accuracy might vary a little bit compared to the paper, since the raw video of Kinetics downloaded by users may have some differences.

Something-Something-V1

ModelFrames x Crops x ClipsTop-1Top-5checkpoint
TDN-ResNet508x1x152.3%80.6%link
TDN-ResNet5016x1x153.9%82.1%link

Something-Something-V2

ModelFrames x Crops x ClipsTop-1Top-5checkpoint
TDN-ResNet508x1x164.0%88.8%link
TDN-ResNet5016x1x165.3%89.7%link
TDN-ResNet1018x1x165.8%90.2%link
8x3x167.1%90.5%-
TDN-ResNet10116x1x166.9%90.9%link
16x3x168.2%91.6%-
TDN-ResNet101(8+16)x1x168.2%91.6%-
(8+16)x3x169.6%92.2%-

Kinetics400

ModelFrames x Crops x ClipsTop-1 (30 view)Top-5 (30 view)checkpoint
TDN-ResNet508x3x1076.6%92.8%link
TDN-ResNet5016x3x1077.5%93.2%link
TDN-ResNet1018x3x1077.5%93.6%link
TDN-ResNet10116x3x1078.5%93.9%link

Testing

Training

This implementation supports multi-gpu, DistributedDataParallel training, which is faster and simpler.

Contact

tongzhan@smail.nju.edu.cn

Acknowledgements

We especially thank the contributors of the TSN and TSM codebase for providing helpful code.

License

This repository is released under the Apache-2.0. license as found in the LICENSE file.

Citation

If you think our work is useful, please feel free to cite our paper 😆 :

@InProceedings{Wang_2021_CVPR,
    author    = {Wang, Limin and Tong, Zhan and Ji, Bin and Wu, Gangshan},
    title     = {TDN: Temporal Difference Networks for Efficient Action Recognition},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {1895-1904}
}