Home

Awesome

VMZ: Model Zoo for Video Modeling

VMZ is a Caffe2 and Pytorch codebase for video modeling developed by the Computer Vision team at Facebook AI. The aim of this codebase is to help other researchers and industry practitioners:

Currently, this codebase supports the following models:

References

  1. D. Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun and M. Paluri. A Closer Look at Spatiotemporal Convolutions for Action Recognition. CVPR 2018.
  2. D. Tran, H. Wang, L. Torresani and M. Feiszli. Video Classification with Channel-Separated Convolutional Networks. ICCV 2019.
  3. D. Ghadiyaram, M. Feiszli, D. Tran, X. Yan, H. Wang and D. Mahajan, Large-scale weakly-supervised pre-training for video action recognition. CVPR 2019.
  4. W. Wang, D. Tran, M. Feiszli, What Makes Training Multi-Modal Classification Networks Hard? CVPR 2020.

Suporting Team

This codebase is actively supported by Facebook AI computer vision: @CHJoanna, @weiyaowang, @hengcv, @deeptigp, @dutran, and community researchers @bjuncek (Quansight, Oxford VGG).