Home

Awesome

Turning to Video for Transcript Sorting

This repo contains the official implementations of the two papers:

  1. Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
  2. TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale

News

Introduction

Quickstart

Folder v1 contains the official code of TVTS. See v1-README for details.

Folder v2 contains the official code of TVTSv2, an upgraded version of TVTS that produces powerful video representations for out-of-the-box usage. See v2-README for details.

Citation

If you find our work helps, please cite our paper.

@InProceedings{Zeng_2023_CVPR,
    author    = {Zeng, Ziyun and Ge, Yuying and Liu, Xihui and Chen, Bin and Luo, Ping and Xia, Shu-Tao and Ge, Yixiao},
    title     = {Learning Transferable Spatiotemporal Representations From Natural Script Knowledge},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {23079-23089}
}
@misc{zeng2023tvtsv2,
      title={TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale}, 
      author={Ziyun Zeng and Yixiao Ge and Zhan Tong and Xihui Liu and Shu-Tao Xia and Ying Shan},
      year={2023},
      eprint={2305.14173},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}