Home

Awesome

CoCon: Coooperative Contrastive Learning for Video Representation Learning

This repository contains the implementation of CoCon - Cooperative Contrastive Learning for video representation learning. We utilize multiple views of videos in order to learn better representations capturing semantics suitable for tasks related to video understanding. CoCon was presented at BayLearn 2020 and will be part of Holistic Video Understanding at CVPR '21.

arch

Authors

Installation

Our implementation should work with python >= 3.6, pytorch >= 0.4, torchvision >= 0.2.2. The repo also requires cv2 (conda install -c menpo opencv), tensorboardX >= 1.7 (pip install tensorboardX), tqdm.

A requirements.txt has been provided which can be used to create the exact environment required.

pip install -r requirements.txt

Prepare data

Follow the instructions here. Instructions to generate multi-view data for custom datasets will be added soon.

Cooperative Contrastive Learning (CoCon)

Training scripts are present in cd CoCon/train/

Run python model_trainer.py --help to get details about the command lines args. The most useful ones are --dataset and --modalities, which are used to change the dataset we're supposed to run our experiments along with the input modalities to use.

Our implementation has been tested with RGB, Optical Flow, Segmentation Masks, Human Keypoints . However, it is easy to extend it to custom views; look at dataset_3d.py for details.

Evaluation: Video Action Recognition

Testing scripts are present in cd CoCon/test/

Results

arch

Qualitative Evaluation

Scripts for qualitative evaluation will be added here.

Acknowledgements

Portions of code have been borrowed from DPC. Feel free to refer to their great work as well if you're interested in the field.

Citing

If our paper or the codebase was useful to you, please consider citing it using the below.

@InProceedings{Rai_2021_CVPR,
    author    = {Rai, Nishant and Adeli, Ehsan and Lee, Kuan-Hui and Gaidon, Adrien and Niebles, Juan Carlos},
    title     = {CoCon: Cooperative-Contrastive Learning},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2021},
    pages     = {3384-3393}
}

Keywords