Awesome

Official Code Implementation of the paper : Video and Text Matching with Conditioned Embeddings https://arxiv.org/abs/2110.11298

Datasets :

We employ the following datasets in our work:

Acitivtynet Captions, the pre-extracted features can be downloaded by clicking here.
Didemo , the pre-extracted features can be downloaded by clicking here
Vatex click here.
MSR-VTT can can be downloaded by clicking here
YouCook2 . the preextracted features can be downloaded here
LSMDC click here

Example training command on Activitynet : python train.py anet_precomp --feat_name i3d --img_dim 2048 --norm