Awesome

Video Question Answering Using Language-Guided Deep Conpressed-Domain Video Feature

(VQAC)

This is the PyTorch Implementation of

Nayoung Kim, Seong-Jong Ha, and Je-Won Kang. Video Question Answering Using Language-Guided Deep Conpressed-Domain Video Feature. In ICCV, 2021. (to appear)

Download preprocessing data

In this experiment, we use MSVD-QA dataset. Please refer to their website for the detailed statistics of this dataset.

We already upload compressed-domain video features. You don't need to download orinial videos.

cd Model

Preprocessing

If you want to generate features, follow the below step. (Will be)

Video encoding To extract motion vector and residue by HM 16.04, you need to follow this process:

resize the video resolution: 224x224

Feature warping