Home

Awesome

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments

by Khoi D. Nguyen, Quoc-Huy Tran, Khoi Nguyen, Binh-Son Hua, and Rang Nguyen

We present a novel method for few-shot video classification, which performs appearance and temporal alignments. In particular, given a pair of query and support videos, we conduct appearance alignment via frame-level feature matching to achieve the appearance similarity score between the videos, while utilizing temporal order-preserving priors for obtaining the temporal similarity score between the videos. Moreover, we leverage the above appearance and temporal similarity scores in prototypes refinement for both inductive and transductive settings. To the best of our knowledge, our work is the first to explore transductive few-shot video classification.

teaser

Details of our evaluation framework and benchmark results can be found in our paper:

@inproceedings{khoi2022ata,
    title={Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments},
    author={Khoi D. Nguyen and Quoc-Huy Tran and Khoi Nguyen and Binh-Son Hua and Rang Nguyen},
    booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
    year={2022}
}

Please CITE our paper when this repository is used to help produce published results or is incorporated into other software.

Content

Prerequisites

The code is built with following libraries:

Data Preparation

Check this for details of Something-Something V2 downloading.

For data preprocessing, we use vidtools as in TAM to extract frames of video.

The processing of video data can be summarized as follows:

Training

To train on Something-Something V2 from ImageNet pretrained models, users can run scripts/train_somethingv2_rgb_8f.sh, which contains:

# train on Something-Something V2
python -u main.py somethingv2 RGB --arch resnet50 \
--num_segments 8 --lr 0.001 --lr_steps 10 20 --epochs 25  \
--batch-size 32 --workers 2 --dropout 0.5 \
--root_log ./checkpoints/path --root_model ./checkpoints/path \
--wd 0.0005 --gpus 0 --episodes 600

Training Arguments

Testing

The pretrained models are available here

To test the downloaded pretrained models on Something-Something V2, users can modify/run scripts/test_somethingv2_rgb_8f.sh. For example, to test 5-way/1-shot inductive settings on 10,000 episodes:

# test on Something-Something V2
python -u main.py somethingv2 RGB --arch resnet50 --num_segments 8 --workers 2 \
--root_log ./checkpoints/path --root_model ./checkpoints/path \
--resume ./checkpoints/path/ckpt.best.pth.tar --evaluate --gpus 0 --way 5 --shot 1 --episodes 10000

Few-shot Arguments

Acknowledgment

We thank the following repos providing helpful components/functions in our work.