Home

Awesome

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

This repository contains the PyTorch implementation of the paper DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval. It provides code for the knowledge distillation training of coarse- and fine-grained student networks based on similarities calculated from a teacher and the selector network. Also, the scripts for the training of the selector network are included. Finally, to facilitate the reproduction of the paper's results, the evaluation code, the extracted features for the employed video datasets, and pre-trained networks for the various students and selectors are available.

<img src="https://github.com/mever-team/distill-and-select/blob/main/dns_overview.png?raw=true" width="100%">

Prerequisites

Preparation

Installation

git clone https://github.com/mever-team/distill-and-select
cd distill-and-select
pip install -r requirements.txt

or

conda install --file requirements.txt

Feature files

Distillation

We provide the code for training and evaluation of our student models.

Student training

python train_student.py --student_type fine-grained --experiment_path experiments/DnS_students --trainset_hdf5 /path/to/dns_100k.hdf5

For fine-grained attention students:

python train_student.py --student_type fine-grained --binarization false --attention true --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5

For fine-grained binarization students:

python train_student.py --student_type fine-grained --binarization true --attention false --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5
python train_student.py --student_type coarse-grained --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5 --attention true --learning_rate 1e-5
python train_student.py --teacher fg_att_student_iter2 --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5
python train_student.py --student_type coarse-grained --val_hdf5 /path/to/fivr_5k.hdf5 --val_set ISVR --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5 --learning_rate 1e-5

Student Evaluation

python evaluation_student.py --student_path experiments/DnS_students/model_fg_att_student.pth --dataset FIVR-5K --dataset_hdf5 /path/to/fivr_200k.hdf5
python evaluation_student.py --student_type fine-grained --attention true --dataset FIVR-5K --dataset_hdf5 /path/to/fivr_200k.hdf5

Selection

We also provide the code for training of the selector network and the evaluation of our overall DnS framework.

Selector training

python train_selector.py --experiment_path experiments/DnS_students --trainset_hdf5 /path/to/dns_100k.hdf5

DnS Evaluation

python evaluation_dns.py --selector_network_path experiments/DnS_students/model_selector_network.pth --dataset FIVR-5K --dataset_hdf5 /path/to/fivr_200k.hdf5
python evaluation_dns.py --attention true --dataset FIVR-5K --dataset_hdf5 /path/to/fivr_200k.hdf5

Use our pretrained models

We also provide our pretrained models trained with the fg_att_student_iter2 teacher.

from model.feature_extractor import FeatureExtractor
from model.students import FineGrainedStudent, CoarseGrainedStudent
from model.selector import SelectorNetwork

# The feature extraction network used in out experiments
feature_extractor = FeatureExtractor(dims=512).eval()

# Our Fine-grained Students
fg_att_student = FineGrainedStudent(pretrained=True, attention=True).eval()
fg_bin_student = FineGrainedStudent(pretrained=True, binarization=True).eval()

# Our Coarse-grained Students
cg_student = CoarseGrainedStudent(pretrained=True).eval()

# Our Selector Networks
selector_att = SelectorNetwork(pretrained=True, attention=True).eval()
selector_bin = SelectorNetwork(pretrained=True, binarization=True).eval()
video_features = feature_extractor(video_tensor)
fg_features = fg_att_student.index_video(video_features)
cg_features = cg_student.index_video(video_features)
sn_features = selector_att.index_video(video_features)
fine_similarity = fg_att_student.calculate_video_similarity(query_fg_features, target_fg_features)
coarse_similarity = cg_student.calculate_video_similarity(query_cg_features, target_cg_features)
selector_features = torch.cat([query_sn_features, target_sn_features, coarse_similarity], 1)
selector_scores = selector_att(selector_features)

Citation

If you use this code for your research, please consider citing our papers:

@article{kordopatis2022dns,
  title={{DnS}: {Distill-and-Select} for Efficient and Accurate Video Indexing and Retrieval},
  author={Kordopatis-Zilos, Giorgos and Tzelepis, Christos and Papadopoulos, Symeon and Kompatsiaris, Ioannis and Patras, Ioannis},
  journal={International Journal of Computer Vision},
  year={2022}
}

@inproceedings{kordopatis2019visil,
  title={{ViSiL}: Fine-grained Spatio-Temporal Video Similarity Learning},
    author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Ioannis},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2019}
}

Related Projects

ViSiL - here you can find our teacher model

FIVR-200K - download our FIVR-200K dataset

Acknowledgements

This work has been supported by the projects WeVerify and MediaVerse, partially funded by the European Commission under contract number 825297 and 957252, respectively, and DECSTER funded by EPSRC under contract number EP/R025290/1.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

Contact for further details about the project

Giorgos Kordopatis-Zilos (georgekordopatis@iti.gr)