Awesome

SVHF-Net

SVHF-Net for Cross-modal binary matching

This directory contains code to import and evaluate the static SVHF-Net model trained on the VoxCeleb and VGGFace datasets as described in the paper:

A. Nagrani, S. Albanie, A. Zisserman, Seeing Voices and Hearing Faces: Cross-modal biometric matching, 
CVPR, 2018

Further details can be found here.

Prerequisites

To use the models first install the MatConvNet framework. Instructions can be found here.

Installing

To install, follow these steps:

Install and compile matconvnet by following instructions here.
Setup paths:

setup_SVHFNet

You can then run the demo script provided to import and test the model.

test_SVHFNet

Dataset

This model has been trained on static face images from the VoxCeleb and VGGFace datasets, and audio segments from the VoxCeleb dataset. The VoxCeleb dataset can be downloaded directly from here. Cropped face images can be downloaded from here.

Citation

If you use this code then please cite:

  @InProceedings{Nagrani18a,
                    author       = "Nagrani, A. and Albanie, S. and Zisserman, A.",
                    title        = "Seeing Voices and Hearing Faces: Cross-modal biometric matching",
                    booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition",
                    year         = "2018",
                  }