Awesome

Self-Supervised Viewpoint Learning from Image Collections

This repository contains code for our work on Self-Supervised Viewpoint Learning from Image Collections (SSV) accepted at CVPR 2020. SSV provides a framework to learn viewpoint estimation of objects just using images of objects without the need for groundtruth viewpoint annotations.

ssv

Prerequisites

We used Pytorch 1.0 with CUDA 10 and CuDNN 7.4.1 in Ubuntu 16.04. All the dependencies are provided in requirements.txt. A similar environment can be created using:
conda create --name ssv --file requirements.txt

Please download MTCNN-Pytorch from here and install it in 'data_preprocessing' folder. This is required for preprocessing the datasets.

Datasets

300W-LP

300W-LP dataset can be downloaded from here. To preprocess it run:
python preprocess_data.py --src-dir <path_to_300wlp_dataset> --dst-dir <path_to_processed_300wlp> --datset 300WLP

Create an lmdb of the preprocessed 300W-LP data by running:
python prepare_lmdb.py <path_to_processed_300wlp> --out <path_to_300wlp_lmdb>

BIWI Head Pose

BIWI headpose estimation dataset can by downloaded by writing to the authors of 'Fanelli, G. and Dantone, M. and Gall, J. and Fossati, A. and van Gool, L., Random Forests for Real Time 3D Face Analysis, International Journal of Computer Vision, 2013.'
To preprocess it run:
python preprocess_data.py --src-dir <path_to_biwi_dataset> --dst-dir <path_to_processed_biwi> --dataset BIWI

Test

To obtain the pretrained model please send an email here. Run the following:
python test_vpnet.py --data_dir <path_to_processed_biwi> --model_path <path_to_pretrained_model>

Demo

Run the following for visualization of head pose predictions on some samples of BIWI dataset.
python ssv_demo.py
The plots are saved in 'demo_images/plots'.

Synthesized samples

The following command produces some sample synthesized images. These are saved in ''synth_images'.
python test_synthesis.py
The gif looks similar to the one shown below.
synthesis examples

Training

To train SSV from scratch, run the following:
python3 train.py --exp_name SSV --data_path <path_to_300wlp_lmdb> --num_workers 4 --exp_root <path_to_experiments_dir> --save_interval 5000 --sample_interval 500 --batch_size 2 --lr 0.0005 --code_size 64 --z_recn_weight 0.8 --vp_recn_weight 0.8 --img_recn_weight 0.4 --flip_cons_weight 0.4 --flipc_recn_weight_G 0.5 --az_range 1.4 --el_range 1.2 --ct_range 0.75

Citation

Please cite our paper if you find this code useful for your research.

@inproceedings{mustikovelaCVPR20,
	title = {Self-Supervised Viewpoint Learning From Image Collections},
	author = {Mustikovela, Siva Karthik and Jampani, Varun and De Mello, Shalini and Liu, Sifei and Iqbal, Umar and Rother, Carsten and Kautz, Jan},
	booktitle = {IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
	month = june,
	year = {2020}
}