Awesome
Table of Content
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images
Recovering the 3D structure of an object from a single image is a challenging task due to its ill-posed nature. One approach is to utilize the plentiful photos of the same object category to learn a strong 3D shape prior for the object. We propose a general framework without symmetry constraint, called LeMul, that effectively Learns from Multi-image datasets for more flexible and reliable unsupervised training of 3D reconstruction networks. It employs loose shape and texture consistency losses based on component swapping across views.
<img src="./image/teaser.png" width="800">Details of the model architecture and experimental results can be found in our following paper.
@inproceedings{ho2021lemul,
title={Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images},
author={Long-Nhat Ho and Anh Tran and Quynh Phung and Minh Hoai},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2021}
}
Please CITE our paper whenever our model implementation is used to help produce published results or incorporated into other software.
Getting Started
Datasets
- CelebA face dataset. Please download the original images (
img_celeba.7z
) from their website and runceleba_crop.py
indata/
to crop the images. - Synthetic face dataset generated using Basel Face Model. This can be downloaded using the script
download_synface.sh
provided indata/
. - Cat face dataset composed of Cat Head Dataset and Oxford-IIIT Pet Dataset (license). This can be downloaded using the script
download_cat.sh
provided indata/
. - CASIA WebFace dataset. You can download the original dataset from backup links such as the Google Drive link on this page. Decompress, and run
casia_data_split.py
indata/
to re-organize the images.
Please remember to cite the corresponding papers if you use these datasets.
Installation:
# clone the repo
git clone https://github.com/VinAIResearch/LeMul.git
cd LeMul
# install dependencies
conda env create -f environment.yml
Experiments
Training and Testing
Check the configuration files in experiments/
and run experiments, eg:
# Training
python run.py --config experiments/train_multi_CASIA.yml --gpu 0 --num_workers 4
# Testing
python run.py --config experiments/test_multi_CASIA.yml --gpu 0 --num_workers 4
Texture fine-tuning
With collection-style datasets such as CASIA, you can fine-tune the texture estimation network after training. Check the configuration file experiments/finetune_CASIA.yml
as an example. You can run it with the command:
python run.py --config experiments/finetune_CASIA.yml --gpu 0 --num_workers 4
Pretrained Models
Pretrained models can be found here: Google Drive
Please download and place pretrained models in ./pretrained
folder.
Demo
After downloading pretrained models and preparing input image folder, you can run demo, eg:
python demo/demo.py --input demo/human_face_cropped --result demo/human_face_results --checkpoint pretrained/casia_checkpoint028.pth
Options:
--config path-to-training-config-file.yml
: input the config file used in training (recommended)--detect_human_face
: enable automatic human face detection and cropping using MTCNN. You need to install facenet-pytorch before using this option. This only works on human face images--gpu
: enable GPU--render_video
: render 3D animations using neural_renderer (GPU is required)
To replicate the results reported in the paper with the model pretrained on the CASIA dataset, use the --detect_human_face
option with images in folder demo/images/human_face
and skip that flag with images in demo/images/human_face_cropped
.