Awesome
GenRep
<img src='img/teaser.png' width=450>Generative Models as a Data Source for Multiview Representation Learning
Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip Isola
Prerequisites
- Linux
- Python 3
- CPU or NVIDIA GPU + CUDA CuDNN
Table of Contents:<br>
- Setup<br>
- Visualizations - plotting image panels, videos, and distributions<br>
- Training - pipeline for training your encoder<br>
- Testing - pipeline for testing/transfer learning your encoder<br>
- Notebooks - some jupyter notebooks, good place to start for trying your own dataset generations<br>
- Colab Demo - a colab notebook to demo how the contrastive encoder training works<br>
Setup
- Clone this repo:
git clone https://github.com/ali-design/GenRep
- Install dependencies:
- we provide a Conda
environment.yml
file listing the dependencies. You can create a Conda environment with the dependencies using:
- we provide a Conda
conda env create -f environment.yml
- Download resources:
- In order to generate datasets from IGMs, you would need to download those IGMs from external git repositories, e.g. BigBiGAN, BigGAN and StyleGAN2. You can then use our generating scripts in utils directory to generate your data. For instance, for the case of StyleGAN2 you can see a demo in this notebook and this script.
Visualizations
Plotting contrasting images: <br> <img src='img/panels.png'>
-
Run
simclr_views_paper_figure.ipynb
andsupcon_views_paper_figure.ipynb
to get the anchors and their contrastive pairs showin in the paper. -
To generate more images run
biggan_generate_samples_paper_figure.py
.
Training encoders
- The current implementation covers these variants:
- Contrastive (SimCLR and SupCon)
- Inverters
- Classifiers
- Some examples of commands for training contrastive encoders:
# train a SimCLR on an unconditional IGM dataset (e.g. your dataset is generated by a Gaussian walk, called my_gauss in a GANs model)
CUDA_VISIBLE_DEVICES=0,1 python main_unified.py --method SimCLR --cosine \
--dataset path_to_your_dataset --walk_method my_gauss \
--cache_folder your_ckpts_path >> log_train_simclr.txt &
# train a SupCon on a conditional IGM dataset (e.g. your dataset is generated by steering walks, called my_steer in a GANs model)
CUDA_VISIBLE_DEVICES=0,1 python main_unified.py --method SupCon --cosine \
--dataset path_to_your_dataset --walk_method my_steer \
--cache_folder your_ckpts_path >> log_train_supcon.txt &
- If you want to find out more about training configurations, you can find the
yml
file of each pretrained models inmodels_pretrained
Testing encoders
- You can currently test (i.e. trasfer learn) your encoder on:
- ImageNet linear classification
- PASCAL classification
- PASCAL detection
- Here is a diagram illustrating the general pipeline:
Imagenet linear classification
Below is the command to train a linear classifier on top of the features learned
# test your unconditional or conditional IGM trained model (i.e. the encoder you trained in the previous section) on ImageNet
CUDA_VISIBLE_DEVICES=0,1 python main_linear.py --learning_rate 0.3 \
--ckpt path_to_your_encoder --data_folder path_to_imagenet \
>> log_test_your_model_name.txt &
Pascal VOC2007 classification
To test classification on PascalVOC, you will extract features from a pretrained model and run an SVM on top of the futures. You can do that running the following code:
cd transfer_classification
./run_svm_voc.sh 0 path_to_your_encoder name_experiment path_to_pascal_voc
The code is based on FAIR Self-Supervision Benchmark
Pascal VOC2007 detection
To test transfer in detection experiments do the following:
- Enter into transfer_detection
- Install detectron2, replacing the
detectron2
folder. - Convert the checkpoints
path_to_your_encoder
to detectron2 format:
python convert_ckpt.py path_to_your_encoder output_ckpt.pth
- Add a symlink from the PascalVOC07 and PascalVOC12 into the
datasets
folder. - Train the detection model:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train_net.py \
--num-gpus 8 \
--config-file config/pascal_voc_R_50_C4_transfer.yaml \
MODEL.WEIGHTS ckpts/${name}.pth \
OUTPUT_DIR outputs/${name}
<a name="notebooks"/>
Notebooks
- We provide some examples of jupyter notebooks illustrating the full training pipeline. See notebooks.
- If using the provided conda environment, you'll need to add it to the jupyter kernel:
source activate genrep_env
python -m ipykernel install --user --name genrep_env
<a name="colab"/>
Colab
- You can find a google colab notebook implementation here.
git Acknowledgements
We thank the authors of these repositories:
Citation
If you use this code for your research, please cite our paper:
@article{jahanian2021generative,
title={Generative Models as a Data Source for Multiview Representation Learning},
author={Jahanian, Ali and Puig, Xavier and Tian, Yonglong and Isola, Phillip},
journal={arXiv preprint arXiv:2106.05258},
year={2021}
}