Home

Awesome

Factorizable Net (F-Net)

This is pytorch implementation of our ECCV-2018 paper: Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation. This project is based on our previous work: Multi-level Scene Description Network.

Progress

Updates

Project Settings

  1. Install the requirements (you can use pip or Anaconda):

    conda install pip pyyaml sympy h5py cython numpy scipy click
    conda install -c menpo opencv3
    conda install pytorch torchvision cudatoolkit=8.0 -c pytorch 
    pip install easydict
    
  2. Clone the Factorizable Net repository

    git clone git@github.com:yikang-li/FactorizableNet.git
    
  3. Build the Cython modules for nms, roi pooling,roi align modules

    cd lib
    make all
    cd ..
    
  4. Download the three datasets VG-MSDN, VG-DR-Net, VRD to F-Net/data. And extract the folders with tar xzvf ${Dataset}.tgz. We have converted the original annotations to json version.

  5. Download Visual Genome images and VRD images.

  6. Link the image data folder to target folder: ln -s /path/to/images F-Net/data/${Dataset}/images

    • p.s. You can change the default data directory by modifying dir in options/data_xxx.json.
  7. [optional] Download the pretrained RPN for Visual Genome and VRD. Place them into output/.

  8. [optional] Download the pretrained Factorizable Net on VG-MSDN, VG-DR-Net and VG-VRD, and place them to output/trained_models/

Project Organization

There are several subfolders contained:

Evaluation with our Pretrained Models

Pretrained models on VG-MSDN, VG-DR-Net and VG-VRD are provided. --evaluate is provided to enable evaluation mode. Additionally, we also provide --use_gt_boxes to fed the ground-truth object bounding boxes instead of RPN proposals.

CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate --dataset_option=normal \
	--path_opt options/models/VG-MSDN.yaml \
	--pretrained_model output/trained_models/Model-VG-MSDN.h5
CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate \
	--path_opt options/models/VRD.yaml \
	--pretrained_model output/trained_models/Model-VRD.h5
CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate --dataset_option=normal \
	--path_opt options/models/VG-DR-Net.yaml \
	--pretrained_model output/trained_models/Model-VG-DR-Net.h5

Training

Acknowledgement

We thank longcw for his generous release of the PyTorch Implementation of Faster R-CNN.

Reference

If you find our project helpful, your citations are highly appreciated:

@inproceedings{li2018fnet,
author={Li, Yikang and Ouyang, Wanli and Bolei, Zhou and Jianping, Shi and Chao, Zhang and Wang, Xiaogang},
title={Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation},
booktitle = {ECCV},
year = {2018}
}

We also have two papers regarding to scene graph generation / visual relationship detection:

@inproceedings{li2017msdn,
author={Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Wang, Kun and Wang, Xiaogang},
title={Scene graph generation from objects, phrases and region captions},
booktitle = {ICCV},
year = {2017}
}

@inproceedings{li2017vip,
author={Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Wang, Kun and Wang, Xiaogang},
title={ViP-CNN: Visual Phrase Guided Convolutional Neural Network},
booktitle = {CVPR},
year = {2017}
}

License:

The pre-trained models and the Factorizable Network technique are released for uncommercial use.

Contact Yikang LI if you have questions.