Home

Awesome

This implements "Visual Translation Embedding Network for Visual Relation Detection,Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, Tat-Seng Chua (CVPR2017)"

Recently there is also a tensorflow adaption provided by yangxuntu, which obtain significant improvement on vg dataset. You can find the code here(https://github.com/yangxuntu/vtranse)

What's inside?

Download links

The files are in google drive. The direct link to the folder is at https://drive.google.com/open?id=1BvtjCnlORMg4l92kNgZ2g1YaHYj9Dy3X

Coming soon

Setup

Object Detector

Ensure data folder looks like this.

zawlin@zlgpu:~/g/cvpr17_vtranse/data$ tree -l -L 4 -d
.
├── demo
├── scripts
├── sg_vrd_2016 -> /media/zawlin/ssd/data/vrd/vrd/sg
│   ├── Annotations
│   │   ├── sg_test_images
│   │   └── sg_train_images
│   ├── Data
│   │   ├── sg_test_images
│   │   └── sg_train_images
│   ├── devkit
│   │   ├── data
│   │   │   └── ilsvrc_det_sample
│   │   └── evaluation
│   └── ImageSets
└── vg1_2_2016 -> /media/zawlin/ssd/data/vrd/vg_1.2/voc_format
    ├── Annotations
    │   ├── test
    │   │   ├── VG_100K
    │   │   └── VG_100K_2
    │   └── train
    │       ├── VG_100K
    │       └── VG_100K_2
    ├── Data
    │   ├── test
    │   │   ├── VG_100K
    │   │   └── VG_100K_2
    │   └── train
    │       ├── VG_100K
    │       └── VG_100K_2
    ├── devkit
    │   ├── data
    │   │   └── ilsvrc_det_sample
    │   └── evaluation
    └── ImageSets

Training And Evaluation Instructions

I am using ubuntu 16.04 with gcc 5.4. If you run into protobuf errors, usually recompiling protobuf from source will eliminate the errors. When I refer to folders, it is with respect to the root github source folder.

The steps below are for vrd dataset. For vg, the steps are similar, you will just need to change the some folder or file paths to point to vg directory or scripts.

Citation

If you're using this code in a scientific publication please cite:

@inproceedings{Zhang_2017_CVPR,
  author    = {Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, Tat-Seng Chua},
  title     = {Visual Translation Embedding Network for Visual Relation Detection},
  booktitle = {CVPR},
  year      = {2017},
}