Home

Awesome

Single-Hop: A Simple Baseline for the GQA Dataset

This repository contains a simple model for the GQA dataset, using a single-hop attention over the visual features based on the encoded question, as specified in models_gqa/single_hop.py. It should serve as the baseline for more complicated models on the GQA dataset. The baseline is described as "single-hop" in the LCGN paper.

(If you are looking for the code for our LCGN model, please check out this repo.)

It is applicable to three types of features from GQA:

It gets the following performance on the validation (val_balanced), the test-dev (testdev_balanced) and the test (test_balanced) split of the GQA dataset:

Visual Feature TypeAccuracy on val_balancedAccuracy on testdev_balancedAccuracy on test_balanced (obtained on EvalAI Phase: test2019)Pre-trained model
spatial features54.97%48.61%49.11%download
objects features62.02%53.78%54.42%download
"perfect-sight" object names and attributes86.98%n/a*n/a*download

*This setting requires using the GQA ground-truth scene graphs at both training and test time (only the object names and attributes are used; their relations are not used). Hence, it is not applicable to the test or the challenge setting.

Installation

  1. Install Python 3 (Anaconda recommended: https://www.continuum.io/downloads).
  2. Install TensorFlow (we used TensorFlow 1.12.0 in our experiments):
    pip install tensorflow-gpu (or pip install tensorflow-gpu==1.12.0 to install TensorFlow 1.12.0)
  3. Install a few other dependency packages (NumPy, HDF5, YAML):
    pip install numpy h5py pyyaml
  4. Download this repository or clone with Git, and then enter the root directory of the repository:
    git clone https://github.com/ronghanghu/gqa_single_hop_baseline.git && cd gqa_single_hop_baseline

Download the GQA dataset

Download the GQA dataset from https://cs.stanford.edu/people/dorarad/gqa/, and symbol link it to exp_gqa/gqa_dataset. After this step, the file structure should look like

exp_gqa/gqa_dataset
    questions/
        train_all_questions/
            train_all_questions_0.json
            ...
            train_all_questions_9.json
        train_balanced_questions.json
        val_all_questions.json
        val_balanced_questions.json
        submission_all_questions.json
        test_all_questions.json
        test_balanced_questions.json
    spatial/
        gqa_spatial_info.json
        gqa_spatial_0.h5
        ...
        gqa_spatial_15.h5
    objects/
        gqa_objects_info.json
        gqa_objects_0.h5
        ...
        gqa_objects_15.h5
    sceneGraphs/
        train_sceneGraphs.json
        val_sceneGraphs.json
    images/
        ...

Note that on GQA images are not needed for training or evaluation -- only questions, features and scene graphs (if you would like to run on the "perfect-sight" object names and attributes) are needed.

Training on GQA

Note:

Pretrained models:

Training steps:

  1. Add the root of this repository to PYTHONPATH: export PYTHONPATH=.:$PYTHONPATH
  2. Train with spatial features:
    python exp_gqa/train.py --cfg exp_gqa/cfgs/spatial.yaml
  3. Train with objects features:
    python exp_gqa/train.py --cfg exp_gqa/cfgs/objects.yaml
  4. Train with "perfect-sight" object names and attributes (one-hot embeddings):
    python exp_gqa/train.py --cfg exp_gqa/cfgs/scene_graph.yaml

Testing on GQA

Note:

Testing steps:

  1. Add the root of this repository to PYTHONPATH: export PYTHONPATH=.:$PYTHONPATH
  2. Test with spatial features:
    • test locally on the val_balanced split:
      python exp_gqa/test.py --cfg exp_gqa/cfgs/spatial.yaml TEST.SPLIT_VQA val_balanced
    • test locally on the testdev_balanced split:
      python exp_gqa/test.py --cfg exp_gqa/cfgs/spatial.yaml TEST.SPLIT_VQA testdev_balanced
    • generate the submission file on submission_all for EvalAI (this takes a long time):
      python exp_gqa/test.py --cfg exp_gqa/cfgs/spatial.yaml TEST.SPLIT_VQA submission_all
  3. Test with objects features:
    • test locally on the val_balanced split:
      python exp_gqa/test.py --cfg exp_gqa/cfgs/objects.yaml TEST.SPLIT_VQA val_balanced
    • test locally on the testdev_balanced split:
      python exp_gqa/test.py --cfg exp_gqa/cfgs/objects.yaml TEST.SPLIT_VQA testdev_balanced
    • generate the submission file on submission_all for EvalAI (this takes a long time):
      python exp_gqa/test.py --cfg exp_gqa/cfgs/objects.yaml TEST.SPLIT_VQA submission_all
  4. Test with "perfect-sight" object names and attributes (one-hot embeddings):
    • test locally on the val_balanced split:
      python exp_gqa/test.py --cfg exp_gqa/cfgs/scene_graph.yaml TEST.SPLIT_VQA val_balanced
    • test locally on the testdev_balanced split (This won't work unless you have a file testdev_sceneGraphs.json under exp_gqa/gqa_dataset/sceneGraphs/ that contains scene graphs for test-dev images, which we don't):
      python exp_gqa/test.py --cfg exp_gqa/cfgs/scene_graph.yaml TEST.SPLIT_VQA testdev_balanced
    • generate the submission file on submission_all for EvalAI (This won't work unless you have a file submission_sceneGraphs.json under exp_gqa/gqa_dataset/sceneGraphs/ that contains scene graphs for all images, which we don't):
      python exp_gqa/test.py --cfg exp_gqa/cfgs/scene_graph.yaml TEST.SPLIT_VQA submission_all

Acknowledgements

The outline of the configuration code (such as models_gqa/config.py) is modified from the Detectron codebase.