Home

Awesome

APL

Python PyTorch

This repo is the official implementation of the paper "APL: Anchor-based Prompt Learning for One-stage Weakly Supervised Referring Expression Comprehension" APL

Project structure

The directory structure of the project looks like this:

├── README.md            <- The top-level README for developers using this project.
│
├── config               <- configuration 
│
├── data
│   ├── anns            <- note: cat_name.json is for prompt template usage
│
├── datasets               <- dataloader file
│
│
├── models  <- Source code for use in this project.
│   │
│   ├── language_encoder.py             <- encoder for images' text descriptions 
│   ├── network_blocks.py               <- files included essential model blocks 
│   ├── tag_encoder.py                  <- encoder for extracting prompt embeddings 
│   ├── visual_encoder.py               <- visual backbone ,also includes prompt template encoder
│   │
│   │
│   ├── APL           <- most important files for APL model implementations
│   │   ├── __init__.py
│   │   ├── head.py   <- for anchor-prompt contrastive loss
|   |   ├── net.py    <- main code for APL model
│   │   ├── sup_head.py <- visual alignment loss
│   │
│   │
├── utils  <- hepler functions
├── requirements.txt     <- The requirements file for reproducing the analysis environment
│── train.py   <- script for training the model
│── test.py <- script for testing from a model
│
└── LICENSE              <- Open-source license if one is chosen

Installation

Instructions on how to clone and set up your repository:

Clone this repo :

git clone https://github.com/Yaxin9Luo/APL.git
cd APL

Create a conda virtual environment and activate it:

conda create -n apl python=3.7 -y
conda activate apl

Install the required dependencies:

(We run all our experiments on pytorch 1.11.0 with CUDA 11.3)

(or use the following commands we copied from their offical repo)

git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Compile the DCN layer:

cd utils/DCN
./make.sh

Install remaining dependencies

pip install -r requirements.txt
wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
pip install en_vectors_web_lg-2.1.0.tar.gz

Data Preparation

(We also prepared the annotations inside the data/anns folder for saving your time)

(We recommend to put it in the main path of APL otherwise, please modify the path in config files)

├── data
│   ├── anns            
│       ├── refcoco.json            
│       ├── refcoco+.json              
│       ├── refcocog.json                 
│       ├── refclef.json
│       ├── cat_name.json       
│   ├── images 
│       ├── train2014
│           ├── COCO_train2014_000000515716.jpg              
│           ├── ...
│       ├── refclef
│           ├── 99.jpg              
│           ├── ...

... the remaining directories    

Training

python train.py --config ./configs/[DATASET_NAME].yaml

Evaluation

python test.py --config ./config/[DATASET_NAME].yaml --eval-weights [PATH_TO_CHECKPOINT_FILE]

Model Zoo

Weakly REC

MethodRefCOCORefCOCO+RefCOCOg
valtestAtestBvaltestAtestBval-g
APL64.5161.9163.5742.7042.8439.8050.22

Weakly RES

MethodRefCOCORefCOCO+RefCOCOg
valtestAtestBvaltestAtestBval-g
APL55.9254.8455.6434.9234.8735.6140.13

Pesudo Labels to training other models ( Weakly Supervsied Training Schema)

MethodRefCOCORefCOCO+RefCOCOg
valtestAtestBvaltestAtestBval-g
APL_SimREC63.9464.7261.2142.1144.8538.3148.35
APL_TransVG64.8664.8963.8739.2841.0836.4546.11

Visualization Prediction Results (Blue box is ground truth)

Image Description : "No cut piece but 7am of cut piece"

vs

Image Description : "Green apple on the left"

vs

Image Description : "Purple book"

vs

Image Description : "Yellow round fruit with blemish"

vs

Image Description : "From bottom right second up"

vs