Awesome
APL
This repo is the official implementation of the paper "APL: Anchor-based Prompt Learning for One-stage Weakly Supervised Referring Expression Comprehension"
Project structure
The directory structure of the project looks like this:
├── README.md <- The top-level README for developers using this project.
│
├── config <- configuration
│
├── data
│ ├── anns <- note: cat_name.json is for prompt template usage
│
├── datasets <- dataloader file
│
│
├── models <- Source code for use in this project.
│ │
│ ├── language_encoder.py <- encoder for images' text descriptions
│ ├── network_blocks.py <- files included essential model blocks
│ ├── tag_encoder.py <- encoder for extracting prompt embeddings
│ ├── visual_encoder.py <- visual backbone ,also includes prompt template encoder
│ │
│ │
│ ├── APL <- most important files for APL model implementations
│ │ ├── __init__.py
│ │ ├── head.py <- for anchor-prompt contrastive loss
| | ├── net.py <- main code for APL model
│ │ ├── sup_head.py <- visual alignment loss
│ │
│ │
├── utils <- hepler functions
├── requirements.txt <- The requirements file for reproducing the analysis environment
│── train.py <- script for training the model
│── test.py <- script for testing from a model
│
└── LICENSE <- Open-source license if one is chosen
Installation
Instructions on how to clone and set up your repository:
Clone this repo :
- Clone the repository and navigate to the project directory:
git clone https://github.com/Yaxin9Luo/APL.git
cd APL
Create a conda virtual environment and activate it:
conda create -n apl python=3.7 -y
conda activate apl
Install the required dependencies:
- Install Pytorch following the offical installation instructions
(We run all our experiments on pytorch 1.11.0 with CUDA 11.3)
- Install apex following the official installation guide for more details.
(or use the following commands we copied from their offical repo)
git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key...
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./
Compile the DCN layer:
cd utils/DCN
./make.sh
Install remaining dependencies
pip install -r requirements.txt
wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
pip install en_vectors_web_lg-2.1.0.tar.gz
Data Preparation
- Download images and Generate annotations according to SimREC
(We also prepared the annotations inside the data/anns folder for saving your time)
- Download the pretrained weights of YoloV3 from Google Drive
(We recommend to put it in the main path of APL otherwise, please modify the path in config files)
- The data directory should look like this:
├── data
│ ├── anns
│ ├── refcoco.json
│ ├── refcoco+.json
│ ├── refcocog.json
│ ├── refclef.json
│ ├── cat_name.json
│ ├── images
│ ├── train2014
│ ├── COCO_train2014_000000515716.jpg
│ ├── ...
│ ├── refclef
│ ├── 99.jpg
│ ├── ...
... the remaining directories
- NOTE: our YoloV3 is trained on COCO’s training images, excluding those in RefCOCO, RefCOCO+, and RefCOCOg’s validation+testing
Training
python train.py --config ./configs/[DATASET_NAME].yaml
Evaluation
python test.py --config ./config/[DATASET_NAME].yaml --eval-weights [PATH_TO_CHECKPOINT_FILE]
Model Zoo
Weakly REC
Method | RefCOCO | RefCOCO+ | RefCOCOg | ||||
---|---|---|---|---|---|---|---|
val | testA | testB | val | testA | testB | val-g | |
APL | 64.51 | 61.91 | 63.57 | 42.70 | 42.84 | 39.80 | 50.22 |
Weakly RES
Method | RefCOCO | RefCOCO+ | RefCOCOg | ||||
---|---|---|---|---|---|---|---|
val | testA | testB | val | testA | testB | val-g | |
APL | 55.92 | 54.84 | 55.64 | 34.92 | 34.87 | 35.61 | 40.13 |
Pesudo Labels to training other models ( Weakly Supervsied Training Schema)
Method | RefCOCO | RefCOCO+ | RefCOCOg | ||||
---|---|---|---|---|---|---|---|
val | testA | testB | val | testA | testB | val-g | |
APL_SimREC | 63.94 | 64.72 | 61.21 | 42.11 | 44.85 | 38.31 | 48.35 |
APL_TransVG | 64.86 | 64.89 | 63.87 | 39.28 | 41.08 | 36.45 | 46.11 |
Visualization Prediction Results (Blue box is ground truth)
Image Description : "No cut piece but 7am of cut piece"
Image Description : "Green apple on the left"
Image Description : "Purple book"
Image Description : "Yellow round fruit with blemish"
Image Description : "From bottom right second up"