Awesome
Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model
The pytorch implementation of "Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model"
<div align="left"> <img src=figures/framework.png width=60% /> </div>Environment Preparation
Create conda environment
cd SamGOP
conda create --name samgop python=3.8 -y
conda activate samgop
conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia
pip install -U opencv-python
# under your working directory
cd detectron2
pip install -e .
Install Requirements
cd ..
pip install -r requirements.txt
CUDA kernel for MSDeformAttn
cd maskGOP/modeling/pixel_decoder/ops
sh make.sh
Data Preparation
We train our model on GOO-Real and GOO-Synth datasets respectively
You can download GOO-synth from OneDrive:
Train: part1, part2, part3, part4, part5, part6, part7, part8, part9, part10, part11
Test: GOOsynth-test_data
Annotation file:
GOOsynth-train_data_Annotation
You can download GOO-Real from OneDrive:
Train: GOOreal-train_data
Test: GOOreal-test_data
You can download GOO-Real annotations file from Baidu disk::
GOOreal-train_data_Annotation (code:4s36)
GOOreal-val_data_Annotation (code:mx3c)
If you want to train on GOO-Real or GOO-Synth dataset, please keep the data structure as follows:
├── datasets
└── coco
└── annotations
└── cate.txt
└── train2017.json
└── val2017.json
└── train2017
├── 0.png
├── 1.png
├── ...
└── val2017
├── 3609.png
├── 3610.png
├── ...
Training & Inference
To carry out experiments, please follow these commands:
python train_net.py --num-gpus 1 --config-file ./configs/coco/instance-segmentation/maskGOP_R50_bs2_75ep_3s.yaml SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0001
To eval the model, please follow these commands:
python eavl_train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/maskGOP_R50_bs2_75ep_3s.yaml MODEL.WEIGHTS weights_path
Acknowledgements
Our implamentation is based on detectron2 and maskdino