Awesome
<center>Coarse-to-Fine Amodal Segmentation with Shape Prior (C2F-Seg)
Jianxiong Gao, Xuelin Qian†, Yikai Wang, Tianjun Xiao†, Tong He, Zheng Zhang, Yanwei Fu
</center>This is the Official Implementation for ICCV'23 paper Coarse-to-Fine Amodal Segmentation with Shape Prior.
Introduction
<img src='./imgs/C2F-Seg.jpg' width="100%"> C2F-Seg is a framework designed for amodal segementation. It first generates a coarse mask from the visible mask and visual features via the mask-and-predict procedure with transformers. Then this coarse amodal mask is refined with a convolutional module guided by human-imitated attention on visual features of the amodal object. The learning of visible mask is used as an auxiliary task in training, while in inference we only provide an estimation of amodal mask.Environment Setup
git clone https://github.com/amazon-science/c2f-seg.git
cd c2f-seg
conda env create -f environment.yml
If conda is too slow, you can use:
conda create --name C2F-Seg python=3.10
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -e .[all]
MOViD-Amodal
<img src="./imgs/example.gif" width="100%">Dataset and checkpoints
Dataset | $\text{mIoU}_{full}$ | $\text{mIoU}_{occ}$ | VQ Model | C2F-Seg |
---|---|---|---|---|
KINS | 82.22 | 53.60 | weight, config | weight, config |
COCOA | 80.28 | 27.71 | weight, config | weight, config |
MOViD-Amodal | 71.67 | 36.13 | weight, config | weight, config |
Please use the following commands to prepare the dataset and checkpoints:
# Example with KINS dataset
bash download.sh KINS
wget https://data.dgl.ai/dataset/C2F-Seg/KINS.tar
tar -xvf KINS.tar
# Important: Update the root_path in config files!
Running Experiments
Evaluate model
# KINS
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 \
test_c2f_seg.py --dataset KINS --batch 1 --data_type image --vq_path KINS_vqgan --path KINS_c2f_seg
# MOViD-Amodal
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 \
test_c2f_seg.py --dataset MOViD_A --batch 1 --data_type video --vq_path MOViD_A_vqgan --path MOViD_A_c2f_seg
Train VQ model
# KINS
CUDA_VISIBLE_DEVICES=0 python train_vq.py --dataset KINS --path KINS_vqgan --check_point_path ../check_points
# MOViD-Amodal
CUDA_VISIBLE_DEVICES=0 python train_vq.py --dataset MOViD_A --path MOViD_A_vqgan --check_point_path ../check_points
Train C2F-Seg
# KINS
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 \
train_c2f_seg.py --dataset KINS --batch 16 --data_type image --vq_path KINS_vqgan --path KINS_c2f_seg
# MOViD-Amodal
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 \
train_c2f_seg.py --dataset MOViD_A --batch 1 --data_type video --vq_path MOViD_A_vqgan --path MOViD_A_c2f_seg
Citation
If you find our paper useful for your research and applications, please cite using this BibTeX:
@inproceedings{gao2023coarse,
title={Coarse-to-Fine Amodal Segmentation with Shape Prior},
author={Gao, Jianxiong and Qian, Xuelin and Wang, Yikai and Xiao, Tianjun and He, Tong and Zhang, Zheng and Fu, Yanwei},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={1262--1271},
year={2023}
}
Security
See CONTRIBUTING for more information.
License
This project is licensed under the Apache-2.0 License.