Home

Awesome

MUM : Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection (CVPR2022)

This is the Pytorch implementation of our paper : <br> MUM : Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection <br> IEEE/CVF International Conference on Computer Vision (CVPR), 2022 <br> [arXiv]

<p align="center"> <img src="teaser/mum_phase.png", width="85%"> </p>

Installtion & Setup

We follow the installation precess of Unbiased Teacher official repo (https://github.com/facebookresearch/unbiased-teacher)

Download the code

Prerequisites

Build Detectron2 from Source

# get the Detectron2 v0.5 package
wget https://github.com/facebookresearch/detectron2/archive/refs/tags/v0.5.zip

# unzip
unzip v0.5.zip

# install
python -m pip install -e detectron2-0.5

Install other requirements

pip install -r requirements.txt

Dataset download

  1. Download COCO & VOC dataset

  2. Organize the dataset as following:

mix-unmix/
└── datasets/
    ├── coco/
    │   ├── train2017/
    │   ├── val2017/
    │   └── annotations/
    │   	├── instances_train2017.json
    │   	└── instances_val2017.json
    ├── VOC2007
    │   ├── Annotations
    │   ├── ImageSets
    │   └── JPEGImages
    └── VOC2012
        ├── Annotations
        ├── ImageSets
        └── JPEGImages

Evaluation

BackboneProtocolsAP50AP50:95Model Weights
R50-FPNCOCO-Standard 1%40.0621.89link
R50-FPNCOCO-Additional63.3042.11link
R50-FPNVOC07 (VOC12)78.9450.22link
R50-FPNVOC07 (VOC12 / COCO20cls)80.4552.31link
SwinCOCO-Standard 0.5%34.2516.52link
python train_net.py \
      --eval-only \
      --num-gpus 1 \
      --config configs/mum_configs/coco.yaml \
      MODEL.WEIGHTS weights/<your weight>.pth
python train_net.py \
      --eval-only \
      --num-gpus 1 \
      --config configs/mum_configs/voc.yaml \
      MODEL.WEIGHTS weights/<your weight>.pth

Train

We use 4 GPUs (A6000 or V100 32GB) to achieve the paper results.

python train_net.py \
      --num-gpus 4 \
      --config configs/mum_configs/coco.yaml \
python train_net.py \
      --num-gpus 4 \
      --config configs/mum_configs/voc.yaml \

Swin

mv swin_tiny_patch4_window7_224.pth weights/
python train_net.py \
      --eval-only \
      --num-gpus 1 \
      --config configs/mum_configs/coco_swin.yaml \
      MODEL.WEIGHTS weights/<your weight>.pth
      
python train_net.py \
      --num-gpus 4 \
      --config configs/mum_configs/coco_swin.yaml \

Mix/UnMix code block

Mixing code block

mask = torch.argsort(torch.rand(bs // ng, ng, nt, nt), dim=1).cuda()
img_mask = mask.view(bs // ng, ng, 1, nt, nt)
img_mask = img_mask.repeat_interleave(3, dim=2)
img_mask = img_mask.repeat_interleave(h // nt, dim=3)
img_mask = img_mask.repeat_interleave(w // nt, dim=4)
img_tiled = images.tensor.view(bs // ng, ng, c, h, w)
img_tiled = torch.gather(img_tiled, dim=1, index=img_mask)
img_tiled = img_tiled.view(bs, c, h, w)

Unmixing code block

inv_mask = torch.argsort(mask, dim=1).cuda()
feat_mask = inv_mask.view(bs//ng,ng,1,nt,nt)
feat_mask = feat_mask.repeat_interleave(c,dim=2)
feat_mask = feat_mask.repeat_interleave(h//nt, dim=3)
feat_mask = feat_mask.repeat_interleave(w//nt, dim=4)
feat_tiled = feat.view(bs//ng,ng,c,h,w)
feat_tiled = torch.gather(feat_tiled, dim=1, index=feat_mask)
feat_tiled = feat_tiled.view(bs,c,h,w)

Acknowledgements

We use Unbiased-teacher official code as our baseline. And also we use Timm repository to implement Swin Transformer easily.