Home

Awesome

[ECCV2024] Learning Camouflaged Object Detection from Noisy Pseudo Label (Poster)

This is the open-source repository for our paper Learning Camouflaged Object Detection from Noisy Pseudo Label, accepted at ECCV 2024!

Our Paper Can Be Seen at <font color=Blue>Paper</font>

Framework Architecture

Framework Architecture

Proposed Models

Performance

Performance

Performance

Comparison

Training Process

Task Definition: Weakly Semi-Supervised Camouflaged Object Detection (WSSCOD)

We introduce a novel training protocol named Weakly Semi-Supervised Camouflaged Object Detection (WSSCOD), utilizing boxes as prompts to generate high-quality pseudo labels. WSSCOD leverages box annotations, complemented by a minimal amount of pixel-level annotations, to generate high-accuracy pseudo labels.

  1. Dataset Division:

    • $\mathcal{D}_m = {\mathcal{X}_m, \mathcal{F}_m, \mathcal{B}m}{m=1}^M$: Pixel-level annotations $\mathcal{F}_m$, box annotations $\mathcal{B}_m$, and training images $\mathcal{X}_m$.
    • $\mathcal{D}_n = {\mathcal{X}_n, \mathcal{B}n}{n=1}^N$: Box annotations and images, where $M+N$ represents the number of training sets.
  2. Training ANet:

    • Train ANet using dataset $\mathcal{D}_m$.
    • Use $\mathcal{B}_m$ as prompts and $\mathcal{F}_m$ for supervision.
  3. Generating Pseudo Labels:

    • Use the trained ANet and dataset $\mathcal{D}_n$ to predict pseudo labels $\mathcal{W}_n$.
  4. Constructing the Weakly Semi-Supervised Dataset:

    • Combine ${\mathcal{X}_m, \mathcal{F}m}{m=1}^M$ and ${\mathcal{X}_n, \mathcal{W}n}{n=1}^N$ to form $\mathcal{D}_t$.
  5. Training PNet:

    • Train PNet using the dataset $\mathcal{D}_t$.
    • Evaluate performance with different $M$ and $N$ ratios:
      • PNet$_{F1}$: $M=1%$, $N=99%$
      • PNet$_{F5}$: $M=5%$, $N=95%$
      • PNet$_{F10}$: $M=10%$, $N=90%$
      • PNet$_{F20}$: $M=20%$, $N=80%$

Details: ANet and PNet Training

AspectANet (Auxiliary Network)PNet (Primary Network)
StageFirstSecond
ObjectiveGenerate high-accuracy pseudo labelsMain camouflaged object detection
Data InputSubset $\mathcal{D}_m$ with pixel and box annotationsWeakly semi-supervised dataset $\mathcal{D}_t$
Training Dataset$\mathcal{D}_m = {\mathcal{X}_m, \mathcal{F}_m, \mathcal{B}m}{m=1}^M$$\mathcal{D}_t = {\mathcal{X}_m, \mathcal{F}m}{m=1}^M \cup {\mathcal{X}_n, \mathcal{W}n}{n=1}^N$
AnnotationsPixel-level $\mathcal{F}_m$ and box $\mathcal{B}_m$Pseudo labels $\mathcal{W}_n$ and pixel-level $\mathcal{F}_m$
SupervisionPixel-level $\mathcal{F}_m$ for pseudo label generationPseudo labels $\mathcal{W}_n$ and pixel-level $\mathcal{F}_m$
Input PromptsBox annotations $\mathcal{B}_m$ for camouflaged objectsImages $\mathcal{X}_m$ and $\mathcal{X}_n$
Performance Evaluation-Different settings: PNet$_{F1}$, PNet$_{F5}$, PNet$_{F10}$, PNet$_{F20}$
Training GoalGenerate high-quality pseudo labels $\mathcal{W}_n$Improve detection accuracy with various $M$ and $N$ ratios

1. Download the Training and Test Sets

We have made the training and test sets available for download via the following links:

Once downloaded, place data.zip in the code/data directory and unzip it.

2. Train ANet

python code/TrainANet/TrainDDP.py --gpu_id 0 --ration 1 
# ration represents the proportion of pixel-level labels
# we find that one card training is better than four or eight cards

3. Generate Pseudo Labels

python code/TrainANet/Test.py --ration 1 
# ration represents the proportion of pixel-level labels

4. Train PNet

python code/TrainANet/TrainDDP.py --gpu_id 0 --ration 1 --q_epoch 20 --batchsize_fully 6 --batchsize_weakly 24 
# ration represents the proportion of pixel-level labels
# q_epoch means we change the q to 1 at this epoch 
# batchsize_fully means the number of fully annotated samples in a batch
# batchsize_weakly means the number of weakly annotated samples in a batch

5. Testing Process

python code/TrainPNet/Test.py --ration 1 
# ration represents the proportion of pixel-level labels

Pretrained Weights and COD Results

For ANet

We release the weight and prediction maps of $N=99%$, $N=95%$, $N=90%$ and $N=20%$ at Biadu Link.

For PNet

ModelPretrained WeightPrediction Description
PNet$_{F1}$Google Link$M=1%$, $N=99%$
PNet$_{F5}$Google Link$M=5%$, $N=95%$
PNet$_{F10}$Google Link$M=10%$, $N=90%$
PNet$_{F20}$Google Link$M=20%$, $N=80%$

References

@inproceedings{zhang2025learning,
  title={Learning Camouflaged Object Detection from Noisy Pseudo Label},
  author={Zhang, Jin and Zhang, Ruiheng and Shi, Yanjiao and Cao, Zhe and Liu, Nian and Khan, Fahad Shahbaz},
  booktitle={European Conference on Computer Vision},
  pages={158--174},
  year={2025},
  organization={Springer}
}