Awesome
[ECCV2024] Learning Camouflaged Object Detection from Noisy Pseudo Label (Poster)
This is the open-source repository for our paper Learning Camouflaged Object Detection from Noisy Pseudo Label, accepted at ECCV 2024!
Our Paper Can Be Seen at <font color=Blue>Paper</font>
Framework Architecture
Performance
Training Process
Task Definition: Weakly Semi-Supervised Camouflaged Object Detection (WSSCOD)
We introduce a novel training protocol named Weakly Semi-Supervised Camouflaged Object Detection (WSSCOD), utilizing boxes as prompts to generate high-quality pseudo labels. WSSCOD leverages box annotations, complemented by a minimal amount of pixel-level annotations, to generate high-accuracy pseudo labels.
-
Dataset Division:
- $\mathcal{D}_m = {\mathcal{X}_m, \mathcal{F}_m, \mathcal{B}m}{m=1}^M$: Pixel-level annotations $\mathcal{F}_m$, box annotations $\mathcal{B}_m$, and training images $\mathcal{X}_m$.
- $\mathcal{D}_n = {\mathcal{X}_n, \mathcal{B}n}{n=1}^N$: Box annotations and images, where $M+N$ represents the number of training sets.
-
Training ANet:
- Train ANet using dataset $\mathcal{D}_m$.
- Use $\mathcal{B}_m$ as prompts and $\mathcal{F}_m$ for supervision.
-
Generating Pseudo Labels:
- Use the trained ANet and dataset $\mathcal{D}_n$ to predict pseudo labels $\mathcal{W}_n$.
-
Constructing the Weakly Semi-Supervised Dataset:
- Combine ${\mathcal{X}_m, \mathcal{F}m}{m=1}^M$ and ${\mathcal{X}_n, \mathcal{W}n}{n=1}^N$ to form $\mathcal{D}_t$.
-
Training PNet:
- Train PNet using the dataset $\mathcal{D}_t$.
- Evaluate performance with different $M$ and $N$ ratios:
- PNet$_{F1}$: $M=1%$, $N=99%$
- PNet$_{F5}$: $M=5%$, $N=95%$
- PNet$_{F10}$: $M=10%$, $N=90%$
- PNet$_{F20}$: $M=20%$, $N=80%$
Details: ANet and PNet Training
Aspect | ANet (Auxiliary Network) | PNet (Primary Network) |
---|---|---|
Stage | First | Second |
Objective | Generate high-accuracy pseudo labels | Main camouflaged object detection |
Data Input | Subset $\mathcal{D}_m$ with pixel and box annotations | Weakly semi-supervised dataset $\mathcal{D}_t$ |
Training Dataset | $\mathcal{D}_m = {\mathcal{X}_m, \mathcal{F}_m, \mathcal{B}m}{m=1}^M$ | $\mathcal{D}_t = {\mathcal{X}_m, \mathcal{F}m}{m=1}^M \cup {\mathcal{X}_n, \mathcal{W}n}{n=1}^N$ |
Annotations | Pixel-level $\mathcal{F}_m$ and box $\mathcal{B}_m$ | Pseudo labels $\mathcal{W}_n$ and pixel-level $\mathcal{F}_m$ |
Supervision | Pixel-level $\mathcal{F}_m$ for pseudo label generation | Pseudo labels $\mathcal{W}_n$ and pixel-level $\mathcal{F}_m$ |
Input Prompts | Box annotations $\mathcal{B}_m$ for camouflaged objects | Images $\mathcal{X}_m$ and $\mathcal{X}_n$ |
Performance Evaluation | - | Different settings: PNet$_{F1}$, PNet$_{F5}$, PNet$_{F10}$, PNet$_{F20}$ |
Training Goal | Generate high-quality pseudo labels $\mathcal{W}_n$ | Improve detection accuracy with various $M$ and $N$ ratios |
1. Download the Training and Test Sets
We have made the training and test sets available for download via the following links:
- Google Drive
- BaiDu Drive (Passwd: ECCV)
Once downloaded, place data.zip
in the code/data
directory and unzip it.
2. Train ANet
python code/TrainANet/TrainDDP.py --gpu_id 0 --ration 1
# ration represents the proportion of pixel-level labels
# we find that one card training is better than four or eight cards
3. Generate Pseudo Labels
python code/TrainANet/Test.py --ration 1
# ration represents the proportion of pixel-level labels
4. Train PNet
python code/TrainANet/TrainDDP.py --gpu_id 0 --ration 1 --q_epoch 20 --batchsize_fully 6 --batchsize_weakly 24
# ration represents the proportion of pixel-level labels
# q_epoch means we change the q to 1 at this epoch
# batchsize_fully means the number of fully annotated samples in a batch
# batchsize_weakly means the number of weakly annotated samples in a batch
5. Testing Process
python code/TrainPNet/Test.py --ration 1
# ration represents the proportion of pixel-level labels
Pretrained Weights and COD Results
For ANet
We release the weight and prediction maps of $N=99%$, $N=95%$, $N=90%$ and $N=20%$ at Biadu Link.
For PNet
Model | Pretrained Weight | Prediction Description |
---|---|---|
PNet$_{F1}$ | Google Link | $M=1%$, $N=99%$ |
PNet$_{F5}$ | Google Link | $M=5%$, $N=95%$ |
PNet$_{F10}$ | Google Link | $M=10%$, $N=90%$ |
PNet$_{F20}$ | Google Link | $M=20%$, $N=80%$ |
References
@inproceedings{zhang2025learning,
title={Learning Camouflaged Object Detection from Noisy Pseudo Label},
author={Zhang, Jin and Zhang, Ruiheng and Shi, Yanjiao and Cao, Zhe and Liu, Nian and Khan, Fahad Shahbaz},
booktitle={European Conference on Computer Vision},
pages={158--174},
year={2025},
organization={Springer}
}