Home

Awesome

BagCAMs

Overview

Official implementation of the paper ``Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization" (ECCV'22)

Gap between image-level classifier and pixel-level localizer

WSOL aims at training a feature extractor and a classifier based on the CE between image-level features and image-level annotation. This classifier is then directly used as the localizer onto pixel-level features in the test time to generate pixel-level classification results, i.e., the localization map.

However, the object localizer focuses on discerning the class of all regional positions based on the pixel-level features, where discriminative factors may not be well-aggregated, i.e., insufficient to activate the globally-learned classifier.

To bridge this gap, our work proposes a plug-and-play approach called BagCAMs, which can better project an image-level trained classifier to comply with the requirement of localization tasks.

<center> <img src="pics/intro.png" width="80%" /> </center>

Our BagCAMs focuses on deriving a set of regional localizers from this well-trained classifier. Those regional localizers can discern object-related factors with respect to each spatial position, acting as the base learners of the ensemble learning. With those regional localizers, the final localization results can be obtained by integrating their effect.

<center> <img src="pics/structure.png" width="80%" /> </center>

Due to better uitlize multiple regional localizer, our BagCAMs can perform better than existing CAM-based method, especially for the intermediate that has higher spatial resolution.

<center> <img src="pics/result.png" width="80%" /> </center>

Getting Start

Prepare the dataset

Following DA-WSOL to prepare the dataset

Training baseline methods

Following DA-WSOL to train the baseline method (CAM/HAS/CutMix/ADL/DA-WSOL)

Note that --post_methods" should be set as CAM" for efficiency in the training process.

Using Our BagCAMs for Testing

  1. Confirming ``$data_root" is set as the folder of datasets that has been arranged as mentioned above.

  2. Downloading the checkpoint of DA-WSOL from our google drive. (or using the checkpoint outputed by the training step)

  3. Setting ``--check_path" as the path of the checkpoint generated by training process or our released checkpoint.

  4. Confirming --architecture" and --wsol_method" are consist with the setting for the trained checkpoint.

  5. Set ``--post_methods" as BagCAMs (or other methods, e.g., CAM/GradCAM/GradCAM++/PCS)

  6. Set ``--target_layer" as name of the layer whose outputed feature & gradient are used. (e.g., layer1,2,3,4 for ResNet backbone).

  7. Running ``bash run_test.sh"

  8. Test log files and test scores are save in "--save_dir"

Performance

ILSVRC Dataset

Top-1 LocGT-known LocMaxBoxAccV2
DA-WSOL-ResNet-CAM43.2670.2768.23
DA-WSOL-ResNet-BagCAMs44.2472.0869.97
DA-WSOL-InceptionV352.7069.1164.75
DA-WSOL-InceptionV3-BagCAMs53.8771.0266.93

CUB-200 Dataset

Top-1 LocGT-known LocMaxBoxAccV2pIoUPxAP
DA-WSOL-ResNet-CAM62.4081.8369.8756.1874.70
DA-WSOL-ResNet-BagCAMs69.6794.0184.8874.5190.38
DA-WSOL-InceptionV3-CAM56.2980.0368.0151.8171.03
DA-WSOL-InceptionV3-BagCAMs60.0789.7876.9458.0572.97

OpenImage dataset

pIoUPxAP
DA-WSOL-ResNet-CAM49.6865.42
DA-WSOL-ResNet-BagCAMs52.1767.68
DA-WSOL-InceptionV3-CAM48.0164.46
DA-WSOL-InceptionV3-BagCAMs50.7966.89

Citation

@article{BagCAMs,</br> title={Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization},</br> author={Zhu, Lei and Chen, Qian and Jin, Lujia and You, Yunfei and Lu, Yanye},</br> journal={arXiv preprint arXiv:2207.07818},</br> year={2022}</br> }

@article{DAWSOL,</br> title={Weakly Supervised Object Localization as Domain Adaption},</br> author={Zhu, Lei and She, Qi and Chen, Qian and You, Yunfei and Wang, Boyu and Lu, Yanye},</br> booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},</br> pages={14637--14646},</br> year={2022}</br> }

Acknowledgement

This code and our experiments are conducted based on the release code of gradcam / wsolevaluation / transferlearning. Here we thank for their remarkable works.