Home

Awesome

Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and Visual Analysis Strategy

News

[2024-09-06] Berif introduction to our work is available on YouTube.
[2024-09-04] Poster updated.
[2024-08-23] Early access from arxiv.
[2024-07-01] Our work is accepted by ECCV 2024! Congratulations and see you in Milano! :laughing:
[2024-03-15] First commit.

COD-TAX Overview

We introduced COD-TAX, which provides textual descriptions and attribute contribution information to COD. The textual descriptions focus on characterizing the entire COD scenario, while the attribute contributions primarily reveal the proportions in which various attributes influence the current scene. These components are initially generated by GPT-4 and subsequently fine-tuned by human. COD-TAX could be found in COD-TAX.

<p align="center"> <img src="asset/distribution_of_datasets.png"/> <br/> <em> Visualization of the overall distribution of the COD-TAX dataset. (a) illustrates the 17 attribute classes within three primary categories: Surrounding Factors (SF), Camouflaged Object-Self Factors (COF), and Imaging Quality Factors (IQF). The proportion of each attribute represents its average contribution to the general COD scenario, while 'Max' reveals the maximum proportion it occurred in the training dataset. (b) shows the statistical results of textual description lengths. (c) presents a word cloud map illustrating word frequency. (d) shows two examples from the COD-TAX dataset. </em> </p>

The number in COD-TAX/attribute_contribution contains 17 normalized contribution to current camouflaged scene. The attribute is orderd in:

Surrounding Factors: Environmental Pattern Matching, Clutter Noise Disruption, Environmental Motion Dazzle, Environmental Shading, Environmental Textures.

Camouflaged Object-Self Reasons: Color Matching, Shape Mimicry, Behavior Mimicry, Texture Adapting, Shadow Minimization, Edge Diffusion.

Imaging Quality Reasons: Blur Issue, Low Resolution, Improper Exposure, Compression Artifacts, Object Size Matters, Object Placement.

For detailed explaination and example, please refer to the supplemental material.

ACUMEN Overview

<p align="center"> <img src="asset/pipline.png"/> <br/> <em> Overall structure of the proposed ACUMEN. The model utilizes both a textual branch and a visual branch, with the textual branch active only during training for practical usage. </em> </p>

Experiment Results

<p align="center"> <img src="asset/Qualitative_analysis.png"/> <br/> <em> Qualitative results. </em> </p> <p align="center"> <img src="asset/Quantitative_result.png"/> <br/> <em> Quantitative results. </em> </p>

Experiment Setting

install dependencies in requirements.txt torch1.9 + cuda10.2 is recomanded

ACUMEN reuslts for CAMO, COD10K, and NC4K can be found in ACUMEN results.

Manage the traning and testing dataset like this, Desc is provided by COD-TAX and fixation information can be found from COD-Rank-Localize-and-Segment. For reproduction convenience, we provide the all in one data Data_all_in_one.

ACUMEN
    --dataset
        --TestDataset
            --CAMO
            --CHAMELEON
            --COD10K
            --NC4k
        --TrainDataset
            --Desc
                --overall_description
                --attribute_description
                --attribute_contribution
            --Imgs
            --Fix
            --GT
      

put pretrained ViT-L-14-336px.pt here. pretrained ViT.

ACUMEN
    --pretrain
        ViT-L-14-336px.pt

Training

For the training process, run:

python train_multigpu_noattr.py --config config/codclip_vit_L14@336_noattr_3_1_50.yaml

Testing / Inference

Put the pretrained checkpoint here.

ACUMEN
    --exp/metapara_noattr_3_1_50
        Net_epoch_best.pth

And run:

python test.py --config config/codclip_vit_L14@336_noattr_3_1_50.yaml