Awesome
Official code for the PLS-LSA algorithm, accepted at ECCV 2024: An accurate detection is not all you need to combat label noise in web-noisy datasets.
Overview
This repository provides the official codebase for our ECCV 2024 paper: "An accurate detection is not all you need to combat label noise in web-noisy datasets."
Architecture
Getting Started
Requirements
All dependencies are listed in the lsa.yml file.
Installation
Create a Conda environment using:
conda env create -f lsa.yml
conda activate lsa
We use LightningLite's Fabric fabric to enable multi-gpu support, although it is not currently implemented.
Datasets
Downloads
This repository supports the following noisy datasets:
- Webvision: Download the dataset from Webvision 2017 and follow the instructions. For faster training, we use the first 50 classes (mini-Webvision).
- Controlled Noisy Web Label (CNWL): Download the dataset from the official webpage or using TFrecords from FaMUS repository.
- Webly-fg: Download the dataset from the official repository.
- ImageNet2012: Download the test set from ImageNet for evaluation.
Dataset Paths
Update the mypath.py file with the paths to the downloaded datasets.
Training
Contrastive Pre-training
Pretrain using unsupervised algorithms (SimCLR) from the solo-learn codebase.
- Pre-trained weights for CNWL and Webvision experiments are available at google drive
- Specify the path to pre-trained weights using the
--pretrained
argument.
PLS-LSA and PLS-LSA+
Run experiments using the train.sh file, which includes examples for: - PLS-LSA and PLS-LSA+ on CNWL, Webvision, and Webly-fg datasets - PLS-LSA and PLA-LSA+ for ViTs pre-trained using CLIP
Getting Started
- Download and prepare the datasets.
- Update mypath.py with dataset paths.
- Pretrain using contrastive learning (optional).
- Run experiments using train.sh.
Results
Performance on Benchmark Datasets
Dataset | Result |
---|---|
CNWL | |
Webvision | |
Webly-fg | |
CNWL (CLIP) |
Reproduction Note
Please note that results reproduced using this codebase may slightly differ due to code cleanup and restructuring.
Citation
Citing Our Work
If you find our work useful for your research, please cite our paper:
@inproceedings{2024_ECCV_LSA,
title={An accurate detection is not all you need to combat label noise in web-noisy datasets},
author={Albert, Paul and Valmadre, Jack and Arazo, Eric and Krishna, Tarun and O'Connor, Noel E and McGuinness, Kevin},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024}
}