Home

Awesome

Awesome-Mixup

<p align="center"> <img src="https://github.com/user-attachments/assets/b13d34e3-55e8-4bfa-b592-86922563d372" width=90% height=90% class="center"> </p>

Awesome GitHub stars GitHub forks

<!-- ![visitors](https://visitor-badge.glitch.me/badge?page_id=Westlake-AI/Awesome-Mixup) -->

Welcome to Awesome-Mixup, a carefully curated survey of Mixup algorithms implemented in the PyTorch library, aiming to meet various needs of the research community. Mixup is a kind of methods that focus on alleviating model overfitting and poor generalization. As a "data-centric" way, Mixup can be applied to various training paradigms and data modalities.

If this repository has been helpful to you, please consider giving it a ⭐️ to show your support. Your support helps us reach more researchers and contributes to the growth of this resource. Thank you!

Introduction

We summarize awesome mixup data augmentation methods for visual representation learning in various scenarios from 2018 to 2024.

The list of awesome mixup augmentation methods is summarized in chronological order and is on updating. The main branch is modified according to Awesome-Mixup in OpenMixup and Awesome-Mix, and we are working on a comperhensive survey on mixup augmentations. You can read our survey: A Survey on Mixup Augmentations and Beyond see more detailed information.

<p align="center"> <img src="https://github.com/user-attachments/assets/32dd37e1-0b49-4ddb-a253-fc176a241253" width=90% height=90% class="center"> </p>

Figuer of Contents

You can see the figuer of mixup augmentation methods deirtly that we summarized.

<p align="center"> <img src="https://github.com/user-attachments/assets/366a6def-9193-4c8b-ac3f-b220e3102f4e" width=90% height=90% class="center"> </p>

Table of Contents

<details> <summary>Table of Contents</summary> <ol> <!-- <li><a href="#sample-mixup-policies-in-sl">Sample Mixup Policies in SL</a></li> --> <details> <summary>Sample Mixup Policies in SL</summary> <ol> <li><a href="#static-linear">Static Linear</a></li> <li><a href="#feature-based">Feature-based</a></li> <li><a href="#cutting-based">Cutting-based</a></li> <li><a href="#k-samples-mixup">K Samples Mixup</a></li> <li><a href="#random-policies">Random Policies</a></li> <li><a href="#style-based">Style-based</a></li> <li><a href="#saliency-based">Saliency-based</a></li> <li><a href="#attention-based">Attention-based</a></li> <li><a href="#generating-samples">Generating Samples</a></li> </ol> </details> <details> <summary>Label Mixup Policies in SL</summary> <ol> <li><a href="#optimizing-calibration">Optimizing Calibration</a></li> <li><a href="#area-based">Area-based</a></li> <li><a href="#loss-object">Loss Object</a></li> <li><a href="#random-label-policies">Random Label Policies</a></li> <li><a href="#optimizing-mixing-ratio">Optimizing Mixing Ratio</a></li> <li><a href="#generating-label">Generating Label</a></li> <li><a href="#attention-score">Attention Score</a></li> <li><a href="#saliency-token">Saliency Token</a></li> </ol> </details> <!-- <li><a href="#label-mixup-policies-in-sl">Label Mixup Policies in SL</a></li> --> <!-- <li><a href="#self-supervised-learning">Self-Supervised Learning</a></li> --> <details> <summary>Self-Supervised Learning</summary> <ol> <li><a href="#contrastive-learning">Contrastive Learning</a></li> <li><a href="#masked-image-modeling">Masked Image Modeling</a></li> </ol> </details> <details> <summary>Semi-Supervised Learning</summary> <ol> <li><a href="#semi-supervised-learning">Semi-Supervised Learning</a></li> </ol> </details> <!-- <li><a href="#semi-supervised-learning">Semi-Supervised Learning</a></li> --> <!-- <li><a href="#cv-downstream-tasks">CV Downstream Tasks</a></li> --> <details> <summary>CV Downstream Tasks</summary> <ol> <li><a href="#regression">Regression</a></li> <li><a href="#long-tail-distribution">Long tail distribution</a></li> <li><a href="#segmentation">Segmentation</a></li> <li><a href="#object-detection">Object Detection</a></li> </ol> </details> <details> <summary>Training Paradigms</summary> <ol> <li><a href="#federated-learning">Federated Learning</a></li> <li><a href="#adversarial-attack-and-adversarial-training">Adversarial Attack and Adversarial Training</a></li> <li><a href="#domain-adaption">Domain Adaption</a></li> <li><a href="#knowledge-distillation">Knowledge Distillation</a></li> <li><a href="#multi-modal">Multi Modal</a></li> </ol> </details> <details> <summary>Beyond Vision</summary> <ol> <li><a href="#nlp">NLP</a></li> <li><a href="#gnn">GNN</a></li> <li><a href="#3d-point">3D Point</a></li> <li><a href="#other">Other</a></li> </ol> </details> <li><a href="#analysis-and-theorem">Analysis and Theorem</a></li> <li><a href="#survey">Survey</a></li> <li><a href="#benchmark">Benchmark</a></li> <li><a href="#classification-results-on-datasets">Classification Results on Datasets</a></li> <li><a href="#related-datasets-link">Related Datasets Link</a></li> <li><a href="#contribution">Contribution</a></li> <li><a href="#license">License</a></li> <li><a href="#acknowledgement">Acknowledgement</a></li> <li><a href="#related-project">Related Project</a></li> </ol> </details>

Sample Mixup Policies in SL

<p align="center"> <img src="https://github.com/user-attachments/assets/16482af9-90c2-4413-87e5-e0df70a7a0cc" width=100% height=100% class="center"> </p>

Static Linear

<p align="right">(<a href="#top">back to top</a>)</p>

Feature-based

<p align="right">(<a href="#top">back to top</a>)</p>

Cutting-based

<p align="right">(<a href="#top">back to top</a>)</p>

K Samples Mixup

<p align="right">(<a href="#top">back to top</a>)</p>

Random Policies

<p align="right">(<a href="#top">back to top</a>)</p>

Style-based

<!-- * **Teach me how to Interpolate a Myriad of Embeddings**<br> --> <p align="right">(<a href="#top">back to top</a>)</p>

Saliency-based

<p align="right">(<a href="#top">back to top</a>)</p>

Attention-based

<p align="right">(<a href="#top">back to top</a>)</p>

Generating Samples

<p align="right">(<a href="#top">back to top</a>)</p>

Label Mixup Policies in SL

<p align="center"> <img src="https://github.com/user-attachments/assets/214c6969-6b69-48db-99b6-ffe276eb4ca4" width=100% height=100% class="center"> </p>

Optimizing Calibration

<p align="right">(<a href="#top">back to top</a>)</p>

Area-based

<p align="right">(<a href="#top">back to top</a>)</p>

Loss Object

<p align="right">(<a href="#top">back to top</a>)</p>

Random Label Policies

<p align="right">(<a href="#top">back to top</a>)</p>

Optimizing Mixing Ratio

<p align="right">(<a href="#top">back to top</a>)</p>

Generating Label

<p align="right">(<a href="#top">back to top</a>)</p>

Attention Score

<p align="right">(<a href="#top">back to top</a>)</p>

Saliency Token

<p align="right">(<a href="#top">back to top</a>)</p>

Self-Supervised Learning

Contrastive Learning

<p align="right">(<a href="#top">back to top</a>)</p>

Masked Image Modeling

<p align="right">(<a href="#top">back to top</a>)</p>

Semi-Supervised Learning

<!-- <p align="center"><img width="50%" src="https://github-production-user-asset-6210df.s3.amazonaws.com/44519745/283528510-1f3b643c-0edd-416e-9979-110f3d2be6b6.png" /></p> --> </details> <p align="right">(<a href="#top">back to top</a>)</p>

CV Downstream Tasks

Regression

<p align="right">(<a href="#top">back to top</a>)</p>

Long tail distribution

<p align="right">(<a href="#top">back to top</a>)</p>

Segmentation

<p align="right">(<a href="#top">back to top</a>)</p>

Object Detection

<p align="right">(<a href="#top">back to top</a>)</p>

Other Applications

<p align="center"> <img src="https://github.com/user-attachments/assets/06c614aa-86ee-4ac7-a2b9-83c3c9fb2087" width=100% height=100% class="center"> </p>

Training Paradigms

Federated Learning

<p align="right">(<a href="#top">back to top</a>)</p>

Adversarial Attack and Adversarial Training

<p align="right">(<a href="#top">back to top</a>)</p>

Domain Adaption

<p align="right">(<a href="#top">back to top</a>)</p>

Knowledge Distillation

<p align="right">(<a href="#top">back to top</a>)</p>

Multi-Modal

<p align="right">(<a href="#top">back to top</a>)</p>

Beyond Vision

NLP

<p align="right">(<a href="#top">back to top</a>)</p>

GNN

<p align="right">(<a href="#top">back to top</a>)</p>

3D Point

<p align="right">(<a href="#top">back to top</a>)</p>

Other

<p align="right">(<a href="#top">back to top</a>)</p>

Analysis and Theorem

<p align="right">(<a href="#top">back to top</a>)</p>

Survey

Benchmark

<p align="right">(<a href="#top">back to top</a>)</p>

Classification Results on Datasets

Mixup methods classification results on general datasets: CIFAR10 \ CIFAR100, TinyImageNet, and ImageNet-1K. $(\cdot)$ denotes training epochs based on ResNet18 (R18), ResNet50 (R50), ResNeXt50 (RX50), PreActResNet18 (PreActR18), and Wide-ResNet28 (WRN28-10, WRN28-8).

MethodPublishCIFAR10CIFAR100CIFAR100CIFAR100CIFAR100CIFAR100Tiny-ImageNetTiny-ImageNetImageNet-1KImageNet-1K
R18R18RX50PreActR18WRN28-10WRN28-8R18RX50R18R50
MixUpICLR'201896.62(800)79.12(800)82.10(800)78.90(200)82.50(200)82.82(400)63.86(400)66.36(400)69.98(100)77.12(100)
CutMixICCV'201996.68(800)78.17(800)78.32(800)76.80(1200)83.40(200)84.45(400)65.53(400)66.47(400)68.95(100)77.17(100)
Manifold MixupICML'201996.71(800)80.35(800)82.88(800)79.66(1200)81.96(1200)83.24(400)64.15(400)67.30(400)69.98(100)77.01(100)
FMixarXiv'202096.18(800)79.69(800)79.02(800)79.85(200)82.03(200)84.21(400)63.47(400)65.08(400)69.96(100)77.19(100)
SmoothMixCVPRW'202096.17(800)78.69(800)78.95(800)--82.09(400)---77.66(300)
GridMixPR'202096.56(800)78.72(800)78.90(800)--84.24(400)64.79(400)---
ResizeMixarXiv'202096.76(800)80.01(800)80.35(800)-85.23(200)84.87(400)63.47(400)65.87(400)69.50(100)77.42(100)
SaliencyMixICLR'202196.20(800)79.12(800)78.77(800)80.31(300)83.44(200)84.35(400)64.60(400)66.55(400)69.16(100)77.14(100)
Attentive-CutMixICASSP'202096.63(800)n78.91(800)80.54(800)--84.34(400)64.01(400)66.84(400)-77.46(100)
Saliency GraftingAAAI'2022-80.83(800)83.10(800)-84.68(300)-64.84(600)67.83(400)-77.65(100)
PuzzleMixICML'202097.10(800)81.13(800)82.85(800)80.38(1200)84.05(200)85.02(400)65.81(400)67.83(400)70.12(100)77.54(100)
Co-MixICLR'202197.15(800)81.17(800)82.91(800)80.13(300)-85.05(400)65.92(400)68.02(400)-77.61(100)
SuperMixCVPR'2021---79.07(2000)93.60(600)----77.60(600)
RecursiveMixNIPS'2022-81.36(200)-80.58(2000)-----79.20(300)
AutoMixECCV'202297.34(800)82.04(800)83.64(800)--85.18(400)67.33(400)70.72(400)70.50(100)77.91(100)
SAMixarXiv'202197.50(800)82.30(800)84.42(800)--85.50(400)68.89(400)72.18(400)70.83(100)78.06(100)
AlignMixupCVPR'2022---81.71(2000)-----78.00(100)
MultiMixNIPS'2023---81.82(2000)-----78.81(300)
GuidedMixupAAAI'2023---81.20(300)84.02(200)----77.53(100)
Catch-up MixAAAI'2023-82.10(400)83.56(400)82.24(2000)--68.84(400)--78.71(300)
LGCOAMixTIP'2024-82.34(800)84.11(800)---68.27(400)73.08(400)--
AdAutoMixICLR'202497.55(800)82.32(800)84.42(800)--85.32(400)69.19(400)72.89(400)70.86(100)78.04(100)

Mixup methods classification results on ImageNet-1K dataset use ViT-based models: DeiT, Swin Transformer (Swin), Pyramid Vision Transformer (PVT), and ConvNext trained 300 epochs.

MethodPublishImageNet-1KImageNet-1KImageNet-1KImageNet-1KImageNet-1KImageNet-1KImageNet-1K
DieT-TinyDieT-SmallDieT-BaseSwin-TinyPVT-TinyPVT-SmallConvNeXt-Tiny
MixUpICLR'201874.6977.7278.9881.0175.2478.6980.88
CutMixICCV'201974.2380.1381.6181.2375.5379.6481.57
FMixarXiv'202074.4177.37-79.6075.2878.7281.04
ResizeMixarXiv'202074.7978.6180.8981.3676.0579.5581.64
SaliencyMixICLR'202174.1779.8880.7281.3775.7179.6981.33
Attentive-CutMixICASSP'202074.0780.3282.4281.2974.9879.8481.14
PuzzleMixICML'202073.8580.4581.6381.4775.4879.7081.48
AutoMixECCV'202275.5280.7882.1881.8076.3880.6482.28
SAMixarXiv'202175.8380.9482.8581.8776.6080.7882.35
TransMixCVPR'202274.5680.6882.5181.8075.5080.50-
TokenMixECCV'202275.3180.8082.9081.6075.60-73.97
TL-AlignICCV'202373.2080.6082.3081.4075.5080.40-
SMMixICCV'202375.5681.1082.9081.8075.6081.03-
MixproICLR'202373.8081.3082.9082.8076.7081.20-
LUMixICASSP'2024-80.6080.2081.70--82.50
<p align="right">(<a href="#top">back to top</a>)</p>

Related Datasets Link

Summary of datasets for mixup methods tasks. Link to dataset websites is provided.

DatasetTypeLabelTaskTotal data numberLink
MINISTImage10Classification70,000MINIST
Fashion-MNISTImage10Classification70,000Fashion-MINIST
CIFAR10Image10Classification60,000CIFAR10
CIFAR100Image100Classification60,000CIFAR100
SVHNImage10Classification630,420SVHN
GTSRBImage43Classification51,839GTSRB
STL10Image10Classification113,000STL10
Tiny-ImageNetImage200Classification100,000Tiny-ImageNet
ImageNet-1KImage1,000Classification1,431,167ImageNet-1K
CUB-200-2011Image200Classification, Object Detection11,788CUB-200-2011
FGVC-AircraftImage102Classification10,200FGVC-Aircraft
StanfordCarsImage196Classification16,185StanfordCars
Oxford FlowersImage102Classification8,189Oxford Flowers
Caltech101Image101Classification9,000Caltech101
SOPImage22,634Classification120,053SOP
Food-101Image101Classification101,000Food-101
SUN397Image899Classification130,519SUN397
iNaturalistImage5,089Classification675,170iNaturalist
CIFAR-CImage10,100Corruption Classification60,000CIFAR-C
CIFAR-LTImage10,100Long-tail Classification60,000CIFAR-LT
ImageNet-1K-CImage1,000Corruption Classification1,431,167ImageNet-1K-C
ImageNet-AImage200Classification7,500ImageNet-A
Pascal VOC 102Image20Object Detection33,043Pascal VOC 102
MS-COCO DetectionImage91Object Detection164,062MS-COCO Detection
DSpritesImage737,280*6Disentanglement737,280DSprites
Place205Image205Recognition2,500,000Place205
Pascal ContextImage459Segmentation10,103Pascal Context
ADE20KImage3,169Segmentation25,210ADE20K
CityscapesImage19Segmentation5,000Cityscapes
StreetHazardsImage12Segmentation7,656StreetHazards
PACSImage7*4Domain Classification9,991PACS
BRACSMedical Image7Classification4,539BRACS
BACHMedical Image4Classification400BACH
CAME-Lyon16Medical Image2Anomaly Detection360CAME-Lyon16
Chest X-RayMedical Image2Anomaly Detection5,856Chest X-Ray
BCCDMedical Image4,888Object Detection364BCCD
TJU600Palm-Vein Image600Classification12,000TJU600
VERA220Palm-Vein Image220Classification2,200VERA220
CoNLL2003Text4Classification2,302CoNLL2003
20 NewsgroupsText20OOD Detection20,00020 Newsgroups
WOSText134OOD Detection46,985WOS
SST-2Text2Sentiment Understanding68,800SST-2
CoraGraph7Node Classification2,708Cora
CiteseerGraph6Node Classification3,312Citeseer
PubMedGraph3Node Classification19,717PubMed
BlogCatalogGraph39Node Classification10,312BlogCatalog
Google CommandsSpeech30Classification65,000Google Commands
VoxCeleb2Speech6,112Sound Classification1,000,000+VoxCeleb2
VCTKSpeech110Enhancement44,000VCTK
ModelNet403D Point Cloud40Classification12,311ModelNet40
ScanObjectNN3D Point Cloud15Classification15,000ScanObjectNN
ShapeNet3D Point Cloud16Recognition, Classification16,880ShapeNet
KITTI3603D Point Cloud80,256Detection, Segmentation14,999KITTI360
UCF101Video101Action Recognition13,320UCF101
Kinetics400Video400Action Recognition260,000Kinetics400
AirfoilTabular-Regression1,503Airfoil
NO2Tabular-Regression500NO2
Exchange-RateTimeseries-Regression7,409Exchange-Rate
ElectricityTimeseries-Regression26,113Electricity
<p align="right">(<a href="#top">back to top</a>)</p>

Contribution

Feel free to send pull requests to add more links with the following Markdown format. Note that the abbreviation, the code link, and the figure link are optional attributes.

* **TITLE**<br>
*AUTHER*<br>
PUBLISH'YEAR [[Paper](link)] [[Code](link)]
   <details close>
   <summary>ABBREVIATION Framework</summary>
   <p align="center"><img width="90%" src="link_to_image" /></p>
   </details>

Citation

If you feel that our work has contributed to your research, please cite it, thanks. 🥰

@article{jin2024survey,
  title={A Survey on Mixup Augmentations and Beyond},
  author={Jin, Xin and Zhu, Hongyu and Li, Siyuan and Wang, Zedong and Liu, Zicheng and Yu, Chang and Qin, Huafeng and Li, Stan Z},
  journal={arXiv preprint arXiv:2409.05202},
  year={2024}
}

Current contributors include: Siyuan Li (@Lupin1998), Xin Jin (@JinXins), Zicheng Liu (@pone7), and Zedong Wang (@Jacky1128). We thank all contributors for Awesome-Mixup!

<p align="right">(<a href="#top">back to top</a>)</p>

License

This project is released under the Apache 2.0 license.

Acknowledgement

This repository is built using the OpenMixup library and Awesome README repository.

Related Project