Awesome
Awesome-Mix
This repository contains a list of papers on the A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability, and we categorize them based on our proposed taxonomy.
We will try to make this list updated. If you found any error or any missed paper, please don't hesitate to open issues or pull requests.
A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability
Chengtai Cao, Fan Zhou, Yurou Dai, and Jianping Wang
arXiv:2212.10888
Methodology
Mixup-based
Mixup
- [Mixup -- ICLR 2018] Mixup: Beyond Empirical Risk Minimization [Code]
- [BC Learning -- ICLR 2018] Learning from Between-class Examples for Deep Sound Recognition [code]
- [BC Learning + -- CVRP 2018] Between-class Learning for Image Classification [code]
- [SamplePairing -- Arxiv 2018] Data Augmentation by Pairing Samples for Images Classification [code*]
Mixng in Embedding Space
- [Manifold Mixup -- ICML 2019] Manifold Mixup: Better Representations by Interpolating Hidden States [code*]
- [AlignMixup -- CVPR 2022] AlignMixup: Improving Representations by Interpolating Aligned Features [code]
- [NFM -- ICLR 2022] Noisy Feature Mixup [code]
Adaptive Mix Strategy
- [AdaMixUp -- AAAI 2019] MixUp as Locally Linear Out-of-manifold Regularization [code*]
- [MetaMixUp -- TNNLS 2021] MetaMixUp: Learning Adaptive Interpolation Policy of MixUp with Meta-Learning
- [AutoMix -- ECCV 2022] AutoMix: Unveiling the Power of Mixup for Stronger Classifiers [code]
- [CAMixup -- ICLR 2021] Combining Ensembles and Data Augmentation Can Harm Your Calibration [code]
- [Nonlinear Mixup -- AAAI 2020] : Out-of-manifold Data Augmentation for Text Classification
- [AMP -- EMNLP 2021] Adversarial Mixing Policy for Relaxing Locally Linear Constraints in Mixup [code]
- [DM -- Arxiv 2022] Decoupled Mixup for Data-efficient Learning
- [Remix -- ECCV 2022] Remix: Rebalanced Mixup
Sample Selection
- [LADA -- EMNLP 2020] Local Additivity based Data Augmentation for Semi-supervised NER [code]
- [Local Mixup -- Arxiv 2022] Preventing Manifold Intrusion with Locality: Local Mixup [code]
- [Pani -- Arxiv 2019] Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy
- [HypMix -- EMNLP 2021] HypMix: Hyperbolic Interpolative Data Augmentation [code]
- [SAMix -- Arxiv 2021] Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
- [GenLabel -- ICML 2022] GenLabel: Mixup Relabeling using Generative Models [code]
- [DMix -- ACL 2022] DMix: Adaptive Distance-aware Interpolative Mixup [code]
- [M-Mix -- KDD 2022] M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning [code]
- [CSANMT -- ACL 2022] Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation [code]
- [RegMixup -- NIPS 2022] RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness [code]
Saliency & Style For Guidance
- [SuperMix -- CVPR 2021] SuperMix: Supervising the Mixing Data Augmentation [code]
- [Superpixel-Mix -- BMVC 2021] Robust Semantic Segmentation with Superpixel-Mix
- [StyleMix -- CVPR 2021] StyleMix: Separating Content and Style for Enhanced Data Augmentation [code]
- [Mixstyle -- ICLR 2021] : Domain Generalization with Mixstyle [code]
- [MoEx -- CVPR 2021] On Feature Normalization and Data Augmentation [code]
- [Mixup-with-AUM-and-SM -- ACL 2022] On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency [code]
- [XAI Mixup -- TKDD 2022] Explainability-based Mixup Approach for Text Data Augmentation
- [TokenMixup -- Arxiv 2022] TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers [code]
- [SciMix -- Arxiv 2022] Swapping Semantic Contents for Mixing Images
Diversity in Mixup
- [BatchMixup -- IJCNLP 2021] BatchMixup: Improving Training by Interpolating Hidden States of the Entire Mini-batch
- [K-Mixup -- Arxiv 2021] K-Mixup Regularization for Deep Learning via Optimal Transport
- [MultiMix -- Arxiv 2022] Teach Me How to Interpolate a Myriad of Embeddings
- [MixMo -- CVPR 2021] MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks [code]
- [DMixup & DCutmix -- Arxiv 2021] Observations on K-image Expansion of Image-mixing Augmentation for Classification
- [PixMix -- CVPR 2022] PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures [code]
Miscellaneous Mixup Methods
- [GIF -- Arxiv 2021] Guided Interpolation for Adversarial Training
- [MWh -- ICIG 2021] Mixup Without Hesitation [code]
- [AutoMix -- ECCV 2020] AutoMix: Mixup Networks for Sample Interpolation via Cooperative Barycenter Learning [code*]
- [RegMix -- Arxiv 2021] RegMix: Data Mixing Augmentation for Regression
Cutmix-based
Cutmix
- [Cutmix -- ICCV 2019] CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features [Code]
- [MixdedExample -- WACV 2019] Improved Mixed-example Data Augmentation [code]
- [RICAP -- ACML 2018] RICAP: Random Image Cropping and Patching Data Augmentation for Deep CNNs [code*]
Integration with Saliency Information
- [Attentive Cutmix -- ICASSP 2020] Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning based Image Classification [code*]
- [FocusMix -- ICTC 2020] Where to Cut and Paste: Data Regularization with Selective Features
- [TransMix -- CVPR 2022] TransMix: Attend to Mix for Vision Transformers [code]
- [TL-Align -- Arxiv 2021] Token-Label Alignment for Vision Transformers [code]
- [SaliencyMix -- ILCR 2021] SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization [code]
- [Puzzle Mix -- ICML 2020] Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup [code]
- [SSMix -- ACL 2021] SSMix: Saliency-based Span Mixup for Text Classification [code]
- [SnapMix -- AAAI 2021] SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data [code]
- [Attribute Mix -- VCIP 2020] Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition
- [ResizeMix -- Arxiv 2020] ResizeMix: Mixing Data with Preserved Object Information and True Labels [code*]
Improved Divergence
- [Saliency Grafting -- AAAI 2022] Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing
- [Co-Mixup -- ICLR 2021] Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity [code]
- [RecursiveMix -- Arxiv 2022] RecursiveMix: Mixed Learning with History [code]
Border Smooth
- [SmoothMix -- CVPR 2020] SmoothMix: A Simple Yet Effective Data Augmentation to Train Robust Classifiers [code]
- [HMix & GMix -- Arxiv 2022] A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective [code]
Other Cutmix Techniques
- [PatchUp -- AAAI 2022] PatchUp: A Regularization Technique for Convolutional Neural Networks [code]
- [TokenMix -- ECCV 2022] TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers [code]
- [ScoreMix -- CVPR 2022] ScoreNet: Learning Non-uniform Attention and Augmentation for Transformer-based Histopathological Image Classification
- [GridMix -- Pattern Recognition 2021] GridMix: Strong Regularization through Local Context Mapping
- [PatchMix -- BMWC 2021] Evolving Image Compositions for Feature Representation Learning
- [ChessMix -- SIBGRAPI 2021] ChessMix: Spatial Context Data Augmentation for Remote Sensing Semantic Segmentation [code]
- [ICC -- ICPS 2021] Intra-class Cutmix for Unbalanced Data Augmentation
Beyond Mixup & Cutmix
Mixing with itself
- [DJMix -- Arxiv 2021] DJMix: Unsupervised Task-agnostic Augmentation for Improving Robustness
- [CutBlur -- CVPR 2020] Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy [code]
Incorporating multiple MixDA approaches
- [RandomMix -- Arxiv 2022 ] RandomMix: A Mixed Sample Data Augmentation Method with Multiple Mixed Modes
- [AugRmixAT -- ICME 2002] AugRmixAT: A Data Processing and Training Method for Improving Multiple Robustness and Generalization Performance
Integrating with other DA methods
- [SuperpixelGridMix -- Arxiv 2022] SuperpixelGridCut, SuperpixelGridMean and SuperpixelGridMix Data Augmentation [code]
- [AugMix -- ICLR 2020] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty [code]
- [StackMix -- UAI 2021] StackMix: A Complementary Mix Algorithm
- [ClassMix -- WACV 2021] ClassMix: Segmentation-based Data Augmentation for Semi-supervised Learning [code]
- [CropMix -- Arxiv 2022] CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping [code]
MixDA Applications
Semi-Supervised Learning
- [ICT -- IJCAI 2019] Interpolation Consistency Training for Semi-Supervised Learning [code]
- [MixMatch -- NIPS 2019] MixMatch: A Holistic Approach to Semi-Supervised Learning [code]
- [ReMixMatch -- Arxiv 2019] ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring [code]
- [DivideMix -- ICLR 2020] DivideMix: Learning with Noisy Labels as Semi-supervised Learning [code]
- [CowMix/CowOut -- Arxiv 2020] Milking CowMask for Semi-Supervised Image Classification [code]
- [MixPUL -- Arxiv 2020] MixPUL: Consistency-based Augmentation for Positive and Unlabeled Learning [code]
- [P<sup>3</sup>Mix -- ICLR 2022] Who is Your Right Mixup Partner in Positive and Unlabeled Learning
Contrastive Learning
- [MixCo -- Arxiv 2022] MixCo: Mix-up Contrastive Learning for Visual Representation [cdoe]
- [Core-tuning -- NIPS 2021] Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-regularized Fine-tuning [code]
- [Feature Transformation -- ICCV 2021] Improving Contrastive Learning by Visualizing Feature Transformation [code]
- [Mochi -- NIPS 2020] Hard Negative Mixing for Contrastive Learning [code]
- [Comix -- NIPS 2021] Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing [code]
- [MixSiam -- Arxiv 2021] MixSiam: A Mixture-based Approach to Self-Supervised Representation Learning
- [Un-Mix -- AAAI 2022] Un-mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning [code]
- [ScaleMix -- CVPR 2022] On the Importance of Asymmetry for Siamese Representation Learning [code]
- [BSIM -- Arxiv 2020] Beyond Single Instance Multi-view Unsupervised Representation Learning
- [i-Mix -- ICLR 2021] i-mix: A Domain-agnostic Strategy for Contrastive Representation Learning [code]
- [MCL -- PRL 2022] Mixing up Contrastive Learning: Self-Supervised Representation Learning for Time Series [code]
- [CLIM -- BMWC 2020] Center-wise Local Image Mixture For Contrastive Representation Learning
- [SDMP -- CVPR 2022] A Simple Data Mixing Prior for Improving Self-Supervised Learning [code]
- [Similarity Mixup -- CVPR 2022] Recall@k Surrogate Loss with Large Batches and Similarity Mixup [code]
- [ProGC-Mix -- ICML] ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning [code]
Metric Learning
- [Embedding Expansion -- CVPR 2020] Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning(2020) [code]
- [Metrix -- ICLR 2022] It Takes Two to Tango: Mixup for Deep Metric Learning(2022) [code]
Adversarial Training
- [IAT -- AISec 2019] Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy
- [AVmixup -- CVPR 2020] Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization [code]
- [AMDA ACL/IJNCLP 2021] Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning [code]
- [M-TLAT -- ECCV 2020] Addressing Neural Network Robustness with Mixup and Targeted Labeling Adversarial Training
- [AOM -- Arxiv 2021] Adversarially Optimized Mixup for Robust Classification
- [MixACM -- NIPS 2021] : MixACM: Mixup-based Robustness Transfer via Distillation of Activated Channel Maps [code]
- [Mixup-SSAT -- Arxiv 2022] Semi-Supervised Semantics-guided Adversarial Training for Trajectory Prediction
- [MI -- ICLR 2020] Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks [code]
Generative Models
- [Shot VAE -- AAAI 2021]SHOT-VAE: Semi-Supervised Deep Generative Models with Label-aware ELBO Approximations [code]
- [AAE -- ICPR 2018] Data Augmentation via Latent Space Interpolation for Image Classification
- [VarMixup -- Arxiv 2020] VarMixup: Exploiting the Latent Space for Robust Training and Inference
- [ACAI -- ICLR 2019] Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer [code]
- [AMR -- NIPS 2019] On Adversarial Mixup Resynthesis [code]
Domain Adaption
- [VMT -- Arxiv 2019] Virtual Mixup Training for Unsupervised Domain Adaptation [code]
- [IIMT -- Arxiv 2020] Improve Unsupervised Domain Adaptation with Mixup Training
- [DM-ADA -- AAAI 2020] Adversarial Domain Adaptation with Domain Mixup
- [DMRL -- ECCV 2020] Dual Mixup Regularized Learning for Adversarial Domain Adaptation [code]
- [SLM -- NIPS 2021] Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation [code]
Natural Language Processing
- [WordMixup & SenMixup -- Arxiv 2019] Augmenting Data with Mixup for Sentence Classification: An Empirical Study
- [Mixup-Transformer -- COLING 2020] Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks
- [Calibrated-BERT-Fine-Tuning -- EMNLP 2020] Calibrated Language Model Fine-tuning for In- and Out-of-distribution Data [code]
- [Emix -- COLING 2020] Augmenting NLP Models using Latent Feature Interpolations
- [TreeMix -- NAALC 2022] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
- [MixText -- ACL 2020] MixText: Linguistically-informed Interpolation of Hidden Space for Semi-Supervised Text Classification [code]
- [SeqMix -- EMNLP 2020] Sequence-level Mixed Sample Data Augmentation [code]
- [SeqMix -- EMNLP 2020] Seqmix: Augmenting Active Sequence Labeling via Sequence Mixup [code]
- [AdvAug -- ACL 2020] AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
- [STEMM -- ACL 2022] STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation [code]
- [MixDiversity -- EMNLP 2021] Mixup Decoding for Diverse Machine Translation
- [XMixup -- ICLR 2022] Enhancing Cross-lingual Transfer by Manifold Mixup [code]
- [mXEncDec -- ACL 2022] Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation
Graph Neural Networks
- [GraphMix -- AAAI 2021] GraphMix: Improved Training of GNNs for Semi-Supervised Learning [code]
- [PMRGNN -- Symmetry 2022] Graph Mixed Random Network based on PageRank
- [NodeAug -- CDS 2021] Node Augmentation Methods for Graph Neural Network based Object Classification
- [MixGNN -- WWW 2021] Mixup for Node and Graph Classification
- [GraphMixup -- Arxiv 2022] GraphMixup: Improving Class-imbalanced Node Classification on Graphs by Self-Supervised Context Prediction
- [G-Mixup -- ICML 2022] G-Mixup: Graph Data Augmentation for Graph Classification
- [Graph Transplant -- MiniCon 2022] Graph Transplant: Node Saliency-guided Graph Mixup with Local Structure Preservation
- [GraphSMOTE -- Arxiv 2022] Synthetic Over-sampling for Imbalanced Node Classification with Graph Neural Networks [code]
Federated Learning
- [Mix2FLD -- CL 2020] Mix2FLD: Downlink Federated Learning after Uplink Federated Distillation with Two-way Mixup
- [XORMixup -- Arxiv 2020] XOR Mixup: Privacy-preserving Data Augmentation for One-shot Federated Learning
- [FedMix --ICLR 2021] FedMix: Approximation of Mixup under Mean Augmented Federated Learning [code]
Other Applications
Point Clound
- [PointMix -- ECCV 2020] PointMixup: Augmentation for Point Clouds [code]
- [PA-AUG -- IROS 2021] Part-aware Data Augmentation for 3D Object Detection in Point Cloud [code]
- [RSMix -- CVPR 2021] Regularization Strategy for Point Cloud via Rigidly Mixed Sample [code]
Multiple-modal Learning
- [CMC -- ICML 2022] VLMixer: Unpaired Vision-language Pre-training via Cross-Modal CutMix [code]
Explainability Analysis of MixDA
Vicinal Risk Minimization
- [VRM -- NeurIPS 2000] Vicinal Risk Minimization
Model Regularization
- [Regularization -- Arxiv 2020] On Mixup Regularization
- [Regularization -- IEEE Access 2018] Understanding Mixup Training Methods [code]
- [Regularization -- ICLR 2021] How does Mixup Help with Robustness and Generalization?
- [Regularization -- Arxiv 2022] A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective [code]
Uncertainty & Calibration
- [Uncertainty & Calibration -- ICML 2022] When and How Mixup Improves Calibration
- [Uncertainty & Calibration -- NIPS 2019] On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
License
This project is released under the Apache 2.0 license.