Home

Awesome

<!-- <h1 align="center"> Awesome Imbalanced Learning </h1> --> <!-- ![](https://raw.githubusercontent.com/ZhiningLiu1998/figures/master/awesome-imbalanced-learning/awesome_imbalanced_learning_header.png) -->

<h2 align="center"> Curated imbalanced learning papers, codes, and libraries </h2> <p align="center"> <img src="https://awesome.re/badge.svg"> <!-- <a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning"> <img src="https://img.shields.io/badge/Imbalanced-Learning-orange"> </a> --> <img src="https://img.shields.io/github/stars/ZhiningLiu1998/awesome-imbalanced-learning"> <img src="https://img.shields.io/github/forks/ZhiningLiu1998/awesome-imbalanced-learning"> <!-- <img src="https://img.shields.io/github/issues/ZhiningLiu1998/awesome-imbalanced-learning"> --> <!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section --> <a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning#contributors-"><img src="https://img.shields.io/badge/all_contributors-4-orange.svg"></a> <!-- ALL-CONTRIBUTORS-BADGE:END --> <!-- <a href="https://github.com/ZhiningLiu1998/awesome-imbalance d-learning/graphs/traffic"> <img src="https://visitor-badge.glitch.me/badge?page_id=ZhiningLiu1998.awesome-imbalanced-learning&left_text=Hi!%20visitors"> </a> --> <img src="https://img.shields.io/github/license/ZhiningLiu1998/awesome-imbalanced-learning"> <a href="https://github.com/ZhiningLiu1998/imbalanced-ensemble"> <img src="https://img.shields.io/badge/Python Toolbox-IMBENS-blueviolet"> </a> </p> <h3 align="center"><b> Language: [<a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning">English</a>] [<a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning/blob/master/README_CN.md">中文</a>] </b></h3> <!-- **A curated list of imbalanced learning papers, codes, frameworks and libraries.** -->

Class-imbalance (also known as the long-tail problem) is the fact that the classes are not represented equally in a classification problem, which is quite common in practice. For instance, fraud detection, prediction of rare adverse drug reactions and prediction gene families. Failure to account for the class imbalance often causes inaccurate and decreased predictive performance of many classification algorithms. Imbalanced learning aims to tackle the class imbalance problem to learn an unbiased model from imbalanced data.

Inspired by awesome-machine-learning. In this repository:

Note:

<h3> <font color='red'>What's new: </font> </h3> <!-- **Disclosure:** Zhining Liu is an author on the following works: **[imbalanced-ensemble](https://github.com/ZhiningLiu1998/imbalanced-ensemble), [Self-paced Ensemble](https://github.com/ZhiningLiu1998/self-paced-ensemble), [MESA](https://github.com/ZhiningLiu1998/mesa)**. -->

Check out Zhining's other open-source projects!

<table style="font-size:15px;"> <tr> <!-- <td align="center"><a href="http://zhiningliu.com"><img src="https://avatars.githubusercontent.com/u/26108487?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Zhining Liu</b></sub></a></td> --> <td align="center"><a href="https://github.com/ZhiningLiu1998/imbalanced-ensemble"><img src="https://raw.githubusercontent.com/ZhiningLiu1998/figures/master/thumbnails/imbens-thumb.png" height="80px" alt=""/><br /><sub><b>Imbalanced-Ensemble [PythonLib]</b></sub></a><br /> <a href="https://github.com/ZhiningLiu1998/imbalanced-ensemble/stargazers"> <img alt="GitHub stars" src="https://img.shields.io/github/stars/ZhiningLiu1998/imbalanced-ensemble?style=social"> </a> </td> <td align="center"><a href="https://github.com/ZhiningLiu1998/awesome-awesome-machine-learning"><img src="https://raw.githubusercontent.com/ZhiningLiu1998/figures/master/thumbnails/awesomeml-thumb.png" height="80px" alt=""/><br /><sub><b>Machine Learning [Awesome]</b></sub></a><br /> <a href="https://github.com/ZhiningLiu1998/awesome-awesome-machine-learning/stargazers"> <img alt="GitHub stars" src="https://img.shields.io/github/stars/ZhiningLiu1998/awesome-awesome-machine-learning?style=social"> </a> </td> <td align="center"><a href="https://github.com/ZhiningLiu1998/self-paced-ensemble"><img src="https://raw.githubusercontent.com/ZhiningLiu1998/figures/master/thumbnails/spe-thumb-1.png" height="80px" alt=""/><br /><sub><b>Self-paced Ensemble [ICDE]</b></sub></a><br /> <a href="https://github.com/ZhiningLiu1998/self-paced-ensemble/stargazers"> <img alt="GitHub stars" src="https://img.shields.io/github/stars/ZhiningLiu1998/self-paced-ensemble?style=social"> </a> </td> <td align="center"><a href="https://github.com/ZhiningLiu1998/mesa"><img src="https://raw.githubusercontent.com/ZhiningLiu1998/figures/master/thumbnails/mesa-thumb.png" height="80px" alt=""/><br /><sub><b>Meta-Sampler [NeurIPS]</b></sub></a><br /> <a href="https://github.com/ZhiningLiu1998/mesa/stargazers"> <img alt="GitHub stars" src="https://img.shields.io/github/stars/ZhiningLiu1998/mesa?style=social"> </a> </td> </tr> </table>

Table of Contents

1. Frameworks and Libraries

1.1 Python

1.2 R

1.3 Java

1.4 Scalar

1.5 Julia

2. Research Papers

2.1 Surveys

2.2 Ensemble Learning

2.2.1 General ensemble

<!-- - **General ensemble** -->

2.2.2 Boosting-based

<!-- - **Boosting-based** -->

2.2.3 Bagging-based

<!-- - **Bagging-based** -->

2.2.4 Cost-sensitive ensemble

<!-- - **Cost-sensitive ensemble** -->

2.3 Data resampling

2.3.1 Over-sampling

<!-- - **Over-sampling** -->

NOTE: See more over-sampling methods at smote-variants.

2.3.2 Under-sampling

<!-- - **Under-sampling** -->

2.3.3 Hybrid-sampling

<!-- - **Hybrid-sampling** -->

2.4 Cost-sensitive Learning

2.5 Deep Learning

2.5.1 Surveys

<!-- - **Surveys** -->

2.5.2 Graph Data Mining

<!-- - **Graph Neural Networks** -->

2.5.3 Hard example mining

<!-- - **Hard example mining** -->

2.5.4 Loss function engineering

<!-- - **Loss function engineering** -->

2.5.5 Meta-learning

<!-- - **Meta-learning** -->

2.5.6 Representation Learning

<!-- - **Representation Learning** -->

2.5.7 Posterior Recalibration

<!-- - **Posterior Recalibration** -->

2.5.8 Semi/Self-supervised Learning

<!-- - **Semi/Self-supervised Learning** -->

2.5.9 Curriculum Learning

<!-- - **Curriculum Learning** -->

2.5.10 Two-phase Training

<!-- - **Two-phase Training** -->

2.5.11 Network Architecture

<!-- - **Network Architecture** -->

2.5.12 Deep Generative Model

<!-- - **Deep Generative Model** -->

2.5.13 Imbalanced Regression

<!-- - **Imbalanced Regression** -->

2.5.14 Data Augmentation

<!-- - **Augmentation** --> <!-- ## 2.6 Anomaly Detection #### 2.6.1 **Surveys** - Anomaly detection: A survey (ACM computing surveys, 2009, 9000+ citations) [[**Paper**](cinslab.com/wp-content/uploads/2019/03/xiaorong.pdf)] - A survey of network anomaly detection techniques (2017, 700+ citations) [[**Paper**](https://www.gta.ufrj.br/~alvarenga/files/CPE826/Ahmed2016-Survey.pdf)] #### 2.6.2 **Classification-based** - One-class SVMs for document classification (JMLR, 2001, 1300+ citations) [[**Paper**](www.jmlr.org/papers/volume2/manevitz01a/manevitz01a.pdf)] - One-class Collaborative Filtering (ICDM 2008, 1000+ citations) [[**Paper**](https://cseweb.ucsd.edu/classes/fa17/cse291-b/reading/04781145.pdf)] - Isolation Forest (ICDM 2008, 1000+ citations) [[**Paper**](https://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/icdm08b.pdf?q=isolation-forest)] - Anomaly Detection using One-Class Neural Networks (2018, 200+ citations) [[**Paper**](https://arxiv.org/pdf/1802.06360.pdf)] - Anomaly Detection with Robust Deep Autoencoders (KDD 2017, 170+ citations) [[**Paper**](https://pdfs.semanticscholar.org/c112/b06d3dac590b4cc111e5ec9c805d0b086c6e.pdf)] -->

3. Miscellaneous

3.1 Datasets

3.2 Github Repositories

3.2.1 Algorithms & Utilities & Jupyter Notebooks

3.2.2 Paper list

3.2.3 Slides

Contributors ✨

Thanks goes to these wonderful people (emoji key):

<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section --> <!-- prettier-ignore-start --> <!-- markdownlint-disable --> <table> <tbody> <tr> <td align="center" valign="top" width="14.28%"><a href="http://zhiningliu.com"><img src="https://avatars.githubusercontent.com/u/26108487?v=4?s=100" width="100px;" alt="Zhining Liu"/><br /><sub><b>Zhining Liu</b></sub></a><br /><a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning/commits?author=ZhiningLiu1998" title="Code">💻</a> <a href="#maintenance-ZhiningLiu1998" title="Maintenance">🚧</a> <a href="#translation-ZhiningLiu1998" title="Translation">🌍</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/AshinZeng"><img src="https://avatars.githubusercontent.com/u/37720752?v=4?s=100" width="100px;" alt="曾阿信"/><br /><sub><b>曾阿信</b></sub></a><br /><a href="#maintenance-AshinZeng" title="Maintenance">🚧</a></td> <td align="center" valign="top" width="14.28%"><a href="https://wjun0830.github.io/"><img src="https://avatars.githubusercontent.com/u/31557552?v=4?s=100" width="100px;" alt="WonJun Moon"/><br /><sub><b>WonJun Moon</b></sub></a><br /><a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning/commits?author=wjun0830" title="Code">💻</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/liugangcode"><img src="https://avatars.githubusercontent.com/u/83067064?v=4?s=100" width="100px;" alt="Gang Liu"/><br /><sub><b>Gang Liu</b></sub></a><br /><a href="https://github.com/ZhiningLiu1998/awesome-imbalanced-learning/commits?author=liugangcode" title="Code">💻</a></td> </tr> </tbody> </table> <!-- markdownlint-restore --> <!-- prettier-ignore-end --> <!-- ALL-CONTRIBUTORS-LIST:END -->

This project follows the all-contributors specification. Contributions of any kind welcome!