Home

Awesome

Offensive AI Compilation

A curated list of useful resources that cover Offensive AI.

πŸ“ Contents πŸ“

🚫 Abuse 🚫

Exploiting the vulnerabilities of AI models.

🧠 Adversarial Machine Learning 🧠

Adversarial Machine Learning is responsible for assessing their weaknesses and providing countermeasures.

⚑ Attacks ⚑

It is organized in four types of attacks: extraction, inversion, poisoning and evasion.

Adversarial Machine Learning attacks

πŸ”’ Extraction πŸ”’

It tries to steal the parameters and hyperparameters of a model by making requests that maximize the extraction of information.

Extraction attack

Depending on the knowledge of the adversary's model, white-box and black-box attacks can be performed.

In the simplest white-box case (when the adversary has full knowledge of the model, e.g., a sigmoid function), one can create a system of linear equations that can be easily solved.

In the generic case, where there is insufficient knowledge of the model, the substitute model is used. This model is trained with the requests made to the original model in order to imitate the same functionality as the original one.

White-box and black-box extraction attacks

⚠️ Limitations ⚠️
πŸ›‘οΈ Defensive actions πŸ›‘οΈ
πŸ”— Useful links πŸ”—
⬅️ Inversion (or inference) ⬅️

They are intended to reverse the information flow of a machine learning model.

Inference attack

They enable an adversary to have knowledge of the model that was not explicitly intended to be shared.

They allow to know the training data or information as statistical properties of the model.

Three types are possible:

πŸ›‘οΈ Defensive actions πŸ›‘οΈ
πŸ”— Useful links πŸ”—
πŸ’‰ Poisoning πŸ’‰

They aim to corrupt the training set by causing a machine learning model to reduce its accuracy.

Poisoning attack

This attack is difficult to detect when performed on the training data, since the attack can propagate among different models using the same training data.

The adversary seeks to destroy the availability of the model by modifying the decision boundary and, as a result, producing incorrect predictions or, create a backdoor in a model. In the latter, the model behaves correctly (returning the desired predictions) in most cases, except for certain inputs specially created by the adversary that produce undesired results. The adversary can manipulate the results of the predictions and launch future attacks.

πŸ”“ Backdoors πŸ”“

BadNets are the simplest type of backdoor in a machine learning model. Moreover, BadNets are able to be preserved in a model, even if they are retrained again for a different task than the original model (transfer learning).

It is important to note that public pre-trained models may contain backdoors.

πŸ›‘οΈ Defensive actions πŸ›‘οΈ
πŸ”— Useful links πŸ”—
πŸƒβ€β™‚οΈ Evasion πŸƒβ€β™‚οΈ

An adversary adds a small perturbation (in the form of noise) to the input of a machine learning model to make it classify incorrectly (example adversary).

Evasion attack

They are similar to poisoning attacks, but their main difference is that evasion attacks try to exploit weaknesses of the model in the inference phase.

The goal of the adversary is for adversarial examples to be imperceptible to a human.

Two types of attack can be performed depending on the output desired by the opponent:

The most common attacks are white-box attacks:

πŸ›‘οΈ Defensive actions πŸ›‘οΈ
πŸ”— Useful links πŸ”—

πŸ› οΈ Tools πŸ› οΈ

NameTypeSupported algorithmsSupported attack typesAttack/DefenceSupported frameworksPopularity
CleverhansImageDeep LearningEvasionAttackTensorflow, Keras, JAXstars
FoolboxImageDeep LearningEvasionAttackTensorflow, PyTorch, JAXstars
ARTAny type (image, tabular data, audio,...)Deep Learning, SVM, LR, etc.Any (extraction, inference, poisoning, evasion)BothTensorflow, Keras, Pytorch, Scikit Learnstars
TextAttackTextDeep LearningEvasionAttackKeras, HuggingFacestars
AdvertorchImageDeep LearningEvasionBoth---stars
AdvBoxImageDeep LearningEvasionBothPyTorch, Tensorflow, MxNetstars
DeepRobustImage, graphDeep LearningEvasionBothPyTorchstars
CounterfitAnyAnyEvasionAttack---stars
Adversarial Audio ExamplesAudioDeepSpeechEvasionAttack---stars
ART

Adversarial Robustness Toolbox, abbreviated as ART, is an open-source Adversarial Machine Learning library for testing the robustness of machine learning models.

ART logo

It is developed in Python and implements extraction, inversion, poisoning and evasion attacks and defenses.

ART supports the most popular frameworks: Tensorflow, Keras, PyTorch, MxNet, ScikitLearn, among many others.

It is not limited to the use of models that use images as input, but also supports other types of data, such as audio, video, tabular data, etc.

Workshop to learn Adversarial Machine Learning with ART πŸ‡ͺπŸ‡Έ

Cleverhans

Cleverhans is a library for performing evasion attacks and testing the robustness of a deep learning model on image models.

Cleverhans logo

It is developed in Python and integrates with the Tensorflow, Torch and JAX frameworks.

It implements numerous attacks such as L-BFGS, FGSM, JSMA, C&W, among others.

πŸ”§ Use πŸ”§

The use of AI to accomplish a malicious task and boost classic attacks.

πŸ•΅οΈβ€β™‚οΈ Pentesting πŸ•΅οΈβ€β™‚οΈ

🦠 Malware 🦠

πŸ—ΊοΈΒ OSINT πŸ—ΊοΈ

πŸ“§Β Phishing πŸ“§

πŸ•΅ Threat Intelligence πŸ•΅

πŸ‘¨β€πŸŽ€ Generative AI πŸ‘¨β€πŸŽ€

πŸ”Š Audio πŸ”Š

πŸ› οΈ Tools πŸ› οΈ
πŸ’‘ Applications πŸ’‘
πŸ”Ž Detection πŸ”Ž

πŸ“· Image πŸ“·

πŸ› οΈ Tools πŸ› οΈ
πŸ’‘ Applications πŸ’‘
πŸ”Ž Detection πŸ”Ž

πŸŽ₯ Video πŸŽ₯

πŸ› οΈ Tools πŸ› οΈ
πŸ’‘ Applications πŸ’‘
πŸ”Ž Detection πŸ”Ž

πŸ“„ Text πŸ“„

πŸ› οΈ Tools πŸ› οΈ
πŸ”Ž Detection πŸ”Ž
πŸ’‘ Applications πŸ’‘

πŸ“š Misc πŸ“š

πŸ“Š Surveys πŸ“Š

πŸ—£ Maintainers πŸ—£

<table> <tr> <td align="center"><a href="https://github.com/Miguel000"><img src="https://avatars2.githubusercontent.com/u/13256426?s=460&v=4" width="150;" alt=""/><br /><sub><b>Miguel HernΓ‘ndez</b></sub></a></td> <td align="center"><a href="https://github.com/jiep"><img src="https://avatars2.githubusercontent.com/u/414463?s=460&v=4" width="150px;" alt=""/><br /><sub><b>JosΓ© Ignacio Escribano</b></sub></a></td> </tr> </table>

©️ License ©️

License: CC BY-SA 4.0