Awesome
Awesome Attacks on Machine Learning Privacy
This repository contains a curated list of papers related to privacy attacks against machine learning. A code repository is provided when available by the authors. For corrections, suggestions, or missing papers, please either open an issue or submit a pull request.
Contents
- Awesome Attacks on Machine Learning Privacy
- Contents
- Surveys and Overviews
- Privacy Testing Tools
- Papers and Code
- Other
Surveys and Overviews
- SoK: Model Inversion Attack Landscape: Taxonomy, Challenges, and Future Roadmap (Sayanton Dibbo, 2023)
- A Survey of Privacy Attacks in Machine Learning (Rigaki and Garcia, 2023)
- An Overview of Privacy in Machine Learning (De Cristofaro, 2020)
- Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks (Fan et al., 2020)
- Privacy and Security Issues in Deep Learning: A Survey (Liu et al., 2021)
- ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models (Liu et al., 2021)
- Membership Inference Attacks on Machine Learning: A Survey (Hu et al., 2021)
- Survey: Leakage and Privacy at Inference Time (Jegorova et al., 2021)
- A Review of Confidentiality Threats Against Embedded Neural Network Models (Joud et al., 2021)
- Federated Learning Attacks Revisited: A Critical Discussion of Gaps,Assumptions, and Evaluation Setups (Wainakh et al., 2021)
- I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences (Oliynyk et al., 2022)
Privacy Testing Tools
- PrivacyRaven (Trail of Bits)
- TensorFlow Privacy (TensorFlow)
- Machine Learning Privacy Meter (NUS Data Privacy and Trustworthy Machine Learning Lab)
- CypherCat (archive-only) (IQT Labs/Lab 41)
- Adversarial Robustness Toolbox (ART) (IBM)
Papers and Code
Membership inference
A curated list of membership inference papers (more than 100 papers) on machine learning models is available at this repository.
- Membership inference attacks against machine learning models (Shokri et al., 2017) (code)
- Understanding membership inferences on well-generalized learning models(Long et al., 2018)
- Privacy risk in machine learning: Analyzing the connection to overfitting, (Yeom et al., 2018) (code)
- Membership inference attack against differentially private deep learning model (Rahman et al., 2018)
- Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. (Nasr et al., 2019) (code)
- Logan: Membership inference attacks against generative models. (Hayes et al. 2019) (code)
- Evaluating differentially private machine learning in practice (Jayaraman and Evans, 2019) (code)
- Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models (Salem et al., 2019) (code)
- Privacy risks of securing machine learning models against adversarial examples (Song L. et al., 2019) (code)
- White-box vs Black-box: Bayes Optimal Strategies for Membership Inference (Sablayrolles et al., 2019)
- Privacy risks of explaining machine learning models (Shokri et al., 2019)
- Demystifying membership inference attacks in machine learning as a service (Truex et al., 2019)
- Monte carlo and reconstruction membership inference attacks against generative models (Hilprecht et al., 2019)
- MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples (Jia et al., 2019) (code)
- Gan-leaks: A taxonomy of membership inference attacks against gans (Chen,et al., 2019))
- Auditing Data Provenance in Text-Generation Models (Song and Shmatikov, 2019)
- Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System? (Hisamoto et al., 2020)
- Revisiting Membership Inference Under Realistic Assumptions (Jayaraman et al., 2020)
- When Machine Unlearning Jeopardizes Privacy (Chen et al., 2020)
- Modelling and Quantifying Membership Information Leakage in Machine Learning (Farokhi and Kaafar, 2020)
- Systematic Evaluation of Privacy Risks of Machine Learning Models (Song and Mittal, 2020) (code)
- Towards the Infeasibility of Membership Inference on Deep Models (Rezaei and Liu, 2020) (code)
- Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference (Leino and Fredrikson, 2020)
- Label-Only Membership Inference Attacks (Choquette Choo et al., 2020)
- Label-Leaks: Membership Inference Attack with Label (Li and Zhang, 2020)
- Alleviating Privacy Attacks via Causal Learning (Tople et al., 2020)
- On the Effectiveness of Regularization Against Membership Inference Attacks (Kaya et al., 2020)
- Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries (Rahimian et al., 2020)
- Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation (He et al., 2019)
- Differential Privacy Defenses and Sampling Attacks for Membership Inference (Rahimian et al., 2019)
- privGAN: Protecting GANs from membership inference attacks at low cost (Mukherjee et al., 2020)
- Sharing Models or Coresets: A Study based on Membership Inference Attack (Lu et al., 2020)
- Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning (Zou et al., 2020)
- Quantifying Membership Inference Vulnerability via Generalization Gap and Other Model Metrics (Bentley et al., 2020)
- MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models (Liu et al., 2020)
- On Primes, Log-Loss Scores and (No) Privacy (Aggarwal et al., 2020)
- MCMIA: Model Compression Against Membership Inference Attack in Deep Neural Networks (Wang et al., 2020)
- Bootstrap Aggregation for Point-based Generalized Membership Inference Attacks (Felps et al., 2020)
- Differentially Private Learning Does Not Bound Membership Inference (Humphries et al., 2020)
- Quantifying Membership Privacy via Information Leakage (Saeidian et al., 2020)
- Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning (Yaghini et al., 2020)
- Use the Spear as a Shield: A Novel Adversarial Example based Privacy-Preserving Technique against Membership Inference Attacks (Xue et al., 2020)
- Towards Realistic Membership Inferences: The Case of Survey Data
- Unexpected Information Leakage of Differential Privacy Due to Linear Property of Queries (Huang et al., 2020)
- TransMIA: Membership Inference Attacks Using Transfer Shadow Training (Hidano et al., 2020)
- An Extension of Fano's Inequality for Characterizing Model Susceptibility to Membership Inference Attacks (Jha et al., 2020)
- Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning (Nasr et al., 2021)
- Membership Inference Attack with Multi-Grade Service Models in Edge Intelligence (Wang et al., 2021)
- Reconstruction-Based Membership Inference Attacks are Easier on Difficult Problems (Shafran et al., 2021)
- Membership Inference Attacks on Deep Regression Models for Neuroimaging (Gupta et al., 2021)
- Node-Level Membership Inference Attacks Against Graph Neural Networks (He et al., 2021)
- Practical Blind Membership Inference Attack via Differential Comparisons (Hui et al., 2021)
- ADePT: Auto-encoder based Differentially Private Text Transformation (Krishna et al., 2021)
- Source Inference Attacks in Federated Learning (Hu et al., 2021) (code)
- The Influence of Dropout on Membership Inference in Differentially Private Models (Galinkin, 2021)
- Membership Inference Attack Susceptibility of Clinical Language Models (Jagannatha et al., 2021)
- Membership Inference Attacks on Knowledge Graphs (Wang & Sun, 2021)
- When Does Data Augmentation Help With Membership Inference Attacks? (Kaya and Dumitras, 2021)
- The Influence of Training Parameters and Architectural Choices on the Vulnerability of Neural Networks to Membership Inference Attacks (Bouanani, 2021)
- Membership Inference on Word Embedding and Beyond (Mahloujifar et al., 2021)
- TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing (Hu et al., 2021)
- Enhanced Membership Inference Attacks against Machine Learning Models (Ye et al., 2021)
- Do Not Trust Prediction Scores for Membership Inference Attacks (Hintersdorf et al., 2021)
- Membership Inference via Backdooring (Hu et al. 2022)
Reconstruction
Reconstruction attacks cover also attacks known as model inversion and attribute inference.
- Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing (Fredrikson et al., 2014)
- Model inversion attacks that exploit confidence information and basic countermeasures (Fredrikson et al., 2015) (code)
- A methodology for formalizing model-inversion attacks (Wu et al., 2016)
- Deep models under the gan: Information leakage from collaborative deep learning (Hitaj et al., 2017)
- Machine learning models that remember too much (Song, C. et al., 2017) (code)
- Model inversion attacks for prediction systems: Without knowledge of non-sensitive attributes (Hidano et al., 2017)
- The secret sharer: Evaluating and testing unintended memorization in neural networks (Carlini et al., 2019)
- Deep leakage from gradients (Zhu et al., 2019) (code)
- Model inversion attacks against collaborative inference (He et al., 2019) (code)
- Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (Wang et al., 2019)
- Neural network inversion in adversarial setting via background knowledge alignment (Yang et al., 2019)
- iDLG: Improved Deep Leakage from Gradients (Zhao et al., 2020) (code)
- Privacy Risks of General-Purpose Language Models (Pan et al., 2020)
- The secret revealer: generative model-inversion attacks against deep neural networks) (Zhang et al., 2020)
- Inverting Gradients - How easy is it to break privacy in federated learning? (Geiping et al., 2020)
- GAMIN: An Adversarial Approach to Black-Box Model Inversion (Aivodji et al., 2019)
- Trade-offs and Guarantees of Adversarial Representation Learning for Information Obfuscation (Zhao et al., 2020)
- Reconstruction of training samples from loss functions (Sannai, 2018)
- A Framework for Evaluating Gradient Leakage Attacks in Federated Learning (Wei et al., 2020)
- Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning (Hitaj et al., 2017)
- Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (Wang et al., 2018)
- Exploring Image Reconstruction Attack in Deep Learning Computation Offloading (Oh and Lee, 2019)
- I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators (Wei et al., 2019)
- Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning (Salem et al., 2019)
- Illuminating the Dark or how to recover what should not be seen in FE-based classifiers (Carpov et al., 2020)
- Evaluation Indicator for Model Inversion Attack (Tanaka et al., 2020)
- Understanding Unintended Memorization in Federated Learning (Thakkar et al., 2020)
- An Attack-Based Evaluation Method for Differentially Private Learning Against Model Inversion Attack (Park et al., 2019)
- Reducing Risk of Model Inversion Using Privacy-Guided Training (Goldsteen et al., 2020)
- Robust Transparency Against Model Inversion Attacks (Alufaisan et al., 2020)
- Does AI Remember? Neural Networks and the Right to be Forgotten (Graves et al., 2020)
- Improving Robustness to Model Inversion Attacks via Mutual Information Regularization (Wang et al., 2020)
- SAPAG: A Self-Adaptive Privacy Attack From Gradients (Wang et al., 2020)
- Theory-Oriented Deep Leakage from Gradients via Linear Equation Solver (Pan et al., 2020)
- Improved Techniques for Model Inversion Attacks (Chen et al., 2020)
- Black-box Model Inversion Attribute Inference Attacks on Classification Models (Mehnaz et al., 2020)
- Deep Face Recognizer Privacy Attack: Model Inversion Initialization by a Deep Generative Adversarial Data Space Discriminator (Khosravy et al., 2020)
- MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery (Li et al., 2020)
- Evaluation of Inference Attack Models for Deep Learning on Medical Data (Wu et al., 2020)
- FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries (Liew and Takahashi, 2020)
- Extracting Training Data from Large Language Models (Carlini et al., 2020)
- MIDAS: Model Inversion Defenses Using an Approximate Memory System (Xu et al., 2021)
- KART: Privacy Leakage Framework of Language Models Pre-trained with Clinical Records (Nakamura et al., 2020)
- Derivation of Constraints from Machine Learning Models and Applications to Security and Privacy (Falaschi et al., 2021)
- On the (In)Feasibility of Attribute Inference Attacks on Machine Learning Models (Zhao et al., 2021)
- Practical Defences Against Model Inversion Attacks for Split Neural Networks (Titcombe et al., 2021)
- R-GAP: Recursive Gradient Attack on Privacy (Zhu and Blaschko, 2021)
- Exploiting Explanations for Model Inversion Attacks (Zhao et al., 2021)
- SAFELearn: Secure Aggregation for private FEderated Learning (Fereidooni et al., 2021)
- Does BERT Pretrained on Clinical Notes Reveal Sensitive Data? (Lehman et al., 2021)
- Training Data Leakage Analysis in Language Models (Inan et al., 2021)
- Model Fragmentation, Shuffle and Aggregation to Mitigate Model Inversion in Federated Learning (Masude et al., 2021)
- PRECODE - A Generic Model Extension to Prevent Deep Gradient Leakage (Scheliga et al., 2021)
- On the Importance of Encrypting Deep Features (Ni et al., 2021)
- Defending Against Model Inversion Attack by Adversarial Examples (Wen et al., 2021)
- See through Gradients: Image Batch Recovery via GradInversion (Yin et al., 2021)
- Variational Model Inversion Attacks (Wang et al., 2021)
- Reconstructing Training Data with Informed Adversaries (Balle et al., 2022)
- Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks (Struppek et al., 2022)
- Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks (Dong et al., 2022)
- A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data (Annamalai et al., 2023)
- Analysis and Utilization of Hidden Information in Model Inversion Attacks (Zhang et al., 2023) (code)
- Text Embeddings Reveal (Almost) As Much As Text(Morris et al., 2023)
- On the Inadequacy of Similarity-based Privacy Metrics: Reconstruction Attacks against "Truly Anonymous Synthetic Data" (Ganev and De Cristofaro, 2023)
- Model Inversion Attack with Least Information and an In-depth Analysis of its Disparate Vulnerability (Dibbo et al., 2023)
Property inference / Distribution inference
- Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers (Ateniese et al., 2015)
- Property inference attacks on fully connected neural networks using permutation invariant representations (Ganju et al., 2018)
- Exploiting unintended feature leakage in collaborative learning (Melis et al., 2019) (code)
- Overlearning Reveals Sensitive Attributes (Song C. et al., 2020) (code)
- Subject Property Inference Attack in Collaborative Learning (Xu and Li, 2020)
- Property Inference From Poisoning (Chase et al., 2021)
- Property Inference Attacks on Convolutional Neural Networks: Influence and Implications of Target Model's Complexity (Parisot et al., 2021)
- Honest-but-Curious Nets: Sensitive Attributes of Private Inputs can be Secretly Coded into the Entropy of Classifiers' Outputs (Malekzadeh et al. 2021) (code)
- Property Inference Attacks Against GANs (Zhou et al., 2021) (code)
- Formalizing and Estimating Distribution Inference Risks (Suri and Evans, 2022) (code)
- Dissecting Distribution Inference (Suri et al., 2023) (code)
- SNAP: Efficient Extraction of Private Properties with Poisoning (Chaudhari et al., 2023) (code)
Model extraction
- Stealing machine learning models via prediction apis (Tramèr et al., 2016) (code)
- Stealing hyperparameters in machine learning (Wang B. et al., 2018)
- Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data (Correia-Silva et al., 2018) (code)
- Towards reverse-engineering black-box neural networks.(Oh et al., 2018) (code)
- Knockoff nets: Stealing functionality of black-box models (Orekondy et al., 2019) (code)
- PRADA: protecting against DNN model stealing attacks (Juuti et al., 2019) (code)
- Model Reconstruction from Model Explanations (Milli et al., 2019)
- Exploring connections between active learning and model extraction (Chandrasekaran et al., 2020)
- High Accuracy and High Fidelity Extraction of Neural Networks (Jagielski et al., 2020)
- Thieves on Sesame Street! Model Extraction of BERT-based APIs (Krishna et al., 2020) (code)
- Cryptanalytic Extraction of Neural Network Models (Carlini et al., 2020)
- CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples (Yu et al., 2020)
- ACTIVETHIEF: Model Extraction Using Active Learning and Unannotated Public Data (Pal et al., 2020) (code)
- Efficiently Stealing your Machine Learning Models (Reith et al., 2019)
- Extraction of Complex DNN Models: Real Threat or Boogeyman? (Atli et al., 2020)
- Stealing Neural Networks via Timing Side Channels (Duddu et al., 2019)
- DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints (Hu et al., 2020) (code)
- CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel (Batina et al., 2019)
- Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures (Yan et al., 2020)
- How to 0wn NAS in Your Spare Time (Hong et al., 2020) (code)
- Security Analysis of Deep Neural Networks Operating in the Presence of Cache Side-Channel Attacks (Hong et al., 2020)
- Reverse-Engineering Deep ReLU Networks (Rolnick and Kording, 2020)
- Model Extraction Oriented Data Publishing with k-anonymity (Fukuoka et al., 2020)
- Hermes Attack: Steal DNN Models with Lossless Inference Accuracy (Zhu et al., 2020)
- Model extraction from counterfactual explanations (Aïvodji et al., 2020) (code)
- MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks (Chen and Yong, 2020) (code)
- Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks (Orekondy et al., 2019) (code)
- IReEn: Iterative Reverse-Engineering of Black-Box Functions via Neural Program Synthesis (Hajipour et al., 2020)
- ES Attack: Model Stealing against Deep Neural Networks without Data Hurdles (Yuan et al., 2020)
- Black-Box Ripper: Copying black-box models using generative evolutionary algorithms (Barbalau et al., 2020) (code)
- Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realization (Wu et al., 2020)
- Model Extraction Attacks and Defenses on Cloud-Based Machine Learning Models (Gong et al., 2020)
- Leveraging Extracted Model Adversaries for Improved Black Box Attacks (Nizar and Kobren, 2020)
- Differentially Private Machine Learning Model against Model Extraction Attack (Cheng et al., 2020)
- Model Extraction Attacks and Defenses on Cloud-Based Machine Learning Models (Gong et al., 2020)
- Stealing Neural Network Models through the Scan Chain: A New Threat for ML Hardware (Potluri and Aysu, 2021)
- Model Extraction and Defenses on Generative Adversarial Networks (Hu and Pang, 2021)
- Protecting Decision Boundary of Machine Learning Model With Differentially Private Perturbation (Zheng et al., 2021)
- Special-Purpose Model Extraction Attacks: Stealing Coarse Model with Fewer Queries (Okada et al., 2021)
- Model Extraction and Adversarial Transferability, Your BERT is Vulnerable! (He et al., 2021) (code)
- Thief, Beware of What Get You There: Towards Understanding Model Extraction Attack (Zhang et al., 2021)
- Model Weight Theft With Just Noise Inputs: The Curious Case of the Petulant Attacker (Roberts et al., 2019)
- Protecting DNNs from Theft using an Ensemble of Diverse Models (Kariyappa et al., 2021)
- Information Laundering for Model Privacy (Wang et al., 2021)
- Deep Neural Network Fingerprinting by Conferrable Adversarial Examples (Lukas et al., 2021)
- BODAME: Bilevel Optimization for Defense Against Model Extraction (Mori et al., 2021)
- Dataset Inference: Ownership Resolution in Machine Learning (Maini et al., 2021)
- Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks (Szyller et al., 2021)
- Towards Characterizing Model Extraction Queries and How to Detect Them (Zhang et al., 2021)
- Hardness of Samples Is All You Need: Protecting Deep Learning Models Using Hardness of Samples (Sadeghzadeh et al., 2021)
- Stateful Detection of Model Extraction Attacks (Pal et al., 2021)
- MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI (Miura et al., 2021)
- INVERSENET: Augmenting Model Extraction Attacks with Training Data Inversion (Gong et al., 2021)
- Increasing the Cost of Model Extraction with Calibrated Proof of Work (Dziedzic et al. 2022) code
- On the Difficulty of Defending Self-Supervised Learning against Model Extraction (Dziedzic et al., 2022) code
- Dataset Inference for Self-Supervised Models (Dziedzic et al., 2022) code
- Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders (Sha et al., 2022)
- StolenEncoder: Stealing Pre-trained Encoders (Liu et al., 2022)
- Model Extraction Attacks Revisited (Liang et al., 2023)
Other
- Prompts Should not be Seen as Secrets: Systematically Measuring Prompt Extraction Attack Success(Zhang et al., 2023)
- Amnesiac Machine Learning (Graves et al., 2020)
- Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy (Naseri et al., 2020)
- Analyzing Information Leakage of Updates to Natural Language Models (Brockschmidt et al., 2020)
- Estimating g-Leakage via Machine Learning (Romanelli et al., 2020)
- Information Leakage in Embedding Models (Song and Raghunathan, 2020)
- Hide-and-Seek Privacy Challenge (Jordan et al., 2020)
- Synthetic Data -- Anonymisation Groundhog Day (Stadler et al., 2020) (code)
- Robust Membership Encoding: Inference Attacks and CopyrightProtection for Deep Learning (Song and Shokri, 2020)
- Quantifying Privacy Leakage in Graph Embedding (Duddu et al., 2020)
- Quantifying and Mitigating Privacy Risks of Contrastive Learning (He and Zhang, 2021)
- Coded Machine Unlearning (Aldaghri et al., 2020)
- Unlearnable Examples: Making Personal Data Unexploitable (Huang et al., 2021)
- Measuring Data Leakage in Machine-Learning Models with Fisher Information (Hannun et al., 2021)
- Teacher Model Fingerprinting Attacks Against Transfer Learning (Chen et al, 2021)
- Bounding Information Leakage in Machine Learning (Del Grosso et al., 2021)
- RoFL: Attestable Robustness for Secure Federated Learning (Burkhalter et al., 2021)
- Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash (Struppek et al., 2021)
- The Privacy Onion Effect: Memorization is Relative (Carlini et al., 2022)
- Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets (Tramer et al., 2022)
- LCANets++: Robust Audio Classification using Multi-layer Neural Networks with Lateral Competition (Dibbo et al., 2023)