Awesome Data Poisoning and Backdoor Attacks

Disclaimer: This repository may not include all relevant papers in this area. Use at your own discretion and please contribute any missing or overlooked papers via pull request.

A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them.

Surveys

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses (TPAMI 2022) [paper]
A Survey on Data Poisoning Attacks and Defenses (DSC 2022) [paper]

2023

arXiv

Silent Killer: Optimizing Backdoor Trigger Yields a Stealthy and Powerful Data Poisoning Attack (arXiv 2023) [code]
Exploring the Limits of Indiscriminate Data Poisoning Attacks (arXiv 2023) [paper]
Students Parrot Their Teachers: Membership Inference on Model Distillation (arXiv 2023) [paper]
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning (arXiv 2023) [paper]
More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models (arXiv 2023) [paper] [code]
Feature Partition Aggregation: A Fast Certified Defense Against a Union of Sparse Adversarial Attacks (arXiv 2023) [paper] [code]
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms (arXiv 2023) [paper] [code]
Temporal Robustness against Data Poisoning (arXiv 2023) [paper]
A Systematic Evaluation of Backdoor Trigger Characteristics in Image Classification (arXiv 2023) [paper]
Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks (arXiv 2023) [paper] [code]
Backdoor Attacks with Input-unique Triggers in NLP (arXiv 2023) [paper]
Do Backdoors Assist Membership Inference Attacks? (arXiv 2023) [paper]
Black-box Backdoor Defense via Zero-shot Image Purification (arXiv 2023) [paper]
Influencer Backdoor Attack on Semantic Segmentation (arXiv 2023) [paper]
TrojViT: Trojan Insertion in Vision Transformers (arXiv 2023) [paper]
Mole Recruitment: Poisoning of Image Classifiers via Selective Batch Sampling (arXiv 2023) [paper] [code]
Poisoning Web-Scale Training Datasets is Practical (arXiv 2023) [paper]
Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization (arXiv 2023) [paper]
MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion (arXiv 2023) [paper]
Launching a Robust Backdoor Attack under Capability Constrained Scenarios (arXiv 2023) [paper]
Certifiable Robustness for Naive Bayes Classifiers (arXiv 2023) [paper] [code]
Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks (arXiv 2023) [paper] [code]
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models (arXiv 2023) [paper] [code]
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning (arXiv 2023) [paper]
BadSAM: Exploring Security Vulnerabilities of SAM via Backdoor Attacks (arXiv 2023) [paper]
Backdoor Learning on Sequence to Sequence Models (arXiv 2023) [paper]
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (arXiv 2023) [paper]
Evil from Within: Machine Learning Backdoors through Hardware Trojans (arXiv 2023) [paper]

Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper]
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only (ICLR 2023) [paper]
TrojText: Test-time Invisible Textual Trojan Insertion (ICLR 2023) [paper] [code]
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning? (ICLR 2023) [paper] [code]
Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper] [code]
Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks (ICLR 2023) [paper] [code]
Revisiting the Assumption of Latent Separability for Backdoor Defenses (ICLR 2023) [paper] [code]
Few-shot Backdoor Attacks via Neural Tangent Kernels (ICLR 2023) [paper] [code]
SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency (ICLR 2023) [paper] [code]
Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective (ICLR 2023) [paper] [code]
Provable Robustness against Wasserstein Distribution Shifts via Input Randomization (ICLR 2023) [paper]
Don’t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure (ICLR 2023) [paper]
Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors (ICLR 2023) [paper] [code]
Towards Robustness Certification Against Universal Perturbations (ICLR 2023) [paper] [code]
Understanding Influence Functions and Datamodels via Harmonic Analysis (ICLR 2023) [paper]
Distilling Cognitive Backdoor Patterns within an Image (ICLR 2023) [paper] [code]
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning (ICLR 2023) [paper] [code]
UNICORN: A Unified Backdoor Trigger Inversion Framework (ICLR 2023) [paper] [code]
Poisoning Language Models During Instruction Tuning (ICML 2023) [paper] [code]
Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning (ICML 2023) [paper] [code]
Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression (ICML 2023) [paper] [code]
Poisoning Generative Replay in Continual Learning to Promote Forgetting (ICML 2023) [paper] [code]
Exploring Model Dynamics for Accumulative Poisoning Discovery (ICML 2023) [paper] [code]
Data Poisoning Attacks Against Multimodal Encoders (ICML 2023) [paper] [code]
Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks (ICML 2023) [paper] [code]
Run-Off Election: Improved Provable Defense against Data Poisoning Attacks (ICML 2023) [paper] [code]
Revisiting Data-Free Knowledge Distillation with Poisoned Teachers (ICML 2023) [paper] [code]
Certified Robust Neural Networks: Generalization and Corruption Resistance (ICML 2023) [paper] [code]
Understanding Backdoor Attacks through the Adaptability Hypothesis (ICML 2023) [paper]
Robust Collaborative Learning with Linear Gradient Overhead (ICML 2023) [paper] [code]
Graph Contrastive Backdoor Attacks (ICML 2023) [paper]
Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023) [paper] [code]
Rethinking Backdoor Attacks (ICML 2023) [paper]
UMD: Unsupervised Model Detection for X2X Backdoor Attacks (ICML 2023) [paper]
LeadFL: Client Self-Defense against Model Poisoning in Federated Learning (ICML 2023) [paper] [code]
RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching (UAI 2023) [paper]
Backdoor Defense via Deconfounded Representation Learning (CVPR 2023) [paper] [code]
Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks (CVPR 2023) [paper]
CUDA: Convolution-based Unlearnable Datasets (CVPR 2023) [paper] [code]
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger (CVPR 2023) [paper]
Single Image Backdoor Inversion via Robust Smoothed Classifiers (CVPR 2023) [paper] [code]
Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples (CVPR 2023) [paper] [code]
Backdoor Defense via Adaptively Splitting Poisoned Dataset (CVPR 2023) [paper] [code]
Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency (CVPR 2023) [paper] [code]
Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning (CVPR 2023) [paper] [code]
Color Backdoor: A Robust Poisoning Attack in Color Space (CVPR 2023) [paper]
How to Backdoor Diffusion Models? (CVPR 2023) [paper] [code]
Backdoor Cleansing With Unlabeled Data (CVPR 2023) [paper] [code]
MEDIC: Remove Model Backdoors via Importance Driven Cloning (CVPR 2023) [paper] [code]
Architectural Backdoors in Neural Networks (CVPR 2023) [paper]
Detecting Backdoors in Pre-Trained Encoders (CVPR 2023) [paper] [code]
The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection (CVPR 2023) [paper] [code]
Progressive Backdoor Erasing via Connecting Backdoor and Adversarial Attacks (CVPR 2023) [paper]
You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks? (CVPR 2023) [paper]
Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs (CVPRW 2023) [paper]
Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers (S&P 2023) [paper]
SNAP: Efficient Extraction of Private Properties with Poisoning (S&P 2023) [paper] [code]
BayBFed: Bayesian Backdoor Defense for Federated Learning (S&P 2023) [paper]
RAB: Provable Robustness Against Backdoor Attacks (S&P 2023) [paper]
FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information (S&P 2023) [paper]
3DFed: Adaptive and Extensible Framework for Covert Backdoor Attack in Federated Learning [paper]
BITE: Textual Backdoor Attacks with Iterative Trigger Injection (ACL 2023) [paper] [code]
Backdooring Neural Code Search (ACL 2023) [paper] [code]
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark (ACL 2023) [paper] [code]
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models (ACL 2023) [paper] [code]
Multi-target Backdoor Attacks for Code Pre-trained Models (ACL 2023) [code] [code]
A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning (ACL 2023) [paper]
Defending against Insertion-based Textual Backdoor Attacks via Attribution (ACL 2023) [paper]
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias (ACL 2023) [paper]
Defending Against Backdoor Attacks by Layer-wise Feature Analysis (PAKDD 2023) [paper] [code]
Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures (SIGIR 2023) [paper]
The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples (SIGIR 2023) [paper]
How to Sift Out a Clean Data Subset in the Presence of Data Poisoning? (USENIX Security 2023) [paper] [code]
PORE: Provably Robust Recommender Systems against Data Poisoning Attacks (USENIX Security 2023) [paper]
On the Security Risks of Knowledge Graph Reasoning (USENIX Security 2023) [paper] [code]
BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT (NDSS 2023) [paper]
Exploiting Logic Locking for a Neural Trojan Attack on Machine Learning Accelerators (GLSVLSI 2023) [paper]
Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning (SecTL 2023) [paper]
Beyond the Model: Data Pre-processing Attack to Deep Learning Models in Android Apps (SecTL 2023) [paper]

2022

Transferable Unlearnable Examples (arXiv 2022) [paper]
Natural Backdoor Datasets (arXiv 2022) [paper]
Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World (arXiv 2022) [paper]
Backdoor Attacks on Self-Supervised Learning (CVPR 2022) [paper] [code]
Poisons that are learned faster are more effective (CVPR 2022 Workshops) [paper]
Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning (ICLR 2022) [paper] [code]
Adversarial Unlearning of Backdoors via Implicit Hypergradient (ICLR 2022) [paper] [code]
Not All Poisons are Created Equal: Robust Training against Data Poisoning (ICML 2022) [paper] [code]
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch (NeurIPS 2022) [paper] [code]
Hidden Poison: Machine unlearning enables camouflaged poisoning attacks (NeurIPS 2022 Workshop MLSW) [paper]
Policy Resilience to Environment Poisoning Attacks on Reinforcement Learning (NeurIPS 2022 Workshop MLSW) [paper]
Hard to Forget: Poisoning Attacks on Certified Machine Unlearning (AAAI 2022) [paper] [code]
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks (AAAI 2022) [paper]
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning (USENIX Security 2022) [paper]
Planting Undetectable Backdoors in Machine Learning Models (FOCS 2022) [paper]

2021

DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations (arXiv 2021) [paper]
How Robust Are Randomized Smoothing Based Defenses to Data Poisoning? (CVPR 2021) [paper]
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release (ICLR 2021 Workshop on Security and Safety in Machine Learning Systems) [paper]
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching (ICLR 2021) [paper] [code]
Unlearnable Examples: Making Personal Data Unexploitable (ICLR 2021) [paper] [code]
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks (ICLR 2021) [paper] [code]
LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition (ICLR 2021) [paper]
What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning (ICLR 2021 Workshop) [paper]
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (ICML 2021) [paper] [code]
Neural Tangent Generalization Attacks (ICML 2021) [paper]
SPECTRE: Defending Against Backdoor Attacks Using Robust Covariance Estimation (ICML 2021) [paper]
Adversarial Examples Make Strong Poisons (NeurIPS 2021) [paper]
Anti-Backdoor Learning: Training Clean Models on Poisoned Data (NeurIPS 2021) [paper] [code]
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective (ICCV 2021) [paper] [code]
Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks (AAAI 2021) [paper] [code]
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff (ICASSP 2021) [paper]

2020

On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping (arXiv 2020) [paper] [code]
Backdooring and poisoning neural networks with image-scaling attacks (arXiv 2020) [paper]
Poisoned classifiers are not only backdoored, they are fundamentally broken (arXiv 2020) [paper] [code]
Invisible backdoor attacks on deep neural networks via steganography and regularization (TDSC 2020) [paper]
Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs (CVPR 2020) [paper] [code]
MetaPoison: Practical General-purpose Clean-label Data Poisoning (NeurIPS 2020) [paper]
Input-Aware Dynamic Backdoor Attack (NeurIPS 2020) [paper] [code]
How To Backdoor Federated Learning (AISTATS 2020) [paper]
Reflection backdoor: A natural backdoor attack on deep neural networks (ECCV 2020) [paper]
Practical Poisoning Attacks on Neural Networks (ECCV 2020) [paper]
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases (ECCV 2020) [paper] [code]
Deep k-NN Defense Against Clean-Label Data Poisoning Attacks (ECCV 2020 Workshops) [paper] [code]
Radioactive data: tracing through training (ICML 2020) [paper]
Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks (ICML 2020) [paper]
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing (ICML 2020) [paper]
An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks (KDD 2020) [paper] [code]
Hidden Trigger Backdoor Attacks (AAAI 2020) [paper] [code]

2019

Label-consistent backdoor attacks (arXiv 2019) [paper]
Poisoning Attacks with Generative Adversarial Nets (arXiv 2019) [paper]
TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems (arXiv 2019) [paper]
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (IEEE Access 2019) [paper]
Data Poisoning against Differentially-Private Learners: Attacks and Defenses (IJCAI 2019) [paper]
DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks (IJCAI 2019) [paper]
Sever: A Robust Meta-Algorithm for Stochastic Optimization (ICML 2019) [paper]
Learning with Bad Training Data via Iterative Trimmed Loss Minimization (ICML 2019) [paper]
Universal Multi-Party Poisoning Attacks (ICML 2019) [paper]
Transferable Clean-Label Poisoning Attacks on Deep Neural Nets (ICML 2019) [paper]
Defending Neural Backdoors via Generative Distribution Modeling (NeurIPS 2019) [paper]
Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder (NeurIPS 2019) [paper]
The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure (AAAI 2019) [paper]
Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models (IEEE Transactions on Services Computing 2019) [paper]
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks (IEEE Symposium on Security and Privacy 2019) [paper]
STRIP: a defence against trojan attacks on deep neural networks (ACSAC 2019) [paper]

2018

Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering (arXiv 2018) [paper]
Spectral Signatures in Backdoor Attacks (NeurIPS 2018) [paper]
Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (NeurIPS 2018) [paper]
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise (NeurIPS 2018) [paper]
Trojaning Attack on Neural Networks (NDSS 2018) [paper]
Label Sanitization Against Label Flipping Poisoning Attacks (ECML PKDD 2018 Workshops) [paper]
Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring (USENIX Security 2018) [paper]

2017

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (arXiv 2017) [paper]
Generative Poisoning Attack Method Against Neural Networks (arXiv 2017) [paper]
Delving into Transferable Adversarial Examples and Black-box Attacks (ICLR 2017) [paper]
Understanding Black-box Predictions via Influence Functions (ICML 2017) [paper] [code]
Certified Defenses for Data Poisoning Attacks (NeurIPS 2017) [paper]

2016

Data Poisoning Attacks on Factorization-Based Collaborative Filtering (NeurIPS 2016) [paper]

2015

Is Feature Selection Secure against Training Data Poisoning? (ICML 2015) [paper]
Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners (AAAI 2015) [paper]

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Data Poisoning and Backdoor Attacks

Surveys

2023

2022

2021

2020

2019

2018

2017

2016

2015

About

Releases

Packages

License

dzhliu/awesome-data-poisoning-and-backdoor-attacks

Folders and files

Latest commit

History

Repository files navigation

Awesome Data Poisoning and Backdoor Attacks

Surveys

2023

2022

2021

2020

2019

2018

2017

2016

2015

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages