Skip to content

A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them

License

Notifications You must be signed in to change notification settings

dzhliu/awesome-data-poisoning-and-backdoor-attacks

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

Awesome Data Poisoning and Backdoor Attacks

Awesome

Disclaimer: This repository may not include all relevant papers in this area. Use at your own discretion and please contribute any missing or overlooked papers via pull request.

A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them.

Surveys

  • Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses (TPAMI 2022) [paper]
  • A Survey on Data Poisoning Attacks and Defenses (DSC 2022) [paper]

2023

arXiv
  • Silent Killer: Optimizing Backdoor Trigger Yields a Stealthy and Powerful Data Poisoning Attack (arXiv 2023) [code]
  • Exploring the Limits of Indiscriminate Data Poisoning Attacks (arXiv 2023) [paper]
  • Students Parrot Their Teachers: Membership Inference on Model Distillation (arXiv 2023) [paper]
  • CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning (arXiv 2023) [paper]
  • More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models (arXiv 2023) [paper] [code]
  • Feature Partition Aggregation: A Fast Certified Defense Against a Union of Sparse Adversarial Attacks (arXiv 2023) [paper] [code]
  • ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms (arXiv 2023) [paper] [code]
  • Temporal Robustness against Data Poisoning (arXiv 2023) [paper]
  • A Systematic Evaluation of Backdoor Trigger Characteristics in Image Classification (arXiv 2023) [paper]
  • Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks (arXiv 2023) [paper] [code]
  • Backdoor Attacks with Input-unique Triggers in NLP (arXiv 2023) [paper]
  • Do Backdoors Assist Membership Inference Attacks? (arXiv 2023) [paper]
  • Black-box Backdoor Defense via Zero-shot Image Purification (arXiv 2023) [paper]
  • Influencer Backdoor Attack on Semantic Segmentation (arXiv 2023) [paper]
  • TrojViT: Trojan Insertion in Vision Transformers (arXiv 2023) [paper]
  • Mole Recruitment: Poisoning of Image Classifiers via Selective Batch Sampling (arXiv 2023) [paper] [code]
  • Poisoning Web-Scale Training Datasets is Practical (arXiv 2023) [paper]
  • Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization (arXiv 2023) [paper]
  • MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion (arXiv 2023) [paper]
  • Launching a Robust Backdoor Attack under Capability Constrained Scenarios (arXiv 2023) [paper]
  • Certifiable Robustness for Naive Bayes Classifiers (arXiv 2023) [paper] [code]
  • Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks (arXiv 2023) [paper] [code]
  • Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models (arXiv 2023) [paper] [code]
  • Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning (arXiv 2023) [paper]
  • BadSAM: Exploring Security Vulnerabilities of SAM via Backdoor Attacks (arXiv 2023) [paper]
  • Backdoor Learning on Sequence to Sequence Models (arXiv 2023) [paper]
  • ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (arXiv 2023) [paper]
  • Evil from Within: Machine Learning Backdoors through Hardware Trojans (arXiv 2023) [paper]
  • Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper]
  • Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only (ICLR 2023) [paper]
  • TrojText: Test-time Invisible Textual Trojan Insertion (ICLR 2023) [paper] [code]
  • Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning? (ICLR 2023) [paper] [code]
  • Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper] [code]
  • Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks (ICLR 2023) [paper] [code]
  • Revisiting the Assumption of Latent Separability for Backdoor Defenses (ICLR 2023) [paper] [code]
  • Few-shot Backdoor Attacks via Neural Tangent Kernels (ICLR 2023) [paper] [code]
  • SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency (ICLR 2023) [paper] [code]
  • Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective (ICLR 2023) [paper] [code]
  • Provable Robustness against Wasserstein Distribution Shifts via Input Randomization (ICLR 2023) [paper]
  • Don’t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure (ICLR 2023) [paper]
  • Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors (ICLR 2023) [paper] [code]
  • Towards Robustness Certification Against Universal Perturbations (ICLR 2023) [paper] [code]
  • Understanding Influence Functions and Datamodels via Harmonic Analysis (ICLR 2023) [paper]
  • Distilling Cognitive Backdoor Patterns within an Image (ICLR 2023) [paper] [code]
  • FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning (ICLR 2023) [paper] [code]
  • UNICORN: A Unified Backdoor Trigger Inversion Framework (ICLR 2023) [paper] [code]
  • Poisoning Language Models During Instruction Tuning (ICML 2023) [paper] [code]
  • Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning (ICML 2023) [paper] [code]
  • Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression (ICML 2023) [paper] [code]
  • Poisoning Generative Replay in Continual Learning to Promote Forgetting (ICML 2023) [paper] [code]
  • Exploring Model Dynamics for Accumulative Poisoning Discovery (ICML 2023) [paper] [code]
  • Data Poisoning Attacks Against Multimodal Encoders (ICML 2023) [paper] [code]
  • Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks (ICML 2023) [paper] [code]
  • Run-Off Election: Improved Provable Defense against Data Poisoning Attacks (ICML 2023) [paper] [code]
  • Revisiting Data-Free Knowledge Distillation with Poisoned Teachers (ICML 2023) [paper] [code]
  • Certified Robust Neural Networks: Generalization and Corruption Resistance (ICML 2023) [paper] [code]
  • Understanding Backdoor Attacks through the Adaptability Hypothesis (ICML 2023) [paper]
  • Robust Collaborative Learning with Linear Gradient Overhead (ICML 2023) [paper] [code]
  • Graph Contrastive Backdoor Attacks (ICML 2023) [paper]
  • Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023) [paper] [code]
  • Rethinking Backdoor Attacks (ICML 2023) [paper]
  • UMD: Unsupervised Model Detection for X2X Backdoor Attacks (ICML 2023) [paper]
  • LeadFL: Client Self-Defense against Model Poisoning in Federated Learning (ICML 2023) [paper] [code]
  • RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching (UAI 2023) [paper]
  • Backdoor Defense via Deconfounded Representation Learning (CVPR 2023) [paper] [code]
  • Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks (CVPR 2023) [paper]
  • CUDA: Convolution-based Unlearnable Datasets (CVPR 2023) [paper] [code]
  • Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger (CVPR 2023) [paper]
  • Single Image Backdoor Inversion via Robust Smoothed Classifiers (CVPR 2023) [paper] [code]
  • Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples (CVPR 2023) [paper] [code]
  • Backdoor Defense via Adaptively Splitting Poisoned Dataset (CVPR 2023) [paper] [code]
  • Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency (CVPR 2023) [paper] [code]
  • Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning (CVPR 2023) [paper] [code]
  • Color Backdoor: A Robust Poisoning Attack in Color Space (CVPR 2023) [paper]
  • How to Backdoor Diffusion Models? (CVPR 2023) [paper] [code]
  • Backdoor Cleansing With Unlabeled Data (CVPR 2023) [paper] [code]
  • MEDIC: Remove Model Backdoors via Importance Driven Cloning (CVPR 2023) [paper] [code]
  • Architectural Backdoors in Neural Networks (CVPR 2023) [paper]
  • Detecting Backdoors in Pre-Trained Encoders (CVPR 2023) [paper] [code]
  • The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection (CVPR 2023) [paper] [code]
  • Progressive Backdoor Erasing via Connecting Backdoor and Adversarial Attacks (CVPR 2023) [paper]
  • You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks? (CVPR 2023) [paper]
  • Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs (CVPRW 2023) [paper]
  • Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers (S&P 2023) [paper]
  • SNAP: Efficient Extraction of Private Properties with Poisoning (S&P 2023) [paper] [code]
  • BayBFed: Bayesian Backdoor Defense for Federated Learning (S&P 2023) [paper]
  • RAB: Provable Robustness Against Backdoor Attacks (S&P 2023) [paper]
  • FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information (S&P 2023) [paper]
  • 3DFed: Adaptive and Extensible Framework for Covert Backdoor Attack in Federated Learning [paper]
  • BITE: Textual Backdoor Attacks with Iterative Trigger Injection (ACL 2023) [paper] [code]
  • Backdooring Neural Code Search (ACL 2023) [paper] [code]
  • Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark (ACL 2023) [paper] [code]
  • NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models (ACL 2023) [paper] [code]
  • Multi-target Backdoor Attacks for Code Pre-trained Models (ACL 2023) [code] [code]
  • A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning (ACL 2023) [paper]
  • Defending against Insertion-based Textual Backdoor Attacks via Attribution (ACL 2023) [paper]
  • Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias (ACL 2023) [paper]
  • Defending Against Backdoor Attacks by Layer-wise Feature Analysis (PAKDD 2023) [paper] [code]
  • Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures (SIGIR 2023) [paper]
  • The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples (SIGIR 2023) [paper]
  • How to Sift Out a Clean Data Subset in the Presence of Data Poisoning? (USENIX Security 2023) [paper] [code]
  • PORE: Provably Robust Recommender Systems against Data Poisoning Attacks (USENIX Security 2023) [paper]
  • On the Security Risks of Knowledge Graph Reasoning (USENIX Security 2023) [paper] [code]
  • BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT (NDSS 2023) [paper]
  • Exploiting Logic Locking for a Neural Trojan Attack on Machine Learning Accelerators (GLSVLSI 2023) [paper]
  • Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning (SecTL 2023) [paper]
  • Beyond the Model: Data Pre-processing Attack to Deep Learning Models in Android Apps (SecTL 2023) [paper]

2022

  • Transferable Unlearnable Examples (arXiv 2022) [paper]
  • Natural Backdoor Datasets (arXiv 2022) [paper]
  • Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World (arXiv 2022) [paper]
  • Backdoor Attacks on Self-Supervised Learning (CVPR 2022) [paper] [code]
  • Poisons that are learned faster are more effective (CVPR 2022 Workshops) [paper]
  • Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning (ICLR 2022) [paper] [code]
  • Adversarial Unlearning of Backdoors via Implicit Hypergradient (ICLR 2022) [paper] [code]
  • Not All Poisons are Created Equal: Robust Training against Data Poisoning (ICML 2022) [paper] [code]
  • Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch (NeurIPS 2022) [paper] [code]
  • Hidden Poison: Machine unlearning enables camouflaged poisoning attacks (NeurIPS 2022 Workshop MLSW) [paper]
  • Policy Resilience to Environment Poisoning Attacks on Reinforcement Learning (NeurIPS 2022 Workshop MLSW) [paper]
  • Hard to Forget: Poisoning Attacks on Certified Machine Unlearning (AAAI 2022) [paper] [code]
  • Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks (AAAI 2022) [paper]
  • PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning (USENIX Security 2022) [paper]
  • Planting Undetectable Backdoors in Machine Learning Models (FOCS 2022) [paper]

2021

  • DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations (arXiv 2021) [paper]
  • How Robust Are Randomized Smoothing Based Defenses to Data Poisoning? (CVPR 2021) [paper]
  • Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release (ICLR 2021 Workshop on Security and Safety in Machine Learning Systems) [paper]
  • Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching (ICLR 2021) [paper] [code]
  • Unlearnable Examples: Making Personal Data Unexploitable (ICLR 2021) [paper] [code]
  • Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks (ICLR 2021) [paper] [code]
  • LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition (ICLR 2021) [paper]
  • What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning (ICLR 2021 Workshop) [paper]
  • Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (ICML 2021) [paper] [code]
  • Neural Tangent Generalization Attacks (ICML 2021) [paper]
  • SPECTRE: Defending Against Backdoor Attacks Using Robust Covariance Estimation (ICML 2021) [paper]
  • Adversarial Examples Make Strong Poisons (NeurIPS 2021) [paper]
  • Anti-Backdoor Learning: Training Clean Models on Poisoned Data (NeurIPS 2021) [paper] [code]
  • Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective (ICCV 2021) [paper] [code]
  • Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks (AAAI 2021) [paper] [code]
  • Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff (ICASSP 2021) [paper]

2020

  • On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping (arXiv 2020) [paper] [code]
  • Backdooring and poisoning neural networks with image-scaling attacks (arXiv 2020) [paper]
  • Poisoned classifiers are not only backdoored, they are fundamentally broken (arXiv 2020) [paper] [code]
  • Invisible backdoor attacks on deep neural networks via steganography and regularization (TDSC 2020) [paper]
  • Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs (CVPR 2020) [paper] [code]
  • MetaPoison: Practical General-purpose Clean-label Data Poisoning (NeurIPS 2020) [paper]
  • Input-Aware Dynamic Backdoor Attack (NeurIPS 2020) [paper] [code]
  • How To Backdoor Federated Learning (AISTATS 2020) [paper]
  • Reflection backdoor: A natural backdoor attack on deep neural networks (ECCV 2020) [paper]
  • Practical Poisoning Attacks on Neural Networks (ECCV 2020) [paper]
  • Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases (ECCV 2020) [paper] [code]
  • Deep k-NN Defense Against Clean-Label Data Poisoning Attacks (ECCV 2020 Workshops) [paper] [code]
  • Radioactive data: tracing through training (ICML 2020) [paper]
  • Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks (ICML 2020) [paper]
  • Certified Robustness to Label-Flipping Attacks via Randomized Smoothing (ICML 2020) [paper]
  • An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks (KDD 2020) [paper] [code]
  • Hidden Trigger Backdoor Attacks (AAAI 2020) [paper] [code]

2019

  • Label-consistent backdoor attacks (arXiv 2019) [paper]
  • Poisoning Attacks with Generative Adversarial Nets (arXiv 2019) [paper]
  • TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems (arXiv 2019) [paper]
  • BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (IEEE Access 2019) [paper]
  • Data Poisoning against Differentially-Private Learners: Attacks and Defenses (IJCAI 2019) [paper]
  • DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks (IJCAI 2019) [paper]
  • Sever: A Robust Meta-Algorithm for Stochastic Optimization (ICML 2019) [paper]
  • Learning with Bad Training Data via Iterative Trimmed Loss Minimization (ICML 2019) [paper]
  • Universal Multi-Party Poisoning Attacks (ICML 2019) [paper]
  • Transferable Clean-Label Poisoning Attacks on Deep Neural Nets (ICML 2019) [paper]
  • Defending Neural Backdoors via Generative Distribution Modeling (NeurIPS 2019) [paper]
  • Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder (NeurIPS 2019) [paper]
  • The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure (AAAI 2019) [paper]
  • Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models (IEEE Transactions on Services Computing 2019) [paper]
  • Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks (IEEE Symposium on Security and Privacy 2019) [paper]
  • STRIP: a defence against trojan attacks on deep neural networks (ACSAC 2019) [paper]

2018

  • Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering (arXiv 2018) [paper]
  • Spectral Signatures in Backdoor Attacks (NeurIPS 2018) [paper]
  • Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (NeurIPS 2018) [paper]
  • Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise (NeurIPS 2018) [paper]
  • Trojaning Attack on Neural Networks (NDSS 2018) [paper]
  • Label Sanitization Against Label Flipping Poisoning Attacks (ECML PKDD 2018 Workshops) [paper]
  • Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring (USENIX Security 2018) [paper]

2017

  • Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (arXiv 2017) [paper]
  • Generative Poisoning Attack Method Against Neural Networks (arXiv 2017) [paper]
  • Delving into Transferable Adversarial Examples and Black-box Attacks (ICLR 2017) [paper]
  • Understanding Black-box Predictions via Influence Functions (ICML 2017) [paper] [code]
  • Certified Defenses for Data Poisoning Attacks (NeurIPS 2017) [paper]

2016

  • Data Poisoning Attacks on Factorization-Based Collaborative Filtering (NeurIPS 2016) [paper]

2015

  • Is Feature Selection Secure against Training Data Poisoning? (ICML 2015) [paper]
  • Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners (AAAI 2015) [paper]

About

A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published