From f78691d66c63543c676f27cbfa26e5f3915cb7cd Mon Sep 17 00:00:00 2001
From: Sizhe Chen <44351170+Sizhe-Chen@users.noreply.github.com>
Date: Thu, 31 Oct 2024 22:16:06 -0700
Subject: [PATCH] Update about.md

---
 _pages/about.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/_pages/about.md b/_pages/about.md
index b32de25c737d2..5d819bd8a04e0 100644
--- a/_pages/about.md
+++ b/_pages/about.md
@@ -25,7 +25,7 @@ Invited Talks
 
 Selected Publications
 ------
-+ Aligning LLMs to Be Robust Against Prompt Injection <br/> **Sizhe Chen**, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, Chuan Guo <br/> **SecAlign** formulates prompt injection defense as preference optimization, and solves it via existing alignment training. From a SFT dataset, we build our preference dataset, where the "input" contains a benign instruction I, a benign data, and an injected instruction I'; the "desirable response" responds to I; and the "undesirable response" responds to I'. The strong [GCG attack](https://arxiv.org/abs/2307.15043) gets only 2% success rate on SecAlign Mistral-7B. <br/> [ArXiv Preprint](https://arxiv.org/abs/2410.05451) \| [Code](https://github.com/facebookresearch/SecAlign)
++ Aligning LLMs to Be Robust Against Prompt Injection <br/> **Sizhe Chen**, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, Chuan Guo <br/> **SecAlign** formulates prompt injection defense as preference optimization, and solves it via existing alignment training. From a SFT dataset, we build our preference dataset, where the "input" contains a benign instruction, a benign data, and an injected instruction; the "desirable response" responds to the benign instruction; and the "undesirable response" responds to the injected instruction. The strong [GCG attack](https://arxiv.org/abs/2307.15043) gets only 2% success rate on SecAlign Mistral-7B. <br/> [ArXiv Preprint](https://arxiv.org/abs/2410.05451) \| [Code](https://github.com/facebookresearch/SecAlign)
 + StruQ: Defending Against Prompt Injection with Structured Queries <br/> **Sizhe Chen**, Julien Piet, Chawin Sitawarin, David Wagner <br/> [USENIX Security'25](http://arxiv.org/abs/2402.06363) \| [Code](https://github.com/Sizhe-Chen/StruQ)
 + One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks <br/> Shutong Wu\*, **Sizhe Chen\***, Cihang Xie, Xiaolin Huang <br/> [ICLR'23 Spotlight](https://openreview.net/forum?id=p7G8t5FVn2h) \| [Code](https://github.com/cychomatica/One-Pixel-Shotcut)
 + Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Attacks <br/> **Sizhe Chen**, Zhehao Huang, Qinghua Tao, Yingwen Wu, Cihang Xie, Xiaolin Huang <br/> [NeurIPS'22](https://openreview.net/forum?id=7hhH95QKKDX) \| [Code](https://github.com/Sizhe-Chen/AAA)