-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
282 lines (238 loc) · 23.2 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
<!DOCTYPE html>
<html lang="en-US">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>Dong Zhang</title>
<meta name="description" content="The homepage of Dong Zhang">
<!-- <link rel="icon" href="logo.jpg"> -->
<link rel="preload" href="assets/css/0.styles.bf6ecb71.css" as="style"><link rel="preload" href="assets/js/app.ef4e7843.js" as="script"><link rel="preload" href="assets/js/2.5b5922e0.js" as="script"><link rel="preload" href="assets/js/8.c1e4c3b9.js" as="script"><link rel="preload" href="assets/js/4.dc64499e.js" as="script"><link rel="preload" href="assets/js/5.d752ec91.js" as="script"><link rel="prefetch" href="assets/js/3.677d4f8f.js"><link rel="prefetch" href="assets/js/6.0a8475de.js"><link rel="prefetch" href="assets/js/7.4f90f5b5.js"><link rel="prefetch" href="assets/js/9.0cc5bf5a.js">
<link rel="stylesheet" href="assets/css/0.styles.bf6ecb71.css">
</head>
<body>
<div id="app" data-server-rendered="true"><div class="theme-container no-sidebar home-page"><header class="navbar"><div class="sidebar-button"></div> <a href="/" class="home-link router-link-exact-active router-link-active"><!----> <span class="site-name">Dong Zhang</span></a> <div class="links"><!----> <!----></div></header> <div class="sidebar-mask"></div> <aside class="sidebar"><!----> <!----> </aside> <main class="page"> <div class="theme-default-content content__default"><div class="profile"><div class="image"><img src="profile.jpeg" alt></div> <div class="info"><div class="name">
Dong Zhang (张栋)
</div>
<div class="bio"><p>Master student @ Fudan University
<a target="_blank" href="projects/cv_zhangdong.pdf" title="Download my CV in PDF"><font size="3.5em" color="">[<u>Resume</u>]</font></a></p><p>[email protected]</div>
<div class="socials">
<div><a href="https://github.com/0nutation" target="_blank">GitHub</a></div>  / 
<div><a href="https://scholar.google.com/citations?user=ScVbeu0AAAAJ" target="_blank">Google Scholar</a></div>  / 
<div><a href="https://twitter.com/dongzha35524835" target="_blank">Twitter</a></div>  / 
<div><a href="https://www.linkedin.com/in/dong-zhang-33481520b/" target="_blank">Linkedin</a></div>  / 
<div><a href="https://www.zhihu.com/people/nutation" target="_blank">Zhihu</a></div>
</div>
<div class="contact"><div title="Contact me" class="email"></div></div></div></div>
<!-- <div><a href="https://github.com/0nutation" target="_blank"><img src="icons/github.svg" alt="GitHub" title="GitHub"></a></div>
<div><a href="https://scholar.google.com/citations?user=ScVbeu0AAAAJ" target="_blank"><img src="icons/google_scholar.svg" alt="Google Scholar" title="Google Scholar"></a></div>
<div><a href="https://twitter.com/dongzha35524835" target="_blank"><img src="icons/twitter.png" alt="Twitter" title="Twitter"></a></div>
<div><a href="https://www.linkedin.com/in/dong-zhang-33481520b/" target="_blank"><img src="icons/linkedin.svg" alt="linkedin" title="linkedin"></a></div>
<div><a href="https://www.zhihu.com/people/nutation" target="_blank"><img src="icons/zhihu.png" alt="Zhihu" title="Zhihu"></a></div>
</div>
<div class="contact"><div title="Contact me" class="email"></div></div></div></div> -->
<h2 id="about-me"><a href="#about-me" class="header-anchor">#</a> About Me</h2>
<p>Hi! I am a final year M.S. student of <a href="https://nlp.fudan.edu.cn/" target="_blank" rel="noopener noreferrer">FudanNLPLab</a> at <a href="https://www.fudan.edu.cn/en/" target="_blank" rel="noopener noreferrer">Fudan University</a>,
supervised by Prof. <a href="https://cs.fudan.edu.cn/3f/aa/c25909a278442/page.htm" target="_blank" rel="noopener noreferrer">Yaqian Zhou</a> and Prof. <a href="https://xpqiu.github.io/" target="_blank" rel="noopener noreferrer">Xipeng Qiu</a>.
I obtained my B.S. degree at Fudan University in 2022, advised by Prof. <a href="https://www.linkedin.com/in/fuliang-weng-6448158/" target="_blank" rel="noopener noreferrer">Fuliang Weng</a>. Previously, I was interning at <a href="https://www.bytedance.com/en/" target="_blank" rel="noopener noreferrer">Bytedance</a> AI Lab, mentored by <a href="https://reneeye.github.io/" target="_blank" rel="noopener noreferrer">Rong Ye</a>.
<p>My research interest focuses on <strong>End-to-end Voice Agent, Speech Foundation Models, and Multi-Modal LLM</strong>.
I have developed several foundation models for speech, including <a href="https://arxiv.org/abs/2305.11000" target="_blank" rel="noopener noreferrer">SpeechGPT</a>, <a href="https://0nutation.github.io/SpeechGPT2.github.io/" target="_blank" rel="noopener noreferrer">SpeechGPT2</a>, <a href="https://arxiv.org/abs/2308.16692" target="_blank" rel="noopener noreferrer">SpeechTokenizer</a> and <a href="https://arxiv.org/abs/2404.05600" target="_blank" rel="noopener noreferrer">SpeechAlign</a>.
<p><strong>I am expected to graduate in June 2025 and seeking Ph.D. and job opportunities worldwide. I'm also open to academic collaboration opportunities</strong>. Please feel free to contact me by <a href="mailto:[email protected]">[email protected]</a> if you are interested!</p>
<h2 id="news"><a href="#news" class="header-anchor">#</a> News</h2>
<ul>
<li><p><strong>[2024.9]</strong> Our SpeechAlign accepted to NeurIPS 2024 and InferAligner accepted to EMNLP 2024. </li>
<li><p><strong>[2024.8]</strong> Invited talks at Nvidia, Microsoft, Bytedance, SJTU X-Lance, Agora.ai. Topic - Towards Human-like Spoken Chatbot: SpeechGPT Series.</li>
<li><p><strong>[2024.7]</strong> We released <a href="https://0nutation.github.io/SpeechGPT2.github.io/" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT2</strong></a>, a emotional intelligent end-to-end spoken dialogue LLM. </li>
<li><p><strong>[2024.7]</strong> We won the first place in <a href="https://dcase.community/challenge2024/task-automated-audio-captioning-results#jung_cmu_t6_2024" target="_blank" rel="noopener noreferrer"><strong>DCASE 2024 Challenge Task6</strong></a>. </li>
<li><p><strong>[2024.5]</strong> Three papers accepted to ACL 2024 main conference! </li>
<li><p><strong>[2024.8]</strong> Invited talk at MIT SLS group about SpeechTokenizer.</li>
<li><p><strong>[2024.4]</strong> We released <a href="https://arxiv.org/pdf/2404.05600.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAlign</strong></a>, the first to apply RLHF to align speech language models with human preferences! </li>
<li><p><strong>[2024.2]</strong> Invited talk about SpeechGPT series works at <a href="https://www.airmeet.com/e/b2157610-cfe7-11ee-93ec-3b2ce56d50d2" target="_blank" rel="noopener noreferrer"><strong>AGI Leap Summit 2024</strong></a> hosted by <a href="https://superagi.com/" target="_blank" rel="noopener noreferrer"><strong>SuperAGI</strong></a>. </li>
<li><p><strong>[2024.2]</strong> We released <a href="https://arxiv.org/pdf/2402.12226.pdf" target="_blank" rel="noopener noreferrer"><strong>AnyGPT</strong></a>, a unified multi-modal LLM for text, image, speech and music! </li>
<li><p><strong>[2024.1]</strong> We released <a href="https://arxiv.org/pdf/2401.13527.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT-Gen</strong></a>, an 8B speech LLM efficient in semantic and perceptual information modeling. </li>
<li><p><strong>[2024.1]</strong> We proposed <a href="https://arxiv.org/pdf/2401.11206.pdf" target="_blank" rel="noopener noreferrer"><strong>InferAligner</strong></a>, an effective training-free LLM alignment method. </li>
<li><p><strong>[2024.1]</strong> Our SpeechTokenizer accepted to ICLR 2024! See you in Vienna! </li>
<li><p><strong>[2024.1]</strong> We released <a href="https://arxiv.org/pdf/2401.03945.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAgents</strong></a>, the first multi-modal multi-agent system. </li>
<li><p><strong>[2023.10]</strong> Two papers accepted to EMNLP 2023! </li>
<li><p><strong>[2023.8]</strong> We released <a href="https://arxiv.org/pdf/2308.16692.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechTokenizer</strong></a>, a speech tokenizer designed for speech language models. </li>
<li><p><strong>[2023.5]</strong> We released <a href="https://arxiv.org/pdf/2305.11000.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT</strong></a>, a conversational speech large language model. </li>
<li><p><strong>[2023.5]</strong> One first-author paper accepted to ACL 2023(Findings)! </li>
<li><p><strong>[2022.9]</strong> I joined FudanNLPLab as a master student. </li>
</ul>
<h2 id="Representative Publications"><a href="Representative Publications" class="header-anchor">#</a>Research</h2>
(*: Equal contribution)
<br>
<div class="text-center">
<p align="center">An overview of my research on building multi-modal large language models.</p>
<img id="teaser" width="110%" src="projects/roadmap.png">
</div>
<br>
<div class="md-card">
<div class="card-image"><img src="projects/speechgpt.png" alt></div>
<div class="card-content">
<p><strong>SpeechGPT: Empowering large language models with intrinsic cross-modal conversational abilities</strong></p>
<p><em><strong>Dong Zhang</strong>, Shimin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu</em></p>
<p>
[<a href="https://arxiv.org/pdf/2305.11000.pdf" target="_blank" rel="noopener noreferrer"><strong>EMNLP 2023 Findings</strong></a>]
[<a href="https://github.com/0nutation/SpeechGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechGPT"></a>]
[<a href="https://0nutation.github.io/SpeechGPT.github.io/" target="_blank" rel="noopener noreferrer">demo</a>]
</p>
<p>
This work is a GitHub Trending project and
is promoted by different media and forums, such as <a href="https://mp.weixin.qq.com/s/KpdOUdeYSVzrBtfuqFbjaQ" target="_blank" rel="noopener noreferrer"><strong>Heart of Machine</strong></a>,
<a href="https://twitter.com/_akhaliq/status/1659426578793725953" target="_blank" rel="noopener noreferrer"><strong>Twitter</strong></a> and <a href="https://www.youtube.com/watch?v=DD1e6FJ-If4" target="_blank" rel="noopener noreferrer">youtube</a>.
</p>
</div>
</div>
<div class="md-card">
<div class="card-image"><img src="projects/speechtokenizer.png" alt></div>
<div class="card-content">
<p><strong>SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models</strong></p>
<p><em><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*(order is random)</sup>, Shimin Li, Yaqian Zhou, Xipeng Qiu</em></p>
<p>
[<a href="https://arxiv.org/pdf/2308.16692.pdf" target="_blank" rel="noopener noreferrer"><strong>ICLR 2024</strong></a>]
[<a href="https://github.com/ZhangXInFD/SpeechTokenizer/" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/ZhangXInFD/SpeechTokenizer"></a>]
[<a href="https://0nutation.github.io/SpeechTokenizer.github.io/" target="_blank" rel="noopener noreferrer">demo</a>]
</p>
<p>
SpeechTokenizer unifies the semantic tokens and acoustic tokens and we build USLM(unified speech language model)<img src="https://img.shields.io/github/stars/0nutation/USLM"> based on it.
</p>
</div>
</div>
<div class="md-card">
<div class="card-image"><img src="projects/speechalign.png" alt></div>
<div class="card-content">
<p><strong>SpeechAlign: Aligning Speech Generation to Human Preferences</strong></p>
<p><em><strong>Dong Zhang<sup>*</sup></strong>, Zhaowei Li<sup>*</sup>, Shimin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu</em></p>
<p>
[<a href="https://arxiv.org/pdf/2404.05600.pdf" target="_blank" rel="noopener noreferrer"><strong>NeurIPS 2024</strong></a>]
[<a href="https://github.com/0nutation/SpeechGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechGPT"></a>]
[<a href="https://0nutation.github.io/SpeechAlign.github.io/" target="_blank" rel="noopener noreferrer">demo</a>]
</p>
<p>
SpeechAlign is the first to applys RLHF to align speech language models with human preferences and proposes an effective iterative self-improvement strategy that converts weak speech language models to stronger ones.
</p>
</div>
</div>
<div class="md-card">
<div class="card-image"><img src="projects/speechgptgen.png" alt></div>
<div class="card-content">
<p><strong>SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation</strong></p>
<p><em><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*</sup>, Jun Zhan, Shimin Li, Yaqian Zhou, Xipeng Qiu</em></p>
<p>
[<a href="https://arxiv.org/pdf/2401.13527.pdf" target="_blank" rel="noopener noreferrer"><strong>Preprint</strong></a>]
[<a href="https://github.com/0nutation/SpeechGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechGPT"></a>]
[<a href="https://0nutation.github.io/SpeechGPT-Gen.github.io/" target="_blank" rel="noopener noreferrer">demo</a>]
</p>
<p>
We propose Chain-of-Information speech generation method and scale up model size to 8B to build SpeechGPT-Gen, which can perform speech-to-speech dialogue with any voice you want.
</p>
</div>
</div>
<div class="md-card">
<div class="card-image"><img src="projects/speechagents.png" alt></div>
<div class="card-content">
<p><strong>SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems</strong></p>
<p><em><strong>Dong Zhang</strong>, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu</em></p>
<p>
[<a href="https://arxiv.org/pdf/2401.03945.pdf" target="_blank" rel="noopener noreferrer"><strong>Preprint</strong></a>]
[<a href="https://github.com/0nutation/SpeechAgents" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechAgents"></a>]
[<a href="https://0nutation.github.io/SpeechAgents.github.io/" target="_blank" rel="noopener noreferrer">demo</a>]
</p>
<p>
SpeechAgents is the first multi-modal multi-agent systems.
</p>
</div>
</div>
<div class="md-card">
<div class="card-image"><img src="projects/dub.png" alt></div>
<div class="card-content">
<p><strong>DUB: Discrete Unit Back-translation for Speech Translation</strong></p>
<p><em><strong>Dong Zhang</strong>, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou</em></p>
<p>
[<a href="https://arxiv.org/pdf/2305.11411.pdf" target="_blank" rel="noopener noreferrer"><strong>ACL 2023 Findings</strong></a>]
[<a href="https://github.com/0nutation/DUB" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/DUB"></a>]
[<a href="https://aclanthology.org/2023.findings-acl.447.mp4" target="_blank" rel="noopener noreferrer">video</a>]
</p>
<p>
DUB is the first to use discrete speech representation as input for speech translation and explore NLP techinques like mBART pretraining and back-translation based on it.
</p>
</div>
</div>
<div class="md-card">
<div class="card-image"><img src="projects/anygpt.png" alt></div>
<div class="card-content">
<p><strong>AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling</strong></p>
<p><em>Jun Zhan<sup>*</sup>, Junqi Dai<sup>*</sup>, Jiasheng Ye<sup>*</sup>, Yunhua Zhou, <strong>Dong Zhang</strong>, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu</em></p>
<p>
[<a href="https://arxiv.org/pdf/2402.12226.pdf" target="_blank" rel="noopener noreferrer"><strong>ACL 2024</strong></a>]
[<a href="https://github.com/OpenMOSS/AnyGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/OpenMOSS/AnyGPT"></a>]
[<a href="https://junzhan2000.github.io/AnyGPT.github.io/" target="_blank" rel="noopener noreferrer">demo</a>]
</p>
<p>
AnyGPT is our new exploration on discrete representation based multimodal LLM after SpeechGPT. AnyGPT unifies text, image, speech and music into one model and can perform any-to-any multimodal conversation.
</p>
</div>
</div>
<h2 id="Full Publications"><a href="Full Publications" class="header-anchor">#</a>Full Publications</h2>
<h3 id="2024"><a href="2024" class="header-anchor">#</a>2024</h3>
<ul>
<li><p><a href="https://arxiv.org/pdf/2404.05600.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAlign: Aligning Speech Generation to Human Preferences</strong></a> <br><strong>Dong Zhang<sup>*</sup></strong>, Zhaowei Li<sup>*</sup>, Shimin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu. <br>NeurIPS 2024</li>
<li><p><a href="https://arxiv.org/pdf/2402.12226.pdf" target="_blank" rel="noopener noreferrer"><strong>AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling</strong></a> <br>Jun Zhan<sup>*</sup>, Junqi Dai<sup>*</sup>, Jiasheng Ye<sup>*</sup>, Yunhua Zhou, <strong>Dong Zhang</strong>, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu. <br>ACL 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2402.06894.pdf" target="_blank" rel="noopener noreferrer"><strong>GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators</strong></a> <br>Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, <strong>Dong Zhang</strong>, Zhehuai Chen, Eng Siong Chng. <br>ACL 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2401.13527.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation</strong></a> <br><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*</sup>, Jun Zhan, Shimin Li, Yaqian Zhou, Xipeng Qiu. <br>Preprint</li>
<li><p><a href="https://arxiv.org/pdf/2401.11206.pdf" target="_blank" rel="noopener noreferrer"><strong>InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance</strong></a> <br>Pengyu Wang, <strong>Dong Zhang</strong>, Linyang Li, Chenkun Tan, Xinghao Wang, Ke Ren, Botian Jiang, Xipeng Qiu. <br>EMNLP 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2401.06071.pdf" target="_blank" rel="noopener noreferrer"><strong>GroundingGPT: Language Enhanced Multi-modal Grounding Model</strong></a> <br>Zhaowei Li, Qi Xu, <strong>Dong Zhang</strong>, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, Zhida Huang, Tao Wang. <br>ACL 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2401.03945.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems</strong></a> <br><strong>Dong Zhang</strong>, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu. <br>Preprint</li>
</ul>
<h3 id="2023"><a href="2024" class="header-anchor">#</a>2023</h3>
<ul>
<li><p><a href="https://arxiv.org/pdf/2310.08903.pdf" target="_blank" rel="noopener noreferrer"><strong>SeqXGPT: Sentence-Level AI-Generated Text Detection</strong></a> <br>Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, <strong>Dong Zhang</strong>, Xipeng Qiu <br>EMNLP 2023</li>
<li><p><a href="https://arxiv.org/pdf/2308.16692.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models</strong></a> <br><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*(order is random)</sup>, Shimin Li, Yaqian Zhou, Xipeng Qiu. <br>ICLR 2024</li>
<li><p><a href="https://arxiv.org/pdf/2305.11411.pdf" target="_blank" rel="noopener noreferrer"><strong>DUB: Discrete Unit Back-translation for Speech Translation</strong></a> <br><strong>Dong Zhang</strong>, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou. <br>ACL 2023(findings) </li>
<li><p><a href="https://arxiv.org/pdf/2305.11000.pdf" target="_blank" rel="noopener noreferrer"><strong>Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities</strong></a> <br><strong>Dong Zhang</strong>, Shimin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu. <br>EMNLP 2023(findings)</li>
</ul>
<h2 id="talks"><a href="#talks" class="header-anchor">#</a> Invited Talks</h2>
<ul>
<li><p><strong>Towards Human-like Spoken Chatbot: SpeechGPT Series</strong> <br>
<a href="" target="_blank" rel="noopener noreferrer"><strong>NTU Singapore</strong></a>(2024/9/17),
<a href="" target="_blank" rel="noopener noreferrer"><strong>NVIDIA</strong></a>(2024/8/15),
<a href="" target="_blank" rel="noopener noreferrer"><strong>Microsoft</strong></a>(2024/8/5),
<a href="https://www.bilibili.com/video/BV1FJ4m137ZB/?spm_id_from=333.337.search-card.all.click" target="_blank" rel="noopener noreferrer"><strong>SJTU X-Lance</strong></a>(2024/6/12),
<a href="" target="_blank" rel="noopener noreferrer"><strong>Bytedance</strong></a>(2024/6/6),
<a href="" target="_blank" rel="noopener noreferrer"><strong>Agora.ai</strong></a>(2024/5/29),
<a href="https://www.airmeet.com/e/b2157610-cfe7-11ee-93ec-3b2ce56d50d2" target="_blank" rel="noopener noreferrer"><strong>AGI Leap Summit 2024</strong></a> hosted by <a href="https://superagi.com/" target="_blank" rel="noopener noreferrer"><strong>SuperAGI</strong></a>(2024/2/29)
<li><p><strong>SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models</strong> <br>
<a href="https://groups.csail.mit.edu/sls/" target="_blank" rel="noopener noreferrer"><strong>MIT CSAIL SLS</strong></a>(2024/5/9)
</ul>
<h2 id="education"><a href="#education" class="header-anchor">#</a> Education</h2>
<ul>
<li><p><strong>Fudan University</strong> <span style="color:gray;float:right;">Sept 2022 - Jun 2025</span> <br>
M.S. in Computer Science</p></li>
<li><p><strong>Fudan University</strong> <span style="color:gray;float:right;">Sept 2018 - Jun 2022</span> <br>
B.S. in Electronic Engineering</p></li>
</ul>
<h2 id="internship"><a href="#internship" class="header-anchor">#</a> Internship</h2>
<ul>
<li><p><strong>Bytedance AI Lab</strong> <span style="color:gray;float:right;">Apr 2022 - Jun 2023</span> <br>
Research on speech translation</p></li>
</ul>
<h2 id="service"><a href="#service" class="header-anchor">#</a> Service</h2>
<ul>
<li><p><strong>Reviewer:</strong> <br>
EMNLP(2023, 2024), ACL(2024), Neurips(2024)</p></li>
</ul>
<!-- <table
style="width:100%;border:0px;border-spacing:0px;border-collapse:separate;margin-right:auto;margin-left:auto;">
<tbody>
<tr>
<td style="padding:20px;width:30%;vertical-align:middle">
<script type='text/javascript' id='clustrmaps'
src=''></script>
</td>
</tr>
</tbody>
</table> -->
<footer class="page-edit"><!----> <!----></footer> <!----> </main></div><div class="global-ui"></div></div>
<script src="assets/js/app.ef4e7843.js" defer></script><script src="assets/js/2.5b5922e0.js" defer></script><script src="assets/js/8.c1e4c3b9.js" defer></script><script src="assets/js/4.dc64499e.js" defer></script><script src="assets/js/5.d752ec91.js" defer></script>
</body>
</html>