index.html

<!DOCTYPE html>
<html lang="en-US">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>Dong Zhang</title>
    <meta name="description" content="The homepage of Dong Zhang">
    <!-- <link rel="icon" href="logo.jpg"> -->
    
    <link rel="preload" href="assets/css/0.styles.bf6ecb71.css" as="style"><link rel="preload" href="assets/js/app.ef4e7843.js" as="script"><link rel="preload" href="assets/js/2.5b5922e0.js" as="script"><link rel="preload" href="assets/js/8.c1e4c3b9.js" as="script"><link rel="preload" href="assets/js/4.dc64499e.js" as="script"><link rel="preload" href="assets/js/5.d752ec91.js" as="script"><link rel="prefetch" href="assets/js/3.677d4f8f.js"><link rel="prefetch" href="assets/js/6.0a8475de.js"><link rel="prefetch" href="assets/js/7.4f90f5b5.js"><link rel="prefetch" href="assets/js/9.0cc5bf5a.js">
    <link rel="stylesheet" href="assets/css/0.styles.bf6ecb71.css">
  </head>
  <body>
    <div id="app" data-server-rendered="true"><div class="theme-container no-sidebar home-page"><header class="navbar"><div class="sidebar-button"></div> <a href="/" class="home-link router-link-exact-active router-link-active"><!----> <span class="site-name">Dong Zhang</span></a> <div class="links"><!----> <!----></div></header> <div class="sidebar-mask"></div> <aside class="sidebar"><!---->  <!----> </aside> <main class="page"> <div class="theme-default-content content__default"><div class="profile"><div class="image"><img src="profile.jpeg" alt></div> <div class="info"><div class="name">
      Dong Zhang (张栋)
    </div> 
    <div class="bio"><p>Master student @ Fudan University  
      <a target="_blank" href="projects/cv_zhangdong.pdf" title="Download my CV in PDF"><font size="3.5em" color="">[<u>Resume</u>]</font></a></p><p>dongzhang22@m.fudan.edu.cn</div> 
      
      <div class="socials">
      <div><a href="https://github.com/0nutation" target="_blank">GitHub</a></div> &nbsp/&nbsp
      <div><a href="https://scholar.google.com/citations?user=ScVbeu0AAAAJ" target="_blank">Google Scholar</a></div> &nbsp/&nbsp
      <div><a href="https://twitter.com/dongzha35524835" target="_blank">Twitter</a></div> &nbsp/&nbsp
      <div><a href="https://www.linkedin.com/in/dong-zhang-33481520b/" target="_blank">Linkedin</a></div> &nbsp/&nbsp
      <div><a href="https://www.zhihu.com/people/nutation" target="_blank">Zhihu</a></div>
      </div> 
      <div class="contact"><div title="Contact me" class="email"></div></div></div></div> 

      <!-- <div><a href="https://github.com/0nutation" target="_blank"><img src="icons/github.svg" alt="GitHub" title="GitHub"></a></div>
      <div><a href="https://scholar.google.com/citations?user=ScVbeu0AAAAJ" target="_blank"><img src="icons/google_scholar.svg" alt="Google Scholar" title="Google Scholar"></a></div>
      <div><a href="https://twitter.com/dongzha35524835" target="_blank"><img src="icons/twitter.png" alt="Twitter" title="Twitter"></a></div>
      <div><a href="https://www.linkedin.com/in/dong-zhang-33481520b/" target="_blank"><img src="icons/linkedin.svg" alt="linkedin" title="linkedin"></a></div>
      <div><a href="https://www.zhihu.com/people/nutation" target="_blank"><img src="icons/zhihu.png" alt="Zhihu" title="Zhihu"></a></div>
      </div> 
      <div class="contact"><div title="Contact me" class="email"></div></div></div></div>  -->
      

      <h2 id="about-me"><a href="#about-me" class="header-anchor">#</a> About Me</h2> 
      <p>Hi! I am a final year M.S. student of <a href="https://nlp.fudan.edu.cn/" target="_blank" rel="noopener noreferrer">FudanNLPLab</a>  at <a href="https://www.fudan.edu.cn/en/" target="_blank" rel="noopener noreferrer">Fudan University</a>,
        supervised by Prof. <a href="https://cs.fudan.edu.cn/3f/aa/c25909a278442/page.htm" target="_blank" rel="noopener noreferrer">Yaqian Zhou</a> and Prof. <a href="https://xpqiu.github.io/" target="_blank" rel="noopener noreferrer">Xipeng Qiu</a>.
      I obtained my B.S. degree at Fudan University in 2022, advised by Prof. <a href="https://www.linkedin.com/in/fuliang-weng-6448158/" target="_blank" rel="noopener noreferrer">Fuliang Weng</a>. Previously, I was interning at <a href="https://www.bytedance.com/en/" target="_blank" rel="noopener noreferrer">Bytedance</a> AI Lab, mentored by <a href="https://reneeye.github.io/" target="_blank" rel="noopener noreferrer">Rong Ye</a>.
      <p>My research interest focuses on <strong>End-to-end Voice Agent, Speech Foundation Models, and Multi-Modal LLM</strong>. 
      I have developed several foundation models for speech, including <a href="https://arxiv.org/abs/2305.11000" target="_blank" rel="noopener noreferrer">SpeechGPT</a>, <a href="https://0nutation.github.io/SpeechGPT2.github.io/" target="_blank" rel="noopener noreferrer">SpeechGPT2</a>, <a href="https://arxiv.org/abs/2308.16692" target="_blank" rel="noopener noreferrer">SpeechTokenizer</a> and <a href="https://arxiv.org/abs/2404.05600" target="_blank" rel="noopener noreferrer">SpeechAlign</a>. 
      <p><strong>I am expected to graduate in June 2025 and seeking Ph.D. and job opportunities worldwide. I'm also open to academic collaboration opportunities</strong>. Please feel free to contact me by <a href="mailto:dongzhang22@m.fudan.edu.cn">dongzhang22@m.fudan.edu.cn</a> if you are interested!</p> 

<h2 id="news"><a href="#news" class="header-anchor">#</a> News</h2> 
<ul>

<li><p><strong>[2024.9]</strong> Our SpeechAlign accepted to NeurIPS 2024 and InferAligner accepted to EMNLP 2024. </li>
<li><p><strong>[2024.8]</strong> Invited talks at Nvidia, Microsoft, Bytedance, SJTU X-Lance, Agora.ai. Topic - Towards Human-like Spoken Chatbot: SpeechGPT Series.</li>
<li><p><strong>[2024.7]</strong> We released <a href="https://0nutation.github.io/SpeechGPT2.github.io/" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT2</strong></a>, a emotional intelligent end-to-end spoken dialogue LLM. </li>
<li><p><strong>[2024.7]</strong> We won the first place in <a href="https://dcase.community/challenge2024/task-automated-audio-captioning-results#jung_cmu_t6_2024" target="_blank" rel="noopener noreferrer"><strong>DCASE 2024 Challenge Task6</strong></a>.  </li>
<li><p><strong>[2024.5]</strong> Three papers accepted to ACL 2024 main conference! </li>
<li><p><strong>[2024.8]</strong> Invited talk at MIT SLS group about SpeechTokenizer.</li>
<li><p><strong>[2024.4]</strong> We released <a href="https://arxiv.org/pdf/2404.05600.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAlign</strong></a>, the first to apply RLHF to align speech language models with human preferences! </li>
<li><p><strong>[2024.2]</strong> Invited talk about SpeechGPT series works at <a href="https://www.airmeet.com/e/b2157610-cfe7-11ee-93ec-3b2ce56d50d2" target="_blank" rel="noopener noreferrer"><strong>AGI Leap Summit 2024</strong></a> hosted by <a href="https://superagi.com/" target="_blank" rel="noopener noreferrer"><strong>SuperAGI</strong></a>. </li>
<li><p><strong>[2024.2]</strong> We released <a href="https://arxiv.org/pdf/2402.12226.pdf" target="_blank" rel="noopener noreferrer"><strong>AnyGPT</strong></a>, a unified multi-modal LLM for text, image, speech and music! </li>
<li><p><strong>[2024.1]</strong> We released <a href="https://arxiv.org/pdf/2401.13527.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT-Gen</strong></a>, an 8B speech LLM efficient in semantic and perceptual information modeling. </li>
<li><p><strong>[2024.1]</strong> We proposed <a href="https://arxiv.org/pdf/2401.11206.pdf" target="_blank" rel="noopener noreferrer"><strong>InferAligner</strong></a>, an effective training-free LLM alignment method. </li>
<li><p><strong>[2024.1]</strong> Our SpeechTokenizer accepted to ICLR 2024! See you in Vienna! </li>
<li><p><strong>[2024.1]</strong> We released <a href="https://arxiv.org/pdf/2401.03945.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAgents</strong></a>, the first multi-modal multi-agent system. </li>
<li><p><strong>[2023.10]</strong> Two papers accepted to EMNLP 2023! </li>
<li><p><strong>[2023.8]</strong> We released <a href="https://arxiv.org/pdf/2308.16692.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechTokenizer</strong></a>, a speech tokenizer designed for speech language models. </li>
<li><p><strong>[2023.5]</strong> We released <a href="https://arxiv.org/pdf/2305.11000.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT</strong></a>, a conversational speech large language model. </li>
<li><p><strong>[2023.5]</strong> One first-author paper accepted to ACL 2023(Findings)! </li>
<li><p><strong>[2022.9]</strong> I joined FudanNLPLab as a master student. </li>
</ul> 

 
<h2 id="Representative Publications"><a href="Representative Publications" class="header-anchor">#</a>Research</h2> 
(*: Equal contribution)

<br>
    <div class="text-center">
      <p align="center">An overview of my research on building multi-modal large language models.</p>
      <img id="teaser" width="110%" src="projects/roadmap.png">     
    </div>
<br>

<div class="md-card">
  <div class="card-image"><img src="projects/speechgpt.png" alt></div> 
  <div class="card-content">
    <p><strong>SpeechGPT: Empowering large language models with intrinsic cross-modal conversational abilities</strong></p> 
    <p><em><strong>Dong Zhang</strong>, Shimin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu</em></p> 
    <p>
      [<a href="https://arxiv.org/pdf/2305.11000.pdf" target="_blank" rel="noopener noreferrer"><strong>EMNLP 2023 Findings</strong></a>] 
      [<a href="https://github.com/0nutation/SpeechGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechGPT"></a>]
      [<a href="https://0nutation.github.io/SpeechGPT.github.io/" target="_blank" rel="noopener noreferrer">demo</a>] 
    </p>
    <p>
      This work is a GitHub Trending project and 
      is promoted by different media and forums, such as <a href="https://mp.weixin.qq.com/s/KpdOUdeYSVzrBtfuqFbjaQ" target="_blank" rel="noopener noreferrer"><strong>Heart of Machine</strong></a>,
      <a href="https://twitter.com/_akhaliq/status/1659426578793725953" target="_blank" rel="noopener noreferrer"><strong>Twitter</strong></a> and <a href="https://www.youtube.com/watch?v=DD1e6FJ-If4" target="_blank" rel="noopener noreferrer">youtube</a>.
    </p>
  </div>
  </div> 


  <div class="md-card">
    <div class="card-image"><img src="projects/speechtokenizer.png" alt></div> 
    <div class="card-content">
      <p><strong>SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models</strong></p> 
      <p><em><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*(order is random)</sup>, Shimin Li, Yaqian Zhou, Xipeng Qiu</em></p> 
      <p>
        [<a href="https://arxiv.org/pdf/2308.16692.pdf" target="_blank" rel="noopener noreferrer"><strong>ICLR 2024</strong></a>] 
        [<a href="https://github.com/ZhangXInFD/SpeechTokenizer/" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/ZhangXInFD/SpeechTokenizer"></a>]
        [<a href="https://0nutation.github.io/SpeechTokenizer.github.io/" target="_blank" rel="noopener noreferrer">demo</a>] 
      </p>
      <p>
      SpeechTokenizer unifies the semantic tokens and acoustic tokens and we build USLM(unified speech language model)<img src="https://img.shields.io/github/stars/0nutation/USLM"> based on it.
      </p>
    </div>
    </div> 
  
  <div class="md-card">
    <div class="card-image"><img src="projects/speechalign.png" alt></div> 
    <div class="card-content">
      <p><strong>SpeechAlign: Aligning Speech Generation to Human Preferences</strong></p> 
      <p><em><strong>Dong Zhang<sup>*</sup></strong>, Zhaowei Li<sup>*</sup>, Shimin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu</em></p> 
      <p>
        [<a href="https://arxiv.org/pdf/2404.05600.pdf" target="_blank" rel="noopener noreferrer"><strong>NeurIPS 2024</strong></a>] 
        [<a href="https://github.com/0nutation/SpeechGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechGPT"></a>]
        [<a href="https://0nutation.github.io/SpeechAlign.github.io/" target="_blank" rel="noopener noreferrer">demo</a>] 
      </p>
      <p>
        SpeechAlign is the first to applys RLHF to align speech language models with human preferences and proposes an effective iterative self-improvement strategy that converts weak speech language models to stronger ones.
      </p>
    </div>
    </div> 


  <div class="md-card">
    <div class="card-image"><img src="projects/speechgptgen.png" alt></div> 
    <div class="card-content">
      <p><strong>SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation</strong></p> 
      <p><em><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*</sup>, Jun Zhan, Shimin Li, Yaqian Zhou, Xipeng Qiu</em></p> 
      <p>
        [<a href="https://arxiv.org/pdf/2401.13527.pdf" target="_blank" rel="noopener noreferrer"><strong>Preprint</strong></a>] 
        [<a href="https://github.com/0nutation/SpeechGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechGPT"></a>]
        [<a href="https://0nutation.github.io/SpeechGPT-Gen.github.io/" target="_blank" rel="noopener noreferrer">demo</a>] 
      </p>
      <p>
        We propose Chain-of-Information speech generation method and scale up model size to 8B to build SpeechGPT-Gen, which can perform speech-to-speech dialogue with any voice you want.
      </p>
    </div>
    </div> 

  <div class="md-card">
    <div class="card-image"><img src="projects/speechagents.png" alt></div> 
    <div class="card-content">
      <p><strong>SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems</strong></p> 
      <p><em><strong>Dong Zhang</strong>, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu</em></p> 
      <p>
        [<a href="https://arxiv.org/pdf/2401.03945.pdf" target="_blank" rel="noopener noreferrer"><strong>Preprint</strong></a>] 
        [<a href="https://github.com/0nutation/SpeechAgents" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/SpeechAgents"></a>]
        [<a href="https://0nutation.github.io/SpeechAgents.github.io/" target="_blank" rel="noopener noreferrer">demo</a>] 
      </p>
      <p>
        SpeechAgents is the first multi-modal multi-agent systems.
      </p>
    </div>
    </div> 


  <div class="md-card">
    <div class="card-image"><img src="projects/dub.png" alt></div> 
    <div class="card-content">
      <p><strong>DUB: Discrete Unit Back-translation for Speech Translation</strong></p> 
      <p><em><strong>Dong Zhang</strong>, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou</em></p> 
      <p>
        [<a href="https://arxiv.org/pdf/2305.11411.pdf" target="_blank" rel="noopener noreferrer"><strong>ACL 2023 Findings</strong></a>] 
        [<a href="https://github.com/0nutation/DUB" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/0nutation/DUB"></a>]
        [<a href="https://aclanthology.org/2023.findings-acl.447.mp4" target="_blank" rel="noopener noreferrer">video</a>] 
      </p>
      <p>
        DUB is the first to use discrete speech representation as input for speech translation and explore NLP techinques like mBART pretraining and back-translation based on it.
      </p>
    </div>
    </div> 

    <div class="md-card">
      <div class="card-image"><img src="projects/anygpt.png" alt></div> 
      <div class="card-content">
        <p><strong>AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling</strong></p> 
        <p><em>Jun Zhan<sup>*</sup>, Junqi Dai<sup>*</sup>, Jiasheng Ye<sup>*</sup>, Yunhua Zhou, <strong>Dong Zhang</strong>, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu</em></p> 
        <p>
          [<a href="https://arxiv.org/pdf/2402.12226.pdf" target="_blank" rel="noopener noreferrer"><strong>ACL 2024</strong></a>] 
          [<a href="https://github.com/OpenMOSS/AnyGPT" target="_blank" rel="noopener noreferrer">code <img src="https://img.shields.io/github/stars/OpenMOSS/AnyGPT"></a>]
          [<a href="https://junzhan2000.github.io/AnyGPT.github.io/" target="_blank" rel="noopener noreferrer">demo</a>] 
        </p>
        <p>
          AnyGPT is our new exploration on discrete representation based multimodal LLM after SpeechGPT. AnyGPT unifies text, image, speech and music into one model and can perform any-to-any multimodal conversation.
        </p>
      </div>
      </div> 


<h2 id="Full Publications"><a href="Full Publications" class="header-anchor">#</a>Full Publications</h2> 

<h3 id="2024"><a href="2024" class="header-anchor">#</a>2024</h3> 
<ul>
<li><p><a href="https://arxiv.org/pdf/2404.05600.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAlign: Aligning Speech Generation to Human Preferences</strong></a> <br><strong>Dong Zhang<sup>*</sup></strong>, Zhaowei Li<sup>*</sup>, Shimin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu. <br>NeurIPS 2024</li>
<li><p><a href="https://arxiv.org/pdf/2402.12226.pdf" target="_blank" rel="noopener noreferrer"><strong>AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling</strong></a> <br>Jun Zhan<sup>*</sup>, Junqi Dai<sup>*</sup>, Jiasheng Ye<sup>*</sup>, Yunhua Zhou, <strong>Dong Zhang</strong>, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu. <br>ACL 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2402.06894.pdf" target="_blank" rel="noopener noreferrer"><strong>GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators</strong></a> <br>Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, <strong>Dong Zhang</strong>, Zhehuai Chen, Eng Siong Chng. <br>ACL 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2401.13527.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation</strong></a> <br><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*</sup>, Jun Zhan, Shimin Li, Yaqian Zhou, Xipeng Qiu. <br>Preprint</li>
<li><p><a href="https://arxiv.org/pdf/2401.11206.pdf" target="_blank" rel="noopener noreferrer"><strong>InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance</strong></a> <br>Pengyu Wang, <strong>Dong Zhang</strong>, Linyang Li, Chenkun Tan, Xinghao Wang, Ke Ren, Botian Jiang, Xipeng Qiu. <br>EMNLP 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2401.06071.pdf" target="_blank" rel="noopener noreferrer"><strong>GroundingGPT: Language Enhanced Multi-modal Grounding Model</strong></a> <br>Zhaowei Li, Qi Xu, <strong>Dong Zhang</strong>, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, Zhida Huang, Tao Wang. <br>ACL 2024 </li>
<li><p><a href="https://arxiv.org/pdf/2401.03945.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems</strong></a> <br><strong>Dong Zhang</strong>, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu. <br>Preprint</li>
</ul> 


<h3 id="2023"><a href="2024" class="header-anchor">#</a>2023</h3> 
<ul>
<li><p><a href="https://arxiv.org/pdf/2310.08903.pdf" target="_blank" rel="noopener noreferrer"><strong>SeqXGPT: Sentence-Level AI-Generated Text Detection</strong></a> <br>Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, <strong>Dong Zhang</strong>, Xipeng Qiu <br>EMNLP 2023</li>
<li><p><a href="https://arxiv.org/pdf/2308.16692.pdf" target="_blank" rel="noopener noreferrer"><strong>SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models</strong></a> <br><strong>Dong Zhang<sup>*</sup></strong>, Xin Zhang<sup>*(order is random)</sup>, Shimin Li, Yaqian Zhou, Xipeng Qiu. <br>ICLR 2024</li>
<li><p><a href="https://arxiv.org/pdf/2305.11411.pdf" target="_blank" rel="noopener noreferrer"><strong>DUB: Discrete Unit Back-translation for Speech Translation</strong></a> <br><strong>Dong Zhang</strong>, Rong Ye, Tom Ko, Mingxuan Wang, Yaqian Zhou. <br>ACL 2023(findings) </li>
<li><p><a href="https://arxiv.org/pdf/2305.11000.pdf" target="_blank" rel="noopener noreferrer"><strong>Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities</strong></a> <br><strong>Dong Zhang</strong>, Shimin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu. <br>EMNLP 2023(findings)</li>
</ul> 

<h2 id="talks"><a href="#talks" class="header-anchor">#</a> Invited Talks</h2> 
<ul>
<li><p><strong>Towards Human-like Spoken Chatbot: SpeechGPT Series</strong> <br>
  <a href="" target="_blank" rel="noopener noreferrer"><strong>NTU Singapore</strong></a>(2024/9/17),
  <a href="" target="_blank" rel="noopener noreferrer"><strong>NVIDIA</strong></a>(2024/8/15),
  <a href="" target="_blank" rel="noopener noreferrer"><strong>Microsoft</strong></a>(2024/8/5),
  <a href="https://www.bilibili.com/video/BV1FJ4m137ZB/?spm_id_from=333.337.search-card.all.click" target="_blank" rel="noopener noreferrer"><strong>SJTU X-Lance</strong></a>(2024/6/12),
  <a href="" target="_blank" rel="noopener noreferrer"><strong>Bytedance</strong></a>(2024/6/6),
  <a href="" target="_blank" rel="noopener noreferrer"><strong>Agora.ai</strong></a>(2024/5/29),
  <a href="https://www.airmeet.com/e/b2157610-cfe7-11ee-93ec-3b2ce56d50d2" target="_blank" rel="noopener noreferrer"><strong>AGI Leap Summit 2024</strong></a> hosted by <a href="https://superagi.com/" target="_blank" rel="noopener noreferrer"><strong>SuperAGI</strong></a>(2024/2/29)

        
<li><p><strong>SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models</strong> <br>
  <a href="https://groups.csail.mit.edu/sls/" target="_blank" rel="noopener noreferrer"><strong>MIT CSAIL SLS</strong></a>(2024/5/9)
  
</ul>  


<h2 id="education"><a href="#education" class="header-anchor">#</a> Education</h2> 
<ul>
<li><p><strong>Fudan University</strong> <span style="color:gray;float:right;">Sept 2022 - Jun 2025</span> <br>
  M.S. in Computer Science</p></li>
<li><p><strong>Fudan University</strong> <span style="color:gray;float:right;">Sept 2018 - Jun 2022</span> <br>
  B.S. in Electronic Engineering</p></li>
</ul>  


<h2 id="internship"><a href="#internship" class="header-anchor">#</a> Internship</h2> 
<ul>
<li><p><strong>Bytedance AI Lab</strong> <span style="color:gray;float:right;">Apr 2022 - Jun 2023</span> <br>
  Research on speech translation</p></li>

</ul>  


<h2 id="service"><a href="#service" class="header-anchor">#</a> Service</h2> 
<ul>
<li><p><strong>Reviewer:</strong>  <br>
  EMNLP(2023, 2024), ACL(2024), Neurips(2024)</p></li>

</ul>  


<!-- <table
style="width:100%;border:0px;border-spacing:0px;border-collapse:separate;margin-right:auto;margin-left:auto;">
<tbody>
  <tr>
    <td style="padding:20px;width:30%;vertical-align:middle">
      <script type='text/javascript' id='clustrmaps'
        src=''></script>
    </td>
  </tr>
</tbody>
</table> -->

   
      <footer class="page-edit"><!----> <!----></footer> <!----> </main></div><div class="global-ui"></div></div>
    <script src="assets/js/app.ef4e7843.js" defer></script><script src="assets/js/2.5b5922e0.js" defer></script><script src="assets/js/8.c1e4c3b9.js" defer></script><script src="assets/js/4.dc64499e.js" defer></script><script src="assets/js/5.d752ec91.js" defer></script>
  </body>
</html>