Skip to content

xiujiesong/LLMPapers

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 

Repository files navigation

[TOC]

重要论文的总结

Papers

Base Models

  1. LLaMA: Open and Efficient Foundation Language Models. arxiv 2023 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. [pdf]

  2. Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling 2023 Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal [paper] [project]

    Model Size: 70M, 160M, 410M, 1B, 1.4B, 2.8B, 6.9B, 12B

  3. Opt-iml: Scaling language model instruction meta learning through the lens of generalization. arxiv 202 Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Da ́niel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, et al. [paper]

  4. GLM: General Language Model Pretraining with Autoregressive Blank Infilling ACP 2022

    Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang [GLM paper] [GLM-130B paper] [GLM project] [GLM-130B project]

    GLM考虑的任务:

    任务 数据集
    SuperGLUE
    Summarization CNN/DailyMail, XSum, Gigaword
    Question Generation SQuAD question generation
    Yahoo text infilling
    Language Modeling Books&Wiki Test, LAMBADA
  5. ChatGLM: An Open Bilingual Dialogue Language Model

  6. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

  7. FLAN: Scaling Instruction-Finetuned Language Models

  8. BELLE: Bloom-Enhanced Large Language model Engine

  9. GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo

Method to Get Training Data

  1. Self-Instruct: Aligning Language Model with Self Generated Instructions arXiv 2022 Wang, Yizhong , Kordi, Yeganeh , Mishra, Swaroop , Liu, Alisa , Smith, Noah A. , Khashabi, Daniel , Hajishirzi, Hannaneh [pdf] [project]

实现ChatGPT的平替

主要跟Instruction Tuning有关,即构建 Instruction + N个Input-output examples。 Alpaca, Vicuna, Dolly都没有使用RLHF

  1. Alpaca: A Strong, Replicable Instruction-Following Model 2023 Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto [blog] [project] 基于LLaMa训练的模型,训练数据来自于text-davinci-003的标注

  2. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality 2023 Chiang, Wei-Lin, Li, Zhuohan, Lin, Zi, Sheng, Ying, Wu, Zhanghao, Zhang, Hao, Zheng, Lianmin, Zhuang, Siyuan, Zhuang, Yonghao, Gonzalez, Joseph E., Stoica, Ion, Xing, Eric P. [blog] [project] 基于LLaMa训练的模型,训练数据来自于chatgpt的标注(share gpt)

  3. Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM Mike Conover, Matt Hayes, Ankit Mathur, Xiangrui Meng, Jianwei Xie, Jun Wan, Sam Shah, Ali Ghodsi, Patrick Wendell, Matei Zaharia, Reynold Xin [blog] [project]

    基于Pythia 12b,训练数据来自于DataBricks员工的标注 databricks-dolly-15k

PEFT (Parameter Efficient Fine-Tuning)

  1. LoRA LoRA: Low-Rank Adaptation of Large Language Models 2021 Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen [pdf] 模型:RoBERTa, DeBERTa, 任务:GLUE 模型:GPT2 M/L, 任务:E2E NLG, WebNLG, DART 模型:GPT3 175B,任务:WikiSQL, MNLI-m, SAMSum 做了Low-data Setting的情况

  2. AdaLoRA

    Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning ICLR2023 Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao [pdf] ==TODO==

  3. Prefix Tuning

    Prefix-Tuning: Optimizing Continuous Prompts for Generation ACL 2021 Xiang Lisa Li, Percy Liang [pdf] 模型:GPT2 M/L, BART L,任务:E2E, WebNLG, DART, XSUM (Summarization) 每一层都加了若干个hidden states作为参数。

    P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks ACL 2022
    Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang [pdf] 每一层都加了若干个hidden states作为参数。 模型:BERT (335M), RoBERTa (355M), GLM (2B,10B),任务:SuperGLUE 模型:BERT, RoBERTa, DeBERTa (750M),任务:

    • NER: CoNLL03, OntoNotes 5.0, CoNLL04
    • Extractive QA: SQuAD 1.1, SQuAD 2.0
    • SRL (Semantic Role Labeling): CoNLL12, CoNLL05 WSJ, CoNLL05 Brown
  4. P-Tuning

    GPT Understands, Too arxiv 2021 Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang [pdf] 模型:BERT, RoBERTa, GHT2-medium, GPT2-xl, MegatronLM,任务:LAMA, SuperGLUE 在Word Embedding层面使用Coninuous Prompts,离散token和连续提示进行了混合。

  5. Prompt Tuning

    The Power of Scale for Parameter-Efficient Prompt Tuning EMNLP2021 Brian Lester, Rami Al-Rfou, Noah Constant [pdf] 模型:T5,任务:SuperGLUE 在Word Embedding层面使用Coninuous Prompts,考虑了同时训练多个任务的情况

  6. PPT PPT: Pre-trained Prompt Tuning for Few-shot Learning ACL 2022 Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang [pdf] 可以看成 Pretraining + Prompt Tuning 模型 T5-XXL (11B),mT5-XXL,CPM-2,任务:主要针对多选项分类(Multiple-Choice Classification)

    语言 类别 数据集
    英语 单句话分类 SST-2, SST-5, YahooAns
    多选分类 RACE-m, RACE-h
    句子间分类(Sentence Pair) BoolQ, RTE, CB
    中文 单句话分类 ChnSent, Amazon, TNews
    多选分类 CCPM, C3
    句子间分类(Sentence Pair) LCQMC, CMNLI, OCNLI
  7. Adapter家族列表:

    • AdapterH (最原始的Adapter) Parameter-Efficient Transfer Learning for NLP ICML 2019 Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly [pdf] 模型:BERT,任务:GLUE,17个其他的classification tasks和SQuAD

    • AdapterL Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning EMNLP 2020 Zhaojiang Lin, Andrea Madotto, Pascale Fung [pdf] 除了训练Adapter,还训练对应的Task Embedding用于加在输入上。 模型:GPT2-small,任务:

      • Chit-chat based Dialogue(DLG): PersonaChat

      • NMT: IWSLT German-English

      • Summarization (SUM): CNN/Daily-Mail

      • Conversational QA (CQA): CoQA

      • (NLG): E2E NLG-Challenge(2019)

    • AdapterP AdapterFusion: Non-Destructive Task Composition for Transfer Learning EACL 2021 Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych [pdf] 模型:BERT-base-uncased,任务:

      • Commonsense Reasoning: Hellaswag, Winogrande, CosmosQA, CSQA, SocialQA
      • Sentiment Analysis: IMDb, SST
      • Natural Language Inference:MNLI, RTE, CB, SciTail, SICK
      • Sentence Relatedness: MRPC, QQP, Arugment, BoolQ
    • AdapterD AdapterDrop: On the Efficiency of Adapters in Transformers EMNLP2021 Andreas Rücklé, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych [pdf]

PEFT统一框架 ==TODO==

  1. Towards a Unified View of Parameter-Efficient Transfer Learning
  2. UNIPELT: A Unified Framework for Parameter-Efficient Language Model Tuning
  3. Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
  4. Sparse Structure Search for Parameter-Efficient Tuning
  5. Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

Plug-And-Play

Evaluating LLMs

  1. Instruction Tuning with GPT-4 arxiv 2023 Peng, Baolin, Li Chunyuan , He Pengcheng , Galley Michel , Gao Jianfeng [blog] [paper] [project]

    基于LLaMA 7B进行SFT,基于OPT 1.3B训练得到Reward Model。

    • Instruction效果的评估,使用GPT-4生成Instruction-following data
    • 3种评估指标:
      • human evaluation on three alignment criteria: Helpfulness (比如越正确、越相关,答案可能就越有帮助), Honesty)(是否有虚假信息), Harmlessness(是否有仇恨、暴力的内容)
      • automatic evaluation using GPT-4 feedback, [让GPT-4给结果从1-10打分]
      • ROUGE-L on un-natural instructions
  2. Holistic Evaluation of Language Models arxiv 2023 Liang P, Bommasani R, Lee T, et al. [paper] [project]

  3. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models arxiv 2023 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, et al. [paper] [project]

    The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to probe large language models and extrapolate their future capabilities. Big-bench include more than 200 tasks.

  4. PandaLM: Reproducible and Automated Language Model Assessment GitHub 2023 Wang Yidong, Yu Zhuohao, Zeng Zhengran, Yang Linyi, Heng Qiang, Wang Cunxiang, Chen Hao, Jiang Chaoya, Xie Rui, Wang Jindong, Xie Xing, Ye Wei, Zhang Shikun and Zhang Yue. [project]

LLM and KG

如何检测内容来自AI?

LLM带来的新方向

Model Explanation

在模型中加入可解释性模块

大模型训练的技巧

硬件层面

  • Flash Attention (Dao et al., 2022)

模型层面

  • rotary embeddings (Su et al. 2021)

  • parallelized attention and feedforward technique (Wang & Komatsuzaki 2021)

  • untied embedding / unembedding matrices (Belrose et al., 2023)

Applications of LLMs

  • AutoGPT
  • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace arxiv 2023 Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang. [paper] [project]

Tutorials

Resources

Data

Online Demo

Models

Tools

训练模型:

  • [peft] from Huggingface🤗: 实现了LoRA, Prefix Tuning, P-Tuning, Prompt Tuning和AdaLoRA。

评估模型表现:

参考

【OpenLLM 007】小参数撬动大模型-万字长文全面解读PEFT参数高效微调技术

TODO

Decoupling Knowledge From Memorization Retrieval-augmented Prompt Learning

神奇LLM引擎上线:帮你把GPT-3直接调成ChatGPT

清华唐杰教授:从千亿模型到ChatGPT的⼀点思考

Prompt-Tuning——深度解读一种新的微调范式

Else

Language Models as Knowledge Bases? EMNLP 2019.

Fabio Petroni, Tim Rocktaschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel. 2019.9 [pdf] [project]

The abbreviation of the work.

The key features in terms of prompt learning used in the work.

The mainly explored task of the work.

The mainly explored property of prompt learning methods in the work.

About

Papers, blogs, videos about LLM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published