T5 For Korean

T5는 text-to-text transformer 모델입니다.

별다른 코드의 수정 없이 transformers 레포지토리에서 제공하는 학습 코드를 사용하여 학습하였습니다.

토크나이저는 MeCab으로 전처리된 코퍼스에서 학습해 형태소가 이상하게 토큰화되는것을 방지하였습니다.

Hyperparameters

Model	Layers	Width	Heads	Batch Size	Learning Rate	Steps
t5 v1.1 small	8	512	6	256	5e-3	2M
t5 v1.1 base	12	768	12	256	5e-3	1M

Usage

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("hyunwoo3235/t5-v1_1-base-ko")
model = T5ForConditionalGeneration.from_pretrained("hyunwoo3235/t5-v1_1-base-ko")

Dataset

모델의 학습에는 아래 데이터셋뿐만 아니라 기타 비공개 데이터셋을 추가로 사용하였습니다.

OSCAR 22.01
위키피디아 덤프
나무위키 덤프
The Pile

Citation

@article{2020t5,
  author  = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
  title   = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {140},
  pages   = {1-67},
  url     = {http://jmlr.org/papers/v21/20-074.html}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t5.md

t5.md

T5 For Korean

Hyperparameters

Usage

Dataset

Citation

Files

t5.md

Latest commit

History

t5.md

File metadata and controls

T5 For Korean

Hyperparameters

Usage

Dataset

Citation