What's Changed
- reinstate cuda rmsnorm (much faster in fp16/awq) + ct2 enc/dec config by @vince62s in #167
- [patch] remove dummy_load, move gpu_ranks warning out of TrainingConfig by @francoishernandez in #168
- fix batch inference by @vince62s in #169
- Code clean-ups by @vince62s in #171
- 120 columns makes more sense on modern screens by @vince62s in #176
- refactor transformer decoder and revamp the left padding attention mask by @vince62s in #178
- Major refactoring of convert HF by @francoishernandez in #156
- [patch] handle self_attn_backend edge case by @francoishernandez in #180
- hotfix post #178 by @vince62s in #181
- fix update vocab param loading by @vince62s in #184
- remove verbosity at validation/scoring by @vince62s in #185
- [patch] Add missing
is_train
kwarg intokenize_id
by @francoishernandez in #187 - Hugging face dataset streaming support by @vince62s in #177
- misc fixes by @vince62s in #192
- Gemma2 support by @francoishernandez in #160
- [convert_HF] handle special tokens defined in tokenizer_config.json by @francoishernandez in #196
- patch max_length handling in tokenize_id by @francoishernandez in #197
Full Changelog: 0.0.3...0.1.0