v0.5.0: First open source optimized Post Training Loss, AMD CI, XPU Support
Highlights
- Post Training Loss: Introducing the first open-source optimized post-training losses in Liger Kernel with ~80% memory reduction, featuring DPO, CPO, ORPO, SimPO, JSD, and more. No more OOM nightmares for post-training ML researchers!
- AMD CI: With AMD’s generous sponsorship of MI300s, we’ve integrated them into our CI. Special thanks to Embedded LLM for building the AMD CI infrastructure. #428
- XPU Support: In collaboration with Intel, we now support XPU, demonstrating comparable performance gains with other vendors. #407
What's Changed
- Adds the CPO Alignment Loss Function by @pramodith in #382
- Qwen2-VL Training Example w/ Liger by @tyler-romero in #389
- Support Qwen2-VL's multimodal RoPE implementation by @li-plus in #384
- add xpu device support for
rms_norm
by @faaany in #379 - fix qwen2 import failure in test by @ByronHsu in #394
- Add Chunked SimPO Loss by @pramodith in #386
- Add script to reproducibly run examples on Modal by @tyler-romero in #397
- add nn.module support for chunked loss function by @shivam15s in #402
- Generalize JSD to FKL/RKL by @yundai424 in #393
- Enable keyword arguments for liger functional by @hongpeng-guo in #400
- add reference model logps to chunkedloss interface and fix dpo loss fn by @shivam15s in #405
- Optimize CE Loss by casting dtype to float32 inside kernel by @pramodith in #406
- Xpu support by @mgrabban in #407
- Fix
get_batch_loss_metrics
comments by @austin362667 in #413 - Add rebuild to CI by @ByronHsu in #415
- Fix os env by @ByronHsu in #416
- Adjust QWEN2 VL Loss
rtol
by @austin362667 in #412 - [tiny] Add QwQ to readme (same arch as Qwen2) by @tyler-romero in #424
- Enhance Cross Entropy Softcap Unit Test by @austin362667 in #423
- Add ORPO Trainer + support HF metrics directly from chunked loss functions + fixes to avoid torch compile recompilations by @shivam15s in #429
- Add Build Success/Fail Badge by @hebiao064 in #431
- Switch amd-ci to use MI300X runner. by @saienduri in #428
- [CI] rename ci and add cron job for amd by @ByronHsu in #433
- [CI] shorten ci name by @ByronHsu in #434
- update ci icon on readme by @bboyleonp666 in #440
- Introduce Knowledge Distillation Base by @austin362667 in #432
- [AMD] [CI] Clean up
amd-ci
by @tjtanaa in #436 - Add xpu in env report by @abhilash1910 in #443
- Specify scheduled CI in AMD badge by @ByronHsu in #446
- improve code quality for chunk loss by @ByronHsu in #448
- Add paper link and formula for preference loss by @ByronHsu in #449
- Make kernel doc lean by @ByronHsu in #450
- Fix LigerCrossEntropyLoss Reduction Behavior for "None" Mode by @hebiao064 in #435
- add eng blog by @ByronHsu in #452
- add chunked loss to readme by @shivam15s in #453
- change chunked readme by @shivam15s in #454
- add sponsorship and collab by @ByronHsu in #457
- version bump to 0.5.0 by @shivam15s in #455
- Add HIP (ROCm) and Liger Kernel to env report by @Comet0322 in #456
New Contributors
- @li-plus made their first contribution in #384
- @faaany made their first contribution in #379
- @hongpeng-guo made their first contribution in #400
- @mgrabban made their first contribution in #407
- @hebiao064 made their first contribution in #431
- @saienduri made their first contribution in #428
- @bboyleonp666 made their first contribution in #440
- @abhilash1910 made their first contribution in #443
- @Comet0322 made their first contribution in #456