-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix All chunked_loss
Benchmark Scripts
#438
Conversation
@@ -1,4 +1,5 @@ | |||
from test.chunked_loss.test_dpo_loss import HF_DPO_Loss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this fn wouldn't work because right now return is a tuple
https://github.com/linkedin/Liger-Kernel/blob/aeb8ce38b637eeecd92058ad6bbc0fac88fb257f/benchmark/scripts/benchmark_dpo_loss.py#L120C1-L122C21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I believe other chunked loss benchmarking scripts would face the same issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks for pointing that out~
chunked_loss
Benchmark Scripts
@@ -65,13 +43,19 @@ def forward(self, x, target): | |||
self.lin.weight, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rather just output the loss from Torch[CUSTOM]Loss and Liger[CUSTOM]Loss, ie. just output tuple's first value. As we only need the loss, this would simplify the other code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Good idea. @shivam15s Thanks for review!
I'm using lambda
to create a function that will access the first element only when it's actually called.
Reason is that I want to reuse TorchLMHead[CUSTOM]
in tests without modifying them as well as for TorchLMHead[CUSTOM]
.
Signed-off-by: Austin Liu <[email protected]> Fix all `.backward()` and reuse `TorchLMHead` from tests Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]> Reuse LigerLMHead Signed-off-by: Austin Liu <[email protected]>
7776009
to
291f933
Compare
Summary
With recent updates, we need to align the APIs used in the benchmark script to ensure consistency.
Code changes:
.backward()
pass inbackward
andfull
mode.TorchLMHead
fromchunked_loss
tests.Testing Done
make test
to ensure correctnessmake checkstyle
to ensure code stylemake test-convergence
to ensure convergence