Loss goes to -inf #1

jdeschena · 2024-08-23T17:05:17Z

Hello,

I am trying to run your code to reproduce your results, but when using either the lambda_DCE or t_DCE loss, the loss quickly goes to -infinity. As such, I am wondering if there is a sign missing somewhere. Could you have a look at the code? Simply negating the loss from get_loss_fn does not solve the issue.

Thanks in advance.

The text was updated successfully, but these errors were encountered:

JingyangOu · 2024-08-25T11:47:48Z

Thank you for your attention. I haven't encountered this issue during my runs, so it would be helpful if you could share more details. Could you please provide the training logs and any specific settings or modifications you made? This will help me diagnose the problem more accurately.

dongzhuoyao · 2024-08-31T16:47:22Z

can you share a screenshot of your loss trend? for me the training loss curve is not stable @JingyangOu

I am using 128 tokens, with 10k vocabulary size, and a model with 130M paramters.

JingyangOu · 2024-09-02T03:41:14Z

This is my training loss with default configures ( 1024 tokens, gpt2 tokenizer with 50k vocabulary size, 130M parameters):

Can you reproduce my result? The loss should always be positive. I'm not sure whether this bug is related to your code modifications.

dongzhuoyao · 2024-09-02T23:51:27Z

How large is your batch size? 512x16? for me, my bs is only 32

JingyangOu · 2024-09-04T03:09:25Z

The batch size in the config refers to the equivalent batch size after combining all GPUs and applying gradient accumulation. Therefore, the batch size I used is 512. I also tried training with a batch size of 32, and in this case, the resulting curve was similar to the one with a batch size of 512, with no signs of training instability.

jdeschena changed the title ~~lambda_DCE loss goes to -inf~~ Loss goes to -inf Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss goes to -inf #1

Loss goes to -inf #1

jdeschena commented Aug 23, 2024 •

edited

Loading

JingyangOu commented Aug 25, 2024

dongzhuoyao commented Aug 31, 2024

JingyangOu commented Sep 2, 2024

dongzhuoyao commented Sep 2, 2024

JingyangOu commented Sep 4, 2024

Loss goes to -inf #1

Loss goes to -inf #1

Comments

jdeschena commented Aug 23, 2024 • edited Loading

JingyangOu commented Aug 25, 2024

dongzhuoyao commented Aug 31, 2024

JingyangOu commented Sep 2, 2024

dongzhuoyao commented Sep 2, 2024

JingyangOu commented Sep 4, 2024

jdeschena commented Aug 23, 2024 •

edited

Loading