-
Notifications
You must be signed in to change notification settings - Fork 627
Loss, entropy, accuracy trends #188
Comments
yes, it becomes more certain what positive is as it trains.
is it overfitting? Otherwise the hparam may be problematic, like learning rate is too big (if you warmup too long learning rate will be too big after certain epochs, then training would be worse). |
Could you explain how the contrastive accuracy is computed? I could
understand the potential for overfitting if it's measured on samples
different from the those used to compute the training loss. From the code,
it seems that loss and accuracy are computed over similar quantities though.
…On Thu, Jan 27, 2022 at 7:50 PM Ting Chen ***@***.***> wrote:
as the training loss declines, the entropy of the distribution increases
rather than decreases. This seems plausible because near convergence, the
probability scores for all the negative examples are all relatively low and
equal while the prob. score for the positive example increases.
yes, it becomes more certain what positive is as it trains.
for a small training dataset (10^3) samples, I find that the accuracy
declines however. Why might this occur?
is it overfitting? Otherwise the hparam may be problematic, like learning
rate is too big (if you warmup too long learning rate will be too big after
certain epochs, then training would be worse).
—
Reply to this email directly, view it on GitHub
<#188 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AN3GCFVW4PRO6KO5XS3EDH3UYIHBBANCNFSM5M7CL5CQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
It's the prediction accuracy of positive examples among all candidates
within the mini-batch.
…On Fri, Jan 28, 2022 at 6:51 PM slala2121 ***@***.***> wrote:
Could you explain how the contrastive accuracy is computed? I could
understand the potential for overfitting if it's measured on samples
different from the those used to compute the training loss. From the code,
it seems that loss and accuracy are computed over similar quantities
though.
Best,
Sayeri Lala
PhD candidate | Electrical Engineering | Princeton University
On Thu, Jan 27, 2022 at 7:50 PM Ting Chen ***@***.***> wrote:
> as the training loss declines, the entropy of the distribution increases
> rather than decreases. This seems plausible because near convergence, the
> probability scores for all the negative examples are all relatively low
and
> equal while the prob. score for the positive example increases.
>
> yes, it becomes more certain what positive is as it trains.
>
> for a small training dataset (10^3) samples, I find that the accuracy
> declines however. Why might this occur?
>
> is it overfitting? Otherwise the hparam may be problematic, like learning
> rate is too big (if you warmup too long learning rate will be too big
after
> certain epochs, then training would be worse).
>
> —
> Reply to this email directly, view it on GitHub
> <
#188 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AN3GCFVW4PRO6KO5XS3EDH3UYIHBBANCNFSM5M7CL5CQ
>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
>
> or Android
> <
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
>.
>
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#188 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKERUL3FWOPK32WW4NNRV3UYMTZLANCNFSM5M7CL5CQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Okay. Then I'm not sure why overfitting would occur since the accuracy is measured over the same samples as the training dataset. |
It is possible that as the training loss declines, the model becomes more confident in its predictions, which can lead to an increase in the entropy of the output distribution. This can happen because the model assigns higher probabilities to correct predictions and lower probabilities to incorrect predictions, which results in a narrower distribution and higher entropy. Regarding the second observation, one possibility is that the model is too complex for the small training dataset, and therefore, it fails to generalize well to new examples. In this case, reducing the model's complexity or collecting more training data could potentially improve performance. I hope this helps clarify these issues. |
I'm trying to understand the relationships and trends among these quantities.
From some experiments, I find that
as the training loss declines, the entropy of the distribution increases rather than decreases. This seems plausible because near convergence, the probability scores for all the negative examples are all relatively low and equal while the prob. score for the positive example increases.
for a small training dataset (10^3) samples, I find that the accuracy declines however. Why might this occur?
Thanks.
The text was updated successfully, but these errors were encountered: