Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PET results different from reported in huggingface blog "How many data points is a prompt worth?" study #69

Open
luffycodes opened this issue Nov 22, 2021 · 1 comment

Comments

@luffycodes
Copy link

luffycodes commented Nov 22, 2021

For MNLI, on the blog https://huggingface.co/blog/how_many_data_points/ - reported accuracy is 0.83 for 1000 data samples.

In the paper (https://arxiv.org/pdf/2001.07676.pdf), (table 1), for MNLI, accuracy reported is 0.85 for 1000 data samples.

I was wondering how the accuracy is reported in the PET paper.

@timoschick
Copy link
Owner

Hi @luffycodes, the accuracy reported in the PET paper is exactly what you obtain using this library. You can check out details about the "How many data points is a prompt worth?" study in their paper - one important difference to our experiments is that they

[...] run every experiment 4 times in order to reduce variance,

Also, I would assume that they have used a different random selection of 1,000 training examples (but to verify this, you should reach out to the authors directly).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants