Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug which causes score method to ignore BERTScorer object's batch_size attribute #191

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

gpetho
Copy link

@gpetho gpetho commented Jul 30, 2024

There is a bug in the score() method's code: A batch_size attribute can be specified for the BERTScorer object, but this setting is ignored by the score() method currently. In fact this attribute of the BERTScorer object has no effect at all.

The proposed fix is that if a batch_size argument is not specified explicitly for the score() method, then the value of the BERTScorer object's batch_size attribute is used for calculating the embeddings.

Another possibility would be to remove the batch_size attribute and, along with it, the batch_size parameter of the BERTScorer class and its initializer method respectively, just like it was apparently done for verbose and return_hash earlier (I fixed the docstrings and removed these outdated parameters), but this would modify the interface of the class and thus break backward compatibility, so I'd rather not do that. I believe it's better to fix the bug instead.

Note: This bug is no joke. I just lost some 3 or 4 hours of time because of it, trying to understand why BERTScore keeps running out of GPU RAM despite my having set batch size to 1, before it occurred to me that I should check if there's maybe some bug in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant