-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logprobs functionality #1111
Conversation
Signed-off-by: lilacheden <[email protected]>
Signed-off-by: lilacheden <[email protected]>
Signed-off-by: Ariel Gera <[email protected]>
Signed-off-by: Ariel Gera <[email protected]>
prepare/processors/processors.py
Outdated
InferDictsToBinaryLogprobs( | ||
binary_class_names=( | ||
"No", | ||
"Yes", | ||
), | ||
field="prediction", | ||
process_every_value=False, | ||
), | ||
] | ||
), | ||
"processors.infer_logprobs_to_yes_no_probs", | ||
) | ||
|
||
add_to_catalog( | ||
SequentialOperator( | ||
steps=[ | ||
InferDictsToBinaryLogProbsLastToken( | ||
binary_class_names=( | ||
"No", | ||
"Yes", | ||
), | ||
field="prediction", | ||
process_every_value=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that we have a field called "options" assigned by many templates (such as the classification templates and multiple choice templates) with the options for completion based on the verbalization of the template. This is a safer way to do it because sometimes when you switch the template it's now "True" "False" and not "Yes" "No" and it's hard to pass it on. That's why so far all the verbalization reposibilty was in the template and externalizing verbalization to anywhere else in the pipeline can lead to issues. If you need help to integrate with the current mechanism of the "options" field you can either contact me or @perlitz who did an extensive use of it in the past specifically in the context of logprobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elronbandel Thanks, but I am not sure I follow how this post-processor is related to the template fields. We want this to be used on the inference engine predictions, across many templates for which this pattern would be applicable.
BTW, there are some other processors like this that have specific strings they expect, for example YesToOneElseZero, StanceToProCon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree , it need not use a task "options" field.
A processor is tied to a template and not necessary to a task. (As was in the multiple choice work Yotam did).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "options" field (which is not a task field but an internal unitxt field) was our way to use it internally for getting log probs. It takes into account template nuances such as whether the model is expected to output the choice numeral (a. 1. etc) in multiple choice, and if we have target prefix. All I suggest is that you fully understand the current mechanism as it was developed very carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I was not too familiar with the option mechanism (we should document it).
Looking at the code. It's not used in unitxt log prob inference but in the internal package inference.
"options" are added to task/data in the MultipleChoiceTemplate (if they are not already in the task).
In any case, it's a different scenario in the LLM as judge, where the user defines a template
"Is the following sentence correct? Answer 'with one word (Yes' or 'No')
'The sky is blue'
"
The model returns
"Yes it is."
And the post processor wants to return the probability of "Yes" vs "No" in the relevant token, (the first) so it converts the text into a float score.
Perhaps if we force the user of the LLM as Judge to use a MultipleChoiceTemplate, it would have the options fields - but do we want this for LLM as Judge? It seems cumbersome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has a better existing solution (detailed in the comments) and should be handled differently IMHO based on the similar solutions for logprobs we did in the past.
Signed-off-by: Ariel Gera <[email protected]>
Signed-off-by: Ariel Gera <[email protected]>
Signed-off-by: Ariel Gera <[email protected]>
Signed-off-by: Ariel Gera <[email protected]>
This PR has 4 changes:
@elronbandel - I think your concerns are only around (4). @pawelknes - can you please review (1) + (2)? @elronbandel - do you review (3)? Regarding (4) - I think @elronbandel thinks it is better addressed in an inference engine, but right now it's very localized changes. @arielge - can you say how you are going to use the the new post processor? |
Signed-off-by: lilacheden <[email protected]>
Regarding (4) The desired design here is described in this issue: #1128 The benefit of implementing it this way is that the algorithm @arielge is suggesting will be compatible with many tasks such as multiple choice tasks and classification tasks. And will work out of the box with all the related templates and tasks without requiring any modifications. |
Thanks @yoavkatz @elronbandel. The intention was to use the post processor as part of a flow that includes an inference engine. So for example you can have a judge metric that is initialized with a specific template and task, processes the user inputs with the given template and sends to the inference engine. Then the inference engine outputs are fed to the post-processor to get the desired score. |
No description provided.