Replies: 2 comments
-
Well, I think if we really need to sample from the beam_size outputs, this can always be done as a post-processing step: let That being said, if we really want this embedded as an option, we might want to add some form of
|
Beta Was this translation helpful? Give feedback.
-
My point is not really relevant in fact. When using beam_size = 20 with greedy search, it will automatically return 20 hypotheses for each ex. The code sort those according to log prob , but when using an estimator just afterward we can rerank after those. |
Beta Was this translation helpful? Give feedback.
-
@francoishernandez @funboarder13920
When we perform topk / nucleus sampling in greedy_search, we do the following:
For a given batch_size / beam_size couple, we start with tiling beam_size times each ex.
Then we advance step by step by picking a "random" token at each step.
At the end, when all beams are finished we do this:
https://github.com/eole-nlp/eole/blob/main/eole/predict/greedy_search.py#L286-L295
which means we retain the n_best or beam_size beams of each ex in the batch according to "scores" (logprob)
Question is:
Shall we pick randomly n_best / beam_size beams in the pool of finished beams to make sure we have a diversity ?
Beta Was this translation helpful? Give feedback.
All reactions