Skip to content

Latest commit

 

History

History
76 lines (51 loc) · 3.71 KB

beam-search.md

File metadata and controls

76 lines (51 loc) · 3.71 KB

BEAM Search

Idea

The key idea of Beam Search is to sample the most likely values for each time step and use them as input for the further. Similar to greedy search the Beam Search algorithm is commonly used in encoder decoder architectures. However, the main difference is in not only using the most likey word, but the B most likely ones. Based on those different parallel computations will be made to calculate the further best values with respect to the maximum probability of the prediction.

Improvement

In comparison to greedy search the beam search algorithm does not prefer common values since several likely will be selected for further calculations.

Concept

  1. Define the Beam width:
    1. If you choose to define a large B you obtain better but slower results
    2. A small B will result worse but faster
  1. Pick the most likely B values for the first timestep
  1. Based on the first B values, pick the next B most likely ones
  1. (you have B copies of your network)

  2. Repeat picking B values

    1. In each iteration only B values will propagate
    2. B values over all results

BeamSearch - Reference

Refinement:

    1. Prefers short predictions
    2. Rounding errors/ underflow
    1. Insert log -> Sum of log
    2. Numerical more stable
    3. Prefers short predictions
    1. Where (normalization factor)

Evaluation

Error Analysis: vs.

  • where : human prediction
  • and : machine prediction

Case 1:

>

  • Beam search is at faul -> change B width

Case 2:

<=

  • RNN is at fault -> change architecture

Production

References

  1. Dive into DeepLearning
  2. Beam Search — A Search Strategy