Skip to content

Latest commit

 

History

History
63 lines (39 loc) · 2.71 KB

gru.md

File metadata and controls

63 lines (39 loc) · 2.71 KB

GRU - Gated Recurrent Unit

Idea

To solve the common problem of short-term memory in RNNs, GRUs provide mechanisms to store the desired information. GRUs are commonly used on top of a RNN architecture, with the difference in the activation blocks. By storing information in so called cells which are regulated by gates, GRUs provide the option to store relevant information over longer sequences.

Improvement

  • Capability of storing long-term information
  • Faster calculation than LSTMs
  • Quite easy implementation
  • Sigmoid in gate is usefull for vanishing gradients
    • close to 0 ->

Concept

The GRU consisting of two gates and one memory cell.

a: activation

c: memory cell || hidden state h

: candidate for replacing c ||

: update gate. Decides when to update (most of the time the value will be 0 or 1) || u || z

: relevance gate. How relevant is to || r

Calculus

Architecture

GRU - reference

Evaluation

Production

References

  1. GRU - Wikipedia
  2. GRU Paper
  3. Illustrated GRU