Skip to content

Latest commit

 

History

History
30 lines (26 loc) · 1.69 KB

Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax.md

File metadata and controls

30 lines (26 loc) · 1.69 KB

Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax

Title Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax
Authors Samuel L. Smith, David H. P. Turban, Steven Hamblin, Nils Y. Hammerla
Year 2017
URL https://arxiv.org/abs/1702.03859

There are many pre-trained monolingual embeddings available, but very few multilingual ones. Nevertheless, it is possible to align pre-trained monolingual embeddings with a linear transformation trained on bilingual dictionaries.
Smith et al. show this transformation should be orthogonal. This is true because the similarities between source and target words should be symmetric, and we would like the transformation to be able to map back the target embeddings into the source embeddings. Because orthogonal transformations are robust to noise, they can moreover be trained on artificial dictionaries of identical strings shared between two languages.

In addition, Smith et al. introduce the "inverted softmax" for identifying translation pairs. This inverted softmax estimates the probability that a candidate target word translates into the source word rather than the other way around. It can therefore deal better with "hubs", target words that are nearest neighbours to many different source words. Combining orthogonal transformations with this orthogonal softmax increases precision from 34% to 43% on English-to-Italian word translation.

Smith et al. have released their code and their transformation matrices for 78 FastText embedding matrices at https://github.com/Babylonpartners/fastText_multilingual.