Skip to content
Zheng Tang edited this page Aug 29, 2019 · 1 revision

Welcome to the bert-vis wiki!

Deep contextualized word representations like BERT or ELMo have achieved great success in many NLP fields. But not like the traditional embeddings like word2vec or glove, BERT vectors are based on its surroundings words, which means the embedding of the same word may differ from sentence to sentence. The good news of this kind of embeddings is they include the contextual information in the vectors and the experiments showed that it really improve the results for many tasks. But the bad news is it needs more resources and time to train the model and the embeddings are not re-usable since the embeddings are different in different sentences. In this project, we plan to build an interactive UI for visualizing the BERT/ELMo embeddings to get a better understand for this kind of fine-tuning embeddings. Also, we can compare the difference between vectors of a same word hoping we can find a way to cluster the embeddings so that we can use the reduced number of vectors as the pre-trained embeddings.

Clone this wiki locally