[Question] Which is better for Cross Lingual classfication Task: LASER or XLM? #106

Ayush-iitkgp · 2019-11-08T15:01:03Z

Hello,
I have the training data with labels in English. Now, I want to use this data to predict for other languages. I saw XLM and LASER both support the cross-lingual classification. However, they don't have the benchmark on the same dataset, therefore, it's difficult to know which model is better. Does someone help me in determining which(XLM or LASER) is better for cross-lingual classification?

Ayush-iitkgp · 2019-11-08T15:05:19Z

XLM is better than LASER

XLM Benchmark: https://github.com/facebookresearch/XLM#ii-cross-lingual-language-model-pretraining-xlm
LASER Benchmark: https://github.com/facebookresearch/LASER/tree/master/tasks/xnli#results

PiotrCzapla · 2019-11-12T12:47:00Z

You can consider further improving the results for regular document classification by following this approach http://nlp.fast.ai/classification/2019/09/10/multifit.html . We used LASER as XLM wasn't available when we were testing multifit. I would be super interested to see how it works with XLM

MastafaF · 2019-11-18T18:40:21Z

XLM does not cover all 100 languages to the best of my knowledge. Which model/implementation did you use @Ayush-iitkgp ?

MastafaF · 2019-11-18T18:46:21Z

Indeed, XLM with MLM+TLM only covers 15 languages currently...

Ayush-iitkgp · 2019-11-18T21:41:13Z

@MastafaF I am starting with the XLM model with 15 languages. XLM does support 100 languages, see here.

MastafaF · 2019-11-19T10:04:12Z

@Ayush-iitkgp From my reading of their paper a few weeks ago, my understanding is that the version MLM+TLM is the one that gets to the best results in terms of multilingual embeddings and that can outperform LASER. Indeed, multi-BERT already implemented MLM for a large number of languages and the quality of the multilingual embeddings is not optimal.

hoschwenk · 2019-11-20T01:45:39Z

Hello,
which approach is better, depends on the classification task, and maybe the languages you want to transfer to. Also, you may need a "deeper" classifier for LASER than for XLM.
Best is to try both approaches :-)

Bachstelze · 2019-12-04T20:20:17Z

@Ayush-iitkgp How is your approach doing?
I will try classification with model freezing and an extra layer on XLM-R for low-ressource languages. FastText could be interesting if time matters.

Ayush-iitkgp · 2019-12-05T13:51:41Z

@Bachstelze my approach included fine-tuning the XLM model on English data and using zero-shot classification to predict on german and Spanish languages. However, the performance of the model on german and Spanish is less than even 20% (accuracy) so I am still figuring out what can be done. The problem in my case is that I only have labeled English data and a very small amount of non-English data for performance measurement. Do you have any recommendations?

Bachstelze · 2019-12-05T15:56:54Z

How is the performance of the model on English?
The following two recommendations could help with the German and Spanish accuracy problem:

Preserve the multilingual layers and only train an additional layer on the top. For example set the paramter 'trainable' to 'false' in allennlp.
Translate the labeled data to the other languages. For example with the transformer.wmt19.en-de

Hopefully you give the power of your knowledge back to the people.

loretoparisi · 2020-02-03T18:09:22Z

https://github.com/facebookresearch/XLM#ii-cross-lingual-language-model-pretraining-xlm

Now XML-R has 100 language, so it makes a lot of sense to replaceLASER with XML-R:

XLM-R is the new state-of-the-art XLM model. XLM-R shows the possibility of training one model for many languages while not sacrificing per-language performance. It is trained on 2.5 TB of CommonCrawl data, in 100 languages
https://github.com/facebookresearch/XLM#ii-cross-lingual-language-model-pretraining-xlm

MastafaF · 2020-02-14T16:52:25Z

Hi @loretoparisi,
My experiments on the task WMT2012 (similarity search) comparing XLM MLM on 100 languages and LASER shows that LASER clearly outperforms XLM in this case.
Haven't tried with XLM-R since on my side it is quite buggy still. But I will be happy to share more around it.

Cheers,

loretoparisi · 2020-02-15T18:30:15Z

@MastafaF pretty interesting test! We did not test XLM-R yet, I wonder why XLM model does not outperform LASER bi-LSTM networks, because according to the results they have presented it should be the opposite, but we did not replace LASER anyways for several other reasons.
At this point, a test with XLM-R must be done!

MastafaF · 2020-02-17T12:51:49Z

Hi @loretoparisi , XLM-R gives poor results at the moment. Stay tuned for further experiments, will post the link soon for replication 😄

MastafaF · 2020-03-10T14:09:55Z

Hi @loretoparisi , you can check some tests here on WMT2012 reproducing experiments from LASER and doing a comparative study between other multilingual architectures.
I plan on maintaining it as often as possible to compare SOTA solutions.
Feel free to raise an issue or send a PR if need be. Hope this helps! 😃

loretoparisi · 2020-03-10T17:13:54Z

@MastafaF thank you very much! It's a very comprehensive and rigorous analysis 💯

yannvgn mentioned this issue Mar 17, 2020

How does this compare to XLM-R / mBERT ? :) yannvgn/laserembeddings#21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Which is better for Cross Lingual classfication Task: LASER or XLM? #106

[Question] Which is better for Cross Lingual classfication Task: LASER or XLM? #106

Ayush-iitkgp commented Nov 8, 2019

Ayush-iitkgp commented Nov 8, 2019

PiotrCzapla commented Nov 12, 2019

MastafaF commented Nov 18, 2019

MastafaF commented Nov 18, 2019

Ayush-iitkgp commented Nov 18, 2019

MastafaF commented Nov 19, 2019

hoschwenk commented Nov 20, 2019 •

edited

Loading

Bachstelze commented Dec 4, 2019

Ayush-iitkgp commented Dec 5, 2019

Bachstelze commented Dec 5, 2019

loretoparisi commented Feb 3, 2020 •

edited

Loading

MastafaF commented Feb 14, 2020 •

edited

Loading

loretoparisi commented Feb 15, 2020 •

edited

Loading

MastafaF commented Feb 17, 2020 •

edited

Loading

MastafaF commented Mar 10, 2020 •

edited

Loading

loretoparisi commented Mar 10, 2020

[Question] Which is better for Cross Lingual classfication Task: LASER or XLM? #106

[Question] Which is better for Cross Lingual classfication Task: LASER or XLM? #106

Comments

Ayush-iitkgp commented Nov 8, 2019

Ayush-iitkgp commented Nov 8, 2019

PiotrCzapla commented Nov 12, 2019

MastafaF commented Nov 18, 2019

MastafaF commented Nov 18, 2019

Ayush-iitkgp commented Nov 18, 2019

MastafaF commented Nov 19, 2019

hoschwenk commented Nov 20, 2019 • edited Loading

Bachstelze commented Dec 4, 2019

Ayush-iitkgp commented Dec 5, 2019

Bachstelze commented Dec 5, 2019

loretoparisi commented Feb 3, 2020 • edited Loading

MastafaF commented Feb 14, 2020 • edited Loading

loretoparisi commented Feb 15, 2020 • edited Loading

MastafaF commented Feb 17, 2020 • edited Loading

MastafaF commented Mar 10, 2020 • edited Loading

loretoparisi commented Mar 10, 2020

hoschwenk commented Nov 20, 2019 •

edited

Loading

loretoparisi commented Feb 3, 2020 •

edited

Loading

MastafaF commented Feb 14, 2020 •

edited

Loading

loretoparisi commented Feb 15, 2020 •

edited

Loading

MastafaF commented Feb 17, 2020 •

edited

Loading

MastafaF commented Mar 10, 2020 •

edited

Loading