Skip to content

Latest commit

 

History

History
11 lines (10 loc) · 1.96 KB

TheGLUEbenmark.md

File metadata and controls

11 lines (10 loc) · 1.96 KB

The GLUE Benchmark is a group of nine classification tasks on sentences or pairs of sentences which are:

  • CoLA (Corpus of Linguistic Acceptability) Determine if a sentence is grammatically correct or not.is a dataset containing sentences labeled grammatically correct or not.
  • MNLI (Multi-Genre Natural Language Inference) Determine if a sentence entails, contradicts or is unrelated to a given hypothesis. (This dataset has two versions, one with the validation and test set coming from the same distribution, another called mismatched where the validation and test use out-of-domain data.)
  • MRPC (Microsoft Research Paraphrase Corpus) Determine if two sentences are paraphrases from one another or not.
  • QNLI (Question-answering Natural Language Inference) Determine if the answer to a question is in the second sentence or not. (This dataset is built from the SQuAD dataset.)
  • QQP (Quora Question Pairs2) Determine if two questions are semantically equivalent or not.
  • RTE (Recognizing Textual Entailment) Determine if a sentence entails a given hypothesis or not.
  • SST-2 (Stanford Sentiment Treebank) Determine if the sentence has a positive or negative sentiment.
  • STS-B (Semantic Textual Similarity Benchmark) Determine the similarity of two sentences with a score from 1 to 5.
  • WNLI (Winograd Natural Language Inference) Determine if a sentence with an anonymous pronoun and a sentence with this pronoun replaced are entailed or not. (This dataset is built from the Winograd Schema Challenge dataset.)