This repository has been archived by the owner on Jul 27, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 19
/
Copy pathdrop.html
75 lines (67 loc) · 4.3 KB
/
drop.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
---
<!DOCTYPE html>
<html lang="en-us">
<head>
{% include meta.html %}
<title>AllenNLP - DROP Dataset</title>
</head>
<body id="top">
<div id="page-content">
{% include header.html %}
<div class="banner banner--interior-hero">
<div class="constrained constrained--sm">
<div class="banner--interior-hero__content">
<h2>DROP: A Reading Comprehension Benchmark Requiring <b>D</b>iscrete <b>R</b>easoning <b>O</b>ver <b>P</b>aragraphs</h2>
<p class="t-sm">Dheeru Dua, Yizhong Wang, Pradeep Dasigi,<br> Gabriel Stanovsky, Sameer Singh and Matt Gardner<br>NAACL 2019.</p>
</div>
</div>
</div>
<div class="constrained constrained--med">
<p>
With system performance on existing reading comprehension benchmarks nearing or surpassing human performance, we need a new, hard
dataset that improves systems' capabilities to actually <i>read</i> paragraphs of text. DROP is a crowdsourced,
adversarially-created, 96k-question benchmark, in which a system must resolve references in a question, perhaps to multiple input
positions, and perform discrete operations over them (such as addition, counting, or sorting). These operations require a much
more comprehensive understanding of the content of paragraphs than what was necessary for prior datasets.
</p>
<p>
AllenNLP provides an easy way for you to get started with this dataset, with a dataset reader that can be used with any model you
design, and a reference implementation of the NAQANet model that was introduced in the DROP paper. Find more details in the links
below.
</p>
<li><a href="https://www.semanticscholar.org/paper/DROP%3A-A-Reading-Comprehension-Benchmark-Requiring-Dua-Wang/248352852f5baa2ef333077c6084a618cb1ea0fd", target="_blank">Paper</a>, describing the dataset and our initial model for it,
Numerically-Augmented QANet (NAQANet), that adds some rudimentary numerical reasoning capability on top of
<a href="https://www.semanticscholar.org/paper/QANet%3A-Combining-Local-Convolution-with-Global-for-Yu-Dohan/8c1b00128e74f1cd92aede3959690615695d5101">QANet</a>.</li>
<li><a href="https://s3-us-west-2.amazonaws.com/allennlp/datasets/drop/drop_dataset.zip">Data</a>, with about 77k questions in the
train set and 9.5k questions in the dev set (and a similar number in a hidden test set). The data is distributed under the <a href="https://creativecommons.org/licenses/by-sa/4.0/legalcode">CC BY-SA 4.0</a> license.</li>
<li>Code for the NAQANet model lives in AllenNLP:
<a href="https://github.com/allenai/allennlp/blob/master/allennlp/data/dataset_readers/reading_comprehension/drop.py">dataset reader</a>,
<a href="https://github.com/allenai/allennlp/blob/master/allennlp/models/reading_comprehension/naqanet.py">NAQANet model</a>.
Code for the other baselines in the paper may get added to AllenNLP in the future; open an issue on github if there's something
in particular you'd like to see.
<li><a href="https://leaderboard.allenai.org/drop", target="_blank">Leaderboard</a> with an automated docker-based evaluation on
a hidden test set.
</li>
<li><a href="https://demo.allennlp.org/reading-comprehension/NjI4NjM5">NAQANet demo</a> - see how well current NLP systems
understand paragraphs! The examples in the select box at the top should give you some sense of what kinds of questions are in
DROP, what the system can do well, and a bit of what it can't. Change the paragraphs, input your own, try your own complex
questions, and see what you find. If you find something interesting,
<a href="https://twitter.com/ai2_allennlp">let us know on twitter</a>!
<p><i>Citation:</i>
<pre>
@inproceedings{Dua2019DROP,
author={Dheeru Dua and Yizhong Wang and Pradeep Dasigi and Gabriel Stanovsky and Sameer Singh and Matt Gardner},
title={ {DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs},
booktitle={Proc. of NAACL},
year={2019}
}
</pre>
</p>
</div>
{% include footer.html %}
</div>
{% include svg-sprite.html %}
{% include scripts.html %}
</body>
</html>