intention-clarification

A dataset of clarification requests extracted from the AMI Corpus, annotated for whether they target intentionality.

Content

The data contains 338 dialogues, separated by double linebreaks.

It is extracted from the AMI corpus [1] using the heuristic detailed in [2]. It is annotated by two expert annotators.

Format & Annotation

Each dialogue contains:

10 utterances of prior context
the SOURCE of the clarification request (the utterance that is clarified), marked by "->"
the clarification request (CR) itself, marked by "-->"
the ANSWER to the CR, marked by "->"
the FOLLOW UP of the asker of the CR, marked by "->"; if there is no follow up, the 10 utterances after the answer are displayed.

If the prior or posterior context is less than 10 utterances long, the maximum number is displayed.

The last line of each excerpt is the annotation. There are 5 annotations:

"not" indicates that the CR is not actually a clarification request (i.e. a false positive of the heuristic).
"low" indicaes that the CR is a low-level clarification request (channel, parsing, or resolution).
"int-rec" indicates that the CR is an intention-recognition CR.
"int-tr" indicates that the CR is an intention-adoption CR.
"ambig" indicates that the CR is ambiguous between one or more of the above categories.

Use

The data points towards a significant, but understudied, class of clarification questions. Some discussion can be found in [3].

References

[1] Carletta, J. (2007). Unleashing the killer corpus: experiences in creating the multi-everything AMI meeting corpus. Language Resources and Evaluation 41(2), 181–190.

[2] Julian J. Schlöder and Raquel Fernández (2015). Clarifying Intentions in Dialogue: A Corpus Study, Proceedings of the 11th International Conference on Computational Semantics (IWCS 2015).

[3] Julian J. Schlöder and Raquel Fernández (2014). Clarification Requests on the Level of Uptake, Proceedings of the 18th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2014, "DialWatt").

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

intention-clarification

Content

Format & Annotation

Use

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

intention-clarification

Content

Format & Annotation

Use

References