This repository is part of the ICSME 2017 NIER track paper "Confusion Detection in Code Reviews". It provides the gold standard dataset with the confusion annotation for the code review comments from Android and also the complete list of features of our confusion framework. This dataset has also been used in a follow-up study "Confusion in Code Reviews: Reasons, Impacts and Coping Strategies", to appear in SANER 2019.
- features.xlsx - this file contains the complete list of features of each category of our confusion framework.
- gold-standard-dataset.xlsx - this file contains the gold standard dataset with the confusion annotation for the hedges category.
- Felipe Ebert (Federal University of Pernambuco - Brazil, Eindhoven University of Technology - The Netherlands)
- Fernando Castor (Federal University of Pernambuco - Brazil)
- Nicole Novielli (University of Bari - Italy)
- Alexander Serebrenik (Eindhoven University of Technology - The Netherlands)
When you use the data, we kindly ask you to cite it as follows:
Felipe Ebert, Fernando Castor, Nicole Novielli, Alexander Serebrenik. Confusion Detection in Code Reviews. 33rd International Conference on Software Maintenance and Evolution (ICSME'2017), New Ideas and Emerging Results. Shangai, China. September 2017.
@INPROCEEDINGS{8094460,
author={F. Ebert and F. Castor and N. Novielli and A. Serebrenik},
booktitle={2017 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
title={Confusion Detection in Code Reviews},
year={2017},
pages={549-553},
keywords={Androids;Humanoid robots;Labeling;Manuals;Software;Training;Uncertainty;code review;confusion;machine learning},
doi={10.1109/ICSME.2017.40},
month={Sept},}