Skip to content

Files

Latest commit

0249afa · Aug 21, 2022

History

History

hotel_reviews

Datasets

This dataset was first introduced and later extended in the following papers:

[1] M. Ott, Y. Choi, C. Cardie, and J.T. Hancock. 2011. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.

[2] M. Ott, C. Cardie, and J.T. Hancock. 2013. Negative Deceptive Opinion Spam. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

We have included the original version of hotel reviews dataset. We have also provided train, valid, and test reviews used for training our machine learning models. The folder also contains 13.7 hotel reviews (download_pred_filter.txt) downloaded by the authors from TripAdvisor using a protocol similar to the data collection process used for the original dataset. The train, valid, and test split used to train a linear student model using BERT predictions is provided in the additional_downloads subdirectory.