Multilingual Deception Detection of GPT-generated Hotel Reviews

This repository contains the dataset and code for our paper.

Data

All data is available at all_data. Source label 0 represents real hotel reviews and label 1 represents fake/ LLM-generated hotel reviews.

Topic Modeling features can be accessed interactively in topic_analysis

All generation code is available at LLM_generation.

XLM-Roberta, Random Forest and Naive Bayes models, together with interpretable features are available at Deception Detection Models.