We aim to provide explainability to the models used for classification as well as identify the features of a tweet that led to a classification of hate speech or offensive (the third category being neither)
Final paper: "Improving Hate Speech Classification on Twitter" in repo