This repository contains dataset and experimental code for the work described in: Roman Urdu Toxic Comment Classification
The labeled Roman Urdu Toxic Comment Corpus can be accessed here
Pre-trained word embeddings used in the experiments can be found here
It is implemented in python 3.6 and requires keras and tensorflow.
@article{Saeed2021_RUTox,
title={Roman Urdu toxic comment classification},
author={Saeed, Hafiz Hassaan and Ashraf, Muhammad Haseeb and Kamiran, Faisal and Karim, Asim and Calders, Toon},
journal={Language Resources and Evaluation},
pages={1--26},
publisher={Springer},
year={2021},
month={Jan},
day={29},
issn={1574-0218},
doi={10.1007/s10579-021-09530-y},
url={https://doi.org/10.1007/s10579-021-09530-y}
}