You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Detection of Cyber Grooming in Online Conversation
Patrick Bours, Halvor Kulsrud
December 2019
Why I chose this paper?
It was a 2019 paper and they used various algorithms. They also tried to detect the predator in the early phases of conversations and, as they said, no one has done the same thing before.
Their metrics were an F-0.5 score and an F-2 score which shows they did care about different aspects of the topic.
The main problem:
finding a predator in an online conversation
The minor problem:
finding a predator in an online conversation as soon as possible.
Applications:
Automatic detection of online predators in a conversation in the early stages of the conversation
Existing Works:
Villatoro-Tello et al (winner of PAN2012)
Eriksson and Karlgren (5th place of PAN2012) (high recall 0.8937, precision 0.8566)
Ebrahimi et al; used CNN on PAN2012 (2016)
Pandey et al; on full Perverted Justice dataset (Detecting predatory behavior from online textual chats)
Gunawan et al (Detecting online child grooming conversation)
Method:
They first preprocessed the PAN2012 data, then created Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) feature sets.
They used three approaches and five classification algorithms.
The approaches: Messaged-based detection (MBD), author-based detection (ABD), conversation-based detection (CBD)
Classification algorithms: Logistic Regression, Ridge, Naïve Bayes, SVM, Neural Network
preprocessing:
Choosing conversations with only 2 chatters
Filtering conversations with less than 6 messages
Filtering messages with empty strings
tokens (words) lowered the case
special characters replaced with whitespace
removed characters other than alphanumeric chars, whitespace, #, +, and _.
removed stopwords
Bag of Words:
Used 5000 most common words (common in the corpus or generally?)
TF-IDF:
Used 18953 features of unigrams and bigrams. Filtered max document frequency (DF) of %90, and min DF of 5 documents
MBD:
Input and Output:
MBD: A single text message -> Does this message belong to a predator or not
ABD: Whole messages of chatter -> Is it a predator chatter
CBD:
Phase 1: Give a whole text conversation -> is it a suspicious conversation
Phase 2: (Something NOT SPECIFIED) of a suspicious conversation -> is it a predator or not
Gaps:
the authors did not talk about the details of their neural network.
in the CBD method, they just used the phase one results of one of the algorithms for the second phase algorithms.
Results:
MBD: Only applied on Logistic Regression and Ridge
F-0.5 for Logistic Regression with TF-IDF was better -> 0.34
F-2 for Ridge with Bag of Words was better -> 0.46
ABD: All 5 classifiers were used. NN worked better on TF-IDF for the F-0.5 score. And NN with BoW returned a better F-2 score.
F-0.5 for NN with TF-IDF -> 0.891
F-2 for NN with BoW -> 0.761
Although the results of NN and BoW for other metrics were slightly weaker than that of TF-IDF, we can pick it as a better combination.
CBD Phase 1: F-0.5 and F-2 with TF-IDF showed significantly better results than that of BoW. And SVM scores were the highest with 0.974 and 0.910
CBD FULL: Used the output of the SVM and TF-IDF combination of phase 1 as input for the 5 other classifiers.
Naïve Bayes and TF-IDF showed better F-0.5 scores but the best F-2 score was for Ridge and TF-IDF. Also, TF-IDF and NB's F-0.5 were slightly higher than that of Ridge and TF-IDF.
Did they answer the question?
The author concluded that the ABD approach alongside NN with TF-IDF works well. In overall, TF-IDF results were always more prominent
Early detection
Almost all the classifiers needed at least 36 messages of conversation to have a recall over 0.8. For a precision of over 0.8 the classifiers needed at least 21 messages.
Also, they used 10 full Perverted Justice conversations for the early detection step. Using CBD, SVM and TF-IDF for phase 1, and NB for the second phase. In most cases, 10% of the conversation was needed (their lengths were on average around 3000 messages).
The text was updated successfully, but these errors were encountered:
@hamedwaezi01
so, they could reproduce Ebrahimi's work? or just report the values?
Also, now this becomes an important baseline for the project, what do you think?
Actually, they only reported the results of Ebrahimi's and did not mention any implementation.
About the baseline, their best results were driven from the Ridge classifier and the Naïve Bayes classifier none of which is a neural network.
Additionally, I saw a 2022 paper by the same author, Bours, and would like to have a look at it. Its title is 'Predatory Conversation Detection Using Transfer Learning Approach'.
Detection of Cyber Grooming in Online Conversation
Patrick Bours, Halvor Kulsrud
December 2019
Why I chose this paper?
It was a 2019 paper and they used various algorithms. They also tried to detect the predator in the early phases of conversations and, as they said, no one has done the same thing before.
Their metrics were an F-0.5 score and an F-2 score which shows they did care about different aspects of the topic.
The main problem:
finding a predator in an online conversation
The minor problem:
finding a predator in an online conversation as soon as possible.
Applications:
Automatic detection of online predators in a conversation in the early stages of the conversation
Existing Works:
Villatoro-Tello et al (winner of PAN2012)
Eriksson and Karlgren (5th place of PAN2012) (high recall 0.8937, precision 0.8566)
Ebrahimi et al; used CNN on PAN2012 (2016)
Pandey et al; on full Perverted Justice dataset (Detecting predatory behavior from online textual chats)
Gunawan et al (Detecting online child grooming conversation)
Method:
They first preprocessed the PAN2012 data, then created Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) feature sets.
They used three approaches and five classification algorithms.
The approaches: Messaged-based detection (MBD), author-based detection (ABD), conversation-based detection (CBD)
Classification algorithms: Logistic Regression, Ridge, Naïve Bayes, SVM, Neural Network
Input and Output:
Gaps:
the authors did not talk about the details of their neural network.
in the CBD method, they just used the phase one results of one of the algorithms for the second phase algorithms.
Results:
MBD: Only applied on Logistic Regression and Ridge
ABD: All 5 classifiers were used. NN worked better on TF-IDF for the F-0.5 score. And NN with BoW returned a better F-2 score.
Although the results of NN and BoW for other metrics were slightly weaker than that of TF-IDF, we can pick it as a better combination.
CBD Phase 1: F-0.5 and F-2 with TF-IDF showed significantly better results than that of BoW. And SVM scores were the highest with 0.974 and 0.910
CBD FULL: Used the output of the SVM and TF-IDF combination of phase 1 as input for the 5 other classifiers.
Naïve Bayes and TF-IDF showed better F-0.5 scores but the best F-2 score was for Ridge and TF-IDF. Also, TF-IDF and NB's F-0.5 were slightly higher than that of Ridge and TF-IDF.
Did they answer the question?
The author concluded that the ABD approach alongside NN with TF-IDF works well. In overall, TF-IDF results were always more prominent
Early detection
Almost all the classifiers needed at least 36 messages of conversation to have a recall over 0.8. For a precision of over 0.8 the classifiers needed at least 21 messages.
Also, they used 10 full Perverted Justice conversations for the early detection step. Using CBD, SVM and TF-IDF for phase 1, and NB for the second phase. In most cases, 10% of the conversation was needed (their lengths were on average around 3000 messages).
The text was updated successfully, but these errors were encountered: