Skip to content

Automate detection of different emotions from paragraphs and predict overall emotion.

License

Notifications You must be signed in to change notification settings

kanchitank/Text-Emotion-Analysis

Repository files navigation

Multi-Class Text Emotion Analysis

Text-Emotion-Analysis is a project to develop rule-based and deep learning algorithms with an aim to first appropriately detect the different types of emotions contained in a collection of English sentences or a large paragraph and then accurately predict the overall emotion of the paragraph.

I have two training and validation dataset:

  1. emotion_data.csv in which basic pre-processing of tweets in done (no lemmatization, no removal of stopwords).
    This dataset is comprised of 55,774 tweets from Twitter with labelled emotions of five classes: Neutral, Happy, Sad, Love, Anger.

  2. emotion_data_prep.csv in which more deep pre-processing of tweets in done (lemmatization, removal of stopwords, etc).
    This dataset is comprised of 62,015 tweets from Twitter with labelled emotions of five classes: Neutral, Happy, Sad, Love, Anger.

Comparison of DL and ML models:

DL:

  1. The DLModel using emotion_data.csv gave me 64.80% accuracy.

Confusion Matrix:

  1. The DLModel-Prep using emotion_data_prep.csv gave me 63.47% accuracy.

Confusion Matrix (Prep):

ML:

The ML Algorithms used for prediction are listed as follows:

Building models using different classifiers (Count vectorizer):

Model 1: Multinomial Naive Bayes Classifier - Accuracy 58.46%
Model 2: Linear SVM - Accuracy 62.00%
Model 3: Logistic Regression - Accuracy 62.47%

Building models using different classifiers (TF-IDF vectorizer):

Model 1: Multinomial Naive Bayes Classifier - Accuracy 38.37%
Model 2: Linear SVM - Accuracy 38.49%
Model 3: Logistic Regression - Accuracy 40.13%

Prediction of emotions from paragraphs and sentences (DL Model):