Text Sentiment Analysis

We aim to predict the sentiment of the text provided in the IMDB Dataset of 50K Movie Reviews - Kaggle between 'positive' and 'negative'.

We try three approaches on %20 of the data (of size 10k) that was set aside as the testing set:

Two-shot LLM evaluation with Qwen2.5: we prompt the LLM with two samples of positive and negative sentiment texts and then ask it to return a new one for the given text. This gives our second to best F1-Score = 0.86.
TF-IDF: we use term-frequency-inverse-document-frequency features comptued from the training test, train a classifier, and then apply it on the test data. This gives our highest F1-Score = 0.89.
NLTK-Sentiment Analysis: Finally, we use the package ready sentiment analysis from nltk in python to predict sentiments of each text. This approach has the lowest performance with an F1-Score of 0.67.

Details of the models' performances are provided below:

Two-Shot LLM evaluation with Qwen2.5

Classification Report: precision recall f1-score support

       0       0.83      0.93      0.88      4961
       1       0.92      0.82      0.86      5039

accuracy                           0.87     10000

macro avg 0.87 0.87 0.87 10000 weighted avg 0.88 0.87 0.87 10000

Confusion Matrix: [[4592 369] [ 927 4112]]

F1-Score = 0.86

TF-IDF Scores:

Vectorizing text... Training model...

Classification Report: precision recall f1-score support

       0       0.90      0.87      0.89      4961
       1       0.88      0.90      0.89      5039

accuracy                           0.89     10000

macro avg 0.89 0.89 0.89 10000 weighted avg 0.89 0.89 0.89 10000

Confusion Matrix: [[4332 629] [ 487 4552]]

F1-Score = 0.89

Nltk Sentiment Analysis:

Classification Report: precision recall f1-score support

       0       0.78      0.01      0.01      4961
       1       0.50      1.00      0.67      5039

accuracy                           0.51     10000

macro avg 0.64 0.50 0.34 10000 weighted avg 0.64 0.51 0.34 10000

Confusion Matrix: [[ 29 4932] [ 8 5031]]

F1-Score = 0.67

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Text Sentiment Analysis

Two-Shot LLM evaluation with Qwen2.5

TF-IDF Scores:

Nltk Sentiment Analysis:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Text Sentiment Analysis

Two-Shot LLM evaluation with Qwen2.5

TF-IDF Scores:

Nltk Sentiment Analysis: