Sentiment-Analysis-with-Playstore-reviews

Performed sentiment analysis for XYZ company on playstore reviews to categorize customer reviews as 'POSITIVE' or 'NEGATIVE'

INTRODUCTION:

AIM: To perform sentiment analysis on Google Play Store app reviews, classifying them as either "Positive" or "Negative"
DATASET USED: Sample dataset was sourced from kaggle.
TOOLS AND LIBRARIES: This project is made with Python and uses:

NLTK for text preprocessing
sci-kit learn for machine learning for ML models (Logistic regression, naive bayes)
Pandas for data manipulation
Seaborn and Matplotlib for data visualization (making confusion matrices)

DATA UNDERSTANDING:

Dataset had 2 useful columns with user reviews and another one with score for those reviews on a scale of 1 to 5, where:

1 = Very Negative
2 = Negative
3 = Neutral
4 = Positive
5 = Very Positive

DATA PREPROCESSING:

Text Cleaning: Used NLTK for:

Tokenization
Stop word removal
Lemmatization

Label Assignment:
Scores of 1 and 2 are labeled as negative
Scores of 4 and 5 are labeled as positive
Neutral Scores (3) are removed from the dataset

FEATURE EXTRACTION:

TF-IDF (Text-frequency inverse document frequency):
Used TfidfVectorizer to convert the cleaned text into numerical features suitable for machine learning models.
Limited the feature size to 6000 terms for efficient computation while preventing overfit.

MODEL IMPLEMENTATION:

Logistic Regression:

Initially implemented logistic regression
Accuracy achieved: 87%
Pros: Simple and easy to interpret, excellent for binary classification
Cons: Assumes linear relation between features, and best useful when datasets are small- medium sized.

Naive Bayes:

Decided to implement naive bayes to compare accuracy
Achieved accuracy of 85%
Pros: Simple and effective for text processing
Cons: Assumes no interrelation between words, hence ‘naive’

CONCLUSION:

Logistic Regression was the best-performing model with an accuracy of 87%. Naive Bayes came close but was slightly lower in performance.

Future Work:

Use advanced models and explore word embeddings.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Nishtha_Intern_Task_0.ipynb		Nishtha_Intern_Task_0.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment-Analysis-with-Playstore-reviews

INTRODUCTION:

DATA UNDERSTANDING:

DATA PREPROCESSING:

FEATURE EXTRACTION:

MODEL IMPLEMENTATION:

Logistic Regression:

Naive Bayes:

CONCLUSION:

Future Work:

About

Releases

Packages

Languages

NishthaSharma-22/Sentiment-Analysis-with-Playstore-reviews

Folders and files

Latest commit

History

Repository files navigation

Sentiment-Analysis-with-Playstore-reviews

INTRODUCTION:

DATA UNDERSTANDING:

DATA PREPROCESSING:

FEATURE EXTRACTION:

MODEL IMPLEMENTATION:

Logistic Regression:

Naive Bayes:

CONCLUSION:

Future Work:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages