Skip to content

Submission of an in-class NLP sentiment analysis competition held at Microsoft AI Singapore group. This submission entry explores the performance of both lexicon & machine-learning based models

License

Notifications You must be signed in to change notification settings

KwokHing/SentimentAnalysis-Python-Demo

Repository files navigation

Exploration of Sentiment Analysis

This repo provides the submission entry for an in-class NLP sentiment analysis competition held at Microsoft AI Singapore group using techniques learned in class to classify text in identifying positive or negative sentiment.

jpg

Recommended to install Anaconda, a pre-packaged Python distribution that contains all of the necessary libraries and software for this project. Alternatively, you can make use of Google Colaboratory, which allows you to write and execute Python codes in your browser.

Data

Data for this in-class competition comes from the Sentiment140 dataset where the training and test data consists of randomly sampled 10% and 5% of the dataset.

Getting started using Lexicon and Machine Learning (ML) based methods

Open SentimentAnalysis.ipynb on a jupyter notebook environment, or Open In Colab

  • VADER (VALENCE based sentiment analyzer) [67%]
  • Naive Bayes
  • Linear SVM (Support Vector Machine) [80%]
  • Decision Tree
  • Random Forest
  • Extra Trees
  • SVC [80%]

Exploring using Deep Learning Techniques (LSTM)

Open SentimentAnalysis_RNN.ipynb on a jupyter notebook environment, or Open In Colab

The LSTM deep learning method [79%] did not perform better than SVC/SVM method

How about the BERT Transformers model?

Open SentimentAnalysis_BERT.ipynb on a jupyter notebook environment, or Open In Colab

The State-of-the-Art transformer model performs slightly better at [82%] accuracy