TWITTER SPEECH DETECTION - SENTIMENT ANALYSIS

DEMO: https://twitter-speech-detection-04.herokuapp.com/

PROJECT DESCRIPTION

The context of this project is to classify the Positive and the Negative real-time Tweets fetched from the Twitter API using the Machine Learning and the Natural Language Processing algorithms.

AIM & OBJECTIVE

The aim of this project is to implement the Logistic Regression Algorithm and NLP Algorithms along with help of sentiment analysis in a newer manner and evaluate the performance of the chosen Machine Learning and NLP algorithm to find out the best suitable and efficient model for the chosen data set.

OBJECTIVE

To understand the efficient use of Twitter API and the machine learning model.
To evaluate the performance of the selected models

WHAT IS SENTIMENT ANALYSIS?

The process of detecting positive or negative sentiment in text is known as sentiment analysis. Businesses mostly use it to detect sentiment in social data, assess brand reputation, and gain a better understanding of their customers.

Sentiment analysis is becoming a crucial tool for monitoring and understanding client sentiment as they share their opinions and feelings more openly than ever before.

Brands can learn what makes customers happy or frustrated by automatically evaluating customer feedback, such as comments in survey replies and social media dialogues. This allows them to customise products and services to match their customers' demands.

The overall benefits of sentiment analysis include:

Sorting Data at Scale
Real-Time Analysis
Consistent criteria

ABOUT THE DATA

Attribute Information (in order):
    - ID      Tweet I'd
    - Tweet   Actual Tweet

    - Label   Class 0 - Positive Tweet
              Class 1 - Negative Tweet

Missing Attribute Values: None

TECHNICAL ASPECTS

MODEL BUILDING The implementation of the Twitter Speech Deteching Model Building is done in 3 steps.

STEP - 1: As we know the TWITTER data contains lots of stops words.

Stop words are a group of words that are frequently employed in a language. Stop words in English include "a," "the," "is," "are," and others.
Stop words are frequently used in Text Mining and Natural Language Processing (NLP) to exclude terms that are so widely used that they contain little meaningful information.

For implementation, I used the TF-IDF Vectorizer Algorithm for removing the English stop words and fetched the top 10,000 most used words from the text.

STEP - 2: Implementation of Logistic Regression

As the problem statement was about classification of real-time Tweets; Logistic Regression is used.
```
  Class 0 - Positive Tweets
  Class 1 - Negative Tweets
```

STEP - 3: PIPELINE CREATION

As both the above steps were important for classifying the Tweet as Postive or Negative; a Pipeline was created where both the above steps were implemented in this single step.

Where first the data was cleaned using the TF-IDF and then this data is passed for the classification purpose.

CHECKING THE BALANCE OF THE DATASET
```
 % of Class 0 : 92.99%
 % of Class 1 : 7.01

 Our Dataset is imbalance as Class 0 is almost the 13x times of Class 1. 
 So we need to up-sample to balance the dataset. 
```
RandomOverSampler was implemented to balance the dataset.

After the implementation of RandomOverSampler; another PIPELINE was created to predict the classification model.
TWITTER API SETUP

To fetch the real-time Tweets from the Twitter API; we need to create the TWITTER DEVELOPER ACCOUNT in order fetch the Tweets.

Steps for creating the Twitter Developer Account follow: https://github.com/khwajaavais/Twitter-Speech-Detection-Sentiment-Analysis/blob/main/Twitter%20Account%20Setup.md

MODEL DEPLOYMENT

LOCALHOST

For implementating the project in your own system follow the steps;

Download the directory
Open the Command Prompt (CLI) and change the command line path to this current file path.
Run the command
```
  python app.py
```
Hit http://127.0.0.1:5000/

WEB APPLICATION

For deploying the project via Heroku platform

Follow Krish Naik`s Deployment of ML models in Heroku using Flask https://www.youtube.com/watch?v=mrExsjcvF4o

Note: Mandatory Files required while deploying ML Model in Heroku using Flask

app.py
Procfile
model.pkl file (Pickle File)
request.py
requirement.txt
templates / index.html (UI File)
static/css/ style.css (You can use my Repository to follow the steps)

INSTALLATION

The Code is written in Python 3.7. If you don't have Python installed you can find it there . If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:

pip install -r requirements.

Conclusion and Future Work

Implementation of Machine Learning Pipeline(Logistic Regression Algorithm and Natural Language Processing) for the classifying the real-time Tweets is successful.

FUTURE WORK

With the increase in dataset; a more accurate model can be build up and with more values within the Label Attribute (e.g. 'Neutral')

SCREENSHOTS

DEMO: https://twitter-speech-detection-04.herokuapp.com/

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
dataset		dataset
static/css		static/css
templates		templates
Balancing Dataset.ipynb		Balancing Dataset.ipynb
Hate Speech Classification.ipynb		Hate Speech Classification.ipynb
Procfile		Procfile
README.md		README.md
Twitter API Connection.ipynb		Twitter API Connection.ipynb
Twitter Account Setup.md		Twitter Account Setup.md
api_connection.py		api_connection.py
app.py		app.py
requirements.txt		requirements.txt
text_classification.joblib		text_classification.joblib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TWITTER SPEECH DETECTION - SENTIMENT ANALYSIS

PROJECT DESCRIPTION

AIM & OBJECTIVE

OBJECTIVE

WHAT IS SENTIMENT ANALYSIS?

ABOUT THE DATA

TECHNICAL ASPECTS

MODEL DEPLOYMENT

INSTALLATION

Conclusion and Future Work

SCREENSHOTS

INDEX PAGE

RESULT PAGE

About

Releases

Packages

Languages

khwajaavais/Twitter-Speech-Detection-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

TWITTER SPEECH DETECTION - SENTIMENT ANALYSIS

PROJECT DESCRIPTION

AIM & OBJECTIVE

OBJECTIVE

WHAT IS SENTIMENT ANALYSIS?

ABOUT THE DATA

TECHNICAL ASPECTS

MODEL DEPLOYMENT

INSTALLATION

Conclusion and Future Work

SCREENSHOTS

INDEX PAGE

RESULT PAGE

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages