Skip to content

shreyadata804/Sentiment_Analysis_Twitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

🐦 Twitter Sentiment Analysis

📋 Project Overview

This project aims to build a sentiment analysis pipeline that classifies tweets as positive 😊, negative 😠, or neutral 😐. Currently, the project uses a pre-existing dataset for model training and evaluation. The next stage will enhance the project to analyze real-time tweets fetched using the Twitter Developer API.

The project involves key Natural Language Processing (NLP) techniques like cleaning, tokenization, stopword removal, and stemming, followed by training machine learning models such as Naive Bayes, Support Vector Machines (SVM), and Logistic Regression.


🚀 Project Status

  • ✅ Current Stage:
    • Model training, evaluation, and testing completed using a pre-existing dataset.
    • Preliminary visualizations (e.g., word clouds, confusion matrices) completed.
  • 🔧 Next Steps:
    • Integration with Twitter Developer API for real-time data 🕒.
    • Deploying the project as a web application using Flask or Streamlit for user interaction.
    • Continuous model improvement with live data.

✨ Features

  1. 🔄 Data Preprocessing:
    • Cleaning tweets by removing URLs 🌐, punctuation ❗, numbers 🔢, and special characters.
    • Tokenization and stopword removal for feature extraction 🛠️.
    • Text normalization using stemming 🔍.
  2. 📊 Visualization:
    • Word clouds for positive 💬 and negative 🔴 sentiments.
    • Confusion matrices for model evaluation.
  3. 🧠 Model Training:
    • Supports multiple machine learning models:
      • 🟡 Naive Bayes
      • 🟢 Support Vector Machines (SVM)
      • 🔵 Logistic Regression
    • Evaluation metrics include accuracy ✅, precision 🎯, recall 🔁, F1-score 🏆, and ROC-AUC curves 📈.
  4. 🌟 Future Plans:
    • Real-time tweet collection and analysis.
    • Interactive web-based interface for sentiment prediction.

🛠️ Technologies Used

  • Languages and Libraries:
    • Python 🐍, Numpy, Pandas, Scikit-learn, Matplotlib, Seaborn, NLTK, WordCloud
  • Visualization Tools:
    • Confusion Matrix, WordCloud, ROC Curve 📉
  • Future Integration:
    • Twitter Developer API for live tweet collection 🐦
    • Flask/Streamlit for web deployment 🌐

💻 Getting Started

📋 Prerequisites

  • Install Python 3.x 🐍
  • Install the required dependencies:
    pip install -r requirements.txt
    (The requirements.txt file includes dependencies like scikit-learn, NLTK, WordCloud, etc.)

⚙️ Setup Instructions

  1. 🗂️ Clone the Repository:

    git clone https://github.com/<shreyadata804>/twitter-sentiment-analysis.git
    cd twitter-sentiment-analysis
  2. 📂 Data Preparation:

    • Pre-existing Dataset:
      • Place the dataset in the /data folder.
      • Run the notebook or script to clean, preprocess, and train models.
    • Real-Time Tweets (Future Plan):
      • Obtain API keys from the Twitter Developer Platform.
      • Configure the tweepy library to fetch live tweets.
  3. 🚀 Run the Notebook: Open and execute the Jupyter Notebook to preprocess data, train the model, and evaluate results.


💡 Example Usage

  1. Analyzing Preloaded Dataset:

    • Run the notebook to process the dataset and generate predictions.
  2. Future Use Case with Real-time Tweets:

    • Fetch live tweets using the Twitter Developer API.
    • Pass the tweets through the preprocessing and model pipeline.
    • Obtain sentiment predictions.

📅 Project Roadmap

  1. ✅ Completed:
    • Training and evaluating models using a pre-existing dataset.
    • Initial visualizations and exploratory data analysis.
  2. 🔧 Ongoing:
    • Integration with Twitter Developer API for real-time data.
    • Deploying the model via Flask/Streamlit.
  3. 🚀 Future Enhancements:
    • Model optimization for better accuracy with large-scale real-time data.
    • Addition of advanced NLP techniques like BERT for sentiment classification.

📣 Acknowledgements

  • Sentiment140 Dataset for initial analysis.
  • Python libraries like NLTK and Scikit-learn for NLP and machine learning tasks.
  • Future reliance on Twitter Developer API for live data integration.

🤝 Contributing

Contributions are welcome! Feel free to fork the project and submit pull requests.


📞 Contact

For questions or suggestions, contact:


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published