Skip to content

HNStaggs/Political-Discourse-NLP-AWS

Repository files navigation

PoliticPulse

Sentiment Analysis Prediction

Welcome

ADS508 Spring 2024 Team 5:

• Conor Fitzpatrick

• Ravita Kartawinata

• Halee Staggs

Company Name: PoliticPulse

Company Industry: Political Opinion Research/ Political Consulting

Company Size: 10

Introduction

The company is focused on utilizing public discourse on social media and comments on news articles, specifically Twitter (now called X), and the New York Times, to understand public sentiment towards major presidential candidates in swing states.

Goal

By leveraging analytics and machine learning techniques, we aim to provide valuable insights to political organizations and campaigns, offering guidance in navigating the ever-changing landscape of public opinion.

Data Sources

Data are stored on public AWS S3 Bucket (s3://ads508team5/). There are 4 files in this bucket:

  • Twitter: s3://ads508team5/tweeter/
  • NYT comment: s3://ads508team5/nyt/nyt-comments-2020.csv
  • US cities: s3://ads508team5/cities/uscities.csv

How to start project

  1. Clone : https://github.com/HNStaggs/ADS508_GroupProject.git

  2. Run all the files in Setup folder

  3. Open EDA.ipnyb - Run All with ml.m5.large instance

  4. Open Partition_Transform.ipnyb - Run All with ml.m5.large instance

  5. Open Modeling.ipnyb - Run All with ml.m5.large instance

Tools

  1. Python (Jupyterlab notebook)

  2. AWS (Sagemaker, Athena, S3, DataWrangler, AutoPilot)

  3. Google Doc

  4. Powerpoint

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •