The idea of this project is to monitor Telsa stock price movement on Tweets sentimental analysis over Twitter on a short term period.
If we can pick up today’s tweets sentiment analysis directions (positive or negative) for Tesla to see what people talk about Telsa and how people react on its performance as related news or market trend, it can affect the way the stock moves tomorrow.
On Twitter, one tweet posted from a single Twitter account can be retweeted many times to share info and influence even more people. The number of followers that the Twitter account has can also be measure of the influence of the tweets since more followers the account has, more influential the tweets can impact the crowd and potentially on stock price.
The main idea is to explore the tweets data for short term period range to do unsupervised learning like topic modeling on the tweets and interpret the topics to see how these relate to Telsa stock price moves.
-
Resource: collected by Twitter API Tweepy.
-
Date Range: start from 07/12
07/05 -07/13 (since the current API only let user collect the past 6-9 days data so it will be my research date period)Features including:
-
Resource: downloaded from Yahoo Finance
-
Date Range: 07/06 – 07/14 (since our hypothesis is today’s tweets will affect tomorrow stock price so we want to look into the daily stock price data on the next day of each tweets date to monitor the movement)
Features including:
-
Variables:
Date, Open, High, Low, Close, Adj Close, Volumne
-
- Tweepy (Twitter API)
- Tokenizer
- sklearn
- Feature Extraction
- from sklearn.feature_extraction.text import TfidfVectorizer, ENGLISH_STOP_WORDS
- Topic Modeling
- from sklearn.decomposition import NMF, TruncatedSVD, LatentDirichletAllocation
- NLTK VADER (sentiment analysis)
- Classification (stock moves up or down)
✨Clean the raw text and make Tweet/Topic Matrix for topic modeling to show what the topic clusters are.✨