This project attempts to reproduce the results from a 1991 paper:
The paper claims to have found a significant relationship between pessimism measured in song lyrics and magazine headlines to changes in economic activity.
I attempt to recreate this by looking at popular song lyrics for the past 60 years and comparing that to changes in GDP per capita.
The data used in this project was sourced from:
API calls to the Genius API through a python library.
Scraping songlyrics.com
The Bureau of Economic Analysis
Before I begin modeling, I use TextBlob to analyse the lyrics for a sentiment value. These sentiment values are then shifted to create multiple lag values.
The sentiment values and their multiple lags are fed into a LASSO regression model. Alpha (the regularization term) is iteratively increased until only one sentiment lag is remaining. This is the lag that we will use as an exogenous variable in the next modelling step.
Visualizations that are helpful are the time series values of GDP compared against lyric sentiment. Differenced lyrics sentiment is too noisy to visually inspect for patterns.
Also, the PACF shows very little autocorrelation in the differenced (stationary) GDP values which suggests that the predictive power of a time-series model will be quite limited, unless the exogenous variable is highly correlated.
Using a SARIMAX model, I test the 19-month lagged sentiment of the lyrics as an exogenous variable. A visual inspection of the predictions resulting from this SARIMAX model shows no significant difference from a linear model, as the shape of those predictions does not reflect the shape of the lagged sentiment.
The results of the paper could not be recreated. In fact, the correlations found do not seem to be powerful enough that any such prediction could be reliably made. Perhaps the addition of more features or a more powerful model could help the predictive power of the model.
When time permits, I would like to create a LSTM model capable of making similar predictions. Also, including things that economists expect to affect the way GDP changes over time would also be interesting like monetary/fiscal policy. This could be especially interesting as our model (which attempts to quantify a perviously exogenous consumer pessimism) could endogenize consumer pessimism thereby helping to separate the effects of government policy from cultural or time-dependent changes.