Problem Statement: This project is centered around analyzing a dataset of various tracks on the music platform Spotify. The dataset includes information such as track names, artists, release dates, presence on playlists and charts, stream counts, and various musical properties like beats per minute (bpm), danceability, mood, energy, and more. The core problem our team is addressing involves exploring relationships and patterns within this rich dataset to understand the factors contributing to a song's popularity and success on streaming platforms. Application of Data Science Algorithms for Business Value
- Exploratory Data Analysis (EDA): The team proposes to use EDA methods to delve into the dataset and uncover underlying patterns. For example, examining the relationship between the speed (bpm) of a song and its perceived danceability, or assessing the impact of presence on Shazam charts on streaming counts.
- Regression Analysis: To understand correlations between different variables, regression analyses are planned. For instance, using bpm as an independent variable and danceability as a dependent variable, or studying the effect of the month a song is released on popularity ranking as an example.
- Visualization: Visualization methods such as scatter plots, regression plots, and potentially 3D scatter plots will be used to represent data patterns and relationships visually. This approach aids in better understanding complex data interactions and can guide more informed business decisions in the music industry.
- Predictive Modeling: The project suggests using regression models to predict streams based on various factors like playlist inclusions and chart presence. Comparing R² and MSE values from different regression models will help understand which factors are more significant predictors of streaming success.