Data Card

Note

Reminder: The song count in the playlists may vary as they are updated daily.

The train dataset is a Spotify playlist named “keep grinding.” that consists of 610 songs. Here is the link to this playlist: «keep grinding.».

The test dataset “Spotify Most Played All Time 500Mil+” that consists of 721 songs. Here is the link to this playlist: «Spotify Most Played All Time 500Mil+».

With this data we want to provide the users with some personalized suggestions in a large space of possible options.

The Spotipy library (Spotipy) is used to extract both the songs and their features from the Spotify playlists. Here is a brief description of each feature extracted using Spotipy:

Feature	Description
Energy	Intensity and activity level of a track
Liveness	Presence of a live audience in a track
Loudness	Overall volume of a track in decibels (dB)

Additional information on extracting various features is accessible in the Spotify API documentation.

Dataset Before and After Processing & Clustering

We present screenshots to provide insight into the data used in the overall process.

Raw Data (After Extraction):
Processed Data:
Clustered Data (Used for Recommendations):

Feature Selection

The Correlation Matrix, used to examine relationships among features, helped us in determining significant correlations among them or if they were largely independent of each other.

The objective was to select the features with the strongest correlation, with the option to include an additional feature if it demonstrates a small correlation with the already identified high-correlation features.

Following the goal‘s criteria, all the three features (Loudness, Energy, Liveness) were adopted due to a noticeable correlation between the Loudness and Energy features and a minor correlation between both the Loudness and Energy features with the Liveness feature.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data Card

Dataset Before and After Processing & Clustering

Feature Selection

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data Card

Dataset Before and After Processing & Clustering

Feature Selection