The objective of this project was to conduct a cluster analysis of data from the OnSports fantasy sports platform in order to identify patterns in player performance and inform pricing decisions for the start of a new season. Determining the potential of each player based on their previous season performance, and identifying the characteristics of different clusters of players.
This was done through the use of cluster analysis and the comparison of various algorithms, including K-Means, K-Medoids, Heirarchical Clustering, GMM clustering, and DBSCAN. Comparison of the five different algorithms resluted in K-means being selected as the final algorithm due to its high silhouette score.
The results of the clustering analysis included the identification of four clusters of players, characterized by their level of influence on the outcome of the game, goals scored, creativity, and other factors.
Based on the characteristics of each cluster, recommendations for player pricing were also made.
- Players in cluster 0 were identified as the top players for fantasy and should be priced higher due to their high potential for fetching points and bonus points.
- Players in cluster 1 were identified as substitutes who fetch fewer fantasy points and should be priced lower.
- Players in cluster 2 were identified as influential in their team's play but not necessarily in terms of scoring or assisting, and should be priced based on their influence.
- Players in cluster 3 were identified as consistent performers who fetch a moderate amount of points and should be priced accordingly.