Skip to content

Data analytics project in WS23. The goal of the project is to find whether encoding data with sentece embeddings is more efficient then OHE for Clustering with K-Means

Notifications You must be signed in to change notification settings

moshchev/data_analytics_ws23

Repository files navigation

This repo contains an analysis of three datasets; each can be found in the corresponding subfolder. Each analysis contains an EDA of data, K-Means with a classical way to encode categorical data, and K-Means with embeddings.

The micropublication describes the project and summarizes key takeaways.

About

Data analytics project in WS23. The goal of the project is to find whether encoding data with sentece embeddings is more efficient then OHE for Clustering with K-Means

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published