Users use this music_box app to listen to music. The project's goal is to predict whether a user will churn.
This is used to unzip log files, add file names and combine them into a whole log file.
Have a look at what are included in the log data.
- Use pyspark on google cloud platform to load and prepare data.
- Fit logistic regression, xgboost and randorm forest models with grid search.