Introduction

The goal of this classification project was to predict whether a person would submit a claim or not.

Multiple permutations of several models, including Logistic Regression, Random Forest and XGBoost did not produce significant results and therefore we must conclude the data is not predictive. The iterations were initially split between those using sklearn's SMOTE to upsample the minority target class and using python's .resample(), which are the titles of the respective notebooks. A snapshot of each models score may be viewed in the 'Results_Snapshot.xlsx' file. Grid Search on XGBoost produced the best results.

However, we did gain insight on useful and superfluous features and can begin to focus energy on collecting more data for those. We can also work on identifying new data points and feature engineering to produce a data set that will be predictive and answer our questions with confidence.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Business_Presentation.pdf		Business_Presentation.pdf
LogReg_RF_XGBoost_with_ReSample.ipynb		LogReg_RF_XGBoost_with_ReSample.ipynb
LogReg_RF_XGBoost_with_SMOTE.ipynb		LogReg_RF_XGBoost_with_SMOTE.ipynb
LogReg_RF_XGBoost_with_crossVal.ipynb		LogReg_RF_XGBoost_with_crossVal.ipynb
ReadMe.md		ReadMe.md
autoinsurance_claims.csv		autoinsurance_claims.csv
autoinsurance_claims_variable glossary_final.xlsx		autoinsurance_claims_variable glossary_final.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

About

Releases

Packages

Languages

mgavish/Logistic-Regression-Random-Forest-Gradient-Boosting

Folders and files

Latest commit

History

Repository files navigation

Introduction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages