Skip to content

Implemented and compared Random Forest, Decision Tree, KNN, SVM, and Logistic Regression outcomes with a confusion matrix. Concluded that Random Forest achieved the highest accuracy of 85% to predict the loan status for investors.

Notifications You must be signed in to change notification settings

kunjan-mhaske/Risky-loan-prediction-using-ML-in-R-Studio

Repository files navigation

** Open Project Report for Details

link for dataset: https://www.kaggle.com/wendykan/lending-club-loan-data

Introduction:

For this project, we worked on deciding the core algorithm to analyze the lending loan risk by classifying available loan data to categorize it as good loans and bad loans. We have considered 5 statistical machine learning models to get the analysis along with their accuracy and efficiency towards the available Lending Loan Club dataset. We worked on Decision trees Classification, KNN Classification, Logistic Regression, and Random Forest Classification. After observing the overall performance of these 4 models, for this instance, we have selected Random Forest Classification as the core algorithm for the system. Further, we worked on the 5th algorithm SVM, which is not covered in the class and tried to fine-tune it to get better outcome. We then compared all algorithms with each other and selected Random Forest Classifier as the core algorithm.

The Goal:

Investment in loan lending business is financially risky without a proper system to analyze the possibility of the existing loans being a good loan or bad loans. The investors should check the historic as well as current statistics of the borrower and deduce the result to invest more money towards improving bad loans or maintaining good loans. For this herculean task, we are proposing this model based on the historic and currently available data to find out the maximum possibility of existing loans becoming a good loan or a bad loan for investors.

About

Implemented and compared Random Forest, Decision Tree, KNN, SVM, and Logistic Regression outcomes with a confusion matrix. Concluded that Random Forest achieved the highest accuracy of 85% to predict the loan status for investors.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages