Due Date : 14-07-2017
This Assignment will be considered for CSE7405c,CSE7202c, and CSE7321c modules lab exercise.
Read the Universal Bank dataset.
Ensure that necessary pre processing steps are implemented (wherever necessary).
- Type conversion
- Imputation
- Standardization
Understand the spread of the data using the numerical attribute and see how the target is varying using the categorical attributes.
Identify the important patterns using visualizations(not mandatory)
Generate new features.
- Using PCA
- Load the data into h20 generate non linear features using Auto encoders
- Business understanding
Consider only the required important attributes using Random Forest (including attributes which are Linear,non linear and business domain attributes).
Built a regression model with income as target variable. Use the following technique.
- Linear Regression
- Decision Tree(Regression Tree)
- SVM
- Neural Network
- KNN
- Ada-boost
- Random Forest
- GBM
- Deep Learning.
Built a classification model to predict those who are likely to accept the offer of a new personal loan, using personal loan as your target. Use the following technique.
- Logistic regression
- Decision tree (both C5.0,CART)
- SVM.
- Neural Network
- KNN
- Ada-boost
- Random Forest
- GBM
- Deep Learning
Apply stacking on all the models for regression and classification.
Hint: Take Linear regression for Regression as your meta learner and Logistic Regression for classification as your meta learner.