Skip to content

It's pan India Data Science Competition by ZS Associates hosted on Hacker Rank. I got a rank in top 40 based on a GBM model.

License

Notifications You must be signed in to change notification settings

sauravkaushik8/ZS_YoungDataScientistChallange2016

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

This is a pan India Data Science Competition by ZS Associates hosted on Hacker Rank.

I got a rank in top 40 based on a GBM model. ( https://drive.google.com/file/d/0ByPBn4rtMQ5HZl9IN1N3MHhHZEU/view?usp= sharing) ( https://cloud.githubusercontent.com/assets/10769039/17398289/a8e123b6-5a59-11e6-895e-9e91b5242aa8.png)

The problem statement was to predict which hospitals will be purchasing the instruments given the historical records and also the revenue each of those hospitals.

The CODE file contains the final trimmed code in R. The steps followed are:

  1. Loading the following libraries :

library('dplyr') library('randomForest') library('caret') library('Metrics') library('mice')

  1. Inputation with MICE.

  2. Creation of all possible combinations of Hospital_ID, District_ID and Instrument_ID through expand.grid command.

  3. Creation of training data by performing left join between the table created in step 3 and Hospital_Revenue table which contains all combinations of Hospital_ID, District_ID and Instrument_ID that resulted in a buy, rest of rows were considered as not bought.

  4. A GBM model was created with Instrument_ID and District_ID as predictors and Buy_or_not as outcome.

  5. Regression was performed through Gradient Boosted Model and appropriate Confidence level was chosen using 30 fold Cross Validation with the training data.

  6. Trends were captured in the Hospital Revenue and Projected_Reevenue files and were captured.

  7. The Final Solution was saved and submitted in the correct format.

About

It's pan India Data Science Competition by ZS Associates hosted on Hacker Rank. I got a rank in top 40 based on a GBM model.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages