Bayesian Optimization using xgboost and sklearn API
Simple test scripts for optimal hyperparameter of xgboost using bayesian optimization.
Original bayesian optimization code is from https://github.com/fmfn/BayesianOptimization and all credit for this work goes to the original author.
To run the examples below you will need to install this package (it is under constant development)
pip install git+https://github.com/fmfn/BayesianOptimization.git
Example 1. is based on the otto dataset from Kaggle, this remains in memory. (https://www.kaggle.com/c/otto-group-product-classification-challenge)
Example 2. is based on Avazu click prediction dataset from Kaggle and requires the 'distributed' version of xgboost. (https://www.kaggle.com/c/avazu-ctr-prediction)
To get this running create a data/otto and data/avazu dir and download the datasets into the respective directories and unzip / untar the files.
Dependencies:
- BayesianOptimization (https://github.com/fmfn/BayesianOptimization)
- Scipy
- Numpy
- Scikit-Learn
- xgboost (https://github.com/dmlc/xgboost)
References: