A simple implementation of GP-UCB[1]. The main purpose is to provide the overview of GP-UCB algorithm's dynamics.
Please refer the bottom of gpucb.py
for sample settings.
- Set up a meshgrid which defines the search space.
x = np.arange(-3, 3, 0.25)
y = np.arange(-3, 3, 0.25)
grid = np.meshgrid(x, y)
- Set up an environment class which is equipped with
sample()
method.
class DummyEnvironment(object):
def sample(self, x):
return np.sin(x[0]) + np.cos(x[1])
- Create a GPUCB instance by padding the search grid and the environment instance.
env = DummyEnvironment()
agent = GPUCB(grid, env)
- Iterate
learn()
method to obtain the estimated curve. If the search space is 2D, you can also useplot()
method to visualize the entire situation.
for i in range(20):
agent.learn()
agent.plot()
- Srinivas, Niranjan, et al. "Gaussian process optimization in the bandit setting: No regret and experimental design." arXiv preprint arXiv:0912.3995 (2009).