GP-UCB

A simple implementation of GP-UCB[1]. The main purpose is to provide the overview of GP-UCB algorithm's dynamics.

Usage

Please refer the bottom of gpucb.py for sample settings.

x = np.arange(-3, 3, 0.25)
y = np.arange(-3, 3, 0.25)
grid = np.meshgrid(x, y)

class DummyEnvironment(object):
    def sample(self, x):
        return np.sin(x[0]) + np.cos(x[1])

Create a GPUCB instance by padding the search grid and the environment instance.

env = DummyEnvironment()
agent = GPUCB(grid, env)

Iterate learn() method to obtain the estimated curve. If the search space is 2D, you can also use plot() method to visualize the entire situation.

for i in range(20):
    agent.learn()
    agent.plot()

Srinivas, Niranjan, et al. "Gaussian process optimization in the bandit setting: No regret and experimental design." arXiv preprint arXiv:0912.3995 (2009).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
gpucb.py		gpucb.py
sample.gif		sample.gif