[Experimental] Derivative Kernel #1794

jacobrgardner · 2021-10-24T03:08:57Z

This leverages some pretty cool functionality the PyTorch team is playing around with in https://github.com/pytorch/functorch to let us replace our bespoke derivative kernel implementations with a single fully autograd based one. You can use it like this:

x = torch.randn(5, 2)

# Polynomial Kernel
kern = gpytorch.kernels.PolynomialKernel(2)
kern_grad = gpytorch.kernels.PolynomialKernelGrad(2)
kern_autograd = gpytorch.kernels.experimental.DerivativeKernel(kern)
assert torch.norm(kern_grad(x).evaluate() - kern_autograd(x).evaluate()) < 1e-5

# RBF Kernel
kern = gpytorch.kernels.RBFKernel()
kern_grad = gpytorch.kernels.RBFKernelGrad()
kern_autograd = gpytorch.kernels.experimental.DerivativeKernel(kern)
assert torch.norm(kern_grad(x).evaluate() - kern_autograd(x).evaluate()) < 1e-5

The only change necessary outside of actually implementing the kernel was that currently vmap can't batch over torch.equal, so I made x1_eq_x2 an argument we can specify / set in __call__ to bypass the comparison here:

gpytorch/gpytorch/kernels/kernel.py

Line 300 in fc2053b

x1_eq_x2 = torch.equal(x1, x2)

Problems

For some reason, the Hessian block of DerivativeKernel(MaternKernel(nu=2.5)) specifically is just the negative of what it should be. This is the only kernel this happens for as far as I can tell. I have no idea why, but it's annoying since Matern would be obviously a fantastic kernel to get a derivative version of "for free." I suspect this has to do with the non squared distance computations here?
If you wrap a non-differentiable kernel in DerivativeKernel, it will still return a matrix but a super non-pd one. I don't think there's a good solution, but it's problematic since if we're not using Cholesky (likely with default settings since derivative kernel matrices get really big really fast) I'm not sure we'd fail loudly anywhere along the way?

jacobrgardner · 2021-10-24T03:25:03Z

Not really familiar with the 3 failing WiskiGP tests -- maybe something related to the x1_eq_x2 argument tracking?

wjmaddox · 2021-10-28T15:55:13Z

Weird that the errors are mainly in the notebooks and not in the unit tests. I also couldn't reproduce the example failures in my setup.

wjmaddox · 2021-10-28T16:40:43Z

After a bit of tracking down the error, it seems to be because the SKI kernel evaluated against the grid should not have x1_eq_x2 as True. This commit should resolve it.

jacobrgardner added 6 commits October 23, 2021 22:49

x1_eq_x2 can now be specified manually

998ca46

Implement derivative kernel

407e9bb

add todo

c664a10

more differentiable matern kernel

0dfb80e

batch shape handling

967644b

fix lint

ab04472

remove x1_eq_x2 when we compute covariance v grid

ac001a0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Experimental] Derivative Kernel #1794

[Experimental] Derivative Kernel #1794

jacobrgardner commented Oct 24, 2021

jacobrgardner commented Oct 24, 2021

wjmaddox commented Oct 28, 2021 •

edited

Loading

wjmaddox commented Oct 28, 2021

[Experimental] Derivative Kernel #1794

Are you sure you want to change the base?

[Experimental] Derivative Kernel #1794

Conversation

jacobrgardner commented Oct 24, 2021

Problems

jacobrgardner commented Oct 24, 2021

wjmaddox commented Oct 28, 2021 • edited Loading

wjmaddox commented Oct 28, 2021

wjmaddox commented Oct 28, 2021 •

edited

Loading