Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of HPolyhedron::UniformSample #20036

Closed
markisus opened this issue Aug 21, 2023 · 5 comments · Fixed by #20974
Closed

Poor performance of HPolyhedron::UniformSample #20036

markisus opened this issue Aug 21, 2023 · 5 comments · Fixed by #20974
Assignees
Labels
component: planning and control Optimization-based planning and control, and search- and sampling-based planning type: bug

Comments

@markisus
Copy link

What happened?

The hit-and-run algorithm only converges to the uniform distribution after a warmup period, aka mixing time. The paper "HIT-AND-RUN FROM A CORNER by Lovasz and Vempala" prove a convergence result (Corollary 1.2) which shows the dependence on the number of steps taken, and closeness to the uniform distribution.

image

As it stands, HPolyhedron::UniformSample uses only a single iteration over a previous sample to return the next sample. If my understanding is correct, there needs to be an inner loop that runs many times, in order for the result to be close to uniformly distributed.

This is apparent if you imagine sampling from a very long, very thin, rod, in very high ambient dimension. For many samples, hit and run will not progress, since it must be very lucky to sample the into the long dimension of the rod. For this case, it's clear that consecutive samples are extremely correlated.

For standalone usage, this is not a big problem, since users can just manually warm up the function. But UniformSample is also used internally in IrisInConfigurationSpace during the counter-example search. For long and thin configuration spaces, I believe this will lead to extremely correlated starting seeds.

Version

No response

What operating system are you using?

No response

What installation option are you using?

No response

Relevant log output

No response

@cohnt
Copy link
Contributor

cohnt commented Aug 21, 2023

Note that IrisInConfigurationSpace attempts to reuse the previous initialization as the starting point for the hit-and-run sampler on the next iteration, so it does perform some mixing (beyond just a single iteration from the Chebyshev center). However, if the previous initialization becomes infeasible, and can't easily be projected back onto the feasible set, then it does revert to whatever the current Chebyshev center is.

Perhaps an appropriate fix would be to include a num_mixing_iterations parameter in IrisOptions, wherein the "uniform" sampler runs that many hit-and-run iterations before starting to return random samples. (It would only have to do this when it is first started, and when it's restarted due to the initialization becoming infeasible.) Then the user can balance the degree of mixing that is required vs the amount of time required to iterate the sampler.

Also, have you observed any problems caused by insufficient mixing? If you have any good adversarial examples that you've encountered, it would be helpful to share them, so they could potentially be included in the tests or benchmarking (e.g. #19771).

@markisus
Copy link
Author

markisus commented Aug 22, 2023

Yes I am able to construct arbitrarily bad counterexamples by making long-thin rectangles in 2d. The skinnier the rectangle, the poorer the performance. Here is a script that generates one million samples from a (10,000, 0.0001) rectangle.

from pydrake.all import (
    HPolyhedron,
    RandomGenerator
)
import numpy as np
from matplotlib import pyplot as plt

box_length = 1e4
box_height = 1e-4
long_thin_box = HPolyhedron.MakeBox(
    np.array([0.0, 0.0]),
    np.array([box_length, box_height]))

generator = RandomGenerator()

num_samples = 1000000
sample = long_thin_box.UniformSample(generator)
samples = []
for _ in range(num_samples):
    sample = long_thin_box.UniformSample(generator, sample)
    samples.append(sample)

samples = np.array(samples)
xs = samples[:,0]
ys = samples[:,1]

plt.title("HPolyhedron UniformSample on a Long Thin Rectangle")
plt.plot(xs)
plt.xlabel("Iteration Number")
plt.ylabel("Sample X Coordinate")
plt.show()

You can see the poor performance by looking at the x-coordinates of the samples.
image

A performant implementation should see the x coordinates ranging from 0 to 10,000. The Drake implementation just barely makes it up to 80.

You can also see that there is high correlation between consecutive samples, no matter how long the sampling has been run. You said "It would only have to do this when it is first started, and when it's restarted due to the initialization becoming infeasible." Unfortunately, I think this adversarial example illustrates why that wouldn't work. The mixing must occur in between every call to UniformSample.

For IrisInconfigurationSpace, I believe that this would be okay, since making the initial guess takes trivial time in comparison with running the subsequent optimization problem.

In terms of real-world example, I encountered a case like this where I had a robot that had to slide itself into a thin corridor.

@cohnt
Copy link
Contributor

cohnt commented Aug 22, 2023

For IrisInconfigurationSpace, I believe that this would be okay, since making the initial guess takes trivial time in comparison with running the subsequent optimization problem.

The solves required for the strategy to get a uniform distribution probably isn't trivial. 1e6 samples takes ~.8 seconds on my machine, and isn't doing a lot of mixing. Even 1e8 samples only saw a quarter of the space (see below), and that takes ~80 seconds, which is clearly too much per iteration. So I don't think just cranking up the amount of mixing is going to solve the problem you're facing.

Figure_1

Note that your runtime is also going to grow a lot if you're facing larger problems. In a 14 dimensional space with 448 hyperplanes, drawing 1e7 samples took ~36 seconds, and achieved very little mixing. Getting nice, uniform samples from an HPolyhedron is hard, especially in these adversarial cases. Rejection sampling isn't going to work either, unless you have a narrow outer approximation. (And realistically, you don't have one of those unless it's axis-aligned.) Initializing samples at the vertices might help if you have them, but converting from HPolyhedron to VPolytope requires exponential time.

However, IRIS-NP doesn't truly need uniform samples to succeed, because these are just the initial guesses for a nonlinear program. As long as you're getting samples in the basin of a feasible local optimum, the solver should be able to find its way. In the real-world example you're looking at, what is the failure case that you're observing? Is IRIS failing to find hyperplanes that separate you from an obstacle? Are you able to share the real-world example?

@jwnimmer-tri jwnimmer-tri added the component: planning and control Optimization-based planning and control, and search- and sampling-based planning label Aug 22, 2023
@markisus
Copy link
Author

markisus commented Aug 22, 2023

I agree that cranking up the mixing does not help much for this particular example, which is pretty pathological. The Lovasz paper shows that you can get good mixing when your number of steps scales as $R \ln(R) / r$, where $R$ is the radius of the smallest ball containing the set, and $r$ is the radius of the largest ball inside the set. Allowing the user to specify a num_mixing_iterations could improve the sampling for real-world sets, where the inner-ball and outer-ball are similar sizes.

Unfortunately I can't share my real-world example, but I'm seeing some counter-examples from IrisInConfigurationSpace when my robot is inside a passageway. But actually any robot where one joint limit is numerically smaller than the other joints will exhibit this problem. The HPolyhedron box associated with the joint limits will be very flat, and hit and run is very likely to collide into the boundary of the small joint limit many times before making substantial progress in the other joints. You might see this if you have, say, a tweezer on your end effector, which can only open 0.001 meter, while your other joints can do $2\pi$ in rotation.

I agree that IRIS-NP doesn't need uniform samples, but it should at least take care to not try initial guesses that are close to the previous guesses, since they will end up attracting to the same local optimum. Unfortunately, the hit and run sampling method, without mixing time, is likely to return guesses that are very close to one another near thin regions and corners of the set.

Now that I think of it, you might be able to do a good counter-example search with an inner IRIS-like optimization algorithm. To find the next guess, you might try to find the largest ellipse within your current set which avoids all previous guesses. The new guess could be the center of the ellipse.

@RussTedrake
Copy link
Contributor

For adversarial examples like this, you might also want to consider using C-IRIS (instead of IRIS-NP). It was not available yet at the time I did the first draft of the IrisBuilder notebook, but is available now: See CspaceFreePolytope. I intend to make more examples that bring these together soon.

RussTedrake added a commit to RussTedrake/drake that referenced this issue Feb 19, 2024
RussTedrake added a commit to RussTedrake/drake that referenced this issue Feb 19, 2024
RussTedrake added a commit to RussTedrake/drake that referenced this issue Feb 21, 2024
ggould-tri pushed a commit that referenced this issue Feb 21, 2024
jwnimmer-tri pushed a commit to jwnimmer-tri/drake that referenced this issue Apr 1, 2024
RussTedrake added a commit to RussTedrake/drake that referenced this issue Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: planning and control Optimization-based planning and control, and search- and sampling-based planning type: bug
Projects
Development

Successfully merging a pull request may close this issue.

5 participants