-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loops in BatchGradientDescent with flat constraints #1508
Comments
I think your princess is in another castle. Line 353 is not inside a while loop. It is inside a for loop over a fixed length list of step sizes: https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/optimization/batch_gradient_descent.py#L347 So if you're getting an infinite step size, it's not as part of that particular loop, or it's because someone put an infinite step size into that list. It's not a result of how we break ties within that loop. |
The problem comes from a side effect of the condition in line 347: for ind, alpha in enumerate(alpha_list):
self._goto_alpha(alpha)
obj = self.obj(*inputs)
if self.verbose:
logger.info('\t{0} {1}'.format(alpha, obj))
# Use <= rather than = so if there are ties
# the bigger step size wins
if obj <= best_obj:
best_obj = obj
best_alpha = alpha
best_alpha_ind = ind which allows to keep growing the step size even though there is no improvement, and the condition in line 377: elif best_alpha_ind > len(alpha_list) - 2:
alpha_list = [alpha * 2. for alpha in alpha_list]
if self.verbose:
logger.info('growing the step size') which says that if the last step size of the list was used then all step sizes in the list should be multiplied by two. In the case of a flat function there is no way to make the algorithm stop. I see at least two solutions:
|
Hello,
I'm encountering an issue using
BatchGradientDescent
withparam_constrainers
.Here is an example of the issue:
This leads to an infinite loop. Examining the output obtained with 'verbose=1', shows that the step size becomes infinite.
In the file batch_gradient_descent.py, line 353, one reads:
I guess the problems comes from the '<=' instead of '<'. The comment seems to claim it is better to take the greater step size when ties occur, but in this case all values for step sizes bigger than some value lead to ties.
A simple correction would be to replace '<=' by '<', but since there is a special comment about this point, I am wondering if there is a strong reason not to do so.
The text was updated successfully, but these errors were encountered: