Fix RL stochastic optimization problem (currently 11.1) #231

hjsuh94 · 2023-03-27T21:24:20Z

The stochastic approximation has

x <- x - eta * [ l(x + w) - l(x) ] w

but the true gradient should be

x <- x - eta * [ l(x + w) - l(x) ] w / sigma^2

so we seem to be off at a scale.

This exercise is also bit misleading since it gives you the impression that we got out of local minima because we did zeroth-order, when in fact, we could have achieved the same effect by doing first order while injecting stochasticity.

I'd love to reimplement this problem with what we've learned with various gradient estimators over the past year.

The text was updated successfully, but these errors were encountered:

RussTedrake self-assigned this Jan 1, 2025

RussTedrake changed the title ~~Fix Exercise 11.1~~ Fix RL stochastic optimization problem (currently 11.1) Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix RL stochastic optimization problem (currently 11.1) #231

Fix RL stochastic optimization problem (currently 11.1) #231

hjsuh94 commented Mar 27, 2023

Fix RL stochastic optimization problem (currently 11.1) #231

Fix RL stochastic optimization problem (currently 11.1) #231

Comments

hjsuh94 commented Mar 27, 2023