You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am curious why in the regress_meta function in the APS agent, the reward is the first argument and not rep? According to the torch documentation, the lstsq function tries to find X in the the equation ||AX -B||_F, with A being the first argument and B being the second argument of the function. Since the equation that we are trying to solve is finding w, such that ||rep * w - reward||, shouldn't A be rep instead?
I am curious why in the regress_meta function in the APS agent, the reward is the first argument and not rep? According to the torch documentation, the lstsq function tries to find X in the the equation
||AX -B||_F
, with A being the first argument and B being the second argument of the function. Since the equation that we are trying to solve is finding w, such that||rep * w - reward||
, shouldn't A be rep instead?task = torch.linalg.lstsq(reward, rep)[0][:rep.size(1), :][0]
The text was updated successfully, but these errors were encountered: