You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the proxy is set as an attribute of the environments and the base environment implements the methods proxy2reward() and reward2proxy() that determine the conversion between proxy outputs and reward. The environment also implements the methods reward() and reward_batch(), which call the proxy and the conversion methods. This is probably not ideal for various reasons.
I do not see any longer a good reason to keep the proxy and these methods within the environment. It seems possible and a good idea to completely detach the environment and the proxy. Some proxies need information from the environment, which is currently set via the call to Env.setup_proxy(), which calls the proxy's setup() method. But this could just be done elsewhere.
Now, in terms of alternatives, I am not completely settled on what the best option would be. In particular, where should the methods that convert between proxy and reward go?
In the (base) proxy?
In the GFlowNet agent?
The text was updated successfully, but these errors were encountered:
It seems that it would be better to have the methods in the base proxy class, so as to make it easier for non-GFN baselines to re-use the GFlowNet code.
Currently, the proxy is set as an attribute of the environments and the base environment implements the methods
proxy2reward()
andreward2proxy()
that determine the conversion between proxy outputs and reward. The environment also implements the methodsreward()
andreward_batch()
, which call the proxy and the conversion methods. This is probably not ideal for various reasons.I do not see any longer a good reason to keep the proxy and these methods within the environment. It seems possible and a good idea to completely detach the environment and the proxy. Some proxies need information from the environment, which is currently set via the call to
Env.setup_proxy()
, which calls the proxy'ssetup()
method. But this could just be done elsewhere.Now, in terms of alternatives, I am not completely settled on what the best option would be. In particular, where should the methods that convert between proxy and reward go?
The text was updated successfully, but these errors were encountered: