general value/q network for multi-dimension reward #55

PaParaZz1 · 2021-09-13T10:41:42Z

PaParaZz1
Sep 13, 2021
Maintainer

Motivation

There are some cases about multi-dimension rewards, both for environments and algorithms:

some complicated env, such as auto-drive, speed, stability, collision and other elements can lead to some kinds of reward
some algorithm designs, such as intrinsic reward in exploration related algorithm

But If we want to use multi value/q network to learn different rewards, we need to do some non-trivial modifications in current DI-engine policy, so we need a general design and validate its necessary in performance.

Plan

TODO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

general value/q network for multi-dimension reward #55

{{title}}

Replies: 0 comments

Select a reply

general value/q network for multi-dimension reward #55

PaParaZz1 Sep 13, 2021 Maintainer

Motivation

Plan

Replies: 0 comments

PaParaZz1
Sep 13, 2021
Maintainer