feature(xyy):add HPT model to implement PolicyStem+DuelingHead #841

luodi-7 · 2024-11-27T14:34:44Z

Description

Here are some tensorboard plots from the lunarlander_hpt_example.py run.

Related Issue

TODO

Check List

merge the latest version source branch/repo, and resolve all the conflicts
pass style check
pass all the tests

puyuan1996 · 2024-11-29T04:04:38Z

ding/model/template/__init__.py

@@ -24,6 +24,8 @@
 from .vae import VanillaVAE
 from .decision_transformer import DecisionTransformer
 from .procedure_cloning import ProcedureCloningMCTS, ProcedureCloningBFS
+from .hpt import HPT
+


optimize import order

puyuan1996 · 2024-11-29T04:05:46Z

ding/model/template/hpt.py

+
+
+class PolicyStem(nn.Module):
+    """policy stem


reformat the docstring as the DI-engine style

dizoo/box2d/lunarlander/entry/lunarlander_dqn_example.py

puyuan1996 · 2024-11-29T04:07:18Z

dizoo/box2d/lunarlander/config/lunarlander_hpt_config.py

+create_config = lunarlander_hpt_create_config
+
+if __name__ == "__main__":
+    # or you can enter `ding -m serial -c lunarlander_dqn_config.py -s 0`


change the comments

puyuan1996 · 2024-11-29T04:07:34Z

dizoo/box2d/lunarlander/config/lunarlander_hpt_config.py

+        import_names=['dizoo.box2d.lunarlander.envs.lunarlander_env'],
+    ),
+    env_manager=dict(type='subprocess'),
+    # env_manager=dict(type='base'),


move unused comments

puyuan1996 · 2024-11-29T04:08:35Z

ding/model/template/hpt.py

+@MODEL_REGISTRY.register('hpt')
+class HPT(nn.Module):
+
+    def __init__(self, state_dim, action_dim):


add overview and related introduction

add unittest like other template in DI-engine

puyuan1996 · 2024-11-29T04:08:53Z

dizoo/box2d/lunarlander/config/lunarlander_hpt_config.py

+    policy=dict(
+        # Whether to use cuda for network.
+        cuda=True,
+        load_path="./lunarlander_hpt_seed0/ckpt/ckpt_best.pth.tar",


remove unused part

dizoo/box2d/lunarlander/entry/lunarlander_hpt_example.py

dizoo/box2d/lunarlander/config/lunarlander_hpt_config.py

dizoo/box2d/lunarlander/entry/lunarlander_dqn_example.py

PaParaZz1 · 2024-12-06T09:36:03Z

ding/model/template/hpt.py

+            - mask (:obj:`torch.Tensor`, optional): The attention mask tensor. Defaults to None.
+
+        Returns:
+            - torch.Tensor: The output tensor after applying attention.


ding/model/template/hpt.py

PaParaZz1 · 2024-12-06T09:48:34Z

ding/model/template/tests/test_hpt.py

+from ding.torch_utils import is_differentiable
+
+T, B = 3, 4
+obs_shape = [4, (8, ), (4, 64, 64)]


Remove the group (4, 64, 64) here, the current HPT model you implemented here can't support the image input like (4, 64, 64). It need a CNN feature extractor, and the corresponding state_dim should be `(4, 64, 64) rather than the current scalar. We will left this part for future, thus you can remove this group now.

luodi-7 added 4 commits November 27, 2024 16:38

add hpt model and corresponding examples.

9667211

feature(xyy):add HPT model to implement PolicyStem+DuelingHead

f7f4d04

feature(xyy):add HPT model

53c9d9a

feature(xyy):add HPT model to implement PolicyStem+DuelingHead

db7ef14

luodi-7 force-pushed the main branch from 44cd97c to db7ef14 Compare November 27, 2024 17:41

luodi-7 added 2 commits November 28, 2024 01:49

feature(xyy):add HPT model and examples

3ed5a7e

feature(xyy):add HPT model and examples

912b37d

PaParaZz1 mentioned this pull request Nov 28, 2024

Roadmap for DI-engine #548

Open

PaParaZz1 added the algo Add new algorithm or improve old one label Nov 28, 2024

puyuan1996 reviewed Nov 29, 2024

View reviewed changes

luodi-7 added 8 commits December 4, 2024 20:14

feature(xyy):add HPT model and test_hpt

9afedc7

feature(xyy):add HPT model and test_hpt

19da3b3

feature(xyy):add HPT model and test_hpt

26f4c97

feature(xyy):add HPT model and test_hpt

79aa427

feature(xyy):add HPT model and test_hpt

0709f83

feature(xyy):add HPT model and test_hpt

32f147f

feature(xyy):add HPT model and test_hpt

f3d5507

feature(xyy):add HPT model and test_hpt

25f1d2f

luodi-7 closed this Dec 4, 2024

luodi-7 force-pushed the main branch from 95a3edc to de9ada0 Compare December 4, 2024 14:57

luodi-7 reopened this Dec 4, 2024

luodi-7 added 6 commits December 4, 2024 23:48

feature(xyy):add HPT model and test_hpt

188759b

feature(xyy):add HPT model and test_hpt

cbe7dea

feature(xyy):add HPT model and test_hpt

6113433

feature(xyy):add HPT model and test_hpt

31f2398

feature(xyy):add HPT model and test_hpt

9d30de8

feature(xyy):add HPT model and test_hpt

7afb21d

PaParaZz1 requested changes Dec 6, 2024

View reviewed changes

feature(xyy):add HPT model and test_hpt

366ef68

PaParaZz1 approved these changes Dec 8, 2024

View reviewed changes

PaParaZz1 merged commit bbc9cc4 into opendilab:main Dec 8, 2024
8 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature(xyy):add HPT model to implement PolicyStem+DuelingHead #841

feature(xyy):add HPT model to implement PolicyStem+DuelingHead #841

luodi-7 commented Nov 27, 2024

puyuan1996 Nov 29, 2024

puyuan1996 Nov 29, 2024 •

edited

Loading

puyuan1996 Nov 29, 2024

puyuan1996 Nov 29, 2024

puyuan1996 Nov 29, 2024

puyuan1996 Nov 29, 2024

puyuan1996 Nov 29, 2024

PaParaZz1 Dec 6, 2024

PaParaZz1 Dec 6, 2024

feature(xyy):add HPT model to implement PolicyStem+DuelingHead #841

feature(xyy):add HPT model to implement PolicyStem+DuelingHead #841

Conversation

luodi-7 commented Nov 27, 2024

Description

Related Issue

TODO

Check List

Choose a reason for hiding this comment

puyuan1996 Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

puyuan1996 Nov 29, 2024 •

edited

Loading