Chapter2 Application Demo #4

PaParaZz1 · 2022-12-22T14:33:42Z

在本 issue 中，我们会更新所有和课程第二讲相关的应用 demo 素材

训练代码链接

火箭回收（离散动作空间）

rocket.mp4
无人机姿态控制（连续动作空间）

drone.mp4
交通信控（多维离散动作空间）

cityflow_tiny.mp4
导航控制（混合动作空间：参数化动作空间）

out.mp4

EasonQYS · 2023-01-17T15:36:04Z

期待代码

jianzuo · 2023-03-12T19:55:42Z

请问有关于multiDiscrete动作空间的详细对照解析吗，我查看了代码注视文档教程好像只有普通离散动作的。
谢谢！

PaParaZz1 · 2023-03-15T02:00:41Z

请问有关于multiDiscrete动作空间的详细对照解析吗，我查看了代码注视文档教程好像只有普通离散动作的。谢谢！

其实就是 DI-engine 中的 MultiHead 功能实现，可以先看这边的源码，我们本周内会在课程 repo 这边更新下代码注解文档。

jianzuo · 2023-03-15T07:35:18Z

明白了，谢谢！

jianzuo · 2023-03-25T09:18:41Z

您好，
请问您回复说的更新关于multihead的代码注释是在哪可以看到？我最近在尝试用PPO实现输出多维动作。
一直没有弄清楚。谢谢！

jianzuo · 2023-03-25T11:09:11Z

我跟据讲解尝试了下multihead,但是报错了：

import torch
import torch.nn as nn
import torch.nn.functional as F
class DiscretePolicyNetMultiHead(nn.Module):
    def __init__(self, obs_dim, hidden_dim, action_dim) -> None:
        super(DiscretePolicyNet, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(obs_dim, hidden_dim),
            nn.ReLU(),
        )
        self.heads = nn.ModuleList([nn.Linear(hidden_dim, dim) for dim in action_dim])
        
        
    def forward(self, x: torch.Tensor)->torch.Tensor:
        x = self.encoder(x)
        logit = [self.head(x) for head in self.heads]
        return logits
    
def sample_act(logit: torch.Tensor) -> torch.Tensor:
    probs = torch.softmax(logit, dim=-1)
    dists = [torch.distributions.Categorical(probs=prob) for prob in probs]
    return [dist.sample() for dist in dists]

def test_action_multihead():
    B, obs_shape, hidden_shape, action_shape = 4, 10, 32, [6, 3]
    state = torch.rand(B, obs_shape)
    policy_net = DiscretePolicyNet(obs_shape, hidden_shape, action_shape)
    logit = policy_net(state)
    assert logit.shape == (B, action_shape)
    action = sample_act(logit)
    assert action.shape == (B,)
    return action

test_action_multihead()

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_27/530012604.py in <module>
----> 1 test_action_multihead()

/tmp/ipykernel_27/2493506364.py in test_action_multihead()
      2     B, obs_shape, hidden_shape, action_shape = 4, 10, 32, [6, 3]
      3     state = torch.rand(B, obs_shape)
----> 4     policy_net = DiscretePolicyNet(obs_shape, hidden_shape, action_shape)
      5     logit = policy_net(state)
      6     assert logit.shape == (B, action_shape)

/tmp/ipykernel_27/2688308212.py in __init__(self, obs_dim, hidden_dim, action_dim)
      6             nn.ReLU(),
      7         )
----> 8         self.head = nn.Linear(hidden_dim, action_dim)
      9 
     10     def forward(self, x: torch.Tensor)->torch.Tensor:

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/linear.py in __init__(self, in_features, out_features, bias, device, dtype)
     94         self.in_features = in_features
     95         self.out_features = out_features
---> 96         self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
     97         if bias:
     98             self.bias = Parameter(torch.empty(out_features, **factory_kwargs))

TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:
 * (tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
 * (tuple of SymInts size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

PaParaZz1 · 2023-03-28T08:42:41Z

我跟据讲解尝试了下multihead,但是报错了：

import torch
import torch.nn as nn
import torch.nn.functional as F
class DiscretePolicyNetMultiHead(nn.Module):
    def __init__(self, obs_dim, hidden_dim, action_dim) -> None:
        super(DiscretePolicyNet, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(obs_dim, hidden_dim),
            nn.ReLU(),
        )
        self.heads = nn.ModuleList([nn.Linear(hidden_dim, dim) for dim in action_dim])
        
        
    def forward(self, x: torch.Tensor)->torch.Tensor:
        x = self.encoder(x)
        logit = [self.head(x) for head in self.heads]
        return logits
    
def sample_act(logit: torch.Tensor) -> torch.Tensor:
    probs = torch.softmax(logit, dim=-1)
    dists = [torch.distributions.Categorical(probs=prob) for prob in probs]
    return [dist.sample() for dist in dists]

def test_action_multihead():
    B, obs_shape, hidden_shape, action_shape = 4, 10, 32, [6, 3]
    state = torch.rand(B, obs_shape)
    policy_net = DiscretePolicyNet(obs_shape, hidden_shape, action_shape)
    logit = policy_net(state)
    assert logit.shape == (B, action_shape)
    action = sample_act(logit)
    assert action.shape == (B,)
    return action

test_action_multihead()

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_27/530012604.py in <module>
----> 1 test_action_multihead()

/tmp/ipykernel_27/2493506364.py in test_action_multihead()
      2     B, obs_shape, hidden_shape, action_shape = 4, 10, 32, [6, 3]
      3     state = torch.rand(B, obs_shape)
----> 4     policy_net = DiscretePolicyNet(obs_shape, hidden_shape, action_shape)
      5     logit = policy_net(state)
      6     assert logit.shape == (B, action_shape)

/tmp/ipykernel_27/2688308212.py in __init__(self, obs_dim, hidden_dim, action_dim)
      6             nn.ReLU(),
      7         )
----> 8         self.head = nn.Linear(hidden_dim, action_dim)
      9 
     10     def forward(self, x: torch.Tensor)->torch.Tensor:

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/linear.py in __init__(self, in_features, out_features, bias, device, dtype)
     94         self.in_features = in_features
     95         self.out_features = out_features
---> 96         self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
     97         if bias:
     98             self.bias = Parameter(torch.empty(out_features, **factory_kwargs))

TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:
 * (tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
 * (tuple of SymInts size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

现在可以参考这个例子 https://github.com/opendilab/PPOxFamily/blob/main/chapter2_action/discrete_tutorial_zh.py#L58

jianzuo · 2023-03-28T13:40:59Z

谢谢！我根据您的例子重写下。

lz-8713 · 2023-08-01T02:14:50Z

multiDiscrete动作空间和Discrete动作空间相关的ppo的代码，还有控制交通信号灯的完整代码能分享一下吗?

zhixiongzh · 2023-08-24T08:34:33Z

你好，我docker pull了最新的opendilab/ding:nightly-mujoco镜像，然后在里面运行pip install git+https://github.com/zjowowen/gym-pybullet-drones@master，想跑一下drones的例子，但是报错

root@BF4-C-008T7:/workspaces/PPOxFamily# pip install git+https://github.com/zjowowen/gym-pybullet-drones@master
Collecting git+https://github.com/zjowowen/gym-pybullet-drones@master
  Cloning https://github.com/zjowowen/gym-pybullet-drones (to revision master) to /tmp/pip-req-build-wy0jagd4
  Running command git clone --filter=blob:none --quiet https://github.com/zjowowen/gym-pybullet-drones /tmp/pip-req-build-wy0jagd4
  Resolved https://github.com/zjowowen/gym-pybullet-drones to commit b35eed32c251cc69c2d7b0de74dd9a66ca1357b1
  Installing build dependencies ... error
  error: subprocess-exited-with-error
  
  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      Collecting poetry-core@ git+https://github.com/python-poetry/poetry-core.git@master
        Cloning https://github.com/python-poetry/poetry-core.git (to revision master) to /tmp/pip-install-s945w_8c/poetry-core_d952979d432a40669870b5448a5371f8
        Running command git clone --filter=blob:none --quiet https://github.com/python-poetry/poetry-core.git /tmp/pip-install-s945w_8c/poetry-core_d952979d432a40669870b5448a5371f8
        WARNING: Did not find branch or tag 'master', assuming revision or ref.
        Running command git checkout -q master
        error: pathspec 'master' did not match any file(s) known to git.
        error: subprocess-exited-with-error
      
        × git checkout -q master did not run successfully.
        │ exit code: 1
        ╰─> See above for output.
      
        note: This error originates from a subprocess, and is likely not a problem with pip.
      error: subprocess-exited-with-error
      
      × git checkout -q master did not run successfully.
      │ exit code: 1
      ╰─> See above for output.
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

我手动安装了poetry-core也不行，感觉是那个master的branch名字要改成main?
@PaParaZz1 请问有什么建议吗？

zhixiongzh · 2023-08-25T07:25:06Z

你好，我docker pull了最新的opendilab/ding:nightly-mujoco镜像，然后在里面运行pip install git+https://github.com/zjowowen/gym-pybullet-drones@master，想跑一下drones的例子，但是报错

root@BF4-C-008T7:/workspaces/PPOxFamily# pip install git+https://github.com/zjowowen/gym-pybullet-drones@master
Collecting git+https://github.com/zjowowen/gym-pybullet-drones@master
  Cloning https://github.com/zjowowen/gym-pybullet-drones (to revision master) to /tmp/pip-req-build-wy0jagd4
  Running command git clone --filter=blob:none --quiet https://github.com/zjowowen/gym-pybullet-drones /tmp/pip-req-build-wy0jagd4
  Resolved https://github.com/zjowowen/gym-pybullet-drones to commit b35eed32c251cc69c2d7b0de74dd9a66ca1357b1
  Installing build dependencies ... error
  error: subprocess-exited-with-error
  
  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      Collecting poetry-core@ git+https://github.com/python-poetry/poetry-core.git@master
        Cloning https://github.com/python-poetry/poetry-core.git (to revision master) to /tmp/pip-install-s945w_8c/poetry-core_d952979d432a40669870b5448a5371f8
        Running command git clone --filter=blob:none --quiet https://github.com/python-poetry/poetry-core.git /tmp/pip-install-s945w_8c/poetry-core_d952979d432a40669870b5448a5371f8
        WARNING: Did not find branch or tag 'master', assuming revision or ref.
        Running command git checkout -q master
        error: pathspec 'master' did not match any file(s) known to git.
        error: subprocess-exited-with-error
      
        × git checkout -q master did not run successfully.
        │ exit code: 1
        ╰─> See above for output.
      
        note: This error originates from a subprocess, and is likely not a problem with pip.
      error: subprocess-exited-with-error
      
      × git checkout -q master did not run successfully.
      │ exit code: 1
      ╰─> See above for output.
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

我手动安装了poetry-core也不行，感觉是那个master的branch名字要改成main? @PaParaZz1 请问有什么建议吗？

解决了，需要把整个drones的库clone下来，git clone https://github.com/zjowowen/gym-pybullet-drones.git 然后把这行代码requires = ["poetry-core @ git+https://github.com/python-poetry/poetry-core.git@master"]里面的master改成main，然后在那个库里手动pip install -e .就可以安装了

zjowowen · 2023-08-25T10:42:58Z

Hi,

This repo [https://github.com/zjowowen/gym-pybullet-drones.git] is updated with the origin repo [https://github.com/utiasDSL/gym-pybullet-drones].

Thanks for reminding us!

zhixiongzh · 2023-08-29T08:12:54Z

@zjowowen
跑通代码后我还是无法复现这个drones_fly_demo, 按照默认参数训练了5e6 steps之后return并没有很好看，然后我加载了最佳的保存模型，record了video之后发现它是从门上面飞过去的而不是从下面传过去的。请问为了达到你们展示的demo的效果还有别的设置吗？

rokey0001 · 2023-10-20T14:43:15Z

您好，我在跑demo时老遇到这样的问题，不知道有没有小伙伴和我有一样的问题。
Traceback (most recent call last):
File "", line 1, in
File "E:\download\anaconda\envs\DILAB\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "E:\download\anaconda\envs\DILAB\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "E:\download\anaconda\envs\DILAB\lib\site-packages\ding\utils\compression_helper.py", line 24, in setstate
self.data = cloudpickle.loads(data)
TypeError: _generator_ctor() takes from 0 to 1 positional arguments but 2 were given

[10-20 22:34:24] WARNING subprocess reset set seed failed, ignore and continue... subprocess_env_manager.py:263
subprocess exception traceback:
Traceback (most recent call last):
File "E:\download\anaconda\envs\DILAB\lib\multiprocessing\connection.py", line 312, in _recv_bytes
nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] 管道已结束。

Traceback (most recent call last):
File "E:\download\anaconda\envs\DILAB\lib\site-packages\ding\envs\env_manager\subprocess_env_manager.py", line
259, in reset
ret = self._pipe_parents.recv()
File "E:\download\anaconda\envs\DILAB\lib\multiprocessing\connection.py", line 250, in recv
buf = self._recv_bytes()
File "E:\download\anaconda\envs\DILAB\lib\multiprocessing\connection.py", line 321, in _recv_bytes
raise EOFError
EOFError

wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing.
[10-20 22:34:26] ERROR Env 2 reset has exceeded max retries(5) subprocess_env_manager.py:317
[10-20 22:34:26] ERROR Env 1 reset has exceeded max retries(5) subprocess_env_manager.py:317
[10-20 22:34:26] ERROR Env 3 reset has exceeded max retries(5) subprocess_env_manager.py:317
wandb: View run dutiful-pond-1 at: https://wandb.ai/anony-mouse-788424711663011732/bipedalwalker_demo/runs/uomu1uw0?apiKey=dc8282c6be97b578e2fa87aac8b882089ab2adaf
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: .\wandb\run-20231020_223406-uomu1uw0\logs

huang312 · 2024-01-08T08:09:41Z

请问仓库中有multidiscretePPO的完整代码和训练过程吗

Billchan9711 · 2024-01-17T11:40:52Z

求multidiscrete+PPO 控制交通灯代码

PaParaZz1 · 2024-01-17T13:45:25Z

请问仓库中有multidiscretePPO的完整代码和训练过程吗

可以参考 DI-smartcross 中的相关例子，由于 cityflow 环境比较复杂，我们没有直接整合到课程仓库中，所以请移步 DI-smartcross 查看。传送门

JBGZ-XXB · 2024-04-03T14:45:56Z

你好，无人机姿态控制（连续动作空间）这个案例的环境代码有么，想参考一下如何用强化学习在接上pid控制器的

PaParaZz1 · 2024-04-05T09:52:34Z

你好，无人机姿态控制（连续动作空间）这个案例的环境代码有么，想参考一下如何用强化学习在接上pid控制器的

所有代码都在本仓库的代码示例中可以找到的，无人机姿态控制的代码是这个链接

PaParaZz1 added application Application analysis or extension discussion Topic discussion labels Dec 22, 2022

PaParaZz1 pinned this issue Dec 22, 2022

PaParaZz1 unpinned this issue Feb 24, 2023

zhixiongzh mentioned this issue Aug 24, 2023

not compatible with gym_pybullet_drones env opendilab/DI-engine#714

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter2 Application Demo #4

Chapter2 Application Demo #4

PaParaZz1 commented Dec 22, 2022 •

edited

Loading

EasonQYS commented Jan 17, 2023

jianzuo commented Mar 12, 2023

PaParaZz1 commented Mar 15, 2023

jianzuo commented Mar 15, 2023

jianzuo commented Mar 25, 2023

jianzuo commented Mar 25, 2023 •

edited

Loading

PaParaZz1 commented Mar 28, 2023

jianzuo commented Mar 28, 2023

lz-8713 commented Aug 1, 2023

zhixiongzh commented Aug 24, 2023

zhixiongzh commented Aug 25, 2023

zjowowen commented Aug 25, 2023

zhixiongzh commented Aug 29, 2023 •

edited

Loading

rokey0001 commented Oct 20, 2023

huang312 commented Jan 8, 2024

Billchan9711 commented Jan 17, 2024

PaParaZz1 commented Jan 17, 2024

JBGZ-XXB commented Apr 3, 2024

PaParaZz1 commented Apr 5, 2024

Chapter2 Application Demo #4

Chapter2 Application Demo #4

Comments

PaParaZz1 commented Dec 22, 2022 • edited Loading

EasonQYS commented Jan 17, 2023

jianzuo commented Mar 12, 2023

PaParaZz1 commented Mar 15, 2023

jianzuo commented Mar 15, 2023

jianzuo commented Mar 25, 2023

jianzuo commented Mar 25, 2023 • edited Loading

PaParaZz1 commented Mar 28, 2023

jianzuo commented Mar 28, 2023

lz-8713 commented Aug 1, 2023

zhixiongzh commented Aug 24, 2023

zhixiongzh commented Aug 25, 2023

zjowowen commented Aug 25, 2023

zhixiongzh commented Aug 29, 2023 • edited Loading

rokey0001 commented Oct 20, 2023

huang312 commented Jan 8, 2024

Billchan9711 commented Jan 17, 2024

PaParaZz1 commented Jan 17, 2024

JBGZ-XXB commented Apr 3, 2024

PaParaZz1 commented Apr 5, 2024

PaParaZz1 commented Dec 22, 2022 •

edited

Loading

jianzuo commented Mar 25, 2023 •

edited

Loading

zhixiongzh commented Aug 29, 2023 •

edited

Loading