-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Non-performance-impacting update] Use Pytorch DDP in ppo_atari_multigpu #495
base: master
Are you sure you want to change the base?
Conversation
[Minor update] Use Pytorch DDP in ppo_atari_multigpu
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Hi @realAsma, thanks for the PR. I am a bit more inclined to keep things as is because it showcases the lower-level implementation of DDP, so maybe more informative to certain users. That said I like your implementation as well. Maybe you could add in the docs a link to this PR, saying you could have implemented multi-gpu this way? |
@vwxyzjn I made a minor update to this PR to update the docs. Now I like the implementation in this PR. In the prior implementation, weight initialization across processes are handled by setting torch seed to be the same across process before model initialization. After model initialization, torch seeds are set back to the unique seed across processes. Additionally, There is that boiler-plate code for gradient synchronization. The prior implementation is more educational overall. However, do we intend to teach distributed training concepts as well? I support whichever works for you :). I will create a different PR which adds a link to this once in the docs if you wish so. Thanks a lot for taking your time to review this PR. Let me know what you think. |
I still like the existing implementation for being more educational, but I agree the DDP implementation is more standard. If you are motivated, we can also just add another file for it like Alternative feel free to create a different PR adding the link to this one in the docs. Either way is fine with me and up to you. |
[Non-performance-impacting update] Use Pytorch DDP in ppo_atari_multigpu
Description
cleanrl/ppo_atari_multigpu.py
does not use any data parallel wrappers and performs gradient synchronization explicitly.Since Pytorch distributed parallel wrappers have become more ubiquitous, we could use them instead and improve the code readability.
Types of changes
Checklist:
pre-commit run --all-files
passes (required).mkdocs serve
.If you need to run benchmark experiments for a performance-impacting changes:
--capture_video
.python -m openrlbenchmark.rlops
.python -m openrlbenchmark.rlops
utility to the documentation.python -m openrlbenchmark.rlops ....your_args... --report
, to the documentation.