Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Default process group has not been initialized, please make sure to call init_process_group. #37

Open
sjdch opened this issue Feb 19, 2025 · 6 comments

Comments

@sjdch
Copy link

sjdch commented Feb 19, 2025

有大佬遇见这个问题么?哪里出现问题了,请解答,谢谢

@yang2021
Copy link

找到engine\misc\dist_utils.py

Image

@LCMLoveFlower
Copy link

跑不通啊,按照图片修改后还是:ValueError: Default process group has not been initialized, please make sure to call init_process_group.

@yang2021
Copy link

yang2021 commented Mar 3, 2025

你试试把gloo改成nccl呢

@ShihuaHuang95
Copy link
Owner

您好,非常感谢您对我们工作的感兴趣。
这样的问题我在运行过程中并没有发现过,请先试图对齐训练环境和训练数据。
如果这样的问题是必现的,请麻烦提供更多的信息,我们可以尝试在本地复现这个错误,然后找到解决办法。

@gysocool
Copy link

gysocool commented Mar 4, 2025

我是将hgnetv2.py文件的

Image

这两个改成了safe_get_rank(),保持和dfine的内容一致,运行了两次就可以了

@1043123978
Copy link

Image
我发现是这一步导致出现这个问题

Image
删除后可以训练了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants