-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Default process group has not been initialized, please make sure to call init_process_group #55
Comments
you could just remove the lines initializing the |
It didn't help, I am afraid, still getting:
|
Did you also try to remove it from the eval scripts, i.e. line 201 in |
I faced this same issue using a single GPU on one machine, I got it working by changing the port and explicitly defining the rank and world size. For evaluation you can edit line 131 in |
ValueError: Default process group has not been initialized, please make sure to call init_process_group facebookresearch#55
ValueError: Default process group has not been initialized, please make sure to call init_process_group facebookresearch#55
IndexError: index 1 is out of bounds for axis 1 with size 1 facebookresearch#55 (comment)
First of all, thanks for providing this code 😄
tl;dr
I am getting ValueError when trying to run eval on
iNat21
dataset withpython -m evals.main --fname configs/evals/vitl16_inat.yaml --devices cuda:0
and running out of ideas how to fix it.Config values
iNaturalist-2021
configs\evals\vith16_inat.yaml
look like this:I have tried
torch.distributed
being available, but not initialized, but I haven't been able to pinpoint where this happensDistributedDataParallel
is the root cause, but I haven't found it in the repoSyncBatchNorm
behaving in unexpected way, when running on single GPU, but this has already been fixed in this PRin
evals.main
, to avoid using ofinit_distributed
functionFull stacktrace
The text was updated successfully, but these errors were encountered: