-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DLRM v2] Using the model for the inference reference implementation #648
Comments
Hi Pablo, |
Already have this version, but the error persist
|
Have you tried to remove fbgemm-gpu as well? |
@yuankuns When i try to remove the
I managed to run the cpu version with |
@pgmpablo157321 It's interesting, since there is no GPU on our server, and it (only fbgemm-gpu-cpu==0.3.2) work for our case. |
@pgmpablo157321 is this still an issue? |
I am currently making the reference implementation and am stuck deploying the model in multiple GPUs.
Here is a link to the PR: mlcommons/inference#1373
Here is a link to the file where the model is: https://github.com/mlcommons/inference/blob/7c64689b261f97a4fc3410bff584ac2439453bcc/recommendation/dlrm_v2/pytorch/python/backend_pytorch_native.py
Currently this works for a debugging model and a single GPU, but fails when I try to run it with multiple ones. Here are the issues that I have:
or
This can be because I am trying to load a sharded model in a different number of ranks. Do you know if that could be related if thats related?
I have tried with pytorch versions 1.12, 1.13, 2.0.0, 2.0.1 and fbgemm version 0.3.2 and 0.4.1
The text was updated successfully, but these errors were encountered: