-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parafold run failing since pulling latest changes #27
Comments
Could you send me your cuda version and jax/jaxlib version? I think you are correct that "jax/jaxlib versions may not be compatible for the CUDA" |
CUDA version 11.6 |
My environment is similar to you:
Is it possible that this is caused by the difference between cudatoolkit and cuda? (I'm not so sure) |
I see, then it must be something else. I tried again with the latest version of cudatoolkit (11.8) and cudnn (8.4.1), but it still fails. I am also running the program on a HPC where CUDA and cudnn are loaded as modules and are not in the standard path such as |
Maybe there are some other problems. The standard path should not be the reason. I have a suggestion. Can you try to use CPU to run this pipeline? Another thing is to double check whether sufficient memory is provided. |
Unfortunately that does not work either. I get the following error
I think I also have 256GB of memory |
I seemed to have solved the issue with the version of jax and jaxlib. Now I do not get the rocm and plugin errors. But the run still gets killed at line 244 of |
I was finally only able to run the python script
I don't know where the error is coming from when I run the bash script. The environment variables seem fine. My jax and jaxlib versions are the latest - 0.4.4 and 0.4.4+cuda11.cudnn86 respectively |
Hello.
I pulled the latest Parafold changes and created a new environment with the suggested installation steps. Next I ran the following command to use Alphafold.
Unfortunately I get the following error
I tried searching for the error elsewhere and some suggested that my jax/jaxlib versions may not be compatible for the CUDA and cudnn version running on my machines. However, I checked this and the versions are all correct since running
jax.devices()
in python detects my GPU.So I am puzzled why the software is not running any longer. Can you please help me with this?
I was able to successfully run alphafold before the latest changes (with version 2.2).
The text was updated successfully, but these errors were encountered: