-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Can't find nccl when building from source #28
Comments
I tried to fix this by adding the following contents: add
to change |
@KnowingNothing Does this still apply ? or no |
@KnowingNothing does this still apply? |
I also met the same issues, solved by also checked if the path of nccl is included. It should be somewhere in For me it is in either a |
Describe the bug
A clear and concise description of what the bug is.
Can't find libnccl.so when building from source. It seems flux only builds static nccl lib instead of shared lib. But reduce_scatter requires shared nccl lib.
To Reproduce
Steps to reproduce the behavior. The easier it is to reproduce the faster it will get maintainer attention.
run
./build.sh --arch 80
Expected behavior
A clear and concise description of what you expected to happen.
link fails. Cannot find
-lnccl
Stack trace/logs
If applicable, add the stack trace or logs from the time of the error.
Environment
Linux hina 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:18:23_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0
A100 80GB PCIE 8 cards
Proposed fix
If you have a proposal for how to fix the issue state it here or link to a PR.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: