-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying to use 2 GPUs results in neverending process that can't be killed without restart #100
Comments
I wonder if this is like a bad magic shape. Can you please try sizes from 69600 till 69700? Besides, can you please post the exact log at maximum verbosity level. Also, are you using Python, R or native interface? Linux, Windows, or MacOS? CUDA version? Did you compile yourself or used the wheel? |
I'm using the python interface. Ubuntu 16.04, CUDA 10.2. I compiled using instructions in the README. Then moved the .so to where I'm using it. It fails for me with 60 000 as well. Here's the output (still running):
Here's nvidia-smi
Maybe it's because of Exclusive Process? |
With verbose=3
|
100% GPU means that the code entered an infinite cycle... It is at https://github.com/src-d/kmcuda/blob/master/src/transpose.cu#L30 CUDA 10+ has not been tested yet, so this must be a code compatibility problem and not a "real" bug. Let's try three things:
|
Did 1. with Sorry I'm not that experienced with this sort of thing what do you mean with "anything relevant printed in dmesg"? If I run
Note this is while the program is hanging on the GPUs. Is that what I was supposed to do? |
And I don't actually know will it work to call the python program with cuda-memcheck (my worry is it only works when calling an actual binary file)? |
The dmesg log is actually very insightful.
|
Ran Should I still try step 3? |
OK, then let's try |
Not really getting much output...
|
So there are no memory errors except that there are memory errors according to Then try disabling the iommu. |
I have two 1080tis. When I try and use kmcuda with both by setting CUDA_VISIBLE_DEVICES to the GPUs I use for compute, (I've tried this out with device=0 and device=3), it gets stuck on
transposing the samples...
, util shows 100%, but power usage isn't high at all. Data has shape(69673, 256)
.I've waited a couple minutes, it takes half a minute usually on 1 gpu.
If I control-c/z it or kill the ID, the process becomes a zombie and I'm forced to restart to kill it (and use the gpus again).
The text was updated successfully, but these errors were encountered: