Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake and autodetect gpu #70

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

cmake and autodetect gpu #70

wants to merge 11 commits into from

Conversation

jkrauska
Copy link

Change in Makefile to support up to Turing GPUs.
NVIDIA Reference

WARNING: Creates binaries 7X larger and makes build 7X slower but supports 6X the GPUs families.

@unzvfu
Copy link
Collaborator

unzvfu commented May 26, 2019

Thanks for the submission!

Everything looks fine, however I would really like to maintain the option of easily compiling for just one architecture. This is vital for development, because 80 second recompilation times are just too long. Eventually I will write a proper CMake script with different DEBUG/RELEASE flags, but in the meantime could you adapt your Makefile to allow for one architecture? Thanks!

@jkrauska
Copy link
Author

@unzvfu Here's a redone attempt that uses cmake. (cmake has lovely cuda support, including gpu detection)

Give it a spin and let me know what you think?

@jkrauska jkrauska changed the title cuda 10 support and multi-arch gpu cmake and autodetect gpu May 30, 2019
@unzvfu
Copy link
Collaborator

unzvfu commented May 30, 2019

Great, thanks heaps for that! There are still one or two things to add, but you've made an excellent start. As I see it, we still need to fix

  1. CMake support for gtest. Compilation currently fails if libgtest.a is not installed in a common location. This should be picked up at build configuration time, not via linking failure.
  2. The compiler flags have been muddled compared to what I had before. We want NDEBUG defined when compiling bench, but it shouldn't be defined when compiling the test suite (as it now is).

@jkrauska
Copy link
Author

@unzvfu I think I addressed you concerns.
1 - checks for gtest, only creates 'check' target if it's there
2 - only used NDEBUG on 'check' target
3- adds -lineinfo (which I had forgotten)

@unzvfu
Copy link
Collaborator

unzvfu commented May 30, 2019

@jkrauska Excellent! Nearly there: I realise my description above of when NDEBUG should be defined was a bit confusing. Recall that specifying -DNDEBUG switches OFF debugging code (in particular assert statements). So it should be specified for bench and omitted for test-suite (since I want assert statements called in the test suite). Make sense?

@jkrauska
Copy link
Author

@unzvfu your description was perfect, I was sloppy. fixed.

@unzvfu unzvfu assigned unzvfu and unassigned unzvfu Jun 2, 2019
@unzvfu
Copy link
Collaborator

unzvfu commented Jun 2, 2019

@jkrauska Related to what I said above, it seems the configuration of libgtest.a is not correct. When libgtest.a is installed in a non-standard location (as is quite common), then I believe we specify that location to CMake like so:

$ CMAKE_PREFIX_PATH=~/src/googletest-build/googlemock/gtest cmake ..
-- Autodetected CUDA architecture(s): 5.0 
-- Found GTest: /home/hlaw/src/googletest-build/googlemock/gtest/libgtest.a  
-- Configuring done
-- Generating done
-- Build files have been written to: /home/hlaw/src/cuda-fixnum-build-fix/build

So libgtest.a is found. However, when I compile the non-standard library path is not used:

$ make VERBOSE=1
[...snip...]
make -f CMakeFiles/check.dir/build.make CMakeFiles/check.dir/depend
make[2]: Entering directory '/home/hlaw/src/cuda-fixnum-build-fix/build'
cd /home/hlaw/src/cuda-fixnum-build-fix/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/hlaw/src/cuda-fixnum-build-fix /home/hlaw/src/cuda-fixnum-build-fix /home/hlaw/src/cuda-fixnum-build-fix/build /home/hlaw/src/cuda-fixnum-build-fix/build /home/hlaw/src/cuda-fixnum-build-fix/build/CMakeFiles/check.dir/DependInfo.cmake --color=
Dependee "/home/hlaw/src/cuda-fixnum-build-fix/build/CMakeFiles/check.dir/DependInfo.cmake" is newer than depender "/home/hlaw/src/cuda-fixnum-build-fix/build/CMakeFiles/check.dir/depend.internal".
Dependee "/home/hlaw/src/cuda-fixnum-build-fix/build/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/home/hlaw/src/cuda-fixnum-build-fix/build/CMakeFiles/check.dir/depend.internal".
Scanning dependencies of target check
make[2]: Leaving directory '/home/hlaw/src/cuda-fixnum-build-fix/build'
make -f CMakeFiles/check.dir/build.make CMakeFiles/check.dir/build
make[2]: Entering directory '/home/hlaw/src/cuda-fixnum-build-fix/build'
[ 66%] Building CUDA object CMakeFiles/check.dir/tests/test-suite.cu.o
/usr/bin/nvcc   -I/home/hlaw/src/cuda-fixnum-build-fix/src  -gencode arch=compute_50,code=sm_50 -Xcompiler -Wall,-Wextra   -lineinfo -std=c++11 -x cu -c /home/hlaw/src/cuda-fixnum-build-fix/tests/test-suite.cu -o CMakeFiles/check.dir/tests/test-suite.cu.o
[ 83%] Linking CUDA device code CMakeFiles/check.dir/cmake_device_link.o
/usr/bin/cmake -E cmake_link_script CMakeFiles/check.dir/dlink.txt --verbose=1
/usr/bin/nvcc   -gencode arch=compute_50,code=sm_50 -Xcompiler -Wall,-Wextra  -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink CMakeFiles/check.dir/tests/test-suite.cu.o -o CMakeFiles/check.dir/cmake_device_link.o 
[100%] Linking CUDA executable bin/check
/usr/bin/cmake -E cmake_link_script CMakeFiles/check.dir/link.txt --verbose=1
/usr/lib/nvidia-cuda-toolkit/bin/g++   CMakeFiles/check.dir/tests/test-suite.cu.o CMakeFiles/check.dir/cmake_device_link.o -o bin/check -lstdc++ -lgtest  -L"/usr/lib/x86_64-linux-gnu/stubs" -lcudadevrt -lcudart_static -lrt -lpthread -ldl
/usr/bin/x86_64-linux-gnu-ld: cannot find -lgtest
collect2: error: ld returned 1 exit status
CMakeFiles/check.dir/build.make:113: recipe for target 'bin/check' failed
make[2]: *** [bin/check] Error 1
make[2]: Leaving directory '/home/hlaw/src/cuda-fixnum-build-fix/build'
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/check.dir/all' failed
make[1]: *** [CMakeFiles/check.dir/all] Error 2
make[1]: Leaving directory '/home/hlaw/src/cuda-fixnum-build-fix/build'
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

@unzvfu
Copy link
Collaborator

unzvfu commented Jun 3, 2019

Actually I think you can fix this by just specifying "${GTEST_LIBRARY}" in place of '-lgtest' in CMakeLists.txt.

@jkrauska
Copy link
Author

jkrauska commented Jun 3, 2019

@unzvfu yes.

I've included what I think is the correct incantation for ld.

-L GTEST_INCLUDE_DIR and -lgtest

@unzvfu
Copy link
Collaborator

unzvfu commented Jun 10, 2019

@jkrauska Any idea what to do fix the following? The problem is that the Cuda SDK installation is at /usr/local/cuda (which CMake sees towards the end with Found CUDA: /usr/local/cuda (found version "10.0")) but it also finds and tries to use the (incompatible) nvcc compiler at /usr/bin (towards the beginning with Check for working CUDA compiler: /usr/bin/nvcc).

$ CUDA_ROOT=/usr/local/cuda CMAKE_PREFIX_PATH=~/googletest/googlemock/gtest cmake ..
-- The CXX compiler identification is GNU 7.4.0
-- The CUDA compiler identification is NVIDIA 9.1.85
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working CUDA compiler: /usr/bin/nvcc
-- Check for working CUDA compiler: /usr/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found version "10.0") 
-- Autodetected CUDA architecture(s): 7.5 7.5 
-- Found GTest: /home/ive020/googletest/googlemock/gtest/libgtest.a  
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ive020/cuda-fixnum-jkrauska/build

@unzvfu
Copy link
Collaborator

unzvfu commented Jun 10, 2019

@jkrauska An unrelated thing: previous behaviour was to run the test suite at the end of make check; I would like to restore that if possible. An issue that will arise is that (at the moment unfortunately) the test-suite binary has to be run from the root source directory like tests/test-suite so that the binary can find the test case files.

@jkrauska
Copy link
Author

@jkrauska ...
finds and tries to use the (incompatible) nvcc compiler at /usr/bin (towards the beginning with Check for working CUDA compiler: /usr/bin/nvcc).

Can you establish how or why a bad /usr/bin/nvcc exists? I'd try to remove it.

CMake is trying to make good decisions and it's getting mixed messages...

@jkrauska
Copy link
Author

@jkrauska An unrelated thing: previous behaviour was to run the test suite at the end of make check; I would like to restore that if possible. An issue that will arise is that (at the moment unfortunately) the test-suite binary has to be run from the root source directory like tests/test-suite so that the binary can find the test case files.

@unzvfu can we omit the auto-run-test behavior? you cannot run the tests without first generating the inputs. (or having previous inputs) Currently the inputs take a long time to create.

set(EXECUTABLE_OUTPUT_PATH "${PROJECT_BINARY_DIR}/bin")

# CUDA
find_package(CUDA REQUIRED)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://cmake.org/cmake/help/v3.15/module/FindCUDA.html, by listing CUDA up there in project, this find_package should be unnecessary. Maybe this is a source of some of cmake's confusion?

@unzvfu
Copy link
Collaborator

unzvfu commented Jun 10, 2019

@jkrauska An unrelated thing: previous behaviour was to run the test suite at the end of make check; I would like to restore that if possible. An issue that will arise is that (at the moment unfortunately) the test-suite binary has to be run from the root source directory like tests/test-suite so that the binary can find the test case files.

@unzvfu can we omit the auto-run-test behavior? you cannot run the tests without first generating the inputs. (or having previous inputs) Currently the inputs take a long time to create.

Good call!

@unzvfu
Copy link
Collaborator

unzvfu commented Jun 10, 2019

@jkrauska ...
finds and tries to use the (incompatible) nvcc compiler at /usr/bin (towards the beginning with Check for working CUDA compiler: /usr/bin/nvcc).

Can you establish how or why a bad /usr/bin/nvcc exists? I'd try to remove it.

CMake is trying to make good decisions and it's getting mixed messages...

@jkrauska I don't disagree that removing /usr/bin/nvcc would be a reasonable idea, but I think it should work (having multiple Cuda installations is not unheard of) and in any case I'm not admin on the machine in question so my options are limited. I think @cmr's observation is probably a good start for resolving the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants