Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compatibility with CUDA - Clang - CMake 3.20.0 toolchain #4106

Closed
StrikerRUS opened this issue Mar 24, 2021 · 10 comments · Fixed by #4183
Closed

Fix compatibility with CUDA - Clang - CMake 3.20.0 toolchain #4106

StrikerRUS opened this issue Mar 24, 2021 · 10 comments · Fixed by #4183

Comments

@StrikerRUS
Copy link
Collaborator

Our CUDA+Clang builds are incopatible with very recently released CMake 3.20.0 version:

Found Clang, but using gcc.

-- The C compiler identification is Clang 6.0.0
-- The CXX compiler identification is Clang 6.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/clang++ - skipped

...

gcc: error: unrecognized command line option ‘-fopenmp=libomp’; did you mean ‘-fopenmp-simd’?
-- The C compiler identification is Clang 6.0.0
-- The CXX compiler identification is Clang 6.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 10.0.130
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found OpenMP_C: -fopenmp=libomp (found version "3.1") 
-- Found OpenMP_CXX: -fopenmp=libomp (found version "3.1") 
-- Found OpenMP: TRUE (found version "3.1")  
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found suitable version "10.0", minimum required is "9.0") 
-- CMAKE_CUDA_FLAGS: -Xcompiler=-fopenmp=libomp -Xcompiler=-fPIC -Xcompiler=-Wall -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -O3 -lineinfo
-- ALLFEATS_DEFINES: -DPOWER_FEATURE_WORKGROUPS=12;-DUSE_CONSTANT_BUF=0;-DENABLE_ALL_FEATURES
-- FULLDATA_DEFINES: -DPOWER_FEATURE_WORKGROUPS=12;-DUSE_CONSTANT_BUF=0;-DENABLE_ALL_FEATURES;-DIGNORE_INDICES
-- Performing Test MM_PREFETCH
-- Performing Test MM_PREFETCH - Success
-- Using _mm_prefetch
-- Performing Test MM_MALLOC
-- Performing Test MM_MALLOC - Success
-- Using _mm_malloc
-- Configuring done
CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "histo_16_64_256_sp".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "histo_16_64_256-fulldata_sp".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "histo_16_64_256_sp_const".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "histo_16_64_256-fulldata_sp_const".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "_lightgbm".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "_lightgbm".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "lightgbm".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "lightgbm".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "histo_16_64_256-allfeats_sp_const".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "histo_16_64_256-allfeats_sp".
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Generating done
-- Build files have been written to: /tmp/pip-req-build-83l4zq2o/build_cpp
[  2%] Building CUDA object CMakeFiles/histo_16_64_256_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[  5%] Building CUDA object CMakeFiles/histo_16_64_256-fulldata_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[  7%] Building CUDA object CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[ 10%] Building CUDA object CMakeFiles/histo_16_64_256_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
gcc: error: unrecognized command line option ‘-fopenmp=libomp’; did you mean ‘-fopenmp-simd’?
CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/build.make:75: recipe for target 'CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o' failed
make[3]: *** [CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o] Error 1
make[3]: *** Deleting file 'CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o'
CMakeFiles/Makefile2:180: recipe for target 'CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/all' failed
make[2]: *** [CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/all] Error 2
make[2]: *** Waiting for unfinished jobs....
gcc: error: unrecognized command line option ‘-fopenmp=libomp’; did you mean ‘-fopenmp-simd’?
CMakeFiles/histo_16_64_256_sp_const.dir/build.make:75: recipe for target 'CMakeFiles/histo_16_64_256_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o' failed
make[3]: *** [CMakeFiles/histo_16_64_256_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o] Error 1
make[3]: *** Deleting file 'CMakeFiles/histo_16_64_256_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o'
CMakeFiles/Makefile2:232: recipe for target 'CMakeFiles/histo_16_64_256_sp_const.dir/all' failed
make[2]: *** [CMakeFiles/histo_16_64_256_sp_const.dir/all] Error 2
gcc: error: unrecognized command line option ‘-fopenmp=libomp’; did you mean ‘-fopenmp-simd’?
CMakeFiles/histo_16_64_256-fulldata_sp.dir/build.make:75: recipe for target 'CMakeFiles/histo_16_64_256-fulldata_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o' failed
make[3]: *** [CMakeFiles/histo_16_64_256-fulldata_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o] Error 1
make[3]: *** Deleting file 'CMakeFiles/histo_16_64_256-fulldata_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o'
CMakeFiles/Makefile2:206: recipe for target 'CMakeFiles/histo_16_64_256-fulldata_sp.dir/all' failed
make[2]: *** [CMakeFiles/histo_16_64_256-fulldata_sp.dir/all] Error 2
gcc: error: unrecognized command line option ‘-fopenmp=libomp’; did you mean ‘-fopenmp-simd’?
CMakeFiles/histo_16_64_256_sp.dir/build.make:75: recipe for target 'CMakeFiles/histo_16_64_256_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o' failed
make[3]: *** [CMakeFiles/histo_16_64_256_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o] Error 1
make[3]: *** Deleting file 'CMakeFiles/histo_16_64_256_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o'
CMakeFiles/Makefile2:125: recipe for target 'CMakeFiles/histo_16_64_256_sp.dir/all' failed
make[2]: *** [CMakeFiles/histo_16_64_256_sp.dir/all] Error 2
CMakeFiles/Makefile2:106: recipe for target 'CMakeFiles/_lightgbm.dir/rule' failed
make[1]: *** [CMakeFiles/_lightgbm.dir/rule] Error 2
Makefile:169: recipe for target '_lightgbm' failed
make: *** [_lightgbm] Error 2

Here is how we are supporting non-default compiler (Clang):
https://github.com/microsoft/LightGBM/pull/3886/files#diff-1e7de1ae2d059d21e1dd75d5812d5a34b0222cef273b7c3a2af62eb747f9d20a

LightGBM/.ci/test.sh

Lines 6 to 9 in ab474dc

elif [[ $OS_NAME == "linux" ]] && [[ $COMPILER == "clang" ]]; then
export CXX=clang++
export CC=clang
fi

For now we are restricting CMake versinon:

LightGBM/.ci/setup.sh

Lines 100 to 107 in ab474dc

if [[ $COMPILER == "clang" ]]; then
apt-get install --no-install-recommends -y \
cmake="3.19.5-0kitware1" \
cmake-data="3.19.5-0kitware1"
else
apt-get install --no-install-recommends -y \
cmake
fi

Does anybody have a GitLab account to ask CMake team what has changed in 3.20.0 version because it looks like a regression problem on CMake side?
https://gitlab.kitware.com/cmake/cmake/-/issues

@jameslamb
Copy link
Collaborator

I have an account and can ask!

@StrikerRUS
Copy link
Collaborator Author

@jameslamb

I have an account and can ask!

Great! Many thanks!

@StrikerRUS
Copy link
Collaborator Author

@jameslamb Have they already answered anything?

@jameslamb
Copy link
Collaborator

Sorry, I haven't asked yet. I want to come to them with a smaller reproducible example since LightGBM is a fairly large project. I think that will improve the eventual quality of the answer we get.

I have access to a machine with an NVIDIA GPU and cuda 10.2 so I think I can make such a smaller reproducible example. If I can't figure that out in the next day or two I'll just ask about LightGBM directly.

@jameslamb
Copy link
Collaborator

jameslamb commented Apr 7, 2021

Ok I've opened an issue with CMake: https://gitlab.kitware.com/cmake/cmake/-/issues/22037.

I wasn't able to create a non-LightGBM reproducible example, but I at least cut most of CMakeLists.txt out by dropping unnecessary stuff like SWIG, HDFS, BUILD_FOR_R, etc.

See that link for more information on the reproducible example. Sharing it here too:

git clone --recursive --branch cuda-cmake-repro git@github.com:jameslamb/LightGBM.git
cd LightGBM

# this will compile successfully
docker run \
    -v $(pwd):/opt/test \
    -w /opt/test \
    --env DEBIAN_FRONTEND=noninteractive \
    -t nvcr.io/nvidia/cuda:11.2.2-devel \
    ./test-cuda.sh

# this will fail with the error mentioned above
docker run \
    -v $(pwd):/opt/test \
    -w /opt/test \
    --env DEBIAN_FRONTEND=noninteractive \
    -t nvcr.io/nvidia/cuda:10.0-devel \
    ./test-cuda.sh

# this will fail with the error mentioned above
docker run \
    -v $(pwd):/opt/test \
    -w /opt/test \
    --env DEBIAN_FRONTEND=noninteractive \
    -t nvcr.io/nvidia/cuda:9.0-devel \
    ./test-cuda.sh

This error does NOT show up in the CUDA 11.2.2 image used in testing.

cuda-9.0 cuda-10.0 cuda-11.2.2
clang 3.8.0-2 6.0.0 10.0.0
gcc 5.4.0 7.5.0 9.3.0
ld 2.26.1 2.30 2.34
nvcc 9.0.176 10.0.13.0 11.2.152
result FAILURE FAILURE SUCCESS

@StrikerRUS
Copy link
Collaborator Author

Thank you very much for minimizing the repro and posting the issue!
The most annoying thing right now is that we are not able to test any workarounds they may suggest us due to #4165 😬 .

@jameslamb
Copy link
Collaborator

Yeah agreed :/

Happy to say the CMake maintainers do see this as a regression, and they've already opened a PR with a bugfix to patch it: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/5992.

@StrikerRUS
Copy link
Collaborator Author

Wow, this is just awesome!

@StrikerRUS
Copy link
Collaborator Author

Upstream issue fixed in CMake 3.20.1 version.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants