[TMVA/CMake] Fix CMake setup for DNN CUDA implementation #283

stwunsch · 2016-10-21T14:16:40Z

Set minimum CUDA version to 8.0 (doesn't build otherwise)
Fix include path of CUDA dependent library dnn_cuda (previously used path ROOT_INCLUDE_DIRS is not in local scope)

stwunsch · 2016-10-21T14:21:49Z

It builds fine on my machine with this cmake setup (note that CUDA 8.0 supports only GCC up to 5.3):

cmake .. \
-Dalien=OFF -Dcuda=ON \
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-8.0 \
-DCMAKE_CXX_COMPILER=/usr/bin/g++-4.9 \
-DCMAKE_C_COMPILER=/usr/bin/gcc-4.9

And here a test python file:

#!/usr/bin/env python

from ROOT import TMVA, TFile, TTree, TCut
from subprocess import call
from os.path import isfile

# Setup TMVA
TMVA.Tools.Instance()

output = TFile.Open('TMVA.root', 'RECREATE')
factory = TMVA.Factory('TMVAClassification', output,
        '!V:!Silent:Color:DrawProgressBar:Transformations=D,G:AnalysisType=Classification')

# Load data
if not isfile('tmva_class_example.root'):
    call(['curl', '-O', 'http://root.cern.ch/files/tmva_class_example.root'])

data = TFile.Open('tmva_class_example.root')
signal = data.Get('TreeS')
background = data.Get('TreeB')

dataloader = TMVA.DataLoader('dataset')
for branch in signal.GetListOfBranches():
    dataloader.AddVariable(branch.GetName())

dataloader.AddSignalTree(signal, 1.0)
dataloader.AddBackgroundTree(background, 1.0)
dataloader.PrepareTrainingAndTestTree(TCut(''),
        'TrainTestSplit_Signal=0.8:TrainTestSplit_Background=0.8:SplitMode=Random:NormMode=NumEvents:!V')

# Book methods
factory.BookMethod(dataloader, TMVA.Types.kDNN, 'DNN',
        'H:!V:VarTransform=D,G:ErrorStrategy=CROSSENTROPY:WeightInitialization=XAVIERUNIFORM:Architecture=GPU:Layout=TANH|64,TANH|64,TANH|64,LINEAR:TrainingStrategy=LearningRate=0.01,Repetitions=1,ConvergenceSteps=20,BatchSize=32,TestRepetitions=1')

# Run training, test and evaluation
factory.TrainAllMethods()
factory.TestAllMethods()
factory.EvaluateAllMethods()

stwunsch · 2016-10-21T14:24:27Z

@simonpf How was it possible on your machine to build the CUDA DNN method with the include path of the librarydnn_cuda set to ROOT_INCLUDE_DIRS, which is not populated by default? On my machines (Ubuntu 16.04 and Arch) it fails because the headers are not found.

simonpf · 2016-10-28T11:23:42Z

Hi Stefan,
You're right about the ROOT_INCLUDE_DIRS. However, the CUDA include directories are only appended to
the standard include directories so the include directories should be set properly anyways. At least they are on my machine, as well as on the techlab machines. Or is this just a weird coincidence?

Concerning CUDA 8.0: I am using CUDA 7.5 on my machine (Ubuntu 15.10) and the techlab machine (Scientific Linux CERN SLC release 6.5) so I don't think requiring CUDA version >= 8.0 is a good idea.

stwunsch · 2016-10-28T11:34:44Z

Regarding the include dirs, probably it's a threading problem? It fails finding header files, so the path is not properly populated on my machine(s). Nevertheless, setting the path directly solves it.

I'm running Ubuntu 16.10 LTS and it fails with CUDA 7.5 but it runs fine with 8.0. It is true that it should be possible to compile it on CERN SLC, so I'll keep investigating what is going wrong on my machine. Though, it should also run on Ubuntu 16.10 because it's used frequently (and will be used the next years because it's a LTS).

simonpf · 2016-10-29T14:30:34Z

Could you maybe run make VERBOSE=1 and post the build command for the
CUDA object files? Then we can see if the root include directories
are there or not.

For CUDA 8.0: Did you try if adding the following nvcc flag
-DCUDA_NVCC_FLAGS="-D_FORCE_INLINES" or something along those lines
BVLC/caffe#4046
solves the problem?

On 28 October 2016 at 13:34, Stefan Wunsch notifications@github.com wrote:

Regarding the include dirs, probably it's a threading problem? It fails
finding header files, so the path is not properly populated on my
machine(s). Nevertheless, setting the path directly solves it.

I'm running Ubuntu 16.10 LTS and it fails with CUDA 7.5 but it runs fine
with 8.0. It is true that it should be possible to compile it on CERN SLC,
so I'll keep investigating what is going wrong on my machine. Though, it
should also run on Ubuntu 16.10 because it's used frequently (and will be
used the next years because it's a LTS).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#283 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGnOhUXVmJ99qAtrtJmrJ5SGswgGJkiEks5q4d3WgaJpZM4KdRgw
.

stwunsch · 2016-10-30T07:17:59Z

Regarding the include problem:

Run with this CMake/make commands on ROOT master branch:

cmake .. \
-Dalien=OFF -Dcuda=ON \
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-8.0 \
-DCMAKE_CXX_COMPILER=/usr/bin/g++-4.9 \
-DCMAKE_C_COMPILER=/usr/bin/gcc-4.9

make -j10 VERBOSE=1 # Multi-thread!

Here the error:

/usr/users/wunsch/root_master/tmva/tmva/src/DNN/Architectures/Cuda/CudaMatrix.cu:16:52: fatal error: TMVA/DNN/Architectures/Cuda/CudaMatrix.h: No such file or directory
 #include "TMVA/DNN/Architectures/Cuda/CudaMatrix.h"
                                                    ^
compilation terminated.
CMake Error at dnn_cuda_generated_CudaMatrix.cu.o.cmake:209 (message):
  Error generating
  /usr/users/wunsch/root_master/build/tmva/tmva/CMakeFiles/dnn_cuda.dir/src/DNN/Architectures/Cuda/./dnn_cuda_generated_CudaMatrix.cu.o


make[2]: *** [tmva/tmva/CMakeFiles/dnn_cuda.dir/src/DNN/Architectures/Cuda/dnn_cuda_generated_CudaMatrix.cu.o] Error 1
make[1]: *** [tmva/tmva/CMakeFiles/dnn_cuda.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [all] Error 2

And now here the verbose make output (stdout grep 'nvcc'):

/usr/local/cuda-8.0/bin/nvcc -M -D__CUDACC__ /usr/users/wunsch/root_master/tmva/tmva/src/DNN/Architectures/Cuda/CudaMatrix.cu -o /usr/users/wunsch/root_master/build/tmva/tmva/CMakeFiles/dnn_cuda.dir/src/DNN/Architectures/Cuda/dnn_cuda_generated_CudaMatrix.cu.o.NVCC-depend -ccbin /usr/bin/gcc-4.9 -m64 --std c++11 -Xcompiler ,\"-pipe\",\"-m64\",\"-Wshadow\",\"-Wall\",\"-W\",\"-Woverloaded-virtual\",\"-fsigned-char\",\"-fPIC\",\"-pthread\",\"-O2\",\"-g\",\"-DNDEBUG\" -DNVCC -I/usr/local/cuda-8.0/include -I/usr/users/wunsch/root_master/build/include -I/usr/local/cuda-8.0/include

Now, if I run it with only one thread (make -j1 VERBOSE=1), everything runs fine (but the verbose output has not changed):

/usr/local/cuda-8.0/bin/nvcc -M -D__CUDACC__ /usr/users/wunsch/root_master/tmva/tmva/src/DNN/Architectures/Cuda/CudaMatrix.cu -o /usr/users/wunsch/root_master/build/tmva/tmva/CMakeFiles/dnn_cuda.dir/src/DNN/Architectures/Cuda/dnn_cuda_generated_CudaMatrix.cu.o.NVCC-depend -ccbin /usr/bin/gcc-4.9 -m64 --std c++11 -Xcompiler ,\"-pipe\",\"-m64\",\"-Wshadow\",\"-Wall\",\"-W\",\"-Woverloaded-virtual\",\"-fsigned-char\",\"-fPIC\",\"-pthread\",\"-O2\",\"-g\",\"-DNDEBUG\" -DNVCC -I/usr/local/cuda-8.0/include -I/usr/users/wunsch/root_master/build/include -I/usr/local/cuda-8.0/include

So, during the generation of the object file of CudaMatrix.cu, the include dir points only to the build include. But I can't find a copy/install target in CMake, which should copy CudaMatrix.h to this directory. Still, it seems to be a threading problem, which only occurs for the CUDA library.

Regarding building the method with CUDA 7.5 on Ubuntu 16.04:

Here the CMake config:

cmake .. \
-Dalien=OFF -Dcuda=ON \
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-7.5 \
-DCMAKE_CXX_COMPILER=/usr/bin/g++-4.9 \
-DCMAKE_C_COMPILER=/usr/bin/gcc-4.9

The error message:

/usr/include/string.h: In function ‘void* __mempcpy_inline(void*, const void*, size_t)’:
/usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
   return (char *) memcpy (__dest, __src, __n) + __n;

And yes, it is fixed, if you append -DCUDA_NVCC_FLAGS="-D_FORCE_INLINES" to CMake! As well, CUDA 8.0 does not break with this flag. So we could append this flag permanently, though I don't like this idea because it clutters the build system. On the other hand, every Ubuntu LTS 16.04 user for the next years will face this problem.

simonpf · 2016-10-31T01:27:37Z

The relevant target is probably move_headers defined here. So adding this as a dependency to dnn_cuda could probably solve the problem. I can have a look at this on Wednesday.

Concerning CUDA 7.5 and Ubuntu 16.04: I see this rather as a problem with the User's CUDA setup than
a problem of root considering that this is a common problem also for users of caffe, opencv and torch.

stwunsch · 2016-10-31T09:08:03Z

Unfortunately the error occurs with the default installation on Ubunut 16.04. But still, I think that you are right and it's not ROOT's problem at all.

stwunsch · 2016-11-02T09:07:38Z

Yep, adding move_headers as dependency of dnn_cuda solves the problem. Here is the code:

#---Handle CUDA dependent code. -----------------
if(CUDA_FOUND)
  CUDA_INCLUDE_DIRECTORIES(${ROOT_INCLUDE_DIRS})
  CUDA_ADD_LIBRARY(dnn_cuda ${DNN_CUDA_FILES})
  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DDNNCUDA")
  set(DNN_CUDA_LIBRARIES dnn_cuda ${CUDA_CUBLAS_LIBRARIES})
  add_dependencies(dnn_cuda move_headers)
  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DDNNCUDA")
else()
  set(DNN_CUDA_LIBRARIES)
endif()

This solutions seems pretty hacky for me, shouldn't a library be build with links to the headers in the source directory? Actually, I think that I'm missing something in the build system.

simonpf · 2016-11-03T01:38:30Z

OK, great. Now, if you could also remove CUDA_INCLUDE_DIRECTORIES the bug
is fixed and the
build system a bit cleaner.

I am unsure about the hackieness of the solution, as I understand it this
is the way it is done in Root.

On 2 November 2016 at 10:07, Stefan Wunsch notifications@github.com wrote:

Yep, adding move_headers as dependency of dnn_cuda solves the problem.
Here is the code:

#---Handle CUDA dependent code. -----------------
if(CUDA_FOUND)
CUDA_INCLUDE_DIRECTORIES(${ROOT_INCLUDE_DIRS})
CUDA_ADD_LIBRARY(dnn_cuda ${DNN_CUDA_FILES})
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DDNNCUDA")
set(DNN_CUDA_LIBRARIES dnn_cuda ${CUDA_CUBLAS_LIBRARIES})
add_dependencies(dnn_cuda move_headers)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DDNNCUDA")
else()
set(DNN_CUDA_LIBRARIES)
endif()

This solutions seems pretty hacky for me, shouldn't a library be build
with links to the headers in the source directory? Actually, I think that
I'm missing something in the build system.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#283 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGnOhTHxNjmACDlkEFuRr4gm6Ac8iHqMks5q6FLcgaJpZM4KdRgw
.

… library

stwunsch · 2016-11-03T08:10:10Z

Following the line

set_property(GLOBAL APPEND PROPERTY ROOT_INCLUDE_DIRS ${CMAKE_CURRENT_SOURCE_DIR}/inc)

the ROOT include directories point to the source directory, but now we are waiting for the move_headers target so that we can point to the binary include dir in build/.

Nevertheless, I'll close this here for now and ask CMake guru Pere for advice about the include directories. I think that we've figured out the problems!

stwunsch · 2016-11-03T09:29:47Z

I talked to Pere, the move_header dependency is indeed the correct way to implement a custom target. Normally, the dependency is hidden inside ROOT_LINKER_LIBRARY. So the dependency on the CUDA dependent target mirrors this function (which can't be used in our case).

@lmoneta From my point of view, this PR is merging ready!

stwunsch changed the title ~~[TMVA/CMake] Fix CMake setup for DNN CUDA implementation~~ [TMVA/CMake][WIP] Fix CMake setup for DNN CUDA implementation Oct 30, 2016

Set minimum CUDA version to 7.5 and fix threading issue of dnn_cuda…

fa6a634

… library

stwunsch changed the title ~~[TMVA/CMake][WIP] Fix CMake setup for DNN CUDA implementation~~ [TMVA/CMake] Fix CMake setup for DNN CUDA implementation Nov 3, 2016

stwunsch closed this Nov 5, 2016

phsft-bot mentioned this pull request Apr 16, 2019

[cxxmodules] Don't declare strings from rootmap for modules #3043

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TMVA/CMake] Fix CMake setup for DNN CUDA implementation #283

[TMVA/CMake] Fix CMake setup for DNN CUDA implementation #283

stwunsch commented Oct 21, 2016

stwunsch commented Oct 21, 2016 •

edited

Loading

stwunsch commented Oct 21, 2016 •

edited

Loading

simonpf commented Oct 28, 2016

stwunsch commented Oct 28, 2016

simonpf commented Oct 29, 2016

stwunsch commented Oct 30, 2016

simonpf commented Oct 31, 2016

stwunsch commented Oct 31, 2016

stwunsch commented Nov 2, 2016

simonpf commented Nov 3, 2016

stwunsch commented Nov 3, 2016

stwunsch commented Nov 3, 2016

[TMVA/CMake] Fix CMake setup for DNN CUDA implementation #283

[TMVA/CMake] Fix CMake setup for DNN CUDA implementation #283

Conversation

stwunsch commented Oct 21, 2016

stwunsch commented Oct 21, 2016 • edited Loading

stwunsch commented Oct 21, 2016 • edited Loading

simonpf commented Oct 28, 2016

stwunsch commented Oct 28, 2016

simonpf commented Oct 29, 2016

stwunsch commented Oct 30, 2016

simonpf commented Oct 31, 2016

stwunsch commented Oct 31, 2016

stwunsch commented Nov 2, 2016

simonpf commented Nov 3, 2016

stwunsch commented Nov 3, 2016

stwunsch commented Nov 3, 2016

stwunsch commented Oct 21, 2016 •

edited

Loading

stwunsch commented Oct 21, 2016 •

edited

Loading