enable operator gpu unittest #3050

QiJune · 2017-07-25T07:40:35Z

No description provided.

QiJune · 2017-08-01T08:08:03Z

paddle/operators/add_op.cu

@@ -1,3 +1,4 @@
+#define EIGEN_USE_GPU


Have to add this macro, otherwise, will cause segment fault

QiJune · 2017-08-01T08:09:57Z

paddle/pybind/tensor_bind.h

  std::memcpy(dst, array.data(), sizeof(T) * array.size());
 }

+#ifndef PADDLE_ONLY_CPU
+template <typename T>
+void PyCUDATensorSetFromArray(


partial specialization of function template is not allowed. So just define another function here

QiJune · 2017-08-01T08:10:29Z

python/paddle/v2/framework/tests/op_test_util.py

+            places = []
+            places.append(core.CPUPlace())
+            if core.is_compile_gpu():
+                places.append(core.GPUPlace(0))


Run CPU OpKernel first, and then GPU OpKernel

QiJune · 2017-08-01T08:11:19Z

python/paddle/v2/framework/tests/test_add_two_op.py

@@ -8,8 +8,8 @@ class TestAddOp(unittest.TestCase):

    def setUp(self):
        self.type = "add_two"
-        self.X = numpy.random.random((342, 345)).astype("float32")
-        self.Y = numpy.random.random((342, 345)).astype("float32")
+        self.X = numpy.random.random((102, 105)).astype("float32")


Sometimes, GPU memory is not enough for unit-test, so reduce the size

QiJune · 2017-08-01T08:12:41Z

cmake/flags.cmake

+        # TODO(qijun) gcc 4.9 or later versions raise SEGV due to the optimization problem.
+        # Use Debug mode instead for now.
+        if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER 4.9 OR CMAKE_CXX_COMPILER_VERSION VERSION_EQUAL 4.9) 
+            set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "" FORCE)


If gcc 4.8 and nvcc 8.0, both debug and release will be fine.
If gcc 5.4 and nvcc 8.0, debug is fine, but release will cause segment fault.

How about we constraint GCC version in Dockerfile, other than checking here?

I know checking in CMake provides a guarantee, but I am afraid adding too many such constraints would complicate our building system too much.

Yes, I will constraint GCC version in Dockerfile.
But some others may compile paddle in gcc5.4 environment, so, the unittest test_framework will fall.

gangliao · 2017-08-01T08:50:07Z

paddle/framework/detail/tensor-inl.h

@@ -62,9 +61,11 @@ inline T* Tensor::mutable_data(platform::Place place) {
    if (platform::is_cpu_place(place)) {
      holder_.reset(new PlaceholderImpl<T, platform::CPUPlace>(
          boost::get<platform::CPUPlace>(place), size));
+    } else if (platform::is_gpu_place(place)) {
+#ifdef PADDLE_ONLY_CPU
+      PADDLE_THROW("'GPUPlace' is not supported in CPU only device.");
    }


{ is outside of macro, so } is needed inside macro

gangliao

LGTM

QiJune added 11 commits July 25, 2017 15:39

enable operator gpu unittest

e2ba133

set default cpu place for tensor alloc

d510913

fix build error

aa5ca8a

make gpu_context inside macro

ff594fa

fix gpu build error

a71a9e6

Merge remote-tracking branch 'baidu/develop' into op_gpu_test

2ddef13

fix gpu build error

358261f

fix bug in register gpu OpKernel

4ecf68e

merge baidu/develop

4cc4217

fix build error

47d8bca

add gpu python op test

4a1f7bd

QiJune force-pushed the op_gpu_test branch 2 times, most recently from 2aae619 to 4a1f7bd Compare July 31, 2017 09:42

QiJune added 5 commits July 31, 2017 17:45

add EIGEN_USE_GPU macro to op.cu file

61f94f0

reduce gpu memory allocation in op_test

cf5ac58

remove unused codes

db4d668

pass precommit

bc7be2a

add cmake patch for gcc version larger than 4.9

edb5729

QiJune changed the title ~~[WIP]enable operator gpu unittest~~ enable operator gpu unittest Aug 1, 2017

QiJune requested review from reyoung, wangkuiyi, gangliao, hedaoyuan and jacquesqiao August 1, 2017 07:07

QiJune commented Aug 1, 2017

View reviewed changes

gangliao reviewed Aug 1, 2017

View reviewed changes

wangkuiyi approved these changes Aug 2, 2017

View reviewed changes

merge baidu/develop

81cc7a3

gangliao approved these changes Aug 2, 2017

View reviewed changes

QiJune added 2 commits August 2, 2017 16:22

Merge remote-tracking branch 'baidu/develop' into op_gpu_test

341d188

pass pre commit

043e983

QiJune merged commit 6824c09 into PaddlePaddle:develop Aug 2, 2017

heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021

fix_test_nms (PaddlePaddle#3050)

9db4a31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable operator gpu unittest #3050

enable operator gpu unittest #3050

QiJune commented Jul 25, 2017

QiJune Aug 1, 2017 •

edited

Loading

QiJune Aug 1, 2017

QiJune Aug 1, 2017 •

edited

Loading

QiJune Aug 1, 2017

QiJune Aug 1, 2017

wangkuiyi Aug 2, 2017

QiJune Aug 2, 2017

gangliao Aug 1, 2017

QiJune Aug 2, 2017

gangliao left a comment

enable operator gpu unittest #3050

enable operator gpu unittest #3050

Conversation

QiJune commented Jul 25, 2017

QiJune Aug 1, 2017 • edited Loading

Choose a reason for hiding this comment

QiJune Aug 1, 2017

Choose a reason for hiding this comment

QiJune Aug 1, 2017 • edited Loading

Choose a reason for hiding this comment

QiJune Aug 1, 2017

Choose a reason for hiding this comment

QiJune Aug 1, 2017

Choose a reason for hiding this comment

wangkuiyi Aug 2, 2017

Choose a reason for hiding this comment

QiJune Aug 2, 2017

Choose a reason for hiding this comment

gangliao Aug 1, 2017

Choose a reason for hiding this comment

QiJune Aug 2, 2017

Choose a reason for hiding this comment

gangliao left a comment

Choose a reason for hiding this comment

QiJune Aug 1, 2017 •

edited

Loading

QiJune Aug 1, 2017 •

edited

Loading