Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update MKLDNN cmakes #3168

Merged
merged 3 commits into from
Aug 3, 2017
Merged

update MKLDNN cmakes #3168

merged 3 commits into from
Aug 3, 2017

Conversation

tensor-tang
Copy link
Contributor

@tensor-tang tensor-tang commented Aug 2, 2017

fix #3162

  1. move MKLDNN and MKLML install path to build third party path
  2. disable both when build doc
  3. disable both on MacOS and Win32, not supported yet
  4. give up hard code building MLKDNN

and disable both when build doc and MacOS
@luotao1
Copy link
Contributor

luotao1 commented Aug 2, 2017

I test the changes on my machine, but there are errors as follows:

/home/luotao02/Paddle/third_party/mkldnn/src/extern_mkldnn/src/cpu/gemm_convolution.cpp:26:23: fatal error: mkl_cblas.h: No such file or directory
 #include "mkl_cblas.h"
                       ^
compilation terminated.

The full log is log.txt

And you forget to change the CMakeLists: use ${AVX_FOUND} to switch the ON/OFF of mkldnn and mklml.

option(WITH_MKLDNN      "Compile PaddlePaddle with mkl-dnn support."    ${AVX_FOUND})

@tensor-tang
Copy link
Contributor Author

tensor-tang commented Aug 2, 2017

And you forget to change the CMakeLists: use ${AVX_FOUND} to switch the ON/OFF of mkldnn and mklml.

Actually, I did not forget about that. I thought you would prefer default OFF just like last time.
I can use ${AVX_FOUND} instead, will update later. Anyway, thanks for your reminder.

About the error, I check that ASAP, never shown on my machine before.

@luotao1
Copy link
Contributor

luotao1 commented Aug 2, 2017

The CMake command result is CMakeCache.txt

@tensor-tang
Copy link
Contributor Author

The CMakeCache.txt make sense to me.
Maybe we need to check your local env carefully, some environment variable must be loaded before use MKLDNN, and that impact on the missing files.

@tensor-tang
Copy link
Contributor Author

The last commit failed with Teamcity shown with Cuda Error: out of memory.

The following tests FAILED:
[20:57:53] : [Step 1/1] 34 - test_matrixCompare (OTHER_FAULT)
[20:57:53]W: [Step 1/1] Errors while running CTest

Not actually this commit caused this failing.

@luotao1
Copy link
Contributor

luotao1 commented Aug 3, 2017

@helinwang test_matrixCompare, test_NetworkCompare and test_CompareSparse fail occasionally, maybe we should refine the unittest to decrease the memory.

@tensor-tang
Copy link
Contributor Author

Still failed with GPU, this time is about the accuracy.
Please help to reset the unit test, thanks~

The last test passed.

test_matrixCompare .................. Passed 259.42 sec

While failed with test_LayerGrad with the accuracy.

The following tests FAILED:
[03:48:30] : [Step 1/1] 46 - test_LayerGrad (Failed)

layer_type=recurrent useGpu=1
[03:44:52] : [Step 1/1] I8030 30:41:55.718948 27745 LayerGradUtil.cpp:703] cost 65.9891
[03:44:52] : [Step 1/1] I8030 30:41:55.758448 27745 LayerGradUtil.cpp:43] recurrent layer_0 step=1e-06 cost1=66.3733 cost2=65.6477 true_delta=0.725571 analytic_delta=0.756912 diff=-0.0414063 ***
[03:44:52] : [Step 1/1] /paddle/paddle/gserver/tests/LayerGradUtil.cpp:752: Failure
[03:44:52] : [Step 1/1] Expected: (fabs(maxDiff)) <= (epsilon), actual: 0.0414063 vs 0.02

@helinwang
Copy link
Contributor

@luotao1 @tensor-tang Sorry about the out of memory problem, I will take a look.

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Since I create a clean /build directory, I can compile and test successfully.

@luotao1 luotao1 merged commit ca39600 into PaddlePaddle:develop Aug 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Installation path and build_doc problems of Mkldnn
3 participants