enhance config.EnableMKLDNN api for mkldnn cache clear strategy #18549

luotao1 · 2019-07-08T09:07:44Z

enhance config.EnableMKLDNN api for mkldnn cache clear strategy: EnableMKLDNN(int mkldnn_input_shape_cache_capacity = 0)
simplify the TEST(Analyzer_MM_DNN, mkldnn_cache_clear) with the enhancement api, and add output compare between no cache strategy and using cache strategy.

luotao1 · 2019-07-08T09:15:47Z

paddle/fluid/inference/api/analysis_predictor.cc

 bool AnalysisPredictor::Run(const std::vector<PaddleTensor> &inputs,
                            std::vector<PaddleTensor> *output_data,
                            int batch_size) {
  paddle::platform::SetNumThreads(config_.cpu_math_library_num_threads());
+#ifdef PADDLE_WITH_MKLDNN
+  if (config_.use_mkldnn_) MkldnnPreRun(inputs);
+#endif


Compared with #18372, the reason of don't use MkldnnPostRun is: if reset the mkldnn_session_id to 0, the unit-test dev_ctx->GetShapeBlobSize() could not get the correct shape_blob size.

There may be a corner case, e.g. if thread is reuse with pool, in last execution, instance X config.mkldnn_input_shape_cache_capacity_ is set >0, then thread A is set thread local cache capacity and this variable is not cleared after execution, but when this thread A is reused by another instance B with config.mkldnn_input_shape_cache_capacity_ = 0, it will hit wrong branch.

so suggest to do like below in MkldnnPreRun:

if (config_.mkldnn_input_shape_cache_capacity_ > 0) { VLOG(2) << "In mkldnn cache clear mode."; platform::set_cur_mkldnn_session_id( platform::kMKLDNNSessionID_CacheClearing); platform::set_cur_input_shape_cache_capacity( config_.mkldnn_input_shape_cache_capacity_); } // Set current_input_shape . std::stringstream ss; for (size_t i = 0; i < inputs.size(); ++i) { for (size_t j = 0; j < inputs[i].shape.size(); ++j) { ss << inputs[i].shape[j] << "-"; } } VLOG(2) << "Set input shape=" << ss.str(); platform::set_cur_input_shape_str(ss.str());

luotao1 · 2019-07-08T09:16:25Z

paddle/fluid/platform/device_context.cc

@@ -462,7 +462,8 @@ void MKLDNNDeviceContext::SetBlob(const std::string& name,
  if (key_it == sBlob->end()) {
    // In cache clearing mode, cur_input_shape_cache_capacity defines
    // max pblob capacity
-    if ((sid == kMKLDNNSessionID_CacheClearing) &&
+    if ((static_cast<size_t>(sid) == kMKLDNNSessionID_CacheClearing) &&
+        sBlob->size() &&


Enhance it for cur_input_shape_cache_capacity=1 and sBlob.size()==0

test=develop

…#18532) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop

test=develop

luotao1 · 2019-07-08T14:39:31Z

@LeoZhao-Intel @jczaja Please take a review!

…ddle into luotao1-enable_mkldnn_enhance

LeoZhao-Intel · 2019-07-09T05:06:21Z

paddle/fluid/inference/api/analysis_predictor.cc

 bool AnalysisPredictor::Run(const std::vector<PaddleTensor> &inputs,
                            std::vector<PaddleTensor> *output_data,
                            int batch_size) {
  paddle::platform::SetNumThreads(config_.cpu_math_library_num_threads());
+#ifdef PADDLE_WITH_MKLDNN
+  if (config_.use_mkldnn_) MkldnnPreRun(inputs);
+#endif


There may be a corner case, e.g. if thread is reuse with pool, in last execution, instance X config.mkldnn_input_shape_cache_capacity_ is set >0, then thread A is set thread local cache capacity and this variable is not cleared after execution, but when this thread A is reused by another instance B with config.mkldnn_input_shape_cache_capacity_ = 0, it will hit wrong branch.

LeoZhao-Intel · 2019-07-09T05:10:57Z

paddle/fluid/pybind/inference_api.cc

@@ -250,7 +250,8 @@ void BindAnalysisConfig(py::module *m) {
      .def("tensorrt_engine_enabled", &AnalysisConfig::tensorrt_engine_enabled)
      .def("switch_ir_debug", &AnalysisConfig::SwitchIrDebug,
           py::arg("x") = true)
-      .def("enable_mkldnn", &AnalysisConfig::EnableMKLDNN)
+      .def("enable_mkldnn", &AnalysisConfig::EnableMKLDNN,
+           py::arg("mkldnn_input_shape_cache_capacity") = 0)
      .def("mkldnn_enabled", &AnalysisConfig::mkldnn_enabled)
      .def("set_cpu_math_library_num_threads",
           &AnalysisConfig::SetCpuMathLibraryNumThreads)


there may be another failure in CI, see in my PR #18081. https://github.com/PaddlePaddle/Paddle/pull/18081/files?file-filters%5B%5D=.py#diff-876ea1bc109973488c161a657f79812fR74 , but it may be fixed in your PR.

LeoZhao-Intel · 2019-07-09T05:20:34Z

paddle/fluid/inference/api/analysis_predictor.cc

 bool AnalysisPredictor::Run(const std::vector<PaddleTensor> &inputs,
                            std::vector<PaddleTensor> *output_data,
                            int batch_size) {
  paddle::platform::SetNumThreads(config_.cpu_math_library_num_threads());
+#ifdef PADDLE_WITH_MKLDNN
+  if (config_.use_mkldnn_) MkldnnPreRun(inputs);
+#endif


so suggest to do like below in MkldnnPreRun:

if (config_.mkldnn_input_shape_cache_capacity_ > 0) { VLOG(2) << "In mkldnn cache clear mode."; platform::set_cur_mkldnn_session_id( platform::kMKLDNNSessionID_CacheClearing); platform::set_cur_input_shape_cache_capacity( config_.mkldnn_input_shape_cache_capacity_); } // Set current_input_shape . std::stringstream ss; for (size_t i = 0; i < inputs.size(); ++i) { for (size_t j = 0; j < inputs[i].shape.size(); ++j) { ss << inputs[i].shape[j] << "-"; } } VLOG(2) << "Set input shape=" << ss.str(); platform::set_cur_input_shape_str(ss.str());

test=develop

CLAassistant · 2019-07-10T10:43:53Z

Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ NHZlX
✅ luotao1
❌ guofei02

guofei02 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

luotao1 commented Jul 8, 2019

View reviewed changes

luotao1 and others added 3 commits July 8, 2019 17:47

enhance config.EnableMKLDNN api for mkldnn cache clear strategy

8b3873b

test=develop

enhance enable_mkldnn() python api

aeecd1a

test=develop

luotao1 requested review from jianhang-liu and jczaja July 8, 2019 14:39

Merge branch 'enable_mkldnn_enhance' of https://github.com/luotao1/Pa…

71ccbf5

…ddle into luotao1-enable_mkldnn_enhance

LeoZhao-Intel reviewed Jul 9, 2019

View reviewed changes

gfwm2013 added 2 commits July 10, 2019 12:40

add SetMkldnnCacheCapacity api

a62b145

enhance analyzer_mm_dnn test for mkldnn_cache_clear

2a32929

test=develop

luotao1 closed this Jul 10, 2019

This was referenced Jul 10, 2019

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy #18579

Closed

add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy #18580

Merged

luotao1 deleted the enable_mkldnn_enhance branch August 29, 2019 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance config.EnableMKLDNN api for mkldnn cache clear strategy #18549

enhance config.EnableMKLDNN api for mkldnn cache clear strategy #18549

luotao1 commented Jul 8, 2019

luotao1 Jul 8, 2019

LeoZhao-Intel Jul 9, 2019

LeoZhao-Intel Jul 9, 2019

luotao1 Jul 8, 2019

luotao1 commented Jul 8, 2019

LeoZhao-Intel Jul 9, 2019

LeoZhao-Intel Jul 9, 2019 •

edited

Loading

LeoZhao-Intel Jul 9, 2019

CLAassistant commented Jul 10, 2019

enhance config.EnableMKLDNN api for mkldnn cache clear strategy #18549

enhance config.EnableMKLDNN api for mkldnn cache clear strategy #18549

Conversation

luotao1 commented Jul 8, 2019

luotao1 Jul 8, 2019

Choose a reason for hiding this comment

LeoZhao-Intel Jul 9, 2019

Choose a reason for hiding this comment

LeoZhao-Intel Jul 9, 2019

Choose a reason for hiding this comment

luotao1 Jul 8, 2019

Choose a reason for hiding this comment

luotao1 commented Jul 8, 2019

LeoZhao-Intel Jul 9, 2019

Choose a reason for hiding this comment

LeoZhao-Intel Jul 9, 2019 • edited Loading

Choose a reason for hiding this comment

LeoZhao-Intel Jul 9, 2019

Choose a reason for hiding this comment

CLAassistant commented Jul 10, 2019

LeoZhao-Intel Jul 9, 2019 •

edited

Loading