Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create new api to indicate detect thread usage #18081

Closed
wants to merge 9 commits into from

Conversation

LeoZhao-Intel
Copy link
Contributor

@LeoZhao-Intel LeoZhao-Intel commented Jun 13, 2019

This PR solves 2 issues:

  1. When Predictor.run is called in changed thread, it will make memory leak due to threadid is inserted into key, while each time key is different.
  2. For detect model, input dims are dynamic, not fixed, it will make conv/pool/concat mkldnn op memory leak due to each time key is different.

The solve method is to disable cache in this case, we extend EnableMKLDNN in AnaysisConfig, use parameter to control if cache is needed.

related #17611

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the description of this PR.

paddle/fluid/platform/device_context.cc Outdated Show resolved Hide resolved
paddle/fluid/inference/api/paddle_analysis_config.h Outdated Show resolved Hide resolved
paddle/fluid/inference/api/paddle_analysis_config.h Outdated Show resolved Hide resolved
paddle/fluid/inference/api/analysis_config.cc Outdated Show resolved Hide resolved
@LeoZhao-Intel
Copy link
Contributor Author

@jczaja @jianhang-liu

@LeoZhao-Intel
Copy link
Contributor Author

This PR solves 2 issues:

  1. When Predictor.run is called in changed thread, it will make memory leak due to threadid is inserted into key, while each time key is different.
  2. For detect model, input dims are dynamic, not fixed, it will make conv/pool/concat mkldnn op memory leak due to each time key is different.

The solve method is to disable cache in this case, we extend EnableMKLDNN in AnaysisConfig, use parameter to control if cache is needed.

@luotao1
Copy link
Contributor

luotao1 commented Jun 15, 2019

[19:20:47]	[100%] Linking CXX executable test_analyzer_small_dam
[19:20:48]	../../../../../third_party/install/ngraph/lib/libngraph.so: file not recognized: File truncated
[19:20:48]	collect2: error: ld returned 1 exit status
[19:20:48]	make[2]: *** [paddle/fluid/inference/tests/api/test_analyzer_mobilenet_depthwise_conv] Error 1
[19:20:48]	paddle/fluid/inference/tests/api/CMakeFiles/test_analyzer_mobilenet_depthwise_conv.dir/build.make:641: recipe for target 'paddle/fluid/inference/tests/api/test_analyzer_mobilenet_depthwise_conv' failed
[19:20:48]	make[1]: *** [paddle/fluid/inference/tests/api/CMakeFiles/test_analyzer_mobilenet_depthwise_conv.dir/all] Error 2
[19:20:48]	CMakeFiles/Makefile2:108240: recipe for target 'paddle/fluid/inference/tests/api/CMakeFiles/test_analyzer_mobilenet_depthwise_conv.dir/all' failed
[19:20:48]	make[1]: *** Waiting for unfinished jobs....

http://ci.paddlepaddle.org/viewLog.html?buildId=115069&buildTypeId=Paddle_PrCi&tab=buildLog&_focus=7893

Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are very core changes , have you run extensive performance tests to check if inference of our models (CAPI tests of our validation) is unharmed ?

@LeoZhao-Intel
Copy link
Contributor Author

LeoZhao-Intel commented Jun 18, 2019

This is just WA to avoid memory leak issue due to dynamic shape and run calling from different threads, and does harm performance. mkldnn reuse does improve performance much.
This is perf data using reuse.
image

This is perf data after disable cache.
image

@jczaja
Copy link
Contributor

jczaja commented Jun 18, 2019

@LeoZhao-Intel I just want to make sure that reusing is disabled only in situation when we expect it to be disabled. I will copy this PR branch to internal repo and run automatic tests. Results should be available tonight

@LeoZhao-Intel
Copy link
Contributor Author

LeoZhao-Intel commented Jun 18, 2019

should be able to pass CI test, but it will fail if EnableMKLDNN(1) is called in multiple instances case and SetMkldnnthreadid() is not set properly, so as I said before it is just WA. Need re-factor whole stack including API.

@jczaja
Copy link
Contributor

jczaja commented Jun 18, 2019

@LeoZhao-Intel Ok, performance (C-API) tests passed eg. no impact visible. My understanding is that you are still working on this PR ?

@LeoZhao-Intel
Copy link
Contributor Author

LeoZhao-Intel commented Jun 18, 2019

this PR gonna be dropped since it impacts performance much and can't reach target, but some requirements may be leveraged into formal solution. e.g. support cache disabled case, use API to indicate scene.

@jianhang-liu
Copy link
Contributor

Thanks Leo for big efforts on this PR. The investigation in this PR had given us much insight of this issue.

@jianhang-liu
Copy link
Contributor

Closed. But some code will be cloned to a new PR from Leo. Similar as PR #18217 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants