Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default use one thread in fluid #7208

Merged

Conversation

reyoung
Copy link
Collaborator

@reyoung reyoung commented Jan 4, 2018

No description provided.

Copy link

@tonyyang-svail tonyyang-svail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

'speed will not be optimized if you use data parallel. It will '
'fail if this PaddlePaddle binary is compiled with OpenBlas since'
' OpenBlas does not support multi-threads.'.format(num_threads),
file=sys.stderr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段注释没太看明白。openblas的时候可以用OPENBLAS_NUM_THREADS来设置,不知道和这个PR有没有关系。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个PR非常简单,就是要把OMP_NUM_THREADS默认设置成1。因为确实在数据并行下,这个环境变量设置成其他的值并不合适。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当cpu核数大于trainer_count时,如果设成1,会导致cpu核数没用满。
OMP_NUM_THREADS = cpu核数/trainer_count

threads = num_processors / trainer_count
threads = '1' if threads < 1 else str(threads)
set_env("OMP_NUM_THREADS", threads)
set_env("MKL_NUM_THREADS", threads)

@qingqing01
Copy link
Contributor

Could you add an issue to describe the problem?

@reyoung reyoung merged commit 2b259bf into PaddlePaddle:develop Jan 4, 2018
@tensor-tang
Copy link
Contributor

tensor-tang commented Jan 5, 2018

Two concern I should point out :

  1. OMP_NUM_THREADS will also impact MKL, not only OpenBLAS mentioned in OpenBLAS doesn't support multi thread application #7234.

  2. Force setting it to 1 will limit MKLML and MKLDNN multi-processing, which means use only 1 core to compute. I suggest it could be calculated by settings like threads_pre_trainer=available_threads/trainer_count.

@tonyyang-svail
Copy link

I see. Change OPENBLAS_NUM_THREADS is a better approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants