Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OpenMP-like synchronization patterns in Eigen thread pool #4236

Merged
merged 12 commits into from
Jun 22, 2020

Conversation

tlh20
Copy link
Contributor

@tlh20 tlh20 commented Jun 15, 2020

Description:

This PR updates the thread pool implementation to make work distribution over the Eigen thread pool more closely resemble techniques used in OpenMP. In particular:

(1) A thread entering a parallel loop works on the iterations itself, rather than requiring a thread switch to/from a thread in the pool, if called from outside the thread pool.

(2) To support #1, work items pushed to the thread pool run a loop to claim iterations from a shared counter via atomic-fetch-and-add, as opposed to having work items themselves represent individual batches of iterations. This means that any thread working on the loop can execute any batch of iterations, including having the main thread run through all of the batches itself if the loop turns out to be short-running.

(3) As with OpenMP active scheduling, the worker loop spins waiting for work prior to blocking. This avoids OS blocking / wake-up paths in workloads with series of short-running parallel sections. The default spinning duration prior to blocking is measured at around 1ms.

Performance tests on a 32-vCPU VM for CPU inference workloads show performance broadly similar to OpenMP builds, with p50 across 143 models a 12% improvement, and p80 a 28% improvement.

Motivation and Context

The PR aims to simplify the configuration of threading with ORT by providing consistent performance with OpenMP-based parallelism.

@tlh20 tlh20 requested a review from a team as a code owner June 15, 2020 13:26
@ghost
Copy link

ghost commented Jun 15, 2020

CLA assistant check
All CLA requirements met.

snnn
snnn previously approved these changes Jun 15, 2020
@yuslepukhin
Copy link
Member

cv_.notify_all();

Would it improve things somewhat if we notified after releasing the mutex?


Refers to: include/onnxruntime/core/platform/Barrier.h:41 in fe2637f. [](commit_id = fe2637f, deletion_comment = False)

yuslepukhin
yuslepukhin previously approved these changes Jun 15, 2020
Copy link
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

include/onnxruntime/core/platform/threadpool.h Outdated Show resolved Hide resolved
onnxruntime/core/common/threadpool.cc Outdated Show resolved Hide resolved
onnxruntime/core/common/threadpool.cc Outdated Show resolved Hide resolved
onnxruntime/core/common/threadpool.cc Outdated Show resolved Hide resolved
@tlh20 tlh20 dismissed stale reviews from yuslepukhin and snnn via c19cf71 June 16, 2020 11:22
@tlh20
Copy link
Contributor Author

tlh20 commented Jun 16, 2020

cv_.notify_all();

Would it improve things somewhat if we notified after releasing the mutex?

Refers to: include/onnxruntime/core/platform/Barrier.h:41 in fe2637f. [](commit_id = fe2637f, deletion_comment = False)

I'll leave this as-is for the moment, as per the current main branch, but it would be interesting to check how this performs on different platforms. I think some mutex + condvar implementations will identify that the waiting threads require the lock that is still held, and defer any work until the lock is released, while others benefited from avoiding the notification going to a thread that gets unblocked briefly before blocking on the lock-acquire.

@tlh20 tlh20 closed this Jun 16, 2020
@tlh20 tlh20 reopened this Jun 16, 2020
pranavsharma
pranavsharma previously approved these changes Jun 16, 2020
Copy link
Contributor

@pranavsharma pranavsharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please change the subject of this PR as it'll show up as-is in the commit logs? Thanks!

snnn
snnn previously approved these changes Jun 16, 2020
@snnn
Copy link
Member

snnn commented Jun 16, 2020

Can you please change the subject of this PR as it'll show up as-is in the commit logs? Thanks!

The log message can be changed with merging the PR.

skottmckay
skottmckay previously approved these changes Jun 16, 2020
@tlh20 tlh20 dismissed stale reviews from skottmckay, snnn, and pranavsharma via 0eafefc June 17, 2020 08:29
@tlh20 tlh20 changed the title Tim/threading Use OpenMP-like synchronization patterns in Eigen thread pool Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants