Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize decide_worker #4332

Merged
merged 11 commits into from
Dec 9, 2020
Merged

Optimize decide_worker #4332

merged 11 commits into from
Dec 9, 2020

Conversation

jakirkham
Copy link
Member

Annotate variables in decide_worker (both method and function). Deduplicate some code within the decide_worker method. Also tune the decide_worker function fast path.

Helps Cython verify that this matches the expected `return` type before
`return`ing.
This winds up being a bit faster than calling `first`.

```python
In [1]: from tlz import first

In [2]: s = {"a"}

In [3]: %timeit first(s)
76 ns ± 0.17 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %%timeit
   ...: for e in s:
   ...:     break
   ...:
42.4 ns ± 0.429 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
```
As the case where there are idle workers and when there are not are
largely the same (except with collection of workers they draw from),
choose between the two collections at the beginning. Then spend the
remainder of the code on the selection logic.
Since we use this in a couple of cases, go ahead and assign it to a
variable initially and reuse that variable afterwards.
As this variable refers to a `WorkerState` instance, rename it to follow
the convention we have with `WorkerState` variable names.
Makes it easier for Cython to identify that this has the expected return
type when checking the value returned.

return worker
return ws
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for standardizing pronouns along the way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's certainly made it a lot easier to understand where things can be annotated 🙂

@mrocklin mrocklin merged commit 3d53801 into dask:master Dec 9, 2020
@mrocklin
Copy link
Member

mrocklin commented Dec 9, 2020

It looks like we're going the route of doing easy optimizations before splitting things out to a cythonized class and a non-cythonized class. Is this correct?

@jakirkham
Copy link
Member Author

Yes and no. I started a refactoring branch as well. Though I think the particular transition these changes are associated with, transition_waiting_processing, is probably the hardest to refactor in a meaningful way atm. It also happens to be the 2nd slowest transition. So wanted to make a concerted effort to unblock performance for this transition before going ahead with the refactor.

@jakirkham jakirkham deleted the opt_decide_worker branch December 9, 2020 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants