Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread cache #1545

Merged
merged 5 commits into from
May 27, 2020
Merged

Add thread cache #1545

merged 5 commits into from
May 27, 2020

Conversation

njsmith
Copy link
Member

@njsmith njsmith commented May 23, 2020

On my Linux laptop, this makes await trio.to_thread.run_sync(lambda: None) about twice as fast, from ~150 µs to ~75 µs.

Closes: #6

Test program:

import trio
import time

COUNT = 10000

async def main():
    while True:
        start = time.monotonic()
        for _ in range(COUNT):
            await trio.to_thread.run_sync(lambda: None)
        end = time.monotonic()
        print("{:.2f} µs/job".format((end - start) / COUNT * 1e6))

trio.run(main)

njsmith added 2 commits May 23, 2020 00:51
On my Linux laptop, this makes 'await trio.to_thread.run_sync(lambda:
None)' about twice as fast, from ~150 µs to ~75 µs.

Closes: python-triogh-6

Test program:

    import trio
    import time

    COUNT = 10000

    async def main():
        while True:
            start = time.monotonic()
            for _ in range(COUNT):
                await trio.to_thread.run_sync(lambda: None)
            end = time.monotonic()
            print("{:.2f} µs/job".format((end - start) / COUNT * 1e6))

    trio.run(main)
@njsmith
Copy link
Member Author

njsmith commented May 23, 2020

Huh, failed test was a genuine, unrelated bug – see #1546

@njsmith
Copy link
Member Author

njsmith commented May 23, 2020

OK, maybe #1548 will help with the test failure...

I guess we're consistently seeing these test failures here, and only here, because open_process uses a thread, and this PR must be making such a speed difference on FreeBSD that it can trigger the race condition? It's odd.

@codecov
Copy link

codecov bot commented May 23, 2020

Codecov Report

Merging #1545 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           master    #1545    +/-   ##
========================================
  Coverage   99.67%   99.68%            
========================================
  Files         108      110     +2     
  Lines       13358    13466   +108     
  Branches     1012     1024    +12     
========================================
+ Hits        13315    13423   +108     
  Misses         28       28            
  Partials       15       15            
Impacted Files Coverage Δ
trio/lowlevel.py 100.00% <ø> (ø)
trio/_core/__init__.py 100.00% <100.00%> (ø)
trio/_core/_thread_cache.py 100.00% <100.00%> (ø)
trio/_core/tests/test_thread_cache.py 100.00% <100.00%> (ø)
trio/_threads.py 100.00% <100.00%> (ø)
trio/tests/test_threads.py 100.00% <100.00%> (ø)

@njsmith
Copy link
Member Author

njsmith commented May 23, 2020

Phew, tests finally passing, and I think this is ready to review.

Random question: do you think it's more intuitive to have start_thread_soon(deliver, fn), or start_thread_soon(fn, deliver)?

@oremanj
Copy link
Member

oremanj commented May 26, 2020

do you think it's more intuitive to have start_thread_soon(deliver, fn), or start_thread_soon(fn, deliver)?

My intuition favors (fn, deliver) because you run fn first and then deliver ("data flows from left to right"), but I think either way is defensible.

Copy link
Member

@oremanj oremanj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for one suggestion.

class ThreadCache:
def __init__(self):
self._idle_workers = {}
self._cache_lock = Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't seem to use the _cache_lock anywhere. If you remove it, you could even make THREAD_CACHE be the dict directly, and move ThreadCache.start_thread_soon() into the global start_thread_soon().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch on _cache_lock!

We currently have a test that instantiates a private ThreadCache object (test_race_between_idle_exit_and_job_assignment`), and I don't see how to easily make it work otherwise, so I guess I'll leave the class there for now.

@njsmith
Copy link
Member Author

njsmith commented May 27, 2020

My intuition favors (fn, deliver) because you run fn first and then deliver ("data flows from left to right"), but I think either way is defensible.

Yeah, having slept on it I think you're right – also fn is just the "more important" function from the perspective of the caller, so having it second feels weird. Swapped them.

@@ -93,7 +93,7 @@ def __init__(self):
self._idle_workers = {}
self._cache_lock = Lock()

def start_thread_soon(self, deliver, fn):
def start_thread_soon(self, fn, deliver):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, yes, this new order seems more intuitive to me, as deliver is called after fn

@oremanj oremanj merged commit 9f5af72 into python-trio:master May 27, 2020
@njsmith njsmith deleted the thread-cache branch May 27, 2020 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Design: thread pools
3 participants