Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: worker pool with --dist=pool #794

Closed
Garrett-R opened this issue Jul 12, 2022 · 4 comments
Closed

Feature request: worker pool with --dist=pool #794

Garrett-R opened this issue Jul 12, 2022 · 4 comments

Comments

@Garrett-R
Copy link

Garrett-R commented Jul 12, 2022

Currently, it seems pytest-xdist divvies up all the work to be done and then the work gets executed. This would be faster if the workers just grabbed tasks out of a pool instead of divvying at the start.

Example:

# test_parallel.py

import sys
from time import sleep

# hack for live prints (https://stackoverflow.com/questions/27006884)
sys.stdout = sys.stderr


def test_1():
    print('test 1 (fast)')


def test_2():
    print('test 2 (slow)')
    sleep(5)


def test_3():
    print('test 3 (fast)')


def test_4():
    print('test 4 (slow)')
    sleep(5)

Running this with

pytest -s -n 2 test_parallel.py

takes 11.8s because supposedly, one worker gets unlucky and gets test_2 and test_4, while the other worker finishes almost immediately but then sits idle. Note: this repro may not be reliable since there's no guaranteed grouping - try shifting the functions around to repro if necessary.

This is an extreme example, but I also see this in my actual codebase I work on where I see 1 worker going for a non-negligle time after others have finished.

@Garrett-R
Copy link
Author

From the docs, this pool behavior is actually how I was expecting --dist load to work given it says: "Sends pending tests to any worker that is available". So perhaps this is not a feature request, but rather a bug?

@RonnyPfannschmidt
Copy link
Member

There is a queue of tasks available for each worker, the scheduler was done with network usage in mind (10-15 years ago that was common usage by the pypy team)

It's simply the case that bundling, back-pressure and testcase with long run times have been integrated

There are numerous issues on the topic already

@Garrett-R
Copy link
Author

I don't have enough context to fully follow that, although maybe you're not aiming that at me. 😆 Sorry if this is a dupe!

@amezin
Copy link
Collaborator

amezin commented Jan 11, 2023

I belive this is now solved by #862 and --dist worksteal. It isn't exactly the same algorithm as described here, but it also does load [re-]balancing much more actively than the default scheduler.

BTW, the scheduling algorithm where workers just grab tests one by one when ready, without any order, likely wouldn't be the best for fixture reuse.

@amezin amezin closed this as completed Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants