Skip to content

Commit

Permalink
pythongh-82054: allow test runner to split test_asyncio to execute in…
Browse files Browse the repository at this point in the history
… parallel by sharding. (python#103927)

This runs test_asyncio sub-tests in parallel using sharding from Cinder. This suite is typically the longest-pole in runs because it is a test package with a lot of further sub-tests otherwise run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.

After porting we can see the direct impact on a multicore system.

Without this change:
  Running make test is 5 min 26 seconds
With this change:
  Running make test takes 3 min 39 seconds

That'll vary based on system and parallelism. On a `-j 4` run similar to what CI and buildbot systems often do, it reduced the overall test suite completion latency by 10%.

The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the wall time saved it is worth it, especially in low-core count CI runs as it pulls a long tail. The win for productivity and reserved CI resource usage is significant.

Future tests that deserve to be refactored into split up suites to benefit from are test_concurrent_futures and the way the _test_multiprocessing suite gets run for all start methods. As exposed by passing the -o flag to python -m test to get a list of the 10 longest running tests.

---------

Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google, LLC]
(cherry picked from commit 9e011e7)
  • Loading branch information
zitterbewegung authored and vstinner committed Sep 2, 2023
1 parent 14ef8cf commit ec875d5
Showing 1 changed file with 16 additions and 3 deletions.
19 changes: 16 additions & 3 deletions Lib/test/libregrtest/runtest.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,14 @@ def __str__(self) -> str:
# set of tests that we don't want to be executed when using regrtest
NOTTESTS = set()

#If these test directories are encountered recurse into them and treat each
# test_ .py or dir as a separate test module. This can increase parallelism.
# Beware this can't generally be done for any directory with sub-tests as the
# __init__.py may do things which alter what tests are to be run.

SPLITTESTDIRS = {
"test_asyncio",
}

# Storage of uncollectable objects
FOUND_GARBAGE = []
Expand All @@ -158,16 +166,21 @@ def findtestdir(path=None):
return path or os.path.dirname(os.path.dirname(__file__)) or os.curdir


def findtests(testdir=None, stdtests=STDTESTS, nottests=NOTTESTS):
def findtests(testdir=None, stdtests=STDTESTS, nottests=NOTTESTS, *, split_test_dirs=SPLITTESTDIRS, base_mod=""):
"""Return a list of all applicable test modules."""
testdir = findtestdir(testdir)
names = os.listdir(testdir)
tests = []
others = set(stdtests) | nottests
for name in names:
mod, ext = os.path.splitext(name)
if mod[:5] == "test_" and ext in (".py", "") and mod not in others:
tests.append(mod)
if mod[:5] == "test_" and mod not in others:
if mod in split_test_dirs:
subdir = os.path.join(testdir, mod)
mod = f"{base_mod or 'test'}.{mod}"
tests.extend(findtests(subdir, [], nottests, split_test_dirs=split_test_dirs, base_mod=mod))
elif ext in (".py", ""):
tests.append(f"{base_mod}.{mod}" if base_mod else mod)
return stdtests + sorted(tests)


Expand Down

0 comments on commit ec875d5

Please sign in to comment.