You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In theory we should be able to use pytest-xdist to parallelize tests. In practice this doesn't work for a few reasons:
If multiple tests use the same test data, the data files can get corrupted if multiple tests are trying to download the same file at the same time
There could be resource contention when running locally
When running tests in parallel, it is safest to use per-test cache directories (i.e. don't use cache_dir config setting or PYTEST_WDL_CACHE_DIR environment variable). But we could also make test data threadsafe by processing all localization via a daemon OR have all files localized prior to running any tests.
Regarding resource contention, there is not a really good solution yet. There is an issue in pytest-xdist that has been open for years to address this. One approach is to mark tests that cannot be parallelized and then run two separate test sessions - the first excludes non-parallelizable tests and uses xdist, the second only runs non-parallelizable tests serially.
We could also borrow code from pytest-workflow, which implements it's own parallelization strategy.
The text was updated successfully, but these errors were encountered:
it might make the most sense to view local test runners (miniwdl/cromwell) as non-parallelizable, and then anything that is remote (dxWdl, Cromwell Server) can be run in parallel. Since these tests generally run in docker, it will be hard to run them in parallel without starving the underlying system of resources
In pytest-workflow the tests are not parallelized. The runnning of the workflows however is parallelized. The workflows take much more time than the tests. Maybe the model of pytest-workflow coud be utilized to achieve better wall clock times for pytest-wdl as well?
What was done in pytest-workflow is that each workflow gets its own runner object by instantiating from a Workflow class. These objects are added to a queue. The queue is then processed in parallel (default=1 thread) by invoking the run method of the objects.
Pytest has a hook pytest_runtestloop which runs after all the collection has completed but before any tests are run. At this moment all the workflows can be started and finished.
After it's done all the tests can be run on the completed workflows.
In theory we should be able to use pytest-xdist to parallelize tests. In practice this doesn't work for a few reasons:
When running tests in parallel, it is safest to use per-test cache directories (i.e. don't use cache_dir config setting or PYTEST_WDL_CACHE_DIR environment variable). But we could also make test data threadsafe by processing all localization via a daemon OR have all files localized prior to running any tests.
Regarding resource contention, there is not a really good solution yet. There is an issue in pytest-xdist that has been open for years to address this. One approach is to mark tests that cannot be parallelized and then run two separate test sessions - the first excludes non-parallelizable tests and uses xdist, the second only runs non-parallelizable tests serially.
We could also borrow code from pytest-workflow, which implements it's own parallelization strategy.
The text was updated successfully, but these errors were encountered: