Correct Way to Setup PyTest Fixture #3540

seanlaw · 2020-03-03T04:44:07Z

Currently, I have several test files being executed that all require using a dask client that is being setup via a PyTest fixture:

from dask.distributed import Client, LocalCluster
import pytest


@pytest.fixture(scope="module")
def dask_client():
    cluster = LocalCluster(n_workers=2, threads_per_worker=2)
    client = Client(cluster)
    yield client
    # teardown
    client.close()
    cluster.close()

This exists at the top of each test file and then, the dask_client is accessed with:

def test_one(dask_client):
    ...

def test_two(dask_client):
    ...

def test_three(dask_client):
    ...

Based on my reading of the PyTest documentation, it is my understanding that the dask_client is created once at the start of the execution of the test file (with scope="module"), each test within the test file is executed, and then the dask_client is torn down before the next test file (that also requires a dask_client) does the same thing.

Since the LocalCluster is initially setup with n_workers=2, threads_per_worker=2, I naively expected the maximum number of cores to be 2 and the number of threads per core to also be 2. However, according to the Activity Monitor on my 13" Macbook Pro, I see the number of threads climb to 16 for one process:

Note that I don't have any other Python processes running. All of the Python processes shown in the image appear to be the result of tests starting/stopping and the dask_client tear down is catching up. However, occasionally, by simply re-running the exact same test suite multiple times, we'll encounter a CancelledError:

../../miniconda3/lib/python3.7/site-packages/distributed/client.py:1885: in gather
    asynchronous=asynchronous,
../../miniconda3/lib/python3.7/site-packages/distributed/client.py:767: in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
../../miniconda3/lib/python3.7/site-packages/distributed/utils.py:345: in sync
    raise exc.with_traceback(tb)
../../miniconda3/lib/python3.7/site-packages/distributed/utils.py:329: in f
    result[0] = yield future
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <tornado.gen.Runner object at 0x1c27e07860>

    def run(self) -> None:
        """Starts or resumes the generator, running until it reaches a
        yield point that is not ready.
        """
        if self.running or self.finished:
            return
        try:
            self.running = True
            while True:
                future = self.future
                if future is None:
                    raise Exception("No pending future")
                if not future.done():
                    return
                self.future = None
                try:
                    exc_info = None

                    try:
>                       value = future.result()
E                       concurrent.futures._base.CancelledError

../../miniconda3/lib/python3.7/site-packages/tornado/gen.py:735: CancelledError
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================ 1 failed, 175 passed in 52.44s ============================================================
Error: pytest encountered exit code 1

Based on my past experience, a CancelledError is common when running a distributed cluster and when there are differences in Python packages installed. However, in this case, we are running a LocalCluster and it appears that all of the resources are being used up and tornado is hanging. Again, the CancelledError happens sporadically when I re-run the exact same test suite multiple times.

I'm guessing that I'm doing things incorrectly or my assumptions are incorrect. Is there a correct/proper way to use Dask LocalCluster with PyTest so that all tests are limited to only 2 cores and 2 threads per core (instead of getting up to 16 threads)?

Initially, a hacky way around this was to limit the total number of tests within each test file which resulted in a test suite with many separate test files (that would each setup/tear down its own dask_client) but with only a handful of tests in each test file. This seemed to help ensure that the number of threads being used wouldn't keep climbing. However, this solution is no longer sufficient and I'm still seeing the same CancelledError as my test suite grows. I've also tried adding cluster restarts inbetween tests, adding a few seconds of sleep time after tear down, or setting up/tearing down dask_client at the test level but this significantly slows down the execution of the test suite.

The test suite can be found here

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2020-03-03T13:13:24Z

Sorry, I haven't looked too closely at the tests in your repo, but is restructuring the tests to be more similar to distributed's an option? Specifically, using the gen_cluster decorator, as described in https://distributed.dask.org/en/latest/develop.html#writing-tests (that's a bit out of date, going to update it now)?

mrocklin · 2020-03-03T13:42:08Z

I strongly recommend using the dask.distributed testing harness if it makes sense for your situation. We test for and clean up a lot of things there. If you don't want to use gen_cluster (because it's async) there are also pytest fixtures for clients that are running synchronously with external processes.

…

On Tue, Mar 3, 2020 at 5:13 AM Tom Augspurger ***@***.***> wrote: Sorry, I haven't looked too closely at the tests in your repo, but is restructuring the tests to be more similar to distributed's an option? Specifically, using the gen_cluster decorator, as described in https://distributed.dask.org/en/latest/develop.html#writing-tests (that's a bit out of date, going to update it now)? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#3540?email_source=notifications&email_token=AACKZTDSR2GWHE6V7U63OATRFT67LA5CNFSM4LACIAH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENTNW5A#issuecomment-593943412>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTFOFMQ7G6LLWKJUE3DRFT67LANCNFSM4LACIAHQ> .

seanlaw · 2020-03-03T14:48:19Z

I have a few questions:

Should it matter if I was using LocalCluster for my testing previously vs using @gen_cluster? I'm using dask.distributed to perform an embarrassingly parallel/chunked computation so I want to test and make sure that the single threaded computation will yield the same result as a manually chunked/embarrassingly parallel computation. Essentially, the function(s) that I need to test takes in a dask client, performs some chunked computation, and returns the results. So, it will look something like this:

@numba.njit(parallel=True)
def _some_func(chunk):
    # Do some computation on the chunk of data
    return result


def some_func(dask_client, x):
    ...
    chunks = split_array_into_chunks(x)
    futures = []
    for chunk in chunks:
        futures.append(
            dask_client.submit(_some_func, chunk)
        )

    results = dask_client.gather(futures)

    return results

Maybe I'm overlooking something but I can't seem to find where this is covered in the dask.distributed docs. What is the proper way to use pytest.mark attributes along with @gen_cluster? I can usually pass in data and filter warnings in PyTest with:

test_data = [np.random.randint(1000)]

@pytest.mark.filterwarnings("ignore:\\s+Port 8787 is already in use:UserWarning")
@pytest.mark.parametrize("x", test_data)
def test_some_func(x):
    y = some_func(x)
    ...
    npt.assert_almost_equal(the_correct_answer, y)

I can get the @gen_cluster example in the dask.distributed docs to work but I can't seem to figure out how to combine both pytest.mark and @gen_cluster together in one test.

TomAugspurger · 2020-03-03T14:56:30Z

Some marks will work (e.g. filterwarnings). I think parametrize causes issues, so the actual test is written in a closure:

distributed/distributed/tests/test_actor.py

Lines 50 to 86 in b049bd7

    
           @pytest.mark.parametrize("direct_to_workers", [True, False]) 
        
           def test_client_actions(direct_to_workers): 
        
               @gen_cluster(client=True) 
        
               def test(c, s, a, b): 
        
                   c = yield Client( 
        
                       s.address, asynchronous=True, direct_to_workers=direct_to_workers 
        
                   ) 
        
                   counter = c.submit(Counter, workers=[a.address], actor=True) 
        
                   assert isinstance(counter, Future) 
        
                   counter = yield counter 
        
                   assert counter._address 
        
                   assert hasattr(counter, "increment") 
        
                   assert hasattr(counter, "add") 
        
                   assert hasattr(counter, "n") 
        
                   n = yield counter.n 
        
                   assert n == 0 
        
                   assert counter._address == a.address 
        
                   assert isinstance(a.actors[counter.key], Counter) 
        
                   assert s.tasks[counter.key].actor 
        
                   yield [counter.increment(), counter.increment()] 
        
                   n = yield counter.n 
        
                   assert n == 2 
        
                   counter.add(10) 
        
                   while (yield counter.n) != 10 + 2: 
        
                       n = yield counter.n 
        
                       yield gen.sleep(0.01) 
        
                   yield c.close() 
        
               test()

.

seanlaw · 2020-03-03T18:05:43Z

@TomAugspurger that was the missing link! I think I'm now able to transition everything over to using @gen_cluster. Thank you for your help!

seanlaw · 2020-03-03T19:10:49Z

So, I was able to port all of my tests over to using @gen_cluster and all of the tests pass. In my testing, I separate it into two parts:

I just run all of the tests (with Numba JIT and CUDA compilation enabled) and look for test failures
I follow this up by disabling Numba JIT and CUDA Compiled Functions (so they are all Python only tests), and then re-run all of the tests, and assess code coverage

For the first part, all of the tests pass. However, for the second part (coverage testing), all of the tests pass as well but the tests that use @gen_cluster are unable to verify that the Python functions that need to be tested are actually called. I'm guessing that it's because those Python functions are being run on a different forked process. Is there a way to limit this to only 1 process and 2 threads per process? This would have to be controlled either at the command line (when calling py.test) handled via some environment variable and not controlled at the code level.

mrocklin · 2020-03-03T20:06:50Z

By default gen_cluster uses the Worker class, which runs things in the same process, but in different threads. If you want to use processes then you should add the Worker=Nanny keyword. Maybe coverage doesn't respect code run in other threads? I'm not sure.

…

On Tue, Mar 3, 2020 at 11:49 AM Sean M. Law ***@***.***> wrote: Reopened #3540 <#3540>. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3540?email_source=notifications&email_token=AACKZTANE3LBNMKQPZ42KGTRFVNK7A5CNFSM4LACIAH2YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOXBSQZJY#event-3093630119>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTEW7MLL3MACTLRGSZ3RFVNK7ANCNFSM4LACIAHQ> .

seanlaw · 2020-03-03T20:51:28Z

Hmmm, according to the docs for coverage.py

Coverage.py can measure multi-threaded programs by default. If you are using more exotic concurrency, with the multiprocessing, greenlet, eventlet, or gevent libraries, then coverage.py will get very confused. Use the --concurrency switch to properly measure programs using these libraries. Give it a value of multiprocessing, thread, greenlet, eventlet, or gevent. Values other than thread require the C extension.

So, in theory this should work out of the box with @gen_cluster if dask.distributed is running in threads.

mrocklin · 2020-03-03T20:58:32Z

Then I don't know. gen_cluster by default runs everything in the same process.

…

On Tue, Mar 3, 2020 at 12:51 PM Sean M. Law ***@***.***> wrote: Hmmm, according to the docs for coverage.py Coverage.py can measure multi-threaded programs by default. If you are using more exotic concurrency, with the multiprocessing, greenlet, eventlet, or gevent libraries, then coverage.py will get very confused. Use the --concurrency switch to properly measure programs using these libraries. Give it a value of multiprocessing, thread, greenlet, eventlet, or gevent. Values other than thread require the C extension. So, in theory this should work out of the box with @gen_cluster. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3540?email_source=notifications&email_token=AACKZTAUVBDZYPQUEBUGF73RFVUVBA5CNFSM4LACIAH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENVDDLA#issuecomment-594162092>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTCT7YOMNPETATWDQYTRFVUVBANCNFSM4LACIAHQ> .

seanlaw · 2020-03-03T22:26:23Z

Okay, I ran a couple of more tests and it looks like the problem is with the closure (or nested function). For some reason, coverage.py isn't able to pick up anything inside of the inner function test():

test_data = [np.random.randint(1000)]

@pytest.mark.filterwarnings("ignore:\\s+Port 8787 is already in use:UserWarning")
@pytest.mark.parametrize("x", test_data)
def test_some_func(x):
    @gen_cluster(client=True)
    def test(c, s, a, b):
        y = some_func(c, x)  # Coverage can't seem to see this function call
        ...
        npt.assert_almost_equal(the_correct_answer, y)

This works fine (i.e., coverage can easily detect the some_func function call) but this is what was causing the CancelledError in the first place:

@pytest.fixture(scope="module")
def dask_client():
    cluster = LocalCluster(n_workers=2, threads_per_worker=2)
    client = Client(cluster)
    yield client
    # teardown
    client.close()
    cluster.close()

test_data = [np.random.randint(1000)]

@pytest.mark.filterwarnings("ignore:\\s+Port 8787 is already in use:UserWarning")
@pytest.mark.parametrize("x", test_data)
def test_some_func(x, dask_client):
    @gen_cluster(client=True)
    def test(c, s, a, b):
        # Do nothing in the closure
        return

    y = some_func(dask_client, x)
    ...
    npt.assert_almost_equal(the_correct_answer, y)

But in this case, when we move the some_func function call inside of the closure (but still using the client that is generated from the pytest fixture), coverage cannot detect the some_func function call:

@pytest.fixture(scope="module")
def dask_client():
    cluster = LocalCluster(n_workers=2, threads_per_worker=2)
    client = Client(cluster)
    yield client
    # teardown
    client.close()
    cluster.close()

test_data = [np.random.randint(1000)]

@pytest.mark.filterwarnings("ignore:\\s+Port 8787 is already in use:UserWarning")
@pytest.mark.parametrize("x", test_data)
def test_some_func(x, dask_client):
    @gen_cluster(client=True)
    def test(c, s, a, b):
        # Note that we use a dask_client from the pytest fixture above
        y = some_func(dask_client, x)  # Coverage can't seem to see this function call
        ...
        npt.assert_almost_equal(the_correct_answer, y)

Is there a more manual way that I could perform the same setup/teardown that would avoid the closure? So, it would look something like:

def test_some_func(x):
    # Add dask client setup code here
    y = some_func(c, x)  # Coverage can't seem to see this function call
    ...
    npt.assert_almost_equal(the_correct_answer, y)
    # Add dask client tear down code

TomAugspurger · 2020-03-04T14:31:06Z

It may be worth making gen_cluster a proper pytest fixture or mark so that we play more nicely with parametrize and others. I'm not really familiar with how that's done though.

mrocklin · 2020-03-04T23:13:18Z

gen_cluster today takes some keywords, so I'm not sure how easy it would be to make it into a proper fixture unfortunately. When I need to interact with parameterize and others I tend to create a scheduler/workers/client manually within the test with `async with` context managers, and I also include the cleanup context manager.

…

On Wed, Mar 4, 2020 at 9:31 AM Tom Augspurger ***@***.***> wrote: It may be worth making gen_cluster a proper pytest fixture or mark so that we play more nicely with parametrize and others. I'm not really familiar with how that's done though. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3540?email_source=notifications&email_token=AACKZTAW76IFQSF6756C4DDRFZQ2VA5CNFSM4LACIAH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENYDXBQ#issuecomment-594557830>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTBTHNNS3I2YNYJG7VLRFZQ2VANCNFSM4LACIAHQ> .

seanlaw · 2020-03-04T23:24:48Z

@mrocklin Can you point me to any good examples where you used async with context manager with the cleanup context manager when interacting with parameterize? I could try emulating it in my code

mrocklin · 2020-03-05T00:20:57Z

If you grep for async with Scheduler you'll find some. Here is an example

@pytest.mark.asyncio
@pytest.mark.parametrize("Worker", [Worker, Nanny])
async def test_protocol_from_scheduler_address(Worker):
    ucp = pytest.importorskip("ucp")

    async with Scheduler(protocol="ucx") as s:
        assert s.address.startswith("ucx://")
        async with Worker(s.address) as w:
            assert w.address.startswith("ucx://")
            async with Client(s.address, asynchronous=True) as c:
                info = c.scheduler_info()
                assert info["address"].startswith("ucx://")

seanlaw · 2020-03-13T20:42:02Z

One thing that is still perplexing from my original code is that when I did:

@pytest.fixture(scope="module")
def dask_client():
    cluster = LocalCluster(n_workers=2, threads_per_worker=2)
    client = Client(cluster)
    yield client
    # teardown
    client.close()
    cluster.close()

def test_some_func_1():
    ...

def test_some_func_2():
    ...

This successfully sets up the LocalCluster and tears it down after all of the tests are run. However, the number of threads get up to double digits (when I expected only 2 threads):

Is this normal? Is there something else that I need to do in order to teardown the LocalCluster properly or more elegantly?

mrocklin · 2020-03-13T20:54:40Z

That sounds like unexpected behavior, but also not entirely unexpected. Cleaning things up reliably is hard. You might want to take a look at the cleanup fixture, which waits until everything gets properly cleaned up. It's also included in our testing fixtures, which ensure that resources like threads/processes/file-descriptors all get cleaned up.

…

On Fri, Mar 13, 2020 at 1:42 PM Sean M. Law ***@***.***> wrote: One thing that is still perplexing from my original code is that when I did: @pytest.fixture(scope="module") def dask_client(): cluster = LocalCluster(n_workers=2, threads_per_worker=2) client = Client(cluster) yield client # teardown client.close() cluster.close() def test_some_func_1(): ... def test_some_func_2(): ... This successfully sets up the LocalCluster and tears it down after all of the tests are run. However, the number of threads get up to double digits: [image: threads] <https://user-images.githubusercontent.com/7473521/75742886-15b60680-5cdd-11ea-899c-fcfa3dca3d69.png> Is this normal? Is there something else that I need to do in order to teardown the LocalCluster properly or more elegantly? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3540 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTFE6IVXRZA26BKIG2LRHKLCNANCNFSM4LACIAHQ> .

jmoralez · 2021-04-06T00:19:53Z

@seanlaw your approach of defining a cluster fixture instead of using the client one is brilliant. I proposed adopting that in microsoft/LightGBM#4159 which was merged today and reduced the CI time from 20 minutes to 3 minutes. The folks at xgboost are looking into adopting it as well (dmlc/xgboost#6816).

I think this approach should be in https://distributed.dask.org/en/latest/develop.html#writing-tests

TomAugspurger mentioned this issue Mar 3, 2020

DOC: update to async await #3543

Merged

seanlaw closed this as completed Mar 3, 2020

seanlaw mentioned this issue Mar 3, 2020

Migrate Dask.Distributed Testing to Using Dask's Testing Framework TDAmeritrade/stumpy#138

Closed

seanlaw reopened this Mar 3, 2020

seanlaw closed this as completed Mar 14, 2020

jdebacker mentioned this issue Apr 23, 2020

Parallelize functions in test modules PSLmodels/OG-Core#557

Merged

This was referenced Mar 31, 2021

[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) microsoft/LightGBM#4088

Merged

[dask] Random failures in Dask tests during teardown microsoft/LightGBM#3829

Closed

gjoseph92 mentioned this issue Jun 23, 2021

Support fixtures and pytest.mark.parametrize with gen_cluster #4958

Merged

2 tasks

aazuspan mentioned this issue Mar 26, 2024

Fix dataset pickling lemma-osu/sknnr-spatial#7

Merged

rhugonnet mentioned this issue Apr 27, 2024

Add Dask-delayed raster subsample(), reproject() and interp_points() GlacioHack/geoutils#537

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct Way to Setup PyTest Fixture #3540

Correct Way to Setup PyTest Fixture #3540

seanlaw commented Mar 3, 2020 •

edited

Loading

TomAugspurger commented Mar 3, 2020

mrocklin commented Mar 3, 2020 via email

seanlaw commented Mar 3, 2020 •

edited

Loading

TomAugspurger commented Mar 3, 2020

seanlaw commented Mar 3, 2020

seanlaw commented Mar 3, 2020 •

edited

Loading

mrocklin commented Mar 3, 2020 via email

seanlaw commented Mar 3, 2020 •

edited

Loading

mrocklin commented Mar 3, 2020 via email

seanlaw commented Mar 3, 2020 •

edited

Loading

TomAugspurger commented Mar 4, 2020

mrocklin commented Mar 4, 2020 via email

seanlaw commented Mar 4, 2020

mrocklin commented Mar 5, 2020

seanlaw commented Mar 13, 2020 •

edited

Loading

mrocklin commented Mar 13, 2020 via email

jmoralez commented Apr 6, 2021

Correct Way to Setup PyTest Fixture #3540

Correct Way to Setup PyTest Fixture #3540

Comments

seanlaw commented Mar 3, 2020 • edited Loading

TomAugspurger commented Mar 3, 2020

mrocklin commented Mar 3, 2020 via email

seanlaw commented Mar 3, 2020 • edited Loading

TomAugspurger commented Mar 3, 2020

seanlaw commented Mar 3, 2020

seanlaw commented Mar 3, 2020 • edited Loading

mrocklin commented Mar 3, 2020 via email

seanlaw commented Mar 3, 2020 • edited Loading

mrocklin commented Mar 3, 2020 via email

seanlaw commented Mar 3, 2020 • edited Loading

TomAugspurger commented Mar 4, 2020

mrocklin commented Mar 4, 2020 via email

seanlaw commented Mar 4, 2020

mrocklin commented Mar 5, 2020

seanlaw commented Mar 13, 2020 • edited Loading

mrocklin commented Mar 13, 2020 via email

jmoralez commented Apr 6, 2021

seanlaw commented Mar 3, 2020 •

edited

Loading

seanlaw commented Mar 3, 2020 •

edited

Loading

seanlaw commented Mar 3, 2020 •

edited

Loading

seanlaw commented Mar 3, 2020 •

edited

Loading

seanlaw commented Mar 3, 2020 •

edited

Loading

seanlaw commented Mar 13, 2020 •

edited

Loading