Can't get UMAP to be distributed with DASK #3991

nono9212 · 2021-06-16T00:28:30Z

Hi,

I am trying to get UMAP to run on two GPU of my setup. However, despite following the doc here I can't get the distribution part to work. I changed this line to be
cluster = LocalCUDACluster(threads_per_worker=1) cluster = LocalCUDACluster(CUDA_VISIBLE_DEVICES="8,9")
But when creating the UMAP model, it loads on the first GPU (seen on nvidia-gpu)
local_model = UMAP() # This loads the first GPU

Here is how I coded the rest :

model= cumlUMAP( n_neighbors=200,
    n_components=2,
    min_dist=0.05,
    verbose=True)
DISTmodel = daskUMAP(
    client=client,
    model = um
)

result = DISTmodel.fit_transform(data)

So I can't get the calculations to happen on GPU 8,9
Is that possible? I am not sure that what I am looking for is already included...
Thanks for any advice!

The text was updated successfully, but these errors were encountered:

viclafargue · 2021-06-16T09:24:16Z

Hi @nono9212, thanks for opening the issue.

The distributed UMAP works in the following way :

A single-GPU UMAP model is first trained with a representative sample of the dataset
This model is then broadcasted to all workers allowing distributed inference of a larger dataset

The single-GPU instance of UMAP is not connected to the Dask client (nor cluster) and will always default to the first visible GPU.
To specify the GPUs you would like to use you can either, do it from command line : CUDA_VISIBLE_DEVICES=8,9 python script.py
Or from the Python code: os.environ["CUDA_VISIBLE_DEVICES"]="8,9" (before training local UMAP model)

nono9212 · 2021-06-19T20:18:11Z

Great thanks a lot for the explanation. Is there any way of freeing the memory when not using a client?

viclafargue · 2021-06-21T10:30:21Z

CuPy/cuDF/Numba allocations like other objects are enlisted to be released by the Python garbage collector as soon as they are out of scope or explicitly destroyed. It is true though that the garbage collector takes care of host memory and not so much of the memory that the objects are pointing to on GPU. Because of this GPU memory might not be released on time for new allocations.

In order to force the release of the memory, you can do the following:

import gc
del obj # Can be a single-GPU UMAP model here for instance
gc.collect() # Releases local memory

You can even do the same on a Dask cluster: client.run(gc.collect) # Releases memory on cluster.
Hope it answers your question

cjnolet · 2021-07-23T17:03:10Z

@nono9212,

One thing to consider when using Dask in a GPU environment is that your client process might be sharing a GPU with one of the workers. You can get around this by explicitly setting it to use a different GPU.

The Python garbage collector should free up any stray objects on host which, if they were holding allocations to GPU memory, should also clean up the GPU memory. The thread in #4068 might be relevant to this discussion as well.

github-actions · 2021-11-23T21:03:16Z

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

viclafargue · 2021-11-24T09:51:26Z

Closing the issue. Please don't hesitate to re-open if needed.

hcho3 added Dask / cuml.dask Issue/PR related to Python level dask or cuml.dask features. question Further information is requested labels Jun 16, 2021

dantegd mentioned this issue Aug 2, 2021

[TRACKER] Algorithm issues and tech debt #4139

Open

43 tasks

github-actions bot added the inactive-90d label Nov 23, 2021

viclafargue closed this as completed Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't get UMAP to be distributed with DASK #3991

Can't get UMAP to be distributed with DASK #3991

nono9212 commented Jun 16, 2021

viclafargue commented Jun 16, 2021 •

edited

Loading

nono9212 commented Jun 19, 2021

viclafargue commented Jun 21, 2021

cjnolet commented Jul 23, 2021

github-actions bot commented Nov 23, 2021

viclafargue commented Nov 24, 2021

Can't get UMAP to be distributed with DASK #3991

Can't get UMAP to be distributed with DASK #3991

Comments

nono9212 commented Jun 16, 2021

viclafargue commented Jun 16, 2021 • edited Loading

nono9212 commented Jun 19, 2021

viclafargue commented Jun 21, 2021

cjnolet commented Jul 23, 2021

github-actions bot commented Nov 23, 2021

viclafargue commented Nov 24, 2021

viclafargue commented Jun 16, 2021 •

edited

Loading