Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] When ASYNC is enabled GDS needs to handle cudaMalloced bounce buffers #5268

Merged

Conversation

abellina
Copy link
Collaborator

Given the ASYNC allocator is enabled, the UCX bounce buffers are allocated directly, bypassing the ASYNC pool (because memory from the async pool can't be mapped for purposes of GPUDirectRDMA).

This PR fixes GDS copies where it assumed the bounce buffer was a DeviceMemoryBuffer when it was really a CudaMemoryBuffer (e.g. straight from cudaMalloc). It also fixes a couple of leaks where .slice was used, but the sliced buffer was not closed in the stack.

…d buffers

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
@abellina abellina added the bug Something isn't working label Apr 18, 2022
@abellina abellina added this to the Apr 18 - Apr 29 milestone Apr 18, 2022
@jlowe
Copy link
Member

jlowe commented Apr 18, 2022

build

@abellina abellina merged commit 8e353ff into NVIDIA:branch-22.06 Apr 18, 2022
@abellina abellina deleted the bug/gds_base_device_memory_buffer branch April 18, 2022 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants