Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rocfft][cufft] DFT update host task to use native command #578

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented Oct 2, 2024

Description

Similar to #572 (see the discussions in that PR for technical details) except this covers fft backends for both amd and nvidia cases.

Update host task impl to use enqueue_native_command for DFT using the cuda/hip backends.

tests:

test_main_dft_ct_amd.txt
test_main_dft_rt_amd.txt
test_main_dft_rt_nvidia.txt
test_main_dft_ct_nvidia.txt

author: @hjabird

@JackAKirk
Copy link
Contributor Author

@oneapi-src/onemkl-dft-write could you please review this?

Thanks

Copy link
Contributor

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Note we have used this in 2 GROMACS benchmarks to get between 2% to 7% improvements on MI210.

src/dft/backends/rocfft/execute_helper.hpp Outdated Show resolved Hide resolved
remove whitespace

Co-authored-by: Romain Biessy <romain.biessy@codeplay.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants