Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Can synchronized TensorCore MMA operations overlap with CUDA Core operations in a single thread? #1821

Open
phantaurus opened this issue Sep 17, 2024 · 0 comments

Comments

@phantaurus
Copy link

What is your question?
Hello,

I am curious whether synchronized TensorCore operations like mma.sync.aligned.m16n8k16.row.col.f16.f16.f16.f16 can run in parallel with non-TensorCore operations such as hexp2 within the same thread, assuming there is no data dependency between them.

Given that these operations utilize different execution pipelines, it seems they should be able to overlap if no data dependencies exist. However, my experimental results suggest otherwise. It seems that they are unable to be parallelized if both are called within one thread.

Thank you so much!

@phantaurus phantaurus changed the title [QST] Can synchronized TensorCore MMA operations overlap with CUDA Core oerations in a single thread? [QST] Can synchronized TensorCore MMA operations overlap with CUDA Core operations in a single thread? Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant