You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am curious whether synchronized TensorCore operations like mma.sync.aligned.m16n8k16.row.col.f16.f16.f16.f16 can run in parallel with non-TensorCore operations such as hexp2 within the same thread, assuming there is no data dependency between them.
Given that these operations utilize different execution pipelines, it seems they should be able to overlap if no data dependencies exist. However, my experimental results suggest otherwise. It seems that they are unable to be parallelized if both are called within one thread.
Thank you so much!
The text was updated successfully, but these errors were encountered:
phantaurus
changed the title
[QST] Can synchronized TensorCore MMA operations overlap with CUDA Core oerations in a single thread?
[QST] Can synchronized TensorCore MMA operations overlap with CUDA Core operations in a single thread?
Sep 17, 2024
What is your question?
Hello,
I am curious whether synchronized TensorCore operations like mma.sync.aligned.m16n8k16.row.col.f16.f16.f16.f16 can run in parallel with non-TensorCore operations such as hexp2 within the same thread, assuming there is no data dependency between them.
Given that these operations utilize different execution pipelines, it seems they should be able to overlap if no data dependencies exist. However, my experimental results suggest otherwise. It seems that they are unable to be parallelized if both are called within one thread.
Thank you so much!
The text was updated successfully, but these errors were encountered: