CUDA role in Memory Efficient Attention #689

BillyGun27 · 2023-03-13T03:58:22Z

BillyGun27
Mar 13, 2023

I am wondering why do we need NVIDIA GPUs with compute capability above 6.0 (P100+) for Memory-efficient attention?
Does this function require a specific CUDA operation that is not exist in the one with lower capability?

Answered by danthe3rd

Mar 13, 2023

Hi,
Memory-efficient attention has been tested on 6.0+, but should work with 5.0+. This kernel relies on CUTLASS which might not work below 5.0.
If you have an older GPU, it might work if you build from source (and with some luck), but I don't recommend that, and we won't support it if you have questions or issues.

View full answer

danthe3rd · 2023-03-13T10:22:40Z

danthe3rd
Mar 13, 2023
Collaborator

Hi,
Memory-efficient attention has been tested on 6.0+, but should work with 5.0+. This kernel relies on CUTLASS which might not work below 5.0.
If you have an older GPU, it might work if you build from source (and with some luck), but I don't recommend that, and we won't support it if you have questions or issues.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA role in Memory Efficient Attention #689

{{title}}

Replies: 1 comment

{{title}}

Select a reply

CUDA role in Memory Efficient Attention #689

BillyGun27 Mar 13, 2023

Replies: 1 comment

danthe3rd Mar 13, 2023 Collaborator

BillyGun27
Mar 13, 2023

danthe3rd
Mar 13, 2023
Collaborator