CUDA role in Memory Efficient Attention #689
Answered
by
danthe3rd
BillyGun27
asked this question in
Q&A
-
I am wondering why do we need NVIDIA GPUs with compute capability above 6.0 (P100+) for Memory-efficient attention? |
Beta Was this translation helpful? Give feedback.
Answered by
danthe3rd
Mar 13, 2023
Replies: 1 comment
-
Hi, |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
BillyGun27
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
Memory-efficient attention has been tested on 6.0+, but should work with 5.0+. This kernel relies on CUTLASS which might not work below 5.0.
If you have an older GPU, it might work if you build from source (and with some luck), but I don't recommend that, and we won't support it if you have questions or issues.