-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory indexing problem for 10k threads #1
Comments
Hello, thank you for your comment.
However, if (num_block×num_thread) is 5000, it does not work because the address exceeds 32-bit.
Because of this problem, even though I have 12GB of global memory, I couldn't get it to work at 10,000 threads. <(num_block×num_thread) = 4,000>
<(num_block×num_thread) = 5000>
|
Hello, thank you for your implementation. Its working fine for 2500 threads, however if I increase the number of threads to 10k with 40GB of GPU memory it gives the following memory related error. Could you investigate this issue with me?
The text was updated successfully, but these errors were encountered: