-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU User Experience Improvements #1283
Conversation
If I understand correctly, the current proposal is to check for errors after execution. I think it might be worth considering a case where we fail at the first CUDA error, not just check at the end - this will make debugging much easier and faster. We could make it an optional feature if there's any concern that an additional check of a return value from CUDA calls would add an unnecessary performance tax. Happy to help with this feature by improving & merging my solution for this. |
We don’t want the overhead by default, that’s what syncdebug is for. It does what you proposed and more (and even more after this PR). |
@mcopik I followed your suggestion in the latest commit |
…if extra dimensions specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor comments to consider.
__launch_bounds__
support in map nodes:-1
disables,0
(default) sets if block size is constant, any other number sets explicitlydace.program
#1263)gpu_block_size
or thread-block sub maps