-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace the CUDA texture API with a device function. #4022
Conversation
Test this please |
As we previously discussed and you noted, we will iterate on this PR increasing the test tolerances where numerically reasonable. There will likely be a handful of failures in the initial CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit test is missing, and would be a more straight forward way to insure correctness. We are satisified with testing indirectly through the test_coulomb*.cpp tests?
Removed the std:: which appears to resolve the compilation problem and allow the device math functions to be called, at least with CUDA. |
Test this please |
Test this please |
I updated the test tolerances for a set of CUDA builds to get the tests passing. I suspect that most of the change is due to the different algorithm resulting in effectively changed referenced values. However none of the tolerances needed to be increased by much. |
Test this please |
This is what I'm getting right now on a Radeon 7 and an MI250X with ROCm 5.1.2: |
All reasonable variation again. The NiO48 electron run is bigger than most of the other tests and therefore more prone to accumulating error. |
Test this please |
At this point, 0 failures on R7, 1 failure on MI250X: |
Should be fixed now. |
Note that other build types may need a similar process, but getting the default build handled in this PR is fine with me. I didn't see anything concerning in the builds I did with CUDA but there might still be a few edge cases. |
Test this please |
R7:
MI250X:
|
I ran some short statistical tests on a NVIDIA GPU.
|
@jakurzak I updated deterministic-diamondC_2x1x1_pp-vmc_sdj_excited-1-1-totenergy tolerance. |
Test this please |
I also previously ran all the non-long tests. Didn’t see any additional failures. (Did see all the usual "not supported" failures that we have not cleaned up for legacy CUDA) |
Proposed changes
This PR removes the use of the texture API, which is deprecated in CUDA.
The tex1D() function with linear interpolation and clamping is replaced by a device function.
What type(s) of changes does this code introduce?
Removing deprecated API calls.
Replacing tex1D() with a device function.
Does this introduce a breaking change?
The change causes a small number of tests to fail by narrowly exceeding the tolerances.
This is unexpected as texture fetches are much less accurate then the code replacing them.
It is possible that the tolerances need to be adjusted - to be investigated.
What systems has this change been tested on?
Checklist