Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU result has number of walkers dependency and doesn't match CPU result #1054

Closed
yaoyi92 opened this issue Sep 8, 2018 · 5 comments
Closed
Assignees

Comments

@yaoyi92
Copy link

yaoyi92 commented Sep 8, 2018

Details in the post here.
https://groups.google.com/forum/#!topic/qmcpack/eOE1eIXAgaE

Not 100% sure whether it is a problem for my build. ctest results here in the post
https://groups.google.com/forum/#!topic/qmcpack/1QzNQoceNHs

image

@jtkrogel jtkrogel added the bug label Sep 10, 2018
@yaoyi92
Copy link
Author

yaoyi92 commented Sep 11, 2018

This seems to be a Volta V100 specific issue. I don't have this problem compiling and running the same simulation on GTX1080.

Need help from someone with experience of Volta V100.

@jtkrogel
Copy link
Contributor

Something does seem to be up with Volta:
https://cdash.qmcpack.org/CDash/index.php?project=QMCPACK&date=2018-09-07.

Compare "Volta-GCC-CUDA-Release" with "GCC-CUDA-Release".

@prckent
Copy link
Contributor

prckent commented Sep 11, 2018

GCC-CUDA-Release runs on a Kepler. This is printed on one on the ~third line of output.

Updated:

Something happened between 7-8 August
Working: https://cdash.qmcpack.org/CDash/testDetails.php?test=2851998&build=26579
Broken: https://cdash.qmcpack.org/CDash/testDetails.php?test=2858752&build=26628
Possibly this was an update to the script or CUDA version, will investigate

@prckent
Copy link
Contributor

prckent commented Oct 9, 2018

To keep this updated: the current belief is that this has hit some kind of edge case (bug) or is a newly surfaced problem on Volta or with recent CUDAs (bug). For whatever reason, our current tests don't show the same problem. @PDoakORNL will try to chase down.

@ye-luo
Copy link
Contributor

ye-luo commented Mar 21, 2021

Some recent fixes in legacy CUDA probably addressed this issue. Open a new issue if needed.

@ye-luo ye-luo closed this as completed Mar 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants