-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chgres_cube: Create new regression tests to further test regional functionality #181
Comments
Currently, the individual chgres_cube regression tests are run in sequence. This can take over 20 minutes. As the number of tests increases, I recommend that they be run in parallel, with the summary log being created after the last test completes. |
@GeorgeGayno-NOAA I currently have our new suite of regression tests running on Hera, with this parallel execution working well. The whole thing takes about 5 minutes now. @JeffBeck-NOAA has agreed to help me test additional regression test drivers on Jet and Orion. However, neither of us has access to WCOSS to test the cray and dell driver scripts. Would you be willing to take the files that we create for the systems we have access too, and engineer similar changes to the cray and dell drivers? |
Yes, I can test on Cray and Dell. |
@GeorgeGayno-NOAA in some testing I'm doing for the new regression tests, some of the previous regression tests are now failing, despite the executable being from the 19th and the regression tests all passing before my last commit with this same executable. The issue looks to be the snow depth and equivalent snow depth fields in the sfc file. The baseline files look to have changed on the 19th for some cases. Any idea why this might be happening? |
I updated the baseline data after the last commit to develop. If your branch is up-to-date with develop, it should pass. How big are the differences? If they are small, then I consider that to be a 'pass'. Check the regression.log file and search for 'nccmp'. It will list the differences. |
They are very small: ~1E-14. I'm just wondering why these are showing up now but weren't before. These are OTOO the errors we see when difference compute node types are used on Jet, or files are copied between systems, but it's only these two fields, which seems really odd. |
Also, it's only tile 6, and only a few grid points. Not usually the same number of grid points for each field. In some cases t2m and q2m are affected as well. Note that the regional test passes, but none of the global tests pass. All of my new regression tests are unaffected. UPDATE: I've tested on Hera and the tests fail with the exact same differences. Down to the # of grid points and the values of the differences. I understand that they're very minor differences, but this not field-wide noise. These are isolated differences that I'd like to understand. The only commits to the develop branch in the past few weeks have been from my PRs, so I'm not certain at all how these differences are occurring. |
@GeorgeGayno-NOAA The same deviations from the baseline tests are showing up when I run the regression tests with the new develop branch on both Jet and Hera. While I understand that these differences are insignificantly small, I feel like we should be aiming for all tests to receive a "PASSED" in the regression test format. I don't know what the original source of the differences is as it's only showing up in a few grid cells on tile 6 in global tests. Should the baselines be recreated with the newest develop branch so that the tests receive a "PASSED"? |
We need to get these regression tests as part of the build, so that we can all see this happening... |
I just ran the regression tests on Hera using 005f9a0 of 'develop'. All tests passed for me. So I am not sure how I can explain your result. For a future task, the regression tests should be modified so any insignificant differences are ignored. But I am not sure how to define 'insignificant'. That is a question for the software engineers. The 'nccmp' utility has an option to ignore differences under a user-specified threshold. But I am not sure of what that threshold should be. |
These differences are only happening for the global regression tests that are already in develop. All of the new regression tests I'm planning to put in a PR soon are passing, as is the regional regression test already in develop. |
I believe I've tracked this issue down to module differences. I've addressed this in Issue #234 |
…ccount for new develop/release split locations. Addresses Issue ufs-community#181 .
Planned regression tests can be seen here
The text was updated successfully, but these errors were encountered: