-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Restart Test Validation #202
Conversation
…lso add a test to compare the diagnostic output in the log for the final restart file
I should note that the output in the log is different between configurations that write NetCDF and Binary files. So comparing the log between netcdf and iobinary isn't so simple. If preferred, I could modify the script to look at the
|
This is great. Can you summarize what has been tested so far. Have you verified that test suites work fine still? Have you tested an iobinary case? I don't think we need to worry about testing binary vs netcdf. That is likely to be problematic. The log file comparison should be adequate in that situation. |
I've performed a
and the
I have also run the
|
I think there might be some conflicts with the report_results.csh script. That script is grepping for "test" in the results.log file and now there is a "test" and a "test_log" pair of results. That is going to confuse the script. The other thing is that the log comparison is probably most useful when we don't have restart files. So, the log comparison should not assume there is restart data. I agree that we should instead look at the ice diagnostics in the log file to do the comparison. There are some separate challenges with doing that. First, the restart files can be bit-for-bit but the diagnostics not for different pe counts or decomps due to round off differences with sums in the log diagnostics. But lets ignore that for now, I think we know how to deal with that. So, what I think we should do is build a log comparison that is triggered if the restart files don't exist or even if we have a set of netcdf restart and a set of binary restarts. That log comparison should compare the ice diagnostics. And then either the netcdf restarts, binary restarts, or log comparison will be used to generate the "test" result in restart testing. We also probably need the same approach implemented for the case comparison mechanism when we compare one case to another. I recommend a "compare" script be created that can be reused for both the restart test validation and the case compare mode, if that makes sense. I guess that might look like "comparebfb dir1 dir2" with a return string of pass, fail, and maybe other things. The comparebfb looks at dir1 and dir2 and decides whether to do a netcdf, binary, or log file comparison. I don't know if that will work in practice, but lets build in reuse if we can. |
Just FYI: |
@eclare108213, I agree. If restarts exist, compare them. If not, revert to the log files. Neither are completely adequate to check full bit-for-bit of all model variables, especially if the log file does not contain much info. We can have a check that some minimum output data is in the log file for it to pass. We'll deal with the global sum bfb issue in the log files separately. |
I'll replace
I'll update the script to look at the diagnostic output. I've been using the date-time group from the final restart file to determine which portions of the log to look at, so I'll have to change my approach in order to handle cases where there are no restart files. I'll try implementing the |
I've created a new
The 4 failing tests are all log comparisons. For example,
If I manually compare the 2 log files, there are differences in
Should I change the I still need to update the |
Thanks @mattdturner. I think the implementation should be that the log files are compared only if restarts are not compared. It looks like the comparebfb script is only used for the restart test now, not yet for the compare option between different cases. It would be good to support both, but I guess that could be deferred. With regard to the different results for the initial and log file, that is something we need to look into more. For the time being, allow the log comparison to fail if this diff is encountered. We need to decide whether that variable should be excluded or whether that diagnostic needs to be updated. Finally, it's worth adding some sort of check to the log compare so it can pass only if a reasonable set of diagnostics are being compared. Also, can you explain what is being compared in the log files? What is this doing? set base_out = Is it comparing everything after the first istep1 string except min,max,sum? I don't see how that could work for the restart files as the amount of info is different in the two log files. |
I am in the process of modifying the compare option between 2 cases to use the I'll modify the scripting to only compare the log files if restart files are not compared.
What this line does is print out every diagnostic output (i.e., |
Sounds good Matt. Comparing the last timestep makes sense. And if there is not timestep output, the test should fail. Making the compare script work for restarts and for compare cases could be a little tricky. In one case, you have two files (or more for binary) to compare in the same directory with different filenames. In the other, you are comparing files, possibly of the same name in two different directories. The comparisons are the same, it's really a matter of keeping track of the dirs/files, especially when you are trying to decide if there are netcdf or binary files there. If it becomes too complicated, we can defer, but I think its worth trying to do. thanks again! |
… comparing log files, and a difference is encountered, the difference is printed to the log.
Ok, I just updated the scripts. I modified the log comparison slightly to print differences to the log file. I also modified the A
|
I think this looks pretty good. It looks like you can pass in 1 directory, 2 directories, or 2 files. With 1 directory, it assumes it's a restart comparison. With 2 directories, it checks for binary, log, or netcdf comparison. With 2 files, it looks like it's assuming they are both netcdf files. Is that correct? That seems reasonable at this point and looks like a clean implementation with lots of reuse which is great. @mattdturner, do you think I should try this myself before approving or are you fairly confident we're OK? |
That is correct. After testing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can merge this, but won't in case we have some other commits we're trying to stage first. Will try to get this merged by the end of the week.
This PR modifies the restart test validation code to compare log file output in addition to restart files. It also will loop through all restart files for the final date-time group for binary I/O configurations.
Developer(s): Matt Turner
Please suggest code Pull Request reviewers in the column at right.
Are the code changes bit for bit, different at roundoff level, or more substantial? bfb
Please include the link to test results or paste the summary block from the bottom of the testing output below.
Does this PR create or have dependencies on Icepack or any other models? No
Is the documentation being updated with this PR? (Y/N) N
If not, does the documentation need to be updated separately at a later time? (Y/N) N
Note: "Documentation" includes information on the wiki and .rst files in doc/source/,
which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.
This addresses restart and run comparisons #164 and part of Test I/O options #136
The comparison of the logs only analyzes the output after writing the final restart file. For example, when writing NetCDF files the log includes:
When writing binary files, the log includes:
It then appends a
PASS
orFAIL
totest_output
with a tag oftest_log
. For example:The log comparison is currently hard-coded in the
test_restart.script
file. If we want to use it for comparing iobinary runs to netcdf output runs, we will likely need to modify it and/or move it to a new script.