-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MYSTRAN 15.0 Update #9
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Merging upstream updates
made compliant with the rest of the files as per Ceans observation
… when its name may be uninit'd
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This update contains a lot of commits, and I'll do my best to describe what has been done.
Bug fixes
Most of this update is bug fixes, and most of the bug fixes have been memory-related. As in, MYSTRAN was doing illegal memory operations that caused crashes and/or undefined behaviour. Thorough debugging with
valgrind
was key! Here's every relevant commit and the rationale behind the code changes:ELMOFF
was comparing QUAD4 areas to an uninitializedEPS1
, causing it to think positive areas were negative. The fix: initializeEPS1
to beEPSIL(1)
as everywhere else does.GPWG
had some debug prints that tried to write the value ofITABLE
before it was set. The fix: move the initialization ofITABLE
a couple lines up.ANY_U_P_OUTPUT
, a variable with all assignments and uses commented out, appeared on an IF statement, thus sometimes being read even though it was never set. The fix: comment it out of the IF statement as well.WRITE_ELEM_ENGR_FORCE
caused a buffer to be read past the part that actually held values. The fix: make the indexes consistent with the initialized part.WRITE_ELEM_STRESSES
forgot to set a value calledSTRESS_CODE
, used to indicate what's about to be written. The fix: put in a placeholder value for now -- there's other placeholder values in that same subroutine anyway.GET_KE_OFFSET
was communicating the wrong shape of BUSH element parameters (KEO_BUSH
) to a matrix operation (MATPUT
). The fix: communicate the correct shape.I2_GMN
regardless of whether it had been just deallocated or not, triggering a bad read. The fix: move the call into the conditional for whetherI2_GMN
was just deallocated.SCNUM
vector, containing subcase numbers, was being allocated with the wrong size, resulting in an overflow when it was written to. In MODES solutions, the "subcases" are the number of requested eigenvectors. The fix: setLSUB
, which determines the maximum number of subcases in the solution, as soon was we see the EIGR bulk data card "chosen" by the METHOD case control card duringLOADB
. This way, all subsequent (i.e. afterLINK0
) allocations ofSCNUM
will be the correct size. There are no writes toSCNUM
that could overflow before, since we set the size before the limit is even communicated to the rest of the program.EIGRL
cards and theOGROUT
vector. The fix: same as above, but settingLSUB
as soon as we see the rightEIGRL
card duringLOADB
.OCID
(or the continuation card is absent), that means we use the GA-GB line as the x-axis for computing offsets. However, when GA and GB coincide, that line can't be computed, but an attempt was being made to compute them anyway, resulting in some nasty memory bugs. The fix: detect zero-length BUSH elements with negativeOCID
and mark them as non-offset, thus avoiding the problematic computations.Almost all of these were detected because they manifested as (usually) silent illegal memory operations. Fixing them helps ensure MYSTRAN behavior is deterministic and prevents crashes or garbled output.
New features
This release was focused on fixing bugs. I just added a
NOCOUNTS
parameter (968909c and 88c294a) to disable those "counter" progress-indicating writes to standard output. Why? Because not all terminals can handle it (see: VSCode's debugger terminal), and neither can files. It's disabled by default, so counters are still there if you don't set it. Counters also make some runs longer, since every operation is punctuated by awrite
syscall. Being able to disable them is good if you're debugging, writing standard output to a log file, or running a large model where 10% time savings mean hours.If you see any counter getting past
NOCOUNTS
, sorry, they're really hard to find in the code! Tell us via Discord, a GitHub issue, a forum post, whatever you prefer.Build and documentation changes
I updated the build instructions (20afd36 and c53e591) to make them more consistent with the way our build script handles libraries.
Also, we bumped our SuperLU version to commit xiaoyeli/superlu@76b2c9a in order to integrate a fix for an invalid read while trying to factor a singular matrix.
Oh, and I added the
--fcheck=all
flag to enable runtime checks. This way, memory bugs are less silent -- as opposed to lurking around until someone decides to run onvalgrind
. There's a small performance hit, comparable to enablingNOCOUNTS
, but the benefits far outweigh the cost. Besides, there are more pressing bottlenecks, and one can always disable that flag by editingCMakeLists.txt
if they're so inclined. But I do not recommend that. At all.And the manual needs updating, of course. So do some other documents. That's in the works.
Other changes
BANDIT is now disabled by default (8e1050d), regardless of what solver you choose to use. Why? Because it's broken. It's written in Fortran 77 with tons of nonstandard stuff, and getting it to work would be moot: if you're running a model large enough for the banded solver to need BANDIT, you shouldn't be using the banded solver. Use SuperLU:
PARAM,SOLLIB,SPARSE
.Extricating BANDIT from the code is low-priority due to its very high difficulty-impact ratio. That means you can still enable it, but it won't work. Don't do it.
Finally, this update is a bump to the
15.0
version, so it was also set on the code (d31a1a6).A quick warning
The bump in the SuperLU version might require you do a clean build. If you get random linker errors, run a
make clean
, deletesuperlu/
,Binaries/
,CMakeCache.txt
, andCMakeFiles/
. Then, re-runcmake
with the appropriate arguments. Sorry about that, it's got to do with how CMake handles Git submodules.Results and final remarks
Phew, that was a lot of commits. Let me summarize what this update means.
All models in our current benchmark set now run without any illegal memory operations. So do other models that used to cause trouble, like
cshear.bdf
(part of the build verification suite) andlarge_shelled_beam.bdf
(user-reported, I think).That doesn't mean results are necessarily correct. Not all bugs are memory bugs! But this update means that many models that used to trigger nondeterministic behaviour and/or crashes now run to completion. This way, we can actually get the results to verify they're correct, and also work on new features unencumbered by crashes. Not bad for a month's work, eh?
Feedback is very much welcome, and there's more on the way!