-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interface raven running raven #341
Conversation
…ravenRunningRaven
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problems found, but a few comments that could use addressing. Once we resolve those and the Windows test failures, we should be good to go.
Another general comment: did we add various parallel tests for this (shared, internalParallel, and MPI)? I feel like when we qsub the master and slave cases both especially there's a lot that could behave strangely.
from utils import utils | ||
from CodeInterfaceBaseClass import CodeInterfaceBase | ||
import DataObjects | ||
#import RavenData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this guy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover...removing
if not self.hasMethods['scalar'] and not self.hasMethods['noscalar']: | ||
raise IOError(self.printTag +' ERROR: the conversionModule "'+extModForVarsManipulationPath | ||
+'" does not contain any of the usable methods! Expected at least ' | ||
+'one of: "manipulateScalarSampledVariables" and/or "manipulateScalarSampledVariables"!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no issue with this being an error, but would it be better as a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep this as an error since the only reason why a user should "import" that module is to implement one of those 2 methods. Otherwise the interface is not going to do anything with it.
dataObjectsToReturn[dataObjectInfo[0]] = DataObjects.returnInstance(dataObjectInfo[1],None) | ||
dataObjectsToReturn[dataObjectInfo[0]]._readMoreXML(dataObjectInfo[2]) | ||
dataObjectsToReturn[dataObjectInfo[0]].name = filename | ||
dataObjectsToReturn[dataObjectInfo[0]].loadXMLandCSV(os.path.join(workingDir,self.innerWorkingDir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, this is nice. I like this implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:P
splittedComp = subNode.split("@") | ||
component = splittedComp[0] | ||
attribPath = "" | ||
attribConstruct = OrderedDict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why OrderedDict? Does the order of the attributes come into play?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the orderedDict is needed to detect if we need to change the attribute value or the node itself (see the if statement later on)
allowAddNodes.append(pathNode) | ||
else: | ||
allowAddNodes.append(None) | ||
#allowAddNodes.append(returnElement.findall(pathNode)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing...
modifyDict['Distributions|Normal@name:Andrea|sigma'] = 2 | ||
parser.printInput(lupo,outfile="lupo1.xml") | ||
lupo2 = parser.modifyOrAdd(modifyDict,allowAdd=True) | ||
parser.printInput(lupo2,outfile="lupo2.xml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the unit test, but it seems to only apply to your user directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right....leftover... I will modify it.
for inKey, value in unstructuredInputs.items(): | ||
appendDict['inputSpaceParams'][inKey] = value[i] | ||
for outKey, value in outputs.items(): | ||
appendDict['outputSpaceParams'][outKey] = value[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be reading this wrong, but does this always take all the results from the slave raven run, or can it take selective results, as requested by the output data object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now it takes all results.
I opened an issue to modify it later. (#363)
<batchSize>3</batchSize> | ||
<!-- | ||
<internalParallel>True</internalParallel> | ||
--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented xml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing
</RunInfo> | ||
|
||
<Files> | ||
</Files> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot it.
removing...
@@ -0,0 +1,186 @@ | |||
<?xml version="1.0" ?> | |||
<Simulation verbosity="debug"> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we usually have <TestInfo>
for code interface tests? I don't know if it makes sense to have it for a sub-input of a RRR test, but maybe it would help with clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that all the CodeInterface tests have a TestInfo.
But probably the sub-input TestInfo should be removed (since the real test is the master input)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks fine with my first look. i had some comments and questions.
The RAVEN interface is meant to provide the possibility to execute a RAVEN input file | ||
driving a set of SLAVE RAVEN calculations. For example, if the user wants to optimize the parameters | ||
of a surrogate model (e.g. minimizing the distance between the surrogate predictions and the real data), he | ||
can achieve this task setting up a RAVEN input file (master) that performs an optimization on the feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this task setting up a" -> "this task by setting up a"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed...
can achieve this task setting up a RAVEN input file (master) that performs an optimization on the feature | ||
space characterized by the surrogate model parameters, whose training and validation assessment is performed in the SLAVE | ||
RAVEN runs. | ||
// There are some limitations for this interface: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "//" be "\\"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
completely right
\item only a maximum of two Outstreams can be collected (1 PointSet and 1 HistorySet) | ||
\end{itemize} | ||
|
||
Like for every other interface, most of the RAVEN workflow stays the same independently on the type of Model (i.e. Code) is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"independently on the type of Model (i.e. Code) is used." -> "independently of which type of Model (i.e. Code) is used."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
</MonteCarlo> | ||
</Samplers> | ||
\end{lstlisting} | ||
In the above example, it can be inferred that each XML node (subnode) needs to be separated by a ``|'' separator. In addition, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, in XPath, / is used as a separator. Since XPath is not being followed exactly, the full syntax supported should probably be documented.
warnings.simplefilter('default',DeprecationWarning) | ||
|
||
import os | ||
from __builtin__ import any as b_any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to create 'b_any'? Could we just use 'any'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It prevents the need to explicitly create the list out of a generator. I don't know what the efficiency gain is in this instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> import __builtin__
>>> __builtin__.any is any
True
I think b_any is exactly the same as any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only reference comes from searching it:
https://stackoverflow.com/questions/16380326/check-if-substring-is-in-a-list-of-strings
On further inspection, it appears the OP of the stack overflow question had possibly overwritten his any
and needed a replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they did from numpy import *
which is forbidden in the raven code standards anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for what it is worth, the full correct way to define b_any is:
import sys
if sys.version_info.major > 2:
from builtins import any as b_any
else:
from __builtin__ import any as b_any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
@ Out, None. | ||
""" | ||
if os.path.basename(xmlNode.find("executable").text) != 'raven_framework': | ||
raise IOError(self.printTag+' ERROR: executable must be "raven_framework" (in whatever location)!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure why this is an error, but I supposed it is easier to switch an error to something that is allowed than the other way round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to raise an error because in the raven_framework we do a lot of checks on the libraries that we do not do in Driver.py, so I thought it would be safer to force the user to use the "raven_framework" wrapper.
Do you think I should allow also to directly point to Driver.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I would rather not allow pointing directly to Driver.py. I think leaving it an error for now is fine, and if someone thinks of a good reason to use something else, we can change it at that point.
except: | ||
return failure | ||
readLines = outputToRead.readlines() | ||
if not b_any("raise" in x for x in readLines[-20:]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we looking for that would not be found by the executable's return code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with Josh, it looks like that the return code is masked by the jobhandler thread.
@ Out, None | ||
""" | ||
# the dirName is actually in workingDir/StepName/prefix => we need to go back 2 dirs | ||
dirName = os.path.join(currentDirName,"../../") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly this should be:
dirName = os.path.dirname(os.path.dirname(currentDirName))
``
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the separator os.path.sep
. This suggested implementation does not work (looks like)
return returnElement | ||
|
||
if __name__ == '__main__': | ||
parser = RAVENparser("/Users/alfoa/projects/raven_github/raven/tests/framework/test_redundant_inputs.xml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part is hard coded. Should the if __name__ == '__main__'
block be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed...
framework/Models/Code.py
Outdated
if type(outputFile).__name__ == 'dict': | ||
ravenCase = True | ||
if ravenCase and self.code.__class__.__name__ != 'RAVEN': | ||
self.raiseAnError(RuntimeError, 'The return argument from "finalizeCodeOutput" must be a str containing the new output file root!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basically forbids any code besides RAVEN from returning a dictionary. I am trying to decided if this is good or bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I did on purpose to avoid any other code interface to return a different thing than a outputfilename root...
I thinks we should allow to do that but the system should be better designed...
inputAsString = "".join([s for s in inputAsString.strip().splitlines(True) if s.strip()]) | ||
if outfile==None: | ||
outfile =self.inputfile | ||
IOfile = open(outfile,'wb') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this is opened as binary 'wb' as opposed to 'w'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. No reason. I converted it into "w+"
Addressed Josh's and Paul's comments so far... now working on the BatchSize as we discussed in person... |
At least on my windows computer, changing bin/bash to usr/bin/bash makes the RAVEN running RAVEN test work, so I pushed bc221d9 |
Interesting, on both my PC (which is a pretty new install) and RAVENHOME (which is older) there are both /bin/bash and /usr/bin/bash. |
That is interesting. And if RAVENHOME has it, then I don't know why the windows test is failing because the only reason it failed on my windows computer was because of the /bin/bash problem. After I fixed that it worked. |
After looking closer, I noticed that "ls -l /bin/bash" finds bash, but if you look in windows explorer, there is no "C:\msys64\bin\bash.exe", but there is a "C:\msys64\usr\bin\bash.exe" So that explains what is happening. msys is doing some magic to make it look like there is a /bin/bash. |
Okay, it passed on RAVENHOME: |
@PaulTalbot-INL Have all your comments been address? If yes, can you switch it to a comment, not a request changes. If no, which still need to be addressed? (All the ones on non-stale diffs seem to have been addressed.) Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes were caught.
Thanks @PaulTalbot-INL I'll review it again and do the checklist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code reviewed and I approve.
readLines = outputToRead.readlines() | ||
for badMessage in ["Traceback ","raise","IOError"]: | ||
if not any(badMessage in x for x in readLines): | ||
failure = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "IOError" is not in the lines, this will have failure = False (even if say Traceback or raise is found). I think you want:
foundBadMessage = False
for badMessage in ["Traceback ","raise","IOError"]:
if any(badMessage in x for x in readLines):
foundBadMessage = True
if not foundBadMessage:
failure = False
As soon as the tests finish, I'll do the checklist. |
Job Test linux on bf2e128 : invalidated by @alfoa understandable error O_o |
Basically, it is failing because conda failed to list the environment.
I suppose if we consider this a problem, I can think of two ways to fix it. The other would be to modify the line in scripts/setup_raven_libs
Update: |
Job Test linux on bf2e128 : invalidated by @joshua-cogliati-inl conda failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve changes.
Just checking, we added a test to cover the failure case @AaronEpiney was seeing, right? The rerun with raven running raven, including the lock file and the job handler in kwargs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
@PaulTalbot-INL No I did not add the test yet but I am working on since I came in. Thanks |
Added test to check the correct collection of failed runs from EnsembleModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked cfa6e32 and I approve.
The tests have passed. Is it ready for final review? |
Yes it is. |
Pull Request Description
What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)
Closes #342
What are the significant changes in functionality due to this change request?
New interface to run RAVEN running RAVEN
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.
<internalParallel>
to True. Tests addedNo change
Checklist passes. By @joshua-cogliati-inl on 2017-Sept-22