Alfoa/dataobject rework finalize ensemble (#565)

* Closes #541 * Update GenericCodeInterface.py * fixed tester (#528) * ensemble model pb weights for variables coming from functions * fixed single-value-duplication error for SKL ROMs (#555) * fixed single-value-duplication error * fixed test framework/ensembleModelTests.testEnsembleModelWith2CodesAndAliasAndOptionalOutputs * modified order of input output to avoid regolding * Reducing DataObject Attribute Functionality (#278) * Enabling the data attribute tests and fixing the operators for PointSets. TODO: Break the data_attributes test down to be more granular and fix the outputPivotValue on the HistorySets. * Splitting the test files for the DataObject attributes and correcting some malformations in the subsequent input files. TODO: Fix the attributes for the history set when operating from a Model. * Fixing HistorySet data attribute test case to look for the correct file. * Correcting attributions for data object tests. maljdan had only moved the files. The original tests were designed by others. TODO: verify if test results are valid or the result of incorrect gold files. * Reducing the number of DataObjects needed in the shared suite of DataObject attribute tests. * Regolding the DataObject HistorySet attributes files to respect the outputPivotVal specified for stories2. * Picking up where I left off, trying to recall what modifications still need to be done to the HistorySet. * Regolding a test case on data attributes, removing dead code from the HistorySet and updating some aspects of the PointSet. * Removing data attribute feature set with explanation in comments. Cleaning old code. * Regolding fixed test case. * Reverting changes to ensemble test and accommodating unstructured inputs. * addressed misunderstanding in HistorySet * added HSToPSOperator PP * added documentation for new interface * finished new PP * addressed first comments * addressed Congjian's comments * updated XSD * moving ahead * fixed test framework/ensembleModelTests.testEnsembleModelLinearThreadWithTimeSeries * fixed framework/ensembleModelTests.testEnsembleModelLinearParallelWithOptimizer * fixed framework/CodeInterfaceTests.DymolaTestTimeDepNoExecutableEnsembleModel * fixed framework/PostProcessors/InterfacedPostProcessor.metadataUsageInInterfacePP * fixed new test files coming from devel * updated InterfacedPP HStoPSOperator * fixed xsd * added documenation for DataSet * added conversion script from old HDF5 to new HDF5 * Update DataObjects.xsd * remove white space * Update database_data.tex * Update postprocessor.tex * removed unuseful __init__ in Melcor interface * addressed Congjian's comments
idaholab · Feb 5, 2018 · 3067d8b · 3067d8b
1 parent 7c58f90
commit 3067d8b
Show file tree

Hide file tree

Showing 75 changed files with 3,082 additions and 1,892 deletions.
diff --git a/developer_tools/XSDSchemas/DataObjects.xsd b/developer_tools/XSDSchemas/DataObjects.xsd
@@ -3,10 +3,12 @@
 <!-- *********************************************************************** -->
 <!--                               DataObjects                               -->
 <!-- *********************************************************************** -->
+
   <xsd:complexType name="DataObjectsData">
     <xsd:sequence>
-      <xsd:element name="PointSet" type="commonDataObjectsData"  minOccurs="0" maxOccurs="unbounded"/>
-      <xsd:element name="HistorySet"    type="commonDataObjectsData"  minOccurs="0" maxOccurs="unbounded"/>
+      <xsd:element name="PointSet"   type="commonDataObjectsData"  minOccurs="0" maxOccurs="unbounded"/>
+      <xsd:element name="HistorySet" type="commonDataObjectsData"  minOccurs="0" maxOccurs="unbounded"/>
+      <xsd:element name="DataSet"    type="dataSetData"            minOccurs="0" maxOccurs="unbounded"/>
     </xsd:sequence>
   </xsd:complexType>
 
@@ -23,15 +25,15 @@
       <!-- Either inputRow or inputPivotValue  will be specified (mutually exclusive) -->
       <xsd:choice>
         <xsd:element name="inputRow"         type="xsd:integer" minOccurs="0"/>
-        <xsd:element name="inputPivotValue"  type="xsd:float"   minOccurs="0"/>
+        <!--  <xsd:element name="inputPivotValue"  type="xsd:float"   minOccurs="0"/> -->
       </xsd:choice>
       <xsd:element name="pivotParameter"   type="xsd:string"  minOccurs="0"/>
       <!-- Either operator, outputRow, or outputPivotValue will be specified
            We need a way to figure out how to do mutually exclusive events like this -->
       <xsd:choice>
         <xsd:element name="outputRow"        type="xsd:integer" minOccurs="0"/>
         <xsd:element name="operator"         type="operatorType"  minOccurs="0"/>
-        <xsd:element name="outputPivotValue" type="xsd:string"  minOccurs="0"/>
+        <!-- <xsd:element name="outputPivotValue" type="xsd:string"  minOccurs="0"/> -->
       </xsd:choice>
    </xsd:sequence>
   </xsd:complexType>
@@ -45,4 +47,23 @@
     <xsd:attribute name="name"         type="xsd:string"  use="required"/>
     <xsd:attribute name="hierarchical" type="RavenBool"/>
   </xsd:complexType>
+
+  <xsd:complexType name="dataSetData">
+      <xsd:sequence>
+          <xsd:element name="Input"   type="xsd:string"  minOccurs="0"/>
+          <xsd:element name="Output"  type="xsd:string"  minOccurs="0"/>
+          <xsd:element name="Index"   type="IndexType"   minOccurs="0" maxOccurs="unbounded"/>
+          <xsd:element name="options" type="optionsType" minOccurs="0"/>
+      </xsd:sequence>
+      <xsd:attribute name="name"         type="xsd:string"  use="required"/>
+      <xsd:attribute name="hierarchical" type="RavenBool"/>
+  </xsd:complexType>
+
+  <xsd:complexType name="IndexType">
+      <xsd:simpleContent>
+          <xsd:extension base="xsd:string">
+              <xsd:attribute name="var" type="xsd:string" use="required"/>
+          </xsd:extension>
+      </xsd:simpleContent>
+  </xsd:complexType>
 </xsd:schema>
diff --git a/doc/user_manual/database_data.tex b/doc/user_manual/database_data.tex
@@ -8,16 +8,15 @@ \section{DataObjects}
 These interactions are made possible through a data handling system that each
 entity understands.
 %
-This system, neglecting the grammar imprecision, is called the ``DataObjects''
-system.
+This system is called the ``DataObjects'' framework.
 
 The \xmlNode{DataObjects} tag is a container of data objects of various types that can
 be constructed during the execution of a particular calculation flow.
 %
 These data objects can be used as input or output for a particular
 \textbf{Model} (see Roles' meaning in section \ref{sec:models}), etc.
 %
-Currently, RAVEN supports 4 different data types, each with a particular
+Currently, RAVEN supports 3 different data types, each with a particular
 conceptual meaning.
 %
 These data types are instantiated as sub-nodes in the \xmlNode{DataObjects} block of
@@ -35,11 +34,35 @@ \section{DataObjects}
   input domain.
   %
   It can be considered a mapping between multiple sets of parameters in the
-  input space and the resulting sets of temporal evolutions in the output
+  input space and the resulting sets of temporal evolution in the output
   space.
   %
+   \item \xmlNode{DataSet} is a generalization of the previously described DataObject, 
+   aimed to contain a mixture of data (scalars, arrays, etc.). The variables here stored
+   can be independent (i.e. scalars) or dependent (arrays) on certain dimensions (e.g. time, coordinates, etc.).
+  %
+  It can be considered a mapping between multiple sets of parameters in the
+  input space (both dependent and/or independent) and the resulting sets of evolution in the output
+  space (both dependent and/or independent).
+  %
+  \nb \textcolor{red} {\textbf{The  \xmlNode{DataSet} is currently usable in the  \xmlNode{EnsembleModel} only (see \ref{subsec:models_EnsembleModel} )}}
 \end{itemize}
 
+In summary, the DataObjects accept the following data in their input/output spaces:
+\begin{table}[h]
+\centering
+\caption{DataObjects' accepted data formats.}
+\label{DataObjectDataFormatTable}
+\begin{tabular}{|c|c|c|}
+\hline
+\textbf{DataObject}                        & \textbf{Input Space} & \textbf{Output Space} \\ \hline
+{\color[HTML]{FE0000} \textit{PointSet}}   & scalars              & scalars               \\ \hline
+{\color[HTML]{FE0000} \textit{HistorySet}} & scalars              & vectors               \\ \hline
+{\color[HTML]{FE0000} \textit{DataSet}}    & any                  & any                   \\ \hline
+\end{tabular}
+\end{table}
+
+
 As noted above, each data object represents a mapping between a set of
 parameters and the resulting outcomes.
 %
@@ -50,12 +73,13 @@ \section{DataObjects}
   <DataObjects>
     <PointSet name='***'>...</PointSet>
     <HistorySet name='***'>...</HistorySet>
+    <DataSet name='***'>...</DataSet>
   </DataObjects>
    ...
 </Simulation>
 \end{lstlisting}
 
-Independent of the type of data, the respective XML node has the following
+Independently on the type of data, the respective XML node has the following
 available attributes:
 \vspace{-5mm}
 \begin{itemize}
@@ -109,17 +133,32 @@ \section{DataObjects}
   \default{False}
 \end{itemize}
 \vspace{-5mm}
-In each XML node (e.g. \xmlNode{PointSet} or \xmlNode{HistorySet}), the user
+In each XML node (e.g. \xmlNode{PointSet}, \xmlNode{HistorySet} or  \xmlNode{DataSet}), the user
 needs to specify the following sub-nodes:
 \begin{itemize}
-  \item \xmlNode{Input}, \xmlDesc{comma separated string, required field} lists
+  \item \xmlNode{Input}, \xmlDesc{comma separated string, required field}, lists
   the input parameters to which this data is connected.
   %
-  \item \xmlNode{Output}, \xmlDesc{comma separated string, required field} lists
+  \item \xmlNode{Output}, \xmlDesc{comma separated string, required field}, lists
   the output parameters to which this data is connected.
   %
 \end{itemize}
 
+The  \xmlNode{PointSet} and  \xmlNode{HistorySet} objects are a specialization of the  \xmlNode{DataSet} where the 
+independent dimensions are defaulted to none, for the \xmlNode{PointSet}, or to the \textit{pivotParameter} (e.g. time), for the \xmlNode{HistorySet}.
+If a  \xmlNode{DataSet} needs to be constructed, an additional information (e.g. sub-node) needs to be inputted, the  \xmlNode{Index}:
+
+\begin{itemize}
+  \item \xmlNode{Index}, \xmlDesc{comma separated string, required field (if  \xmlNode{DataSet})}, lists
+  the dependent variables that depend on this index (specified through the attribute  \xmlAttr{var}).
+  This XML node requires the following attribute:
+   \begin{itemize}
+       \item \xmlAttr{var}, \xmlDesc{required string attribute}, the dimension name of this index (e.g. time)
+   \end{itemize}
+  %
+\end{itemize}
+
+
 In addition to the XML nodes \xmlNode{Input} and \xmlNode{Output} explained above, the user
 can optionally specify a XML node named  \xmlNode{options}. The  \xmlNode{options} node can
 contain the following optional XML sub-nodes:
@@ -147,41 +186,58 @@ \section{DataObjects}
        \xmlNode{outputRow} and  \xmlNode{outputPivotValue} can not be inputted (mutually exclusive).
        \\\nb This XML node is available for DataObjects of type \xmlNode{PointSet} only;
   %
-  \item \xmlNode{pivotParameter}, \xmlDesc{string, optional field} the name of
-    the parameter whose values need to be used as reference for the values
-    specified in the XML nodes \xmlNode{inputPivotValue},
-    \xmlNode{outputPivotValue}, or \xmlNode{inputPivotValue} (if inputted).
-    This field can be used, for example, if the driven code output file uses  a
-    different name for the variable ``time'' or to specify a different reference
-    parameter (e.g. PRESSURE). Default value is \xmlString{time}.
-    \\\nb The variable specified here should be monotonic; the code does not
-    check for eventual oscillation and is going to take the first occurance for
-    the values specified in the XML nodes \xmlNode{inputPivotValue},
-    \xmlNode{outputPivotValue}, and  \xmlNode{inputPivotValue};
-  %
-  \item \xmlNode{inputPivotValue}, \xmlDesc{float, optional field}, the value of the \xmlNode{pivotParameter} at which the input space needs to be retrieved
-    If this node is inputted, the node  \xmlNode{inputRow} can not be inputted (mutually exclusive).
-    %
-  \item \xmlNode{outputPivotValue}. This node can be either a float or a list of floats, depending on the type of DataObjects:
-   \begin{itemize}
-      \item if \xmlNode{HistorySet},\xmlNode{outputPivotValue}, \xmlDesc{list of floats, optional field},  list of values of the
-                          \xmlNode{pivotParameter} at which the output space needs to be retrieved;
-      \item if \xmlNode{PointSet},\xmlNode{outputPivotValue}, \xmlDesc{float, optional field},  the value of the \xmlNode{pivotParameter}
-         at which the output space needs to be retrieved. If this node is inputted, the node  \xmlNode{outputRow} can not be inputted (mutually exclusive);
-   \end{itemize}
+  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+  %%%% This feature is being disabled until the DataObjects handle data in a
+  %%%% more encapsulated fashion. When the data can handle this all internally
+  %%%% then we can re-add this feature. As of now, determining the rows
+  %%%% associated to the outputPivotValue or inputPivotValue requires knowing
+  %%%% information outside of the "value" passed into
+  %%%% DataObject.updateOutputValue or DataObject.updateInputValue, thus the
+  %%%% caller has to do this computation, but currently the caller occurs in ~50
+  %%%% different places according to my grep of "updateOutputValue"
+  %%%% -- DPM 8/29/2017
+  % \item \xmlNode{pivotParameter}, \xmlDesc{string, optional field} the name of
+  %   the parameter whose values need to be used as reference for the values
+  %   specified in the XML nodes \xmlNode{inputPivotValue},
+  %   \xmlNode{outputPivotValue}, or \xmlNode{inputPivotValue} (if inputted).
+  %   This field can be used, for example, if the driven code output file uses  a
+  %   different name for the variable ``time'' or to specify a different reference
+  %   parameter (e.g. PRESSURE). Default value is \xmlString{time}.
+  %   \\\nb The variable specified here should be monotonic; the code does not
+  %   check for eventual oscillation and is going to take the first occurance for
+  %   the values specified in the XML nodes \xmlNode{inputPivotValue},
+  %   \xmlNode{outputPivotValue}, and  \xmlNode{inputPivotValue};
+  % %
+  % \item \xmlNode{inputPivotValue}, \xmlDesc{float, optional field}, the value of the \xmlNode{pivotParameter} at which the input space needs to be retrieved
+  %   If this node is inputted, the node  \xmlNode{inputRow} can not be inputted (mutually exclusive).
+  %   %
+  % \item \xmlNode{outputPivotValue}. This node can be either a float or a list of floats, depending on the type of DataObjects:
+  %  \begin{itemize}
+  %     \item if \xmlNode{HistorySet},\xmlNode{outputPivotValue}, \xmlDesc{list of floats, optional field},  list of values of the
+  %                         \xmlNode{pivotParameter} at which the output space needs to be retrieved;
+  %     \item if \xmlNode{PointSet},\xmlNode{outputPivotValue}, \xmlDesc{float, optional field},  the value of the \xmlNode{pivotParameter}
+  %        at which the output space needs to be retrieved. If this node is inputted, the node  \xmlNode{outputRow} can not be inputted (mutually exclusive);
+  %  \end{itemize}
+  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   %
   It needs to be noticed that if the optional nodes in the block \xmlNode{options} are not inputted, the following default are applied:
     \begin{itemize}
-       \item the Input space is retrieved from the first row in the CSVs files or HDF5 tables (if the parameters specified are not among the variables sampled by RAVEN);
+       \item the Input space (scalars) is retrieved from the first row in the CSVs files or HDF5 tables (if the parameters specified are not 
+          among the variables sampled by RAVEN); In case of the  \xmlNode{DataSet}, if any of the input space variables depend on an \xmlNode{Index}, they 
+          are going to be linked to the \xmlNode{Index} variable
        \item  the output space defaults are as follows:
        \begin{itemize}
            \item if \xmlNode{PointSet}, the output space is retrieved from the last row in the CSVs files or HDF5 tables;
            \item if \xmlNode{HistorySet}, the output space is represented by all the rows found in  the CSVs or HDF5 tables.
+           \item if \xmlNode{DataSet}, the output space of the variables that do not depends on any index is retrieved from the last row in the CSVs files or HDF5 tables; 
+           on the contrary, the output space of the variables that depends on indexes is represented by all the rows found in  the CSVs or HDF5 tables (if they match 
+           with the indexes' dimension)
         \end{itemize}
     \end{itemize}
+
 \end{itemize}
 
-\begin{lstlisting}[style=XML,morekeywords={inputTs,operator,hierarchical,name,history}]
+\begin{lstlisting}[style=XML,morekeywords={operator,hierarchical,name,var}]
   <DataObjects>
     <PointSet name='outTPS1'>
       <options>
@@ -191,31 +247,19 @@ \section{DataObjects}
       <Input>pipe_Area,pipe_Dh,Dummy1</Input>
       <Output>pipe_Hw,pipe_Tw,time</Output>
     </PointSet>
-    <PointSet name='outTPS2'>
-        <options>
-            <inputPivotValue>0.00011</inputPivotValue>
-            <pivotParameter>time</pivotParameter>
-            <outputPivotValue>0.00012345</outputPivotValue>
-        </options>
-        <Input>pipe_Area,pipe_Dh,Dummy1</Input>
-        <Output>pipe_Hw,pipe_Tw,time</Output>
-    </PointSet>
     <HistorySet name='stories1'>
         <options>
             <inputRow>1</inputRow>
+            <outputRow>-1</outputRow>
         </options>
       <Input>pipe_Area,pipe_Dh</Input>
       <Output>pipe_Hw,pipe_Tw,time</Output>
     </HistorySet>
-    <HistorySet name='stories2'>
-        <options>
-            <inputRow>-1</inputRow>
-            <pivotParameter>time</pivotParameter>
-            <outputPivotValue>0.0002 0.0003 0.0004</outputPivotValue>
-        </options>
-        <Input>pipe_Area,pipe_Dh</Input>
-        <Output>pipe_Hw,pipe_Tw,time</Output>
-    </HistorySet>
+    <DataSet name='aDataSet'>
+      <Input>pipe_Area,pipe_Dh</Input>
+      <Output>pipe_Hw,pipe_Tw</Output>
+      <Index var="time">pipe_Hw,pipe_Tw</Index>
+    </DataSet>
   </DataObjects>
 \end{lstlisting}
 

diff --git a/doc/user_manual/existing_interfaces.tex b/doc/user_manual/existing_interfaces.tex
@@ -83,7 +83,7 @@ \subsection{Generic Interface}
 the GenericCode interface can be invoked using the \xmlNode{outputFile}
 node in which the output file name (CSV only) must be specified.
 For example, in the previous example, say instead of \texttt{-a gen.two} and \texttt{-o myOut}
-in the command line, the code always produce a CSV file named ``fixed\_output.csv''; 
+in the command line, the code always produce a CSV file named ``fixed\_output.csv'';
 
 Then, our example XML for the code would be
 

diff --git a/doc/user_manual/model.tex b/doc/user_manual/model.tex
@@ -1510,8 +1510,8 @@ \subsection{EnsembleModel}
         a data object defined in the \xmlNode{DataObjects} block (see
         Section~\ref{sec:DataObjects}).
         %
-        Currently, the  \xmlNode{EnsembleModel} accept ``DataObjects''  both of type
-        ``PointSet'' and ``HistorySet''.
+        Currently, the  \xmlNode{EnsembleModel} accept all ``DataObjects''  types:
+        ``PointSet'',  ``HistorySet'' and ``DataSet''.
         \nb The  \xmlNode{TargetEvaluation} is primary used for input-output identification. If the linked
         DataObject is not placed as additional output of the Step where the EnsembleModel is used, it will
         not be filled with the data coming from the calculation and it will be kept empty.
@@ -1530,7 +1530,7 @@ \subsection{EnsembleModel}
         As an example, the Output dataObjects cannot contained variables that are defined at the Ensemble model level.
         %
         The user can specify as many \xmlNode{Output} (s) as needed. The optional \xmlNode{Output}s  can be of both classes ``DataObjects'' and ``Databases'' 
-        (e.g. PointSet, HistorySet, HDF5).
+        (e.g. PointSet, HistorySet, DataSet, HDF5).
         \nb \textbf{The \xmlNode{Output} (s) here specified MUST be listed in the Step in which the EnsembleModel is used.}
     \end{itemize}
   %
@@ -1561,6 +1561,21 @@ \subsection{EnsembleModel}
         otherwise an error will be raised.
   \end{itemize}
 \end{itemize}
+
+\nb \textcolor{red} { \textbf{ It is crucial to understand that the choice of the \xmlNode{DataObject} used as
+ \newline \xmlNode{TargetEvaluation} determines how the data are going to be transferred from a model to
+  the other. If for example the chain of models is $A \rightarrow B$:}}
+\begin{itemize}
+  \item \textcolor{red} { \textbf{ If model $B$ expects as input scalars and outputs time-series, the \xmlNode{TargetEvaluation}  
+  of the  model $B$ will be a \textit{HistorySet} and the  \xmlNode{TargetEvaluation} of the model $A$ will be either 
+  a \textit{PointSet} or a \textit{DataSet} (where the output variables that need to be transferred to the model $A$ are scalars) }    }
+   \item \textcolor{red} { \textbf{ If model $B$ expects as input scalars and time-series and outputs time-series or scalars or both, the \xmlNode{TargetEvaluation}  
+  of the  model $B$ will be a \textit{DataSet} and the \newline  \xmlNode{TargetEvaluation} of the model $A$ will be either 
+  a \textit{HistorySet} or a \textit{DataSet}  }    }
+  \item \textcolor{red} { \textbf{ If both model $A$ and $B$ expect as input scalars and output scalars, the \xmlNode{TargetEvaluation}  
+  of the  both models  $A$  and $B$ will be  \textit{PointSet}s  }  }
+\end{itemize}
+
 \textbf{Example (Linear System):}
 \begin{lstlisting}[style=XML,morekeywords={subType,debug,name,class,type}]
 <Simulation>