Skip to content

Commit

Permalink
Alfoa/dataobject rework finalize ensemble (#565)
Browse files Browse the repository at this point in the history
* Closes #541

* Update GenericCodeInterface.py

* fixed tester (#528)

* ensemble model pb weights for variables coming from functions

* fixed single-value-duplication error for SKL ROMs (#555)

* fixed single-value-duplication error

* fixed test framework/ensembleModelTests.testEnsembleModelWith2CodesAndAliasAndOptionalOutputs

* modified order of input output to avoid regolding

* Reducing DataObject Attribute Functionality (#278)

* Enabling the data attribute tests and fixing the operators for PointSets. TODO: Break the data_attributes test down to be more granular and fix the outputPivotValue on the HistorySets.

* Splitting the test files for the DataObject attributes and correcting some malformations in the subsequent input files. TODO: Fix the attributes for the history set when operating from a Model.

* Fixing HistorySet data attribute test case to look for the correct file.

* Correcting attributions for data object tests. maljdan had only moved the files. The original tests were designed by others. TODO: verify if test results are valid or the result of incorrect gold files.

* Reducing the number of DataObjects needed in the shared suite of DataObject attribute tests.

* Regolding the DataObject HistorySet attributes files to respect the outputPivotVal specified for stories2.

* Picking up where I left off, trying to recall what modifications still need to be done to the HistorySet.

* Regolding a test case on data attributes, removing dead code from the HistorySet and updating some aspects of the PointSet.

* Removing data attribute feature set with explanation in comments. Cleaning old code.

* Regolding fixed test case.

* Reverting changes to ensemble test and accommodating unstructured inputs.

* addressed misunderstanding in HistorySet

* added HSToPSOperator PP

* added documentation for new interface

* finished new PP

* addressed first comments

* addressed Congjian's comments

* updated XSD

* moving ahead

* fixed test framework/ensembleModelTests.testEnsembleModelLinearThreadWithTimeSeries

* fixed framework/ensembleModelTests.testEnsembleModelLinearParallelWithOptimizer

* fixed framework/CodeInterfaceTests.DymolaTestTimeDepNoExecutableEnsembleModel

* fixed framework/PostProcessors/InterfacedPostProcessor.metadataUsageInInterfacePP

* fixed new test files coming from devel

* updated InterfacedPP HStoPSOperator

* fixed xsd

* added documenation for DataSet

* added conversion script from old HDF5 to new HDF5

* Update DataObjects.xsd

* remove white space

* Update database_data.tex

* Update postprocessor.tex

* removed unuseful __init__ in Melcor interface

* addressed Congjian's comments
  • Loading branch information
alfoa authored and wangcj05 committed Feb 5, 2018
1 parent 7c58f90 commit 3067d8b
Show file tree
Hide file tree
Showing 75 changed files with 3,082 additions and 1,892 deletions.
29 changes: 25 additions & 4 deletions developer_tools/XSDSchemas/DataObjects.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
<!-- *********************************************************************** -->
<!-- DataObjects -->
<!-- *********************************************************************** -->

<xsd:complexType name="DataObjectsData">
<xsd:sequence>
<xsd:element name="PointSet" type="commonDataObjectsData" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="HistorySet" type="commonDataObjectsData" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="PointSet" type="commonDataObjectsData" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="HistorySet" type="commonDataObjectsData" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="DataSet" type="dataSetData" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>

Expand All @@ -23,15 +25,15 @@
<!-- Either inputRow or inputPivotValue will be specified (mutually exclusive) -->
<xsd:choice>
<xsd:element name="inputRow" type="xsd:integer" minOccurs="0"/>
<xsd:element name="inputPivotValue" type="xsd:float" minOccurs="0"/>
<!-- <xsd:element name="inputPivotValue" type="xsd:float" minOccurs="0"/> -->
</xsd:choice>
<xsd:element name="pivotParameter" type="xsd:string" minOccurs="0"/>
<!-- Either operator, outputRow, or outputPivotValue will be specified
We need a way to figure out how to do mutually exclusive events like this -->
<xsd:choice>
<xsd:element name="outputRow" type="xsd:integer" minOccurs="0"/>
<xsd:element name="operator" type="operatorType" minOccurs="0"/>
<xsd:element name="outputPivotValue" type="xsd:string" minOccurs="0"/>
<!-- <xsd:element name="outputPivotValue" type="xsd:string" minOccurs="0"/> -->
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
Expand All @@ -45,4 +47,23 @@
<xsd:attribute name="name" type="xsd:string" use="required"/>
<xsd:attribute name="hierarchical" type="RavenBool"/>
</xsd:complexType>

<xsd:complexType name="dataSetData">
<xsd:sequence>
<xsd:element name="Input" type="xsd:string" minOccurs="0"/>
<xsd:element name="Output" type="xsd:string" minOccurs="0"/>
<xsd:element name="Index" type="IndexType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="options" type="optionsType" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required"/>
<xsd:attribute name="hierarchical" type="RavenBool"/>
</xsd:complexType>

<xsd:complexType name="IndexType">
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute name="var" type="xsd:string" use="required"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
</xsd:schema>
144 changes: 94 additions & 50 deletions doc/user_manual/database_data.tex
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,15 @@ \section{DataObjects}
These interactions are made possible through a data handling system that each
entity understands.
%
This system, neglecting the grammar imprecision, is called the ``DataObjects''
system.
This system is called the ``DataObjects'' framework.

The \xmlNode{DataObjects} tag is a container of data objects of various types that can
be constructed during the execution of a particular calculation flow.
%
These data objects can be used as input or output for a particular
\textbf{Model} (see Roles' meaning in section \ref{sec:models}), etc.
%
Currently, RAVEN supports 4 different data types, each with a particular
Currently, RAVEN supports 3 different data types, each with a particular
conceptual meaning.
%
These data types are instantiated as sub-nodes in the \xmlNode{DataObjects} block of
Expand All @@ -35,11 +34,35 @@ \section{DataObjects}
input domain.
%
It can be considered a mapping between multiple sets of parameters in the
input space and the resulting sets of temporal evolutions in the output
input space and the resulting sets of temporal evolution in the output
space.
%
\item \xmlNode{DataSet} is a generalization of the previously described DataObject,
aimed to contain a mixture of data (scalars, arrays, etc.). The variables here stored
can be independent (i.e. scalars) or dependent (arrays) on certain dimensions (e.g. time, coordinates, etc.).
%
It can be considered a mapping between multiple sets of parameters in the
input space (both dependent and/or independent) and the resulting sets of evolution in the output
space (both dependent and/or independent).
%
\nb \textcolor{red} {\textbf{The \xmlNode{DataSet} is currently usable in the \xmlNode{EnsembleModel} only (see \ref{subsec:models_EnsembleModel} )}}
\end{itemize}

In summary, the DataObjects accept the following data in their input/output spaces:
\begin{table}[h]
\centering
\caption{DataObjects' accepted data formats.}
\label{DataObjectDataFormatTable}
\begin{tabular}{|c|c|c|}
\hline
\textbf{DataObject} & \textbf{Input Space} & \textbf{Output Space} \\ \hline
{\color[HTML]{FE0000} \textit{PointSet}} & scalars & scalars \\ \hline
{\color[HTML]{FE0000} \textit{HistorySet}} & scalars & vectors \\ \hline
{\color[HTML]{FE0000} \textit{DataSet}} & any & any \\ \hline
\end{tabular}
\end{table}


As noted above, each data object represents a mapping between a set of
parameters and the resulting outcomes.
%
Expand All @@ -50,12 +73,13 @@ \section{DataObjects}
<DataObjects>
<PointSet name='***'>...</PointSet>
<HistorySet name='***'>...</HistorySet>
<DataSet name='***'>...</DataSet>
</DataObjects>
...
</Simulation>
\end{lstlisting}

Independent of the type of data, the respective XML node has the following
Independently on the type of data, the respective XML node has the following
available attributes:
\vspace{-5mm}
\begin{itemize}
Expand Down Expand Up @@ -109,17 +133,32 @@ \section{DataObjects}
\default{False}
\end{itemize}
\vspace{-5mm}
In each XML node (e.g. \xmlNode{PointSet} or \xmlNode{HistorySet}), the user
In each XML node (e.g. \xmlNode{PointSet}, \xmlNode{HistorySet} or \xmlNode{DataSet}), the user
needs to specify the following sub-nodes:
\begin{itemize}
\item \xmlNode{Input}, \xmlDesc{comma separated string, required field} lists
\item \xmlNode{Input}, \xmlDesc{comma separated string, required field}, lists
the input parameters to which this data is connected.
%
\item \xmlNode{Output}, \xmlDesc{comma separated string, required field} lists
\item \xmlNode{Output}, \xmlDesc{comma separated string, required field}, lists
the output parameters to which this data is connected.
%
\end{itemize}

The \xmlNode{PointSet} and \xmlNode{HistorySet} objects are a specialization of the \xmlNode{DataSet} where the
independent dimensions are defaulted to none, for the \xmlNode{PointSet}, or to the \textit{pivotParameter} (e.g. time), for the \xmlNode{HistorySet}.
If a \xmlNode{DataSet} needs to be constructed, an additional information (e.g. sub-node) needs to be inputted, the \xmlNode{Index}:

\begin{itemize}
\item \xmlNode{Index}, \xmlDesc{comma separated string, required field (if \xmlNode{DataSet})}, lists
the dependent variables that depend on this index (specified through the attribute \xmlAttr{var}).
This XML node requires the following attribute:
\begin{itemize}
\item \xmlAttr{var}, \xmlDesc{required string attribute}, the dimension name of this index (e.g. time)
\end{itemize}
%
\end{itemize}


In addition to the XML nodes \xmlNode{Input} and \xmlNode{Output} explained above, the user
can optionally specify a XML node named \xmlNode{options}. The \xmlNode{options} node can
contain the following optional XML sub-nodes:
Expand Down Expand Up @@ -147,41 +186,58 @@ \section{DataObjects}
\xmlNode{outputRow} and \xmlNode{outputPivotValue} can not be inputted (mutually exclusive).
\\\nb This XML node is available for DataObjects of type \xmlNode{PointSet} only;
%
\item \xmlNode{pivotParameter}, \xmlDesc{string, optional field} the name of
the parameter whose values need to be used as reference for the values
specified in the XML nodes \xmlNode{inputPivotValue},
\xmlNode{outputPivotValue}, or \xmlNode{inputPivotValue} (if inputted).
This field can be used, for example, if the driven code output file uses a
different name for the variable ``time'' or to specify a different reference
parameter (e.g. PRESSURE). Default value is \xmlString{time}.
\\\nb The variable specified here should be monotonic; the code does not
check for eventual oscillation and is going to take the first occurance for
the values specified in the XML nodes \xmlNode{inputPivotValue},
\xmlNode{outputPivotValue}, and \xmlNode{inputPivotValue};
%
\item \xmlNode{inputPivotValue}, \xmlDesc{float, optional field}, the value of the \xmlNode{pivotParameter} at which the input space needs to be retrieved
If this node is inputted, the node \xmlNode{inputRow} can not be inputted (mutually exclusive).
%
\item \xmlNode{outputPivotValue}. This node can be either a float or a list of floats, depending on the type of DataObjects:
\begin{itemize}
\item if \xmlNode{HistorySet},\xmlNode{outputPivotValue}, \xmlDesc{list of floats, optional field}, list of values of the
\xmlNode{pivotParameter} at which the output space needs to be retrieved;
\item if \xmlNode{PointSet},\xmlNode{outputPivotValue}, \xmlDesc{float, optional field}, the value of the \xmlNode{pivotParameter}
at which the output space needs to be retrieved. If this node is inputted, the node \xmlNode{outputRow} can not be inputted (mutually exclusive);
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% This feature is being disabled until the DataObjects handle data in a
%%%% more encapsulated fashion. When the data can handle this all internally
%%%% then we can re-add this feature. As of now, determining the rows
%%%% associated to the outputPivotValue or inputPivotValue requires knowing
%%%% information outside of the "value" passed into
%%%% DataObject.updateOutputValue or DataObject.updateInputValue, thus the
%%%% caller has to do this computation, but currently the caller occurs in ~50
%%%% different places according to my grep of "updateOutputValue"
%%%% -- DPM 8/29/2017
% \item \xmlNode{pivotParameter}, \xmlDesc{string, optional field} the name of
% the parameter whose values need to be used as reference for the values
% specified in the XML nodes \xmlNode{inputPivotValue},
% \xmlNode{outputPivotValue}, or \xmlNode{inputPivotValue} (if inputted).
% This field can be used, for example, if the driven code output file uses a
% different name for the variable ``time'' or to specify a different reference
% parameter (e.g. PRESSURE). Default value is \xmlString{time}.
% \\\nb The variable specified here should be monotonic; the code does not
% check for eventual oscillation and is going to take the first occurance for
% the values specified in the XML nodes \xmlNode{inputPivotValue},
% \xmlNode{outputPivotValue}, and \xmlNode{inputPivotValue};
% %
% \item \xmlNode{inputPivotValue}, \xmlDesc{float, optional field}, the value of the \xmlNode{pivotParameter} at which the input space needs to be retrieved
% If this node is inputted, the node \xmlNode{inputRow} can not be inputted (mutually exclusive).
% %
% \item \xmlNode{outputPivotValue}. This node can be either a float or a list of floats, depending on the type of DataObjects:
% \begin{itemize}
% \item if \xmlNode{HistorySet},\xmlNode{outputPivotValue}, \xmlDesc{list of floats, optional field}, list of values of the
% \xmlNode{pivotParameter} at which the output space needs to be retrieved;
% \item if \xmlNode{PointSet},\xmlNode{outputPivotValue}, \xmlDesc{float, optional field}, the value of the \xmlNode{pivotParameter}
% at which the output space needs to be retrieved. If this node is inputted, the node \xmlNode{outputRow} can not be inputted (mutually exclusive);
% \end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
It needs to be noticed that if the optional nodes in the block \xmlNode{options} are not inputted, the following default are applied:
\begin{itemize}
\item the Input space is retrieved from the first row in the CSVs files or HDF5 tables (if the parameters specified are not among the variables sampled by RAVEN);
\item the Input space (scalars) is retrieved from the first row in the CSVs files or HDF5 tables (if the parameters specified are not
among the variables sampled by RAVEN); In case of the \xmlNode{DataSet}, if any of the input space variables depend on an \xmlNode{Index}, they
are going to be linked to the \xmlNode{Index} variable
\item the output space defaults are as follows:
\begin{itemize}
\item if \xmlNode{PointSet}, the output space is retrieved from the last row in the CSVs files or HDF5 tables;
\item if \xmlNode{HistorySet}, the output space is represented by all the rows found in the CSVs or HDF5 tables.
\item if \xmlNode{DataSet}, the output space of the variables that do not depends on any index is retrieved from the last row in the CSVs files or HDF5 tables;
on the contrary, the output space of the variables that depends on indexes is represented by all the rows found in the CSVs or HDF5 tables (if they match
with the indexes' dimension)
\end{itemize}
\end{itemize}

\end{itemize}

\begin{lstlisting}[style=XML,morekeywords={inputTs,operator,hierarchical,name,history}]
\begin{lstlisting}[style=XML,morekeywords={operator,hierarchical,name,var}]
<DataObjects>
<PointSet name='outTPS1'>
<options>
Expand All @@ -191,31 +247,19 @@ \section{DataObjects}
<Input>pipe_Area,pipe_Dh,Dummy1</Input>
<Output>pipe_Hw,pipe_Tw,time</Output>
</PointSet>
<PointSet name='outTPS2'>
<options>
<inputPivotValue>0.00011</inputPivotValue>
<pivotParameter>time</pivotParameter>
<outputPivotValue>0.00012345</outputPivotValue>
</options>
<Input>pipe_Area,pipe_Dh,Dummy1</Input>
<Output>pipe_Hw,pipe_Tw,time</Output>
</PointSet>
<HistorySet name='stories1'>
<options>
<inputRow>1</inputRow>
<outputRow>-1</outputRow>
</options>
<Input>pipe_Area,pipe_Dh</Input>
<Output>pipe_Hw,pipe_Tw,time</Output>
</HistorySet>
<HistorySet name='stories2'>
<options>
<inputRow>-1</inputRow>
<pivotParameter>time</pivotParameter>
<outputPivotValue>0.0002 0.0003 0.0004</outputPivotValue>
</options>
<Input>pipe_Area,pipe_Dh</Input>
<Output>pipe_Hw,pipe_Tw,time</Output>
</HistorySet>
<DataSet name='aDataSet'>
<Input>pipe_Area,pipe_Dh</Input>
<Output>pipe_Hw,pipe_Tw</Output>
<Index var="time">pipe_Hw,pipe_Tw</Index>
</DataSet>
</DataObjects>
\end{lstlisting}

Expand Down
2 changes: 1 addition & 1 deletion doc/user_manual/existing_interfaces.tex
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ \subsection{Generic Interface}
the GenericCode interface can be invoked using the \xmlNode{outputFile}
node in which the output file name (CSV only) must be specified.
For example, in the previous example, say instead of \texttt{-a gen.two} and \texttt{-o myOut}
in the command line, the code always produce a CSV file named ``fixed\_output.csv'';
in the command line, the code always produce a CSV file named ``fixed\_output.csv'';

Then, our example XML for the code would be

Expand Down
21 changes: 18 additions & 3 deletions doc/user_manual/model.tex
Original file line number Diff line number Diff line change
Expand Up @@ -1510,8 +1510,8 @@ \subsection{EnsembleModel}
a data object defined in the \xmlNode{DataObjects} block (see
Section~\ref{sec:DataObjects}).
%
Currently, the \xmlNode{EnsembleModel} accept ``DataObjects'' both of type
``PointSet'' and ``HistorySet''.
Currently, the \xmlNode{EnsembleModel} accept all ``DataObjects'' types:
``PointSet'', ``HistorySet'' and ``DataSet''.
\nb The \xmlNode{TargetEvaluation} is primary used for input-output identification. If the linked
DataObject is not placed as additional output of the Step where the EnsembleModel is used, it will
not be filled with the data coming from the calculation and it will be kept empty.
Expand All @@ -1530,7 +1530,7 @@ \subsection{EnsembleModel}
As an example, the Output dataObjects cannot contained variables that are defined at the Ensemble model level.
%
The user can specify as many \xmlNode{Output} (s) as needed. The optional \xmlNode{Output}s can be of both classes ``DataObjects'' and ``Databases''
(e.g. PointSet, HistorySet, HDF5).
(e.g. PointSet, HistorySet, DataSet, HDF5).
\nb \textbf{The \xmlNode{Output} (s) here specified MUST be listed in the Step in which the EnsembleModel is used.}
\end{itemize}
%
Expand Down Expand Up @@ -1561,6 +1561,21 @@ \subsection{EnsembleModel}
otherwise an error will be raised.
\end{itemize}
\end{itemize}

\nb \textcolor{red} { \textbf{ It is crucial to understand that the choice of the \xmlNode{DataObject} used as
\newline \xmlNode{TargetEvaluation} determines how the data are going to be transferred from a model to
the other. If for example the chain of models is $A \rightarrow B$:}}
\begin{itemize}
\item \textcolor{red} { \textbf{ If model $B$ expects as input scalars and outputs time-series, the \xmlNode{TargetEvaluation}
of the model $B$ will be a \textit{HistorySet} and the \xmlNode{TargetEvaluation} of the model $A$ will be either
a \textit{PointSet} or a \textit{DataSet} (where the output variables that need to be transferred to the model $A$ are scalars) } }
\item \textcolor{red} { \textbf{ If model $B$ expects as input scalars and time-series and outputs time-series or scalars or both, the \xmlNode{TargetEvaluation}
of the model $B$ will be a \textit{DataSet} and the \newline \xmlNode{TargetEvaluation} of the model $A$ will be either
a \textit{HistorySet} or a \textit{DataSet} } }
\item \textcolor{red} { \textbf{ If both model $A$ and $B$ expect as input scalars and output scalars, the \xmlNode{TargetEvaluation}
of the both models $A$ and $B$ will be \textit{PointSet}s } }
\end{itemize}

\textbf{Example (Linear System):}
\begin{lstlisting}[style=XML,morekeywords={subType,debug,name,class,type}]
<Simulation>
Expand Down
Loading

0 comments on commit 3067d8b

Please sign in to comment.