Skip to content

Commit

Permalink
EconomicRatio PostProcessor Enhancements (#1763)
Browse files Browse the repository at this point in the history
* inherited collectOutput method has more comprehensive error/warning checking no need to overload

* adding steVals from BasicStatistics to EconomicRatio

* update getInputSpecification

* inherited inputToInternal is more comprehensive, use it

* update initialize method to include _ste metadata and all BasicStatistics metrics

* updating _handleInput to allow BasicStatistics metrics through to EconomicRatio and not throw warnings

* updating handling of targets and run method

* update __runLocal to handle vectorVals in BasicStatistics

* __computePower copy/pasted from BasicStatistics change to _computePower and inherit

* adding standard error calculation of percentiles

* adding threshold name to variable names percentile, value at risk, sortinoRatio, gainLossRatio, and expectedShortfall

* adding standard error calculation of value at risk similar to percentiles

* update user manual to include requesting BasicStatistics metrics from EconomicRatio

* removing empty list as optional argument, replaced with None

* adding reference for standard error calculations

* adding docstring to constructor

* renaming tealVals to econVals since they don't come from TEAL

* adding defensive check where all values are equal

* update percentile/VaR standard error for data with pivotParameter

* fixing implementation with DataSet and golding files from tests

* remove misleading descriptions using teal

* fixing implementation for DataSet output

* fixing DataSet implementation

* adding tests for EconomicRatio PostProcessor

* adding additional samples to test
  • Loading branch information
dgarrett622 authored Feb 9, 2022
1 parent 658a7b0 commit a8f779f
Show file tree
Hide file tree
Showing 19 changed files with 1,584 additions and 491 deletions.
12 changes: 9 additions & 3 deletions doc/user_manual/PostProcessors/EconomicRatio.tex
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ \subsubsection{EconomicRatio}
\label{EconomicRatio}
The \xmlNode{EconomicRatio} post-processor provides the economic metrics from the percent change
period return of the asset or strategy that is given as an input. These metrics measure the risk-adjusted returns.
\nb Any metric from \xmlNode{BasicStatistics} may be requested from \xmlNode{EconomicRatio}.
%
\ppType{EconomicRatio}{EconomicRatio}

Expand Down Expand Up @@ -34,12 +35,17 @@ \subsubsection{EconomicRatio}
\itemsep0em
\item \xmlAttr{prefix}, \xmlDesc{required string attribute}, user-defined prefix for the given \textbf{metric}.
For scalar quantifies, RAVEN will define a variable with name defined as: ``prefix'' + ``\_'' + ``parameter name''.
For example, if we define ``mean'' as the prefix for \textbf{expectedValue}, and parameter ``x'', then variable
``mean\_x'' will be defined by RAVEN.
For example, if we define ``sharpe'' as the prefix for \textbf{sharpeRatio}, and parameter ``x'', then variable
``sharpe\_x'' will be defined by RAVEN. For \textbf{metrics} that include a ``threshold'' parameter,
RAVEN will define a variable with name defined as: ``prefix'' + ``\_'' + ``threshold'' + ``\_'' + ``parameter name''.
For example, if we define ``VaR'' as the prefix for \textbf{valueAtRisk}, threshold ``0.05'', and parameter name ``x'',
then variable ``VaR\_0.05\_x'' will be defined by RAVEN. If we define ``glr'' as the prefix for \textbf{gainLossRatio},
``median'' as the threshold, and ``x'' as the parameter name, then the variable ``glr\_median\_x''
will be defined by RAVEN.
For matrix quantities, RAVEN will define a variable with name defined as: ``prefix'' + ``\_'' + ``target parameter name'' + ``\_'' + ``feature parameter name''.
For example, if we define ``sen'' as the prefix for \textbf{sensitivity}, target ``y'' and feature ``x'', then
variable ``sen\_y\_x'' will be defined by RAVEN.
\nb These variable will be used by RAVEN for the internal calculations. It is also accessible by the user through
\nb These variables will be used by RAVEN for the internal calculations. It is also accessible by the user through
\textbf{DataObjects} and \textbf{OutStreams}.
\end{itemize}

Expand Down
6 changes: 5 additions & 1 deletion doc/user_manual/postprocessor.tex
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,10 @@ \subsubsection{BasicStatistics}
\item \xmlAttr{prefix}, \xmlDesc{required string attribute}, user-defined prefix for the given \textbf{metric}.
For scalar quantifies, RAVEN will define a variable with name defined as: ``prefix'' + ``\_'' + ``parameter name''.
For example, if we define ``mean'' as the prefix for \textbf{expectedValue}, and parameter ``x'', then variable
``mean\_x'' will be defined by RAVEN.
``mean\_x'' will be defined by RAVEN. For \textbf{percentile}, RAVEN will define a variable with name defined as:
``prefix'' + ``\_'' + ``percent'' + ``\_'' + ``parameter name''. For example, if we define ``perc'' as the prefix
for \textbf{percentile}, percent as ``10'', and parameter ``x'', then variable ``perc\_10\_x'' will
be defined by RAVEN.
For matrix quantities, RAVEN will define a variable with name defined as: ``prefix'' + ``\_'' + ``target parameter name'' + ``\_'' + ``feature parameter name''.
For example, if we define ``sen'' as the prefix for \textbf{sensitivity}, target ``y'' and feature ``x'', then
variable ``sen\_y\_x'' will be defined by RAVEN.
Expand All @@ -169,6 +172,7 @@ \subsubsection{BasicStatistics}
\item \textbf{sigma}
\item \textbf{skewness}
\item \textbf{kurtosis}
\item \textbf{percentile}
\end{itemize}
RAVEN will define a variable with name defined as: ``prefix for given \textbf{metric}'' + ``\_ste\_'' + ``parameter name'' to
store standard error of given \textbf{metric} with respect to given parameter. This information will be stored in the DataObjects,
Expand Down
143 changes: 109 additions & 34 deletions framework/Models/PostProcessors/BasicStatistics.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"""
Created on July 10, 2013
@author: alfoa, wangc
@author: alfoa, wangc, dgarrett622
"""
#External Modules---------------------------------------------------------------
import numpy as np
Expand All @@ -23,7 +23,7 @@
from collections import OrderedDict, defaultdict
import six
import xarray as xr

import scipy.stats as stats
#External Modules End-----------------------------------------------------------

#Internal Modules---------------------------------------------------------------
Expand Down Expand Up @@ -53,7 +53,7 @@ class BasicStatistics(PostProcessorInterface):
'higherPartialVariance', # Statistic metric not available yet
'higherPartialSigma', # Statistic metric not available yet
'lowerPartialSigma', # Statistic metric not available yet
'lowerPartialVariance' # Statistic metric not available yet
'lowerPartialVariance' # Statistic metric not available yet
]
vectorVals = ['sensitivity',
'covariance',
Expand All @@ -67,7 +67,8 @@ class BasicStatistics(PostProcessorInterface):
'variance_ste',
'sigma_ste',
'skewness_ste',
'kurtosis_ste']
'kurtosis_ste',
'percentile_ste']

@classmethod
def getInputSpecification(cls):
Expand Down Expand Up @@ -265,37 +266,62 @@ def initialize(self, runInfo, inputs, initDict):
inputMetaKeys = []
outputMetaKeys = []
for metric, infos in self.toDo.items():
steMetric = metric + '_ste'
if steMetric in self.steVals:
for info in infos:
prefix = info['prefix']
for target in info['targets']:
metaVar = prefix + '_ste_' + target if not self.outputDataset else metric + '_ste'
metaDim = inputObj.getDimensions(target)
if len(metaDim[target]) == 0:
inputMetaKeys.append(metaVar)
else:
outputMetaKeys.append(metaVar)
if metric in self.scalarVals + self.vectorVals:
steMetric = metric + '_ste'
if steMetric in self.steVals:
for info in infos:
prefix = info['prefix']
for target in info['targets']:
if metric == 'percentile':
for strPercent in info['strPercent']:
metaVar = prefix + '_' + strPercent + '_ste_' + target if not self.outputDataset else metric + '_ste'
metaDim = inputObj.getDimensions(target)
if len(metaDim[target]) == 0:
inputMetaKeys.append(metaVar)
else:
outputMetaKeys.append(metaVar)
else:
metaVar = prefix + '_ste_' + target if not self.outputDataset else metric + '_ste'
metaDim = inputObj.getDimensions(target)
if len(metaDim[target]) == 0:
inputMetaKeys.append(metaVar)
else:
outputMetaKeys.append(metaVar)
metaParams = {}
if not self.outputDataset:
if len(outputMetaKeys) > 0:
metaParams = {key:[self.pivotParameter] for key in outputMetaKeys}
else:
if len(outputMetaKeys) > 0:
params = {key:[self.pivotParameter,self.steMetaIndex] for key in outputMetaKeys + inputMetaKeys}
params = {}
for key in outputMetaKeys + inputMetaKeys:
# percentile standard error has additional index
if key == 'percentile_ste':
params[key] = [self.pivotParameter, self.steMetaIndex, 'percent']
else:
params[key] = [self.pivotParameter, self.steMetaIndex]
metaParams.update(params)
elif len(inputMetaKeys) > 0:
params = {key:[self.steMetaIndex] for key in inputMetaKeys}
params = {}
for key in inputMetaKeys:
# percentile standard error has additional index
if key == 'percentile_ste':
params[key] = [self.pivotParameter, self.steMetaIndex, 'percent']
else:
params[key] = [self.pivotParameter, self.steMetaIndex]
metaParams.update(params)
metaKeys = inputMetaKeys + outputMetaKeys
self.addMetaKeys(metaKeys,metaParams)

def _handleInput(self, paramInput):
def _handleInput(self, paramInput, childVals=None):
"""
Function to handle the parsed paramInput for this class.
@ In, paramInput, ParameterInput, the already parsed input.
@ In, childVals, list, optional, quantities requested from child statistical object
@ Out, None
"""
if childVals is None:
childVals = []
self.toDo = {}
for child in paramInput.subparts:
tag = child.getName()
Expand Down Expand Up @@ -355,11 +381,12 @@ def _handleInput(self, paramInput):
elif tag == "multipleFeatures":
self.multipleFeatures = child.value
else:
self.raiseAWarning('Unrecognized node in BasicStatistics "',tag,'" has been ignored!')
if tag not in childVals:
self.raiseAWarning('Unrecognized node in BasicStatistics "',tag,'" has been ignored!')

assert (len(self.toDo)>0), self.raiseAnError(IOError, 'BasicStatistics needs parameters to work on! Please check input for PP: ' + self.name)

def __computePower(self, p, dataset):
def _computePower(self, p, dataset):
"""
Compute the p-th power of weights
@ In, p, int, the power
Expand All @@ -383,7 +410,7 @@ def __computeVp(self,p,weights):
@ In, weights, xarray.Dataset, probability weights of all input variables
@ Out, vp, xarray.Dataset, the sum of p-th power of weights
"""
vp = self.__computePower(p,weights)
vp = self._computePower(p,weights)
vp = vp.sum()
return vp

Expand Down Expand Up @@ -449,7 +476,7 @@ def _computeKurtosis(self, arrayIn, expValue, variance, pbWeight=None, dim=None)
"""
if dim is None:
dim = self.sampleTag
vr = self.__computePower(2.0, variance)
vr = self._computePower(2.0, variance)
if pbWeight is not None:
unbiasCorr = self.__computeUnbiasedCorrection(4,pbWeight) if not self.biased else 1.0
vp = 1.0/self.__computeVp(1,pbWeight)
Expand Down Expand Up @@ -482,7 +509,7 @@ def _computeSkewness(self, arrayIn, expValue, variance, pbWeight=None, dim=None)
"""
if dim is None:
dim = self.sampleTag
vr = self.__computePower(1.5, variance)
vr = self._computePower(1.5, variance)
if pbWeight is not None:
unbiasCorr = self.__computeUnbiasedCorrection(3,pbWeight) if not self.biased else 1.0
vp = 1.0/self.__computeVp(1,pbWeight)
Expand Down Expand Up @@ -590,7 +617,6 @@ def _computeWeightedPercentile(self,arrayIn,pbWeight,percent=0.5):
result = sortedWeightsAndPoints[indexL,1]
return result


def __runLocal(self, inputData):
"""
This method executes the postprocessor action. In this case, it computes all the requested statistical FOMs
Expand Down Expand Up @@ -730,7 +756,7 @@ def __runLocal(self, inputData):
metric = 'sigma'
if len(needed[metric]['targets'])>0:
self.raiseADebug('Starting "'+metric+'"...')
sigmaDS = self.__computePower(0.5,calculations['variance'][list(needed[metric]['targets'])])
sigmaDS = self._computePower(0.5,calculations['variance'][list(needed[metric]['targets'])])
self.calculations[metric] = sigmaDS
calculations[metric] = sigmaDS
#
Expand Down Expand Up @@ -806,7 +832,7 @@ def __runLocal(self, inputData):
metric = 'lowerPartialSigma'
if len(needed[metric]['targets'])>0:
self.raiseADebug('Starting "'+metric+'"...')
lpsDS = self.__computePower(0.5,calculations['lowerPartialVariance'][list(needed[metric]['targets'])])
lpsDS = self._computePower(0.5,calculations['lowerPartialVariance'][list(needed[metric]['targets'])])
calculations[metric] = lpsDS
#
# higherPartialVariance
Expand All @@ -826,17 +852,22 @@ def __runLocal(self, inputData):
metric = 'higherPartialSigma'
if len(needed[metric]['targets'])>0:
self.raiseADebug('Starting "'+metric+'"...')
hpsDS = self.__computePower(0.5,calculations['higherPartialVariance'][list(needed[metric]['targets'])])
hpsDS = self._computePower(0.5,calculations['higherPartialVariance'][list(needed[metric]['targets'])])
calculations[metric] = hpsDS

############################################################
# compute standard error for expectedValue
# Begin Standard Error Calculations
#
# Reference for standard error calculations (including percentile):
# B. Harding, C. Tremblay and D. Cousineau, "Standard errors: A review and evaluation of
# standard error estimators using Monte Carlo simulations", The Quantitative Methods of
# Psychology, Vol. 10, No. 2 (2014)
############################################################
metric = 'expectedValue'
if len(needed[metric]['targets'])>0:
self.raiseADebug('Starting calculate standard error on"'+metric+'"...')
if self.pbPresent:
factor = self.__computePower(0.5,calculations['equivalentSamples'])
factor = self._computePower(0.5,calculations['equivalentSamples'])
else:
factor = np.sqrt(self.sampleSize)
calculations[metric+'_ste'] = calculations['sigma'][list(needed[metric]['targets'])]/factor
Expand All @@ -848,7 +879,7 @@ def __runLocal(self, inputData):
if self.pbPresent:
en = calculations['equivalentSamples'][varList]
factor = 2.0 /(en - 1.0)
factor = self.__computePower(0.5,factor)
factor = self._computePower(0.5,factor)
else:
factor = np.sqrt(2.0/(float(self.sampleSize) - 1.0))
calculations[metric+'_ste'] = calculations['sigma'][varList]**2 * factor
Expand All @@ -860,7 +891,7 @@ def __runLocal(self, inputData):
if self.pbPresent:
en = calculations['equivalentSamples'][varList]
factor = 2.0 * (en - 1.0)
factor = self.__computePower(0.5,factor)
factor = self._computePower(0.5,factor)
else:
factor = np.sqrt(2.0 * (float(self.sampleSize) - 1.0))
calculations[metric+'_ste'] = calculations['sigma'][varList] / factor
Expand All @@ -878,7 +909,7 @@ def __runLocal(self, inputData):
if self.pbPresent:
en = calculations['equivalentSamples'][varList]
factor = 6.*en*(en-1.)/((en-2.)*(en+1.)*(en+3.))
factor = self.__computePower(0.5,factor)
factor = self._computePower(0.5,factor)
calculations[metric+'_ste'] = xr.full_like(calculations[metric],1.0) * factor
else:
en = float(self.sampleSize)
Expand All @@ -891,8 +922,8 @@ def __runLocal(self, inputData):
varList = list(needed[metric]['targets'])
if self.pbPresent:
en = calculations['equivalentSamples'][varList]
factor1 = self.__computePower(0.5,6.*en*(en-1.)/((en-2.)*(en+1.)*(en+3.)))
factor2 = self.__computePower(0.5,(en**2-1.)/((en-3.0)*(en+5.0)))
factor1 = self._computePower(0.5,6.*en*(en-1.)/((en-2.)*(en+1.)*(en+3.)))
factor2 = self._computePower(0.5,(en**2-1.)/((en-3.0)*(en+5.0)))
factor = 2.0 * factor1 * factor2
calculations[metric+'_ste'] = xr.full_like(calculations[metric],1.0) * factor
else:
Expand Down Expand Up @@ -957,6 +988,46 @@ def __runLocal(self, inputData):
percentileSet = percentileSet.rename({'quantile':'percent'})
calculations[metric] = percentileSet

# because percentile is different, calculate standard error here
self.raiseADebug('Starting calculate standard error on "'+metric+'"...')
percentileSteSet = xr.Dataset()
calculatedPercentiles = calculations[metric]
relWeight = pbWeights[list(needed[metric]['targets'])]
for target in needed[metric]['targets']:
targWeight = relWeight[target].values
en = targWeight.sum()**2/np.sum(targWeight**2)
targDa = dataSet[target]
if self.pivotParameter in targDa.sizes.keys():
percentileSte = []
for pct in percent:
subPercentileSte = []
factor = np.sqrt(pct*(1.0 - pct)/en)
for label, group in targDa.groupby(self.pivotParameter):
if group.values.min() == group.values.max():
# all values are the same
subPercentileSte.append(0.0)
else:
# get KDE
kde = stats.gaussian_kde(group.values, weights=targWeight)
val = calculatedPercentiles[target].sel(**{'percent': pct, self.pivotParameter: label}).values
subPercentileSte.append(factor/kde(val)[0])
percentileSte.append(subPercentileSte)
da = xr.DataArray(percentileSte, dims=('percent', self.pivotParameter), coords={'percent': percent, self.pivotParameter: self.pivotValue})
percentileSteSet[target] = da
else:
calcPercentiles = calculatedPercentiles[target]
if targDa.values.min() == targDa.values.max():
# distribution is a delta function, so no KDE construction
percentileSte = list(np.zeros(calcPercentiles.shape))
else:
# get KDE
kde = stats.gaussian_kde(targDa.values, weights=targWeight)
factor = np.sqrt(np.array(percent)*(1.0 - np.array(percent))/en)
percentileSte = list(factor/kde(calcPercentiles.values))
da = xr.DataArray(percentileSte, dims=('percent'), coords={'percent': percent})
percentileSteSet[target] = da
calculations[metric+'_ste'] = percentileSteSet

def startVector(metric):
"""
Common method among all metrics for establishing parameters
Expand Down Expand Up @@ -1198,6 +1269,10 @@ def getCovarianceSubset(desired):
varName = '_'.join([prefix,percent,target])
percentVal = float(percent)/100.
outputDict[varName] = np.atleast_1d(outputSet[metric].sel(**{'targets':target,'percent':percentVal}))
steMetric = metric + '_ste'
if steMetric in self.steVals:
metaVar = '_'.join([prefix,percent,'ste',target])
outputDict[metaVar] = np.atleast_1d(outputSet[steMetric].sel(**{'targets':target,'percent':percentVal}))
else:
#check if it was skipped for some reason
skip = self.skipped.get(metric, None)
Expand Down
Loading

0 comments on commit a8f779f

Please sign in to comment.