You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently provide Nulls.skip to allow ignoring nulls when computing reductions over iterators/vectors. But this doesn't work if you want to compute a reduction over a dimension of a matrix or higher-dimensional array.
DataArrays does this using the skipna=true/skipnull=true argument to reduce. The limit of that approach is that every reduction function needs to be reimplemented to support it.
In theory, one could achieve the same result using Nulls.replace, choosing the appropriate neutral value for a given reduction (i.e. 0 for sum, 1 for prod, etc.). This doesn't work currently because the returned generator is not an AbstractArray, but it could easily be fixed by using a custom type. However, it would not be ideal for this use case since 1) it's harder to discover and cumbersome, 2) it's not generic if you don't know the kind of reduction which will be performed beforehand, and 3) some reduction operations do not have a neutral element that is easy to compute (e.g. mean).
I don't have a simple solution to propose. The only generic approach I can imagine is to have Nulls.skip return an object of a special type, and override Base.mapreducedim! for that type. The advantage compared with skipna/skipnull is that convenience functions like reduce, sum and mean do not have to be modified.
This issue is related to the even more complex question of how to handle functions which require pairwise or rowwise deletion in the presence of nulls, like cov, cor and distances. Cf. this julia-stats thread.
The text was updated successfully, but these errors were encountered:
We currently provide
Nulls.skip
to allow ignoring nulls when computing reductions over iterators/vectors. But this doesn't work if you want to compute a reduction over a dimension of a matrix or higher-dimensional array.DataArrays does this using the
skipna=true
/skipnull=true
argument toreduce
. The limit of that approach is that every reduction function needs to be reimplemented to support it.In theory, one could achieve the same result using
Nulls.replace
, choosing the appropriate neutral value for a given reduction (i.e.0
forsum
,1
forprod
, etc.). This doesn't work currently because the returned generator is not anAbstractArray
, but it could easily be fixed by using a custom type. However, it would not be ideal for this use case since 1) it's harder to discover and cumbersome, 2) it's not generic if you don't know the kind of reduction which will be performed beforehand, and 3) some reduction operations do not have a neutral element that is easy to compute (e.g.mean
).I don't have a simple solution to propose. The only generic approach I can imagine is to have
Nulls.skip
return an object of a special type, and overrideBase.mapreducedim!
for that type. The advantage compared withskipna
/skipnull
is that convenience functions likereduce
,sum
andmean
do not have to be modified.This issue is related to the even more complex question of how to handle functions which require pairwise or rowwise deletion in the presence of nulls, like
cov
,cor
and distances. Cf. this julia-stats thread.The text was updated successfully, but these errors were encountered: