Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacement for reducedim(X, dim, skipnull=true) #43

Open
nalimilan opened this issue Oct 8, 2017 · 2 comments
Open

Replacement for reducedim(X, dim, skipnull=true) #43

nalimilan opened this issue Oct 8, 2017 · 2 comments

Comments

@nalimilan
Copy link
Member

We currently provide Nulls.skip to allow ignoring nulls when computing reductions over iterators/vectors. But this doesn't work if you want to compute a reduction over a dimension of a matrix or higher-dimensional array.

DataArrays does this using the skipna=true/skipnull=true argument to reduce. The limit of that approach is that every reduction function needs to be reimplemented to support it.

In theory, one could achieve the same result using Nulls.replace, choosing the appropriate neutral value for a given reduction (i.e. 0 for sum, 1 for prod, etc.). This doesn't work currently because the returned generator is not an AbstractArray, but it could easily be fixed by using a custom type. However, it would not be ideal for this use case since 1) it's harder to discover and cumbersome, 2) it's not generic if you don't know the kind of reduction which will be performed beforehand, and 3) some reduction operations do not have a neutral element that is easy to compute (e.g. mean).

I don't have a simple solution to propose. The only generic approach I can imagine is to have Nulls.skip return an object of a special type, and override Base.mapreducedim! for that type. The advantage compared with skipna/skipnull is that convenience functions like reduce, sum and mean do not have to be modified.

This issue is related to the even more complex question of how to handle functions which require pairwise or rowwise deletion in the presence of nulls, like cov, cor and distances. Cf. this julia-stats thread.

@nalimilan
Copy link
Member Author

Missings.skip now returns a special iterator type, so we could implement a special mapreducedim! method for that type.

@nalimilan
Copy link
Member Author

JuliaLang/julia#28027 implements this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant