Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of optional center in API of moment() #161

Merged
merged 11 commits into from
Mar 26, 2020

Conversation

jvdp1
Copy link
Member

@jvdp1 jvdp1 commented Mar 22, 2020

As discussed in #153
Related to #113

Changed the API of moment() from

result = moment(array, order [, mask])
result = moment(array, order, dim [, mask])

to

result = moment(array, order [, center [, mask]])
result = moment(array, order, dim [, center[, mask]])

The default returned value before changes was central moments. Now it is raw moments, and central moments if center is provided.

Note: if dim is provided, center must be an array. Otherwise, it must be a scalar.

Comments are welcome, @fiolj @ivan-pi @certik @milancurcic

commit 155ddc2
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 19:12:07 2020 +0100

    rawmoment: update spec

commit a365ea0
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 18:56:46 2020 +0100

    rawmoment: small change in spec

commit d955c0c
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 18:03:49 2020 +0100

    rawmoment: correction

commit 0480bc1
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 17:57:24 2020 +0100

    rawmoment: small correction

commit fcc6a43
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 17:52:30 2020 +0100

    rawmoment: cleaning

commit c63ea07
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 15:48:40 2020 +0100

    rawmoment: progress in test files

commit ef6deec
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 15:42:46 2020 +0100

    rawmoment: progress

commit a31654c
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 14:03:50 2020 +0100

    rawmoment: start tests raw moment

commit 78d7fb9
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 13:37:02 2020 +0100

    rawmoment: addition of test

commit 0e93187
Author: Vandenplas, Jeremie <jeremie.vandenplas@gmail.com>
Date:   Sun Mar 22 11:55:31 2020 +0100

    rawmoment: start to add center in test
@ivan-pi
Copy link
Member

ivan-pi commented Mar 23, 2020

I appreciate that the current iteration also supports other centers. Is there any particular reason to prefer raw vs central moments as the default option? In Julia StatsBase.jl the default are central moments. The MATLAB moment is also central.

! raw is default (currently)
raw = moment(array, order)
central = moment(array, order,center=mean(array))

! vs

! if central were default
central = moment(array, order)
raw = moment(array, order,center=0.0)

I am not sure which is more convenient. If central was default, then, for the moment along a specific dimension of a multidimensional array it would be convenient to have a zeros function:

! raw is default (currently)
raw = moment(array, order,dim=2)
central = moment(array, order,dim=2,center=mean(array,dim=2))

! if central is default
central = moment(array, order,dim=2)
raw = moment(array, order,center=zeros(size(array,1),...)) !  center is a rank >= 1 array

For the moment of a multi-dimensional array along a specific dimension the currently proposed version seems easier to use, it is however opposite from MATLAB and Julia. With central as default on the other hand, inferring the correct size of the zeros array for raw moments is error-prone.

Thoughts?

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 23, 2020

Thanks @ivan-pi for the comments. I though about this issue (default: raw or central moment) and wanted to have some discussion about that. I am glad you started it.

I choose 'raw moment' as default such that center can be optional. However, I agree that other languages (Julia, Matlab, but also Scipy, Octave,R) have 'central momen' as default. Julia does not provide raw moments explicitely. So it would make sense that stdlib provides a central moment as default.

Possible solution:

A way to implement easily raw moments would be to allow a scalar center when dim is provided, in addition to the current implementation (i.e., an array for center when dim is mentioned). (Or should we also allow array center when dim is not provided? With which rank?)
With such an implemenation, a zero() function is not needed.

So, center would be still optional. But the default returned value is 'central moment'.
If center is provided, then the return value is a moment about center. A raw moment could be computed by providing center = 0. This behaviour is similar to the one of Julia.

The API would remain the same. Only the default returned value would change.

Copy link
Member

@milancurcic milancurcic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I don't have a strong preference regarding the default value for center, so I'll leave it to you and other reviewers to decide.

@ivan-pi
Copy link
Member

ivan-pi commented Mar 25, 2020

For consistency with other programming languages used in scientific computing I would prefer to see central moments as default. That way it is easier for users to move between programming languages.

If I understand correctly, for the solution you are suggesting, the interface would be the following

! existing API
result = moment(array, order [, center [, mask]])  ! center is a scalar, default is `mean(array)`
result = moment(array, order, dim [, center[, mask]]) ! center is an array, default is `mean(array,dim)`

! possible solution
result = moment(array, order, dim, center [, mask]) 
    ! center is a scalar, raw moments with `center = 0`

Would the type of center match the input array? I think it makes sense to allow integer arrays to have a real center. For real arrays presumably, it will be necessary to specify the precision as center = 0.0_sp, otherwise the compiler will complain.

The remaining option

result = moment(array, order, center [, mask]) ! center is an array, dim is not provided

does not make sense to me in the context of statistical moments. It is in contradiction with the first version above, where the rank of the array and the location of values in the array carry no significance in the moment calculation. It would however make sense for things such as image moments, in which case the center would probably be a vector of size equal to the rank of the array. Currently, I don't think these different use cases are compatible.

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 25, 2020

Following the discussion with @ivan-pi

  1. Current default result is now central moment (instead of raw moment)
  2. the optional center must be a scalar if dim is not provided, OR can be a scalar or an array if dim is provided.
  3. center must be of the same type as res (so for integer arrays, center must be of type real(dp)).

So, this can be done to compute 2-th raw moment:

res = moment( x, 2, center = 0.)
res = moment( x, 2, dim = 1, center = 0.)
res = moment( x, 2, dim = 1, center = zero)   !where zero is an array of the same size as res

Tests have been implemented for both options.

@jvdp1 jvdp1 requested a review from milancurcic March 25, 2020 12:36
Copy link
Member

@certik certik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this looks good. API wise I don't have strong feelings either way. So if there are no objections, we can merge it.

@certik
Copy link
Member

certik commented Mar 25, 2020

I restarted the builds, they are failing. Probably due to the conflicts.

@certik
Copy link
Member

certik commented Mar 25, 2020

@jvdp1 I got an email where you were asking about some CI failures due to floating point exceptions, but I can't find the comment here. If there is still an issue, let me know, I can have a look at that.

@fiolj
Copy link
Contributor

fiolj commented Mar 25, 2020

Sorry for the delay. It looks very good now. I also would prefer center for compatibility with other languages/environment. Thanks for all the work!

@milancurcic
Copy link
Member

With #163 now in master, I think this needs to be rebased to latest master, and replace calls to assert with calls to check. I used sed for this, for example:

sed -i 's/assert/check/gi' test_moment.f90
sed -i 's/assert/check/gi' test_rawmoment.f90

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 25, 2020

@jvdp1 I got an email where you were asking about some CI failures due to floating point exceptions, but I can't find the comment here. If there is still an issue, let me know, I can have a look at that.

@certik thank you for proposing help. I solved this issue and deleted the comment (sorry, I should have update it now).

I will rebase the branche. It should be fine then.

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 25, 2020

@certik The tests for the manual Makefiles failed, probably because they take too much time. Removing all the flags for the debug options leads to pass the tests here.

What could we do for this issue?

@fiolj
Copy link
Contributor

fiolj commented Mar 26, 2020

@certik The tests for the manual Makefiles failed, probably because they take too much time. Removing all the flags for the debug options leads to pass the tests here.

What could we do for this issue?

I think you are right, the problem is the same that has been insinuating for a time now. Compilation times are getting too long due to rank-aware routines. It also happens with cmake system but my guess is that with Makefiles are slightly longer (at least for this case) and is crossing some time threshold in the CI system.
I've done a simple test of building times in my machine, obtaining:

real	4m15.740s
user	4m2.226s
sys	0m7.115s

and then, setting MAXRANK manually to 7:

#! Ranks to be generated when templates are created
#:set MAXRANK = 7
#:if not defined('MAXRANK')
  #:if VERSION90
    #:set MAXRANK = 7
  #:else
    #:set MAXRANK = 15
  #:endif
#:endif

I've got for a new complete build:

real	0m42.008s
user	0m38.078s
sys	0m3.121s

Note: I've had to manually disable (comment out) some lines of test_mean_f03.f90 for it to compile.

Probably we should discuss the compiling times in a new issue, because it will get worse quickly when we add new routines that are made to work with all different ranks.

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 26, 2020

I think you are right, the problem is the same that has been insinuating for a time now. Compilation times are getting too long due to rank-aware routines. It also happens with cmake system but my guess is that with Makefiles are slightly longer (at least for this case) and is crossing some time threshold in the CI system.

Indeed, the problem also appeared with CMake, and it is why we limited the number of ranks to 4 for the CI.
We should probably do the same for the manual Makefiles. I will have a look at this option.

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 26, 2020

Seeing the report for the CI, it seems that the flag FYPPFLAGS (to limit the number of ranks to 4 for the CI) is not applied when using the Makefile.manual. I have no idea how to check it.
@certik @milancurcic Any ideas?

EDIT: The PR #164 should fix this issue.

@milancurcic
Copy link
Member

I merged #164 into master and restarted the builds here. But I forgot that we need to rebase to master here. @jvdp1 Can you do that?

@jvdp1
Copy link
Member Author

jvdp1 commented Mar 26, 2020

I rebased this PR to master and all checks have passed,

@milancurcic
Copy link
Member

Thank you, great work! Merging into master.

@milancurcic milancurcic merged commit de38699 into fortran-lang:master Mar 26, 2020
@jvdp1 jvdp1 deleted the raw_moment_dev branch March 26, 2020 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants