-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Numerically stable Cython functions roll_cov
and roll_corr
#8326
Conversation
This is incomplete, and requires making changes to |
Some examples and timings that I think will show the benefits of having three different Cython implementations:
It produces the right results in the general case, and exactly
And the
|
Uses a Welford-like method for general computations, and includes detection of exact zeros in the denominator and of exactly identical sequences.
307781f
to
4c54410
Compare
Lastly, the failing tests have to do with what I think are wrong expectations: |
you may want to add the identities that you have above as tests as well |
I think tests of those identities (and more) are already in |
This would force relaxing some of those tests, though, because the build is going to fail... |
Which tests will fail? Note that the consistency checks only apply when x and y have the same pattern of
|
It's a size 3 sliding window going over two consecutive NaNs, e.g. I could put in an |
Ah. Another way to deal with it is to first deal with the value exiting the window (so in this case there would be no values in the window) before dealing with the value entering the window.
|
It looks like Putting aside for a moment the checks for repeated values, here are a couple questions:
My thoughts:
Perhaps it's wishful thinking, but my inclination is to try to implement Note: As I mentioned previously, I think the code would be simpler (though perhaps not faster) if it first handled the old value (if any) exiting the window and then handled the new (if any) entering the window, rather than trying to handle all four combinations simultaneously. |
I have a sentimental attachment to the math for the case where one observation comes in and another goes out: you don't find it in the books or in wikipedia, so I had to work it out myself. But of course this is not about not hurting my feelings, it does have its merits... The basic code for removing an observation
To add an observation requires:
If you do both in a single step, it is something like
I think the key thing here is that, when doing both together, you only have to do one division, which is the expensive operation. I just put together a
So there seems to be a 50% penalty to pay for not doing the "remove first, then add" case in a single shot. The timings I posted before kind of show a similar slow down for doing The "remove first, then add" strategy also seems to be a little less numerically stable, although the practical implications of that are negligible. One last problem of implementing I guess I really like the idea of I am not directly opposed to it detecting exact zeros, but I think it is worth understandingwhy do we want that behavior, if LAstly. I am much more open to implementing a monstrous |
Hmm. Interesting.
|
closing pls reopen if/when updated |
No description provided.