Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap around below #51

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

oxinabox
Copy link

@oxinabox oxinabox commented May 23, 2018

I noticed that at

if 1 <= j <= npoints-1
grid[j] += (midpoints[k]-x)*ainc*weights[i]
grid[k] += (x-midpoints[j])*ainc*weights[i]
end
end

if a point should be allocated to the lower boundary it is just discarded.
Meaning after tabulate the bottom bin is always empty.

As I read the original Jones and Lotwick paper
https://www.jstor.org/stable/pdf/2347674.pdf
there solution in the equivalent case is to wrap-around.
Make sense as the FFT-convolution will wrap around anyway

Now in most usage, the lower (and upper) bins should be sufficiently far from the edge,
so as to avoid wrap-around effects from FFT-convolution.
(I am running it to it because I am explicitly wanting wrap around effects).

Possible rather than employing wrap-around in tabulate,
the alternative might be to just throw an error in an else statement added to that if
Saying "Lower boundary too tight, please expand lower boundary"
It seem bad to just silently throw away data

This only handled the uni-variate case, if I am right it can be extended to the bivariate case

@oxinabox
Copy link
Author

Bump.
This is passing on 0.6

@simonbyrne
Copy link
Member

I'm broadly sympathetic, but why just wrap for the one step below? Since we're using midpoints, shouldn't it be half a step below, half a step above? Or should we just wrap all?

@oxinabox
Copy link
Author

I think that actually equivalent to this it comes down to how where indexing in the first place and we have this empty bin at the bottom

@oxinabox
Copy link
Author

I think I need to draw some pictures
I know that my solution stops any data being thrown away
and it feels like it is equivalent to Jones and Lotwick's
but they index differently, so wrap above to below IIRC

@oxinabox
Copy link
Author

Ok, after playing with this, and master
I've concluded the wrap-around doesn't matter as much as things that are exactly equal to the lower boundary midpoint, actually end up in the bottom bin

For mps= 1:9

Master:

julia>  KernelDensity.tabulate([1.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

This PR

julia> KernelDensity.tabulate([1.0], mps).density |> print
[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

So it doesn't seem right that the lower boundary just gets stiffed like that.
Hitting the upper boundary gets you in the upper bin.
Hitting the lower boundary gets you drops.
It is basically saying this is exclusive below inclusive above.

You are correct that when it comes to actual wrap-around,
it only actually ends up wrapping round from below to above.

Full details

With this PR:

julia> mps = KernelDensity.kde_range((1,9), 9)
1.0:1.0:9.0

julia> KernelDensity.tabulate([1.0], mps).density |> print
[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([1.5], mps).density |> print
[0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia> KernelDensity.tabulate([0.5], mps).density |> print
[0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5]
julia> KernelDensity.tabulate([9.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]

julia> KernelDensity.tabulate([9.5], mps).density |> print
ERROR: BoundsError: attempt to access 9-element StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}} at index [10]

julia>  KernelDensity.tabulate([1.01], mps).density |> print
[0.99, 0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

Current master:

julia> mps = KernelDensity.kde_range((1,9), 9)
1.0:1.0:9.0

julia>  KernelDensity.tabulate([1.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia>  KernelDensity.tabulate([0.5], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia>  KernelDensity.tabulate([9.0], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]
julia>  KernelDensity.tabulate([1.5], mps).density |> print
[0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia>  KernelDensity.tabulate([8.5], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5]
julia>  KernelDensity.tabulate([9.5], mps).density |> print
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
julia>  KernelDensity.tabulate([1.01], mps).density |> print
[0.99, 0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0

@simonbyrne
Copy link
Member

This could probably be handled better (maybe an optional periodic argument?), but in any case it is an improvement. Would you mind adding a test or two, and perhaps comment, and then we can merge

@oxinabox
Copy link
Author

Maybe rather than wrap around I could workout how to include the lower boundary and the an error on out of bounds

@matbesancon
Copy link
Member

@oxinabox you wanted to add some modifications, some tests? Maybe we can finish this one?

@oxinabox
Copy link
Author

My thesis is due in 1 week.
So no. Not right now.

@matbesancon
Copy link
Member

no problem, just wanted to see pending PRs, good luck :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants