Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve floating point accuracy in percentile #4617

Merged
merged 7 commits into from
Feb 14, 2021

Conversation

wphicks
Copy link
Contributor

@wphicks wphicks commented Feb 3, 2021

Improve the floating point accuracy of the linear interpolation formula used in percentile_weightnening kernel of cupy.percentile
Resolve #4607

@wphicks wphicks marked this pull request as draft February 3, 2021 19:24
@wphicks
Copy link
Contributor Author

wphicks commented Feb 3, 2021

The fix here is relatively straightforward and exactly mirrors the fix used in numpy. The associated unit test is a little more annoying. In numpy, Hypothesis was used to create a more general test of this problem, but since it looks like we aren't using Hypothesis in cupy, I used Hypothesis to find a "magic value" that would produce the error and then just hardcoded that value into the test that I've submitted here. Does that seem reasonable or do folks have some other idea on how this might be tested?

@wphicks wphicks marked this pull request as ready for review February 3, 2021 20:03
@wphicks wphicks changed the title [WIP] Improve floating point accuracy in percentile Improve floating point accuracy in percentile Feb 3, 2021
Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix here is relatively straightforward and exactly mirrors the fix used in numpy.

Any chance you could give a pointer to the Numpy counterpart for the record (either issue/PR or link to actual code is fine)?

The associated unit test is a little more annoying. In numpy, Hypothesis was used to create a more general test of this problem, but since it looks like we aren't using Hypothesis in cupy, I used Hypothesis to find a "magic value" that would produce the error and then just hardcoded that value into the test that I've submitted here. Does that seem reasonable or do folks have some other idea on how this might be tested?

Well, even Hypothesis itself is fairly new to NumPy. It landed in NumPy only about a year ago (numpy/numpy#15189) so don't feel annoyed that it's not adopted here 😄

Could you add a few more magic values here, and also test against NumPy's outputs (as done in all other tests throughout the codebase)? You could return the output in the test function, decorated by @testing.numpy_cupy_allclose(). This way we get a stronger confidence in the results.

cupy/_statistics/order.py Show resolved Hide resolved
@wphicks
Copy link
Contributor Author

wphicks commented Feb 3, 2021

Any chance you could give a pointer to the Numpy counterpart for the record (either issue/PR or link to actual code is fine)?

Oh yes! Sorry about that; included it in the issue but forgot to repost it here. The numpy PR is here: numpy/numpy#16273, and it addressed numpy/numpy#14685.

Well, even Hypothesis itself is fairly new to NumPy. It landed in NumPy only about a year ago (numpy/numpy#15189) so don't feel annoyed that it's not adopted here smile

Haha! No worries; I was more annoyed with myself that I hadn't found a more elegant test without Hypothesis...

Could you add a few more magic values here, and also test against NumPy's outputs (as done in all other tests throughout the codebase)? You could return the output in the test function, decorated by @testing.numpy_cupy_allclose(). This way we get a stronger confidence in the results.

Absolutely! Will update momentarily.

@wphicks
Copy link
Contributor Author

wphicks commented Feb 3, 2021

All right, got the test updated and posted some benchmark data. Let me know if I can tweak anything further or if you'd like to see something more from that test.

@asi1024
Copy link
Member

asi1024 commented Feb 8, 2021

Jenkins, test this please.

@asi1024 asi1024 added this to the v9.0.0b3 milestone Feb 8, 2021
@chainer-ci
Copy link
Member

Jenkins CI test (for commit 10e03d7, target branch master) failed with status FAILURE.

@asi1024
Copy link
Member

asi1024 commented Feb 9, 2021

@wphicks Could you check test failure?

@wphicks
Copy link
Contributor Author

wphicks commented Feb 9, 2021

Ah, interesting. It looks like numpy is violating the percentile monotonicity test in CI. I couldn't reproduce this with numpy==1.20.0, but earlier versions (which don't include numpy/numpy#16273) unsurprisingly demonstrate this issue. I wasn't immediately able to determine what numpy version was being used in CI to confirm that this is definitely the problem, but it seems likely. Does someone with better familiarity of the CI system know of a good way to check that version?

In terms of a fix, we can either add a requirement to the test section of setup.py for numpy>=1.20.0 or we can skip the test if we detect an old version of numpy. I'm going to push a commit that changes the test requirement, but if that seems like too heavy-handed of a change, we can go with the skip instead.

@asi1024
Copy link
Member

asi1024 commented Feb 9, 2021

I see! Could you use decorator @testing.with_requires('numpy>=1.20')?

This reverts commit 55b37e9 in favor of
using the testing.with_requires decorator
Skip test when numpy version is old enough to violate the percentile
monotonicity constraint itself
@wphicks
Copy link
Contributor Author

wphicks commented Feb 9, 2021

I see! Could you use decorator @testing.with_requires('numpy>=1.20')?

Done! Thank you. I reverted the change to setup.py and added the decorator.

@wphicks
Copy link
Contributor Author

wphicks commented Feb 9, 2021

Looks like Github actions are having some issues right now, which seems to be responsible for the latest failure.

@jakirkham
Copy link
Member

jakirkham commented Feb 9, 2021

Probably from this ( https://www.githubstatus.com/incidents/5qdkkyg958vy ). Maybe try pushing an empty commit? They say it was since resolved.

@leofang
Copy link
Member

leofang commented Feb 9, 2021

Power-cycle (close and reopen) should restart it.

@wphicks wphicks closed this Feb 9, 2021
@wphicks wphicks reopened this Feb 9, 2021
@leofang
Copy link
Member

leofang commented Feb 9, 2021

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 949a25c, target branch master) succeeded!

@jakirkham
Copy link
Member

This shows up in the Windows CI log. Does cuTENSOR get installed on Windows typically? If so, is this just an issue of a download timing out or something?

FAILED tests/cupyx_tests/tools_tests/test_install_library.py::TestInstallLibrary::test_install_cutensor

https://ci.preferred.jp/cupy.experimental.win.cuda100/68442/

@leofang
Copy link
Member

leofang commented Feb 10, 2021

There're still tons of errors in the Windows CI so hopefully we can address them all before v9 is out. For now it's expected to see failures on Windows.

For this specific case, it's tracked in #4601. The issue is cuTENSOR for Windows is packaged as .exe which cannot be extracted without installing 3rd party tools like 7zip, so this part cannot be automated and requires users' action. On conda-forge we don't have this problem because 7zip was made a dependency on Windows.

@jakirkham Perhaps it's easier if you could request the cuTENSOR team to upload a different, machine-readable format? 🙂 At least cuDNN does not have this issue.

@leofang
Copy link
Member

leofang commented Feb 10, 2021

(btw all tests in test_order.py passed on Windows.)

@jakirkham
Copy link
Member

Ok let's move the Windows discussion to that issue then 🙂

@asi1024
Copy link
Member

asi1024 commented Feb 14, 2021

The CI failure is unrelated to this PR. LGTM! Thank you for the PR!

@asi1024 asi1024 merged commit 9fed304 into cupy:master Feb 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:numpy-compat Follow the NumPy/SciPy spec changes prio:high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Percentile output non-monotonic for monotonically increasing percentiles
6 participants