Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bloom filter negative expectedFpp and premature saturation of filter #1656

Closed
gissuebot opened this issue Oct 31, 2014 · 3 comments
Closed
Assignees
Labels
Milestone

Comments

@gissuebot
Copy link

gissuebot commented Oct 31, 2014

Original issue created by spie...@addthis.com on 2014-02-05 at 04:07 PM


Hi folks,

We've observed that for very large bloom filters the expected false positive probability returns a negative value and the observed false positive probability exceeds the desired false positive probability during conditions when the number of performed insertions is relatively small as compared to the number of expected insertions.

I've been able to reproduce this behavior in the attached SSCCE. The bloom filter is constructed with an expected 150M insertions and a desired fpp of 0.0001. After 75M insertions the expected false positive probability is a negative value. The observed false positive probability is around 0.0002. Informally we have been able to reproduce the negative expected false positive probability with larger desired false positive probabilities by increasing the expected number of insertions.

@gissuebot
Copy link
Author

Original comment posted by cgdecker@google.com on 2014-02-05 at 05:30 PM


(No comment entered for this change.)


Labels: Package-Hash, Type-Defect
CC: andreou@google.com

@gissuebot
Copy link
Author

Original comment posted by Maaartinus on 2014-02-07 at 08:32 PM


This seems to be related to 1119 (if not the same problem). Quiet some things still use int which gladly overflows, e.g., BloomFilterStrategies.BitArray.bitSize().

@gissuebot
Copy link
Author

Original comment posted by kak@google.com on 2014-03-28 at 07:43 PM


The overflows should be fixed by 6de004a

The other issues (wrong FPP) is being addressed in Issue #1119.

Thanks for the report, and especially thanks for the GuavaBloomTest.java...it was helpful in confirming our fix.


Status: Fixed
Owner: kak@google.com
Labels: Milestone-Release17

@cgdecker cgdecker modified the milestone: 17.0 Nov 1, 2014
panacekcz added a commit to panacekcz/guava that referenced this issue Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants