-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashing integer ranges #12226
Comments
From the definition of the |
This was changed in #6084 as a consequence of the discussion in #5778 (comment). The trouble with having arrays and ranges equal is that they must hash equal, and computing the hash of a range like an array requires hashing every single element. If they compare unequal, then ranges can hash in O(1). |
Thanks. |
So I guess |
It is unfortunate. :-\ We use run-length encoding when hashing the elements so sparse matrices don't have the same issue. It might not be that much more work to check if the elements can be lifted to a StepRange, too, but it's totally non-composable and would take much more time to check for a lifted FloatRange or LinSpace. I'm not sure there's a winning choice here since equality and hashing are so fundamental and need to be fast, but many custom arrays don't store their elements directly and instead compute them on the fly. |
I wonder if we could do some kind of clever mathematical trick here, but nothing comes to mind. Maybe something along the lines of hashing the run-length encoding of the diff of a vector? That would be fast for integer ranges, but not necessarily for floating-point ranges since the intervals are often irregular. |
I don't think we need to bend over backwards to support float ranges. They're already approximations of a theoretical range, and checking float equality is fraught with trouble in any case. It'd be interesting to experiment with RLE of the diff + first element. That would be a net gain for custom types, too: linear and piece-wise linear arrays could add fast paths (in addition to constant and piece-wise constant). |
One challenge for the diff RLE thing is overflow. How do we make |
Please close if I'm missing something obvious (it is early) but I thought that this was supposed to work.
The text was updated successfully, but these errors were encountered: