-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect I32Key Index Ordering #489
Comments
Thanks for the issue. Instead of
The problem is that we're using the internal representation to define the order, and the internal representation of negative numbers is based on two's complement, in which the most significant bit is used as sign bit. So that, lexicographically, negative numbers come after positive numbers. We can fix this by switching to another representation. Need to think about it. |
There is no way to easily scan besides byte order. I think a new byte representation for signed ints could be interesting. I believe if we toggle the first bit on big endian, this will actually provide the ordering we want.
Is that correct? If so, we can use IntKey and UintKey as two types with different serialization/parsing logic and everyone will be happy. |
Almost. It's be[0] ^= 0x80; But yes, this is a good idea, and will in fact solve the ordering issue(!). When we deserialize, we can easily fix / revert this, and restore the original value. So, this will require a deserializing So, we either deprecate raw
|
I am fine deprecating in 0.11 and removing in 0.12. But I would like that to be independent of this issue.
Hmmm... we do not promise what our encoding is. But we should provide some sanctioned function to get the eg. u32 from IntKey. No need to call it |
Right. |
When using an I32Key as the primary key for an index, the ordering is done incorrectly. Instead of sorting signed integers as you would expect them to be sorted, all negative numbers are considered to be "greater" than all positive numbers.
IntKey new implementation:
The new implementation converts the i32 number to big endian bytes which doesn't preserve the sorting order.
My workaround was to restrict my range query with a max of the most negative number:
This fits my use case because I don't need the negative values when querying, but is quite counterintuitive and took me a long time to figure out what was going on.
The text was updated successfully, but these errors were encountered: