-
Notifications
You must be signed in to change notification settings - Fork 826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid allocation in BigDecimal
serializer with alternative way to write bytes.
#1016
Avoid allocation in BigDecimal
serializer with alternative way to write bytes.
#1016
Conversation
int p = position; | ||
position = p + count; | ||
for (int i = count - 1; i >= 0; i--) { | ||
buffer[p++] = (byte) (bytes >> (i << 3)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing could be one less shift per iteration (untested):
for (int i = (count - 1) << 3; i >= 0; i -= 8)
buffer[p++] = (byte)(bytes >> i);
Same for ByteBufferOutput.
Reading could match, like:
for (int i = (count - 1) << 3; i > 0; i -= 8)
bytes |= (buffer[p++] & 0xFF) << i;
However, for reading what you have may be better: simpler initializer and similar operations per iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do that, but the read side got a bit more complex:
long bytes = buffer[p] >= 0 ? 0 : ~(-1L >>> ((8-count) << 3));
for (int i = (count - 1) << 3; i >= 0; i -= 8) {
bytes |= (buffer[p++] & (long) 0xFF) << i;
}
And also much slower (while the write side stayed the same when it comes to speed):
So in 813ae4b I did something different - I unrolled the loops manually replacing it with some switches and repeated code. This got the reading/writing a little bit faster:
@@ -352,6 +352,16 @@ public void readBytes (byte[] bytes, int offset, int count) throws KryoException | |||
} | |||
} | |||
|
|||
public long readBytesAsLong (int count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better than my initial idea of readBytes2
, readBytes3
, etc. Nice!
I don't hate the name readBytesAsLong
, but what if it was readLong(int count)
? Should we have a readInt(int count)
for 2-3 bytes?
The first line of the method should be:
if (count < 0 || count > 8) throw new IllegalArgumentException("count must be >= 0 and <= 8: " + count);
Same with other methods taking a count
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -276,6 +276,14 @@ public void writeBytes (byte[] bytes, int offset, int count) throws KryoExceptio | |||
} | |||
} | |||
|
|||
public void writeBytesFromLong (long bytes, int count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeLong(long bytes, int count)
? Should we also have writeInt(int bytes, int count)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed to writeLong(long bytes, int count)
and added writeInt(int bytes, int count)
. I wonder if those methods should stay where they are in the Output
class, i.e. in the section marked as // byte:
, or should I move them to sections // long:
and // int:
correspondingly (and to the same in Input
class)?
I like the switches, kudos! Last one: can I bother you to check if read/writeLong is better with it's own implementation? Calling the int method twice is probably slower, since there are more method calls and multiple |
I've tried versions wher read/writeLong has its own implementation, but the results puzzle me. The 10- and 19-digits long value, which in serialized form uses 5 and 8 bytes correspondigly, got a bit better, but at the same time the 1-2 digits values, which uses one byte got worse, even though it shouldn't. So given that the results are inconclusive, and the code gets longer, maybe it's not worth it? |
Yes, seems I spoiled line endings. To avoid making mess in git history, I've created a fresh pull request with the same changes as the ones in this pull request, but with proper line endings - it's pull request #1018, so I'll close this one. |
Sorry, I got swamped for a while there. Interesting results for the int/long methods. Maybe the JVM is optimizing when the number of the method bytecodes is below some threshold and the 8 count switch exceeds that. I suppose it's not worth doing then. We can squash commits when a PR is merged, so it's fine to make lots of a commits in the PR rather than open separate issues to keep the PR history clean. Not a big deal either way, we'll carry on in #1018. |
This is a continuation of #1014 and alternative to #1015. Same as #1015 it avoids allocation, but additionally new methods were added to the Input/Output:
Output.writeBytesFromLong(long bytes, int count)
Input.readBytesAsLong(int count)
Those methods write/read specified number of bytes using a
long
value as storage for bytes instead ofbyte[]
array, to avoid allocation of the array on the heap. Additionally they properlyrequire()
necessary number of bytes in the buffer.Thanks to those changes, the
BigDecimalSerializer
is even faster (mostly) than the version from #1015:Of course it's thus also faster (completely) than the code in the master branch:
Any suggestions for the method names (and the code) are welcome, and if this change is too intrusive I'm also happy with only the #1015 being merged.