Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid allocation in BigDecimal serializer with alternative way to write bytes. #1016

Closed

Conversation

gdela
Copy link
Contributor

@gdela gdela commented Oct 14, 2023

This is a continuation of #1014 and alternative to #1015. Same as #1015 it avoids allocation, but additionally new methods were added to the Input/Output:

  • Output.writeBytesFromLong(long bytes, int count)
  • Input.readBytesAsLong(int count)
    Those methods write/read specified number of bytes using a long value as storage for bytes instead of byte[] array, to avoid allocation of the array on the heap. Additionally they properly require() necessary number of bytes in the buffer.

Thanks to those changes, the BigDecimalSerializer is even faster (mostly) than the version from #1015:
avoid-allocation-option-B-vs-A

Of course it's thus also faster (completely) than the code in the master branch:
avoid-allocation-option-B

Any suggestions for the method names (and the code) are welcome, and if this change is too intrusive I'm also happy with only the #1015 being merged.

@gdela gdela marked this pull request as ready for review October 14, 2023 17:45
int p = position;
position = p + count;
for (int i = count - 1; i >= 0; i--) {
buffer[p++] = (byte) (bytes >> (i << 3));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing could be one less shift per iteration (untested):

for (int i = (count - 1) << 3; i >= 0; i -= 8)
	buffer[p++] = (byte)(bytes >> i);

Same for ByteBufferOutput.

Reading could match, like:

for (int i = (count - 1) << 3; i > 0; i -= 8)
	bytes |= (buffer[p++] & 0xFF) << i;

However, for reading what you have may be better: simpler initializer and similar operations per iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do that, but the read side got a bit more complex:

long bytes = buffer[p] >= 0 ? 0 : ~(-1L >>> ((8-count) << 3));
for (int i = (count - 1) << 3; i >= 0; i -= 8) {
	bytes |= (buffer[p++] & (long) 0xFF) << i;
}

And also much slower (while the write side stayed the same when it comes to speed):
new-loops

So in 813ae4b I did something different - I unrolled the loops manually replacing it with some switches and repeated code. This got the reading/writing a little bit faster:

unrolled-loops

@@ -352,6 +352,16 @@ public void readBytes (byte[] bytes, int offset, int count) throws KryoException
}
}

public long readBytesAsLong (int count) {
Copy link
Member

@NathanSweet NathanSweet Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better than my initial idea of readBytes2, readBytes3, etc. Nice!

I don't hate the name readBytesAsLong, but what if it was readLong(int count)? Should we have a readInt(int count) for 2-3 bytes?

The first line of the method should be:

if (count < 0 || count > 8) throw new IllegalArgumentException("count must be >= 0 and <= 8: " + count);

Same with other methods taking a count parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 029190b, and the readInt(int count) in 9698899.

@@ -276,6 +276,14 @@ public void writeBytes (byte[] bytes, int offset, int count) throws KryoExceptio
}
}

public void writeBytesFromLong (long bytes, int count) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeLong(long bytes, int count)? Should we also have writeInt(int bytes, int count)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed to writeLong(long bytes, int count) and added writeInt(int bytes, int count). I wonder if those methods should stay where they are in the Output class, i.e. in the section marked as // byte:, or should I move them to sections // long: and // int: correspondingly (and to the same in Input class)?

@NathanSweet
Copy link
Member

DefaultSerializers.java is showing as the whole file modified?

I like the switches, kudos!

Last one: can I bother you to check if read/writeLong is better with it's own implementation? Calling the int method twice is probably slower, since there are more method calls and multiple require.

@gdela
Copy link
Contributor Author

gdela commented Oct 26, 2023

I've tried versions wher read/writeLong has its own implementation, but the results puzzle me. The 10- and 19-digits long value, which in serialized form uses 5 and 8 bytes correspondigly, got a bit better, but at the same time the 1-2 digits values, which uses one byte got worse, even though it shouldn't.

Version 1:
v1-read
v1-write
v1-results

Version 2:
v2-read
v2-write
v2-results

So given that the results are inconclusive, and the code gets longer, maybe it's not worth it?

@gdela
Copy link
Contributor Author

gdela commented Oct 26, 2023

DefaultSerializers.java is showing as the whole file modified?

Yes, seems I spoiled line endings. To avoid making mess in git history, I've created a fresh pull request with the same changes as the ones in this pull request, but with proper line endings - it's pull request #1018, so I'll close this one.

@NathanSweet
Copy link
Member

Sorry, I got swamped for a while there.

Interesting results for the int/long methods. Maybe the JVM is optimizing when the number of the method bytecodes is below some threshold and the 8 count switch exceeds that. I suppose it's not worth doing then.

We can squash commits when a PR is merged, so it's fine to make lots of a commits in the PR rather than open separate issues to keep the PR history clean. Not a big deal either way, we'll carry on in #1018.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants