You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The BK data path is very efficient when processing large entries and it's generally able to saturate the disk and network IO in these cases.
By contrast, when handling a large number of very small entries there are several inefficiencies that cause the CPU to become the bottleneck, because of the per-entry overhead.
There are several low-hanging fruits to tackle to improve performance:
Reduce contention between message passing
Reduce contention in journal & force-write queues:
Reduce the number of buffers allocated per entry written/read.
For each entry being written in a ledger we are using 4 ByteBuf instances:
The entry payload (this gets passed in to BK client)
The checksum
The serialized `AddRequest
The 4 byte size header
These buffers are passed to Netty which will do a scatter writev, though it will pass all the buffers.
Allocating and managing all these buffer is expensive. There is overhead in:
Refcounting
Recycler to get the ByteBuf instances and put them back in the pool
ByteBuf pool arena to handle allocations/deallocation
Inter-thread synchronization: these buffer are normally allocated in one thread and deallocated from a different thread
To make matters worse, while the checksum is computed only once, the AddRequest is serialized each time we write it on a connection.
eg: if we have write-quorum=3, it would mean we are using (2 * 3) + 1 = 7 ByteBuf per each entry.
Finally, while for big entries is very important to avoid copying the payload, for small entries the overhead of maintaining the ByteBufList is greater than just copying the payload into a single buffer.
For that we should do:
If the entries are big -> keep using ByteBufList, with 1 buffer for all the header and the 2nd buffer referencing the payload, with no copy.
If entry is small -> allocate a buffer to contain all the headers and the payload and copy into it.
The BK data path is very efficient when processing large entries and it's generally able to saturate the disk and network IO in these cases.
By contrast, when handling a large number of very small entries there are several inefficiencies that cause the CPU to become the bottleneck, because of the per-entry overhead.
There are several low-hanging fruits to tackle to improve performance:
Reduce contention between message passing
Reduce contention in journal & force-write queues:
Improve the OrderedExecutor performance:
Reduce the number of buffers allocated per entry written/read.
For each entry being written in a ledger we are using 4
ByteBuf
instances:These buffers are passed to Netty which will do a scatter
writev
, though it will pass all the buffers.Allocating and managing all these buffer is expensive. There is overhead in:
ByteBuf
instances and put them back in the poolTo make matters worse, while the checksum is computed only once, the
AddRequest
is serialized each time we write it on a connection.eg: if we have write-quorum=3, it would mean we are using (2 * 3) + 1 = 7
ByteBuf
per each entry.Finally, while for big entries is very important to avoid copying the payload, for small entries the overhead of maintaining the
ByteBufList
is greater than just copying the payload into a single buffer.For that we should do:
ByteBufList
, with 1 buffer for all the header and the 2nd buffer referencing the payload, with no copy.Pending changes:
The text was updated successfully, but these errors were encountered: