Performance considerations

Here are a few performance considerations that should be considered when thinking about making Catalyst faster (or at least, not making it slower).

Allocations

Allocations are cheap but they aren't free. Every allocated object at the end of its lifetime is collected by GC. Effectively, the less memory we allocate, the less pressure is put on GC. There are several ways to reduce the amount of allocated memory. Some of them, that are meaningful for Catalyst, are described below:

Zero copy with Google Protocol Buffers

Catalyst uses Google Protocol Buffers as its protocol format. This means, that for complex binary types ByteString might be used to pass the value. One of the examples could be a hash value or Uint256. Unfortunately for the performance, the only way to get access to byte[] that is wrapped by ByteString is to call .ToByteArray(). This means copying the whole payload again, just to obtain the array. There's one accessor though, that enables accessing value without any allocation. This is .Span that returns ReadOnlySpan<byte>. This means that for free, without any copying involved, we can get a readonly access. Unfortunately for method signatures, this means, that instead of accepting byte[] we need to restrict them to ReadOnlySpan<byte>.

This approach, of not copying, but just using data as they are sometimes is referred to as zero-copy.

This moves us to another point, spanification.

Spanification

Modern dotnet APIs, that care about performance no longer use byte[] or ArraySegment<byte>. Instead they use Span<byte> for synchronous flow or Memory<byte> for asynchronous flow. The first enables using stack-allocated memory or wrapping any other memory segment. The second allows using different kinds of pooling, either one built in dotnet that is based on ArrayPool<>.Shared or others, like the one used in Kestrel (slab-based).

Span<> and ReadOnlySpan<> are by-ref types (ref struct), methods using them cannot be mocked easily using NSubstitute. NSubstitute is based on Castle.Core that captures parameters as object[], which involves boxing operation. This can be address by create an abstract class per interface (one per test solution) that delegates the span-based method to a method that accepts byte[]. Then, the class should be used for substituting purposes

public abstract class FakeKeySigner : IKeySigner
{
    // span-based that cannot be substituted because of the by-ref semantics of span
    ISignature IKeySigner.Sign(ReadOnlySpan<byte> data, SigningContext signingContext)
    {
        // just delegate to the byte[] method
        return Sign(data.ToArray(), signingContext);
    }
    // method that can be substituted
    public abstract ISignature Sign(byte[] data, SigningContext signingContext);
}

Benchmarking

The best way to optimize is to profile first, come up with a scenario for the improvement and then, have it benchmarked with BenchmarkDotNet. There are a few samples of benchmarks written in Catalyst. Whenever you think that you have a potential gain, profile it, provide a benchmark and rerun it after the fix. Applying changes without measuring might have an opposite effect!

Sources

There are some good sources for writing a well-performing code in dotnet

Books:
1. Pro .NET Memory Management by Konrad Kokosa
2. Writing High-Performance .NET Code by Ben Watson
Blogs:
1. Adam Sitnik

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance considerations

Allocations

Zero copy with Google Protocol Buffers

Spanification

Benchmarking

Sources

Overview

How To Run a POA node

Other

Clone this wiki locally