Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unsafe perf quirks from System.Text.Encodings.Web #109896

Merged
merged 8 commits into from
Nov 19, 2024

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Nov 17, 2024

Contributes to #94941 effort.

Should be ~zero performance difference as the JIT is smart enough to handle the safer equivalent, i.e.:

// New code:
static bool New(Span<byte> destination)
{
    return "&gt;"u8.TryCopyTo(destination);
}

// Old code:
static bool Old(Span<byte> destination)
{
    return TryWriteBytes(destination, (byte)'&', (byte)'g', (byte)'t', (byte)';');
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool TryWriteBytes(Span<byte> span, byte a, byte b, byte c, byte d)
{
    if (span.Length >= 4)
    {
        uint abcd32;
        if (BitConverter.IsLittleEndian)
        {
            abcd32 = ((uint)d << 24) | ((uint)c << 16) | ((uint)b << 8) | a;
        }
        else
        {
            abcd32 = ((uint)a << 24) | ((uint)b << 16) | ((uint)c << 8) | d;
        }
        Unsafe.WriteUnaligned<uint>(ref MemoryMarshal.GetReference(span), abcd32);
        return true;
    }
    else
    {
        return false;
    }
}

Codegen for New and Old:

; Method Program:New(System.Span`1[ubyte]):ubyte (FullOpts)
       mov      rax, bword ptr [rcx]
       mov      ecx, dword ptr [rcx+0x08]
       xor      edx, edx
       cmp      ecx, 4
       jl       SHORT G_M2274_IG04
       mov      dword ptr [rax], 0x3B746726
       mov      edx, 1
G_M2274_IG04:  ;; offset=0x0018
       mov      eax, edx
       ret      
; Total bytes of code: 27


; Method Program:Old(System.Span`1[ubyte]):ubyte (FullOpts)
       mov      rax, bword ptr [rcx]
       mov      ecx, dword ptr [rcx+0x08]
       cmp      ecx, 4
       jge      SHORT G_M16025_IG04
       xor      eax, eax
       jmp      SHORT G_M16025_IG05
G_M16025_IG04:
       mov      dword ptr [rax], 0x3B746726
       mov      eax, 1
G_M16025_IG05:
       ret      
; Total bytes of code: 27

Same for writing chars vs "str".TryCopyTo.

PS: SpanUtility.IsValidIndex can also be just removed (propagated at all uses) so we can remove the whole file, but it's out of scope for this PR since IsValidIndex is perfectly safe.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-encodings-web
See info in area-owners.md if you want to be subscribed.

@EgorBo EgorBo force-pushed the remove-unsafe-text-encodings-web branch from c4217fe to 452857b Compare November 17, 2024 11:31
@EgorBo
Copy link
Member Author

EgorBo commented Nov 17, 2024

ugh, .NET Framework 😐

@MichalPetryka
Copy link
Contributor

How does the performance compare with Mono?

@EgorBo
Copy link
Member Author

EgorBo commented Nov 17, 2024

How does the performance compare with Mono?

This is a part of the global security initiative to improve code safety. We assume that extra nanoseconds in these APIs for mono workloads won't be a big deal. Not sure performance matters for quite old System.Text.Encodings.Web at all, perhaps, only for HtmlEncoder.

@EgorBo EgorBo force-pushed the remove-unsafe-text-encodings-web branch from 97bb412 to a58a31f Compare November 18, 2024 03:34
@EgorBo EgorBo merged commit 20f5c7e into dotnet:main Nov 19, 2024
83 checks passed
@EgorBo EgorBo deleted the remove-unsafe-text-encodings-web branch November 19, 2024 12:20
@@ -68,22 +68,22 @@ internal override int EncodeUtf8(Rune value, Span<byte> destination)
{
if (value.Value == '<')
{
if (!SpanUtility.TryWriteBytes(destination, (byte)'&', (byte)'l', (byte)'t', (byte)';')) { goto OutOfSpace; }
if (!"&lt;"u8.TryCopyTo(destination)) { goto OutOfSpace; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This library builds for .NET Framework as well. Did these changes regress performance there? Do we care?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's impossible to avoid regressions on Mono and .NET Framework during "de-unsafization", the only part where it won't happen is the tiny coreclr-specific part of the corelib 😞

""&lt;"u8.TryCopyTo" seems to be 5ns slower than the old hack on .NET Framework for me locally. But all the public APIs System.Text.Encodings.Web contains aren't fast anyway (e.g. UrlEncoder.Default.Encode takes ~ms) so hopefully it won't be noticed? I've got an impression that this package is not often used on hot path, that's why I started from it.

cc @jkotas (e.g. we recently introduced a similar regression for Mono in #108572 (comment))

@github-actions github-actions bot locked and limited conversation to collaborators Dec 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants