Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Fix too wide loads on arm64 for small structs #76341

Merged
merged 18 commits into from
Nov 17, 2022
Merged
33 changes: 31 additions & 2 deletions src/coreclr/jit/lower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3777,12 +3777,41 @@ void Lowering::LowerRetStruct(GenTreeUnOp* ret)

case GT_BLK:
case GT_OBJ:
retVal->ChangeOper(GT_IND);
FALLTHROUGH;
case GT_IND:
{
// Spill to a local if sizes don't match so we can avoid the "load more than requested"
// problem, e.g. struct size is 5 and we emit "ldr x0, [x1]"
unsigned realSize = retVal->AsIndir()->Size();
CORINFO_CLASS_HANDLE structCls = comp->info.compMethodInfo->args.retTypeClass;
if (realSize == 0)
{
// We have an IND<struct> node, assuming it's used in a return statement, take target size
// from the current method's signature (it's only used to decide whether we need to spill it to
// a local or not, so the actual size won't be used).
EgorBo marked this conversation as resolved.
Show resolved Hide resolved
// TODO-ADDR: delete once "IND<struct>" nodes are no more
realSize = comp->info.compCompHnd->getClassSize(structCls);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it is guaranteed that structCls != NO_CLASS_HANDLE here or we would have hit some BADCODE previously? (e.g. returning struct in method declared to return int)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that the current behavior tries to be careful in such cases, I've just tried and got an AccessViolationException in Main for it when I changed return type to int while was trying to return a small struct.

}

if (genTypeSize(nativeReturnType) > realSize)
{
LIR::Use retValUse(BlockRange(), &ret->gtOp1, ret);
unsigned tmpNum = BAD_VAR_NUM;
if (structCls != NO_CLASS_HANDLE)
{
tmpNum = comp->lvaGrabTemp(true DEBUGARG("mis-sized struct return"));
comp->genReturnLocal = tmpNum;
comp->lvaSetStruct(tmpNum, structCls, true);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LIR::Use retValUse(BlockRange(), &ret->gtOp1, ret);
unsigned tmpNum = BAD_VAR_NUM;
if (structCls != NO_CLASS_HANDLE)
{
tmpNum = comp->lvaGrabTemp(true DEBUGARG("mis-sized struct return"));
comp->genReturnLocal = tmpNum;
comp->lvaSetStruct(tmpNum, structCls, true);
}
LIR::Use retValUse(BlockRange(), &ret->gtOp1, ret);
unsigned tmpNum = comp->lvaGrabTemp(true DEBUGARG("mis-sized struct return"));
comp->lvaSetStruct(tmpNum, structCls, false);
  1. genReturnLocal has a very specific purpose, no reason to set it here.
  2. No unsafe value class checks needed (in principle, in practice it doesn't matter) since the address of this local won't escape.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

genReturnLocal has a very specific purpose, no reason to set it here.

Tests assert after this patch: Assertion failed 'tmp == genReturnLocal'

Copy link
Member Author

@EgorBo EgorBo Nov 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted some suggestions if you don't mind because we hit the path for non-structs as well (e.g. small type primitives) so now it's cleaner that we have a special case for that rare case with IND<struct>.

Copy link
Contributor

@SingleAccretion SingleAccretion Nov 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to handle the case with handle-less BLK nodes though.

Reproduction:

struct Sx3
{
    public byte A, B, C;
}

[MethodImpl(MethodImplOptions.NoInlining)]
private static Sx3 Problem(void* s)
{
    Sx3 a;
    Unsafe.CopyBlock(&a, s, 3);
    return a;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, reverted the revert. @SingleAccretion does it look good to you? I assume it's just your change written with my hands 😄

ReplaceWithLclVar(retValUse, tmpNum);
LowerRetSingleRegStructLclVar(ret);
break;
}

retVal->ChangeOper(GT_IND);
retVal->ChangeType(nativeReturnType);
LowerIndir(retVal->AsIndir());
break;
}

case GT_LCL_VAR:
LowerRetSingleRegStructLclVar(ret);
Expand Down
166 changes: 166 additions & 0 deletions src/tests/JIT/Regression/JitBlue/Runtime_76194/Runtime_76194.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

public unsafe class Runtime_76194
{
[StructLayout(LayoutKind.Explicit, Size = 5)]
public struct Data1
{
[FieldOffset(0)] public byte Byte;
[FieldOffset(1)] public int Int;
}

[StructLayout(LayoutKind.Explicit, Size = 7)]
public struct Data2
{
[FieldOffset(0)] public short A1;
[FieldOffset(2)] public short A2;
}

[StructLayout(LayoutKind.Explicit, Size = 7)]
public struct Data3
{
[FieldOffset(0)] public int Int1;
[FieldOffset(3)] public int Int2;
}

public struct Data4
{
public short A1;
public short A2;
}

public struct Data5
{
public byte Byte;
public int Int;
}

[MethodImpl(MethodImplOptions.NoInlining)] static Data1 Read1(byte* location) => Unsafe.ReadUnaligned<Data1>(location);
[MethodImpl(MethodImplOptions.NoInlining)] static Data2 Read2(byte* location) => Unsafe.ReadUnaligned<Data2>(location);
[MethodImpl(MethodImplOptions.NoInlining)] static Data3 Read3(byte* location) => Unsafe.ReadUnaligned<Data3>(location);
[MethodImpl(MethodImplOptions.NoInlining)] static Data4 Read4(byte* location) => Unsafe.ReadUnaligned<Data4>(location);
[MethodImpl(MethodImplOptions.NoInlining)] static Data5 Read5(byte* location) => Unsafe.ReadUnaligned<Data5>(location);

[MethodImpl(MethodImplOptions.NoInlining)] static void Write1(byte* location, Data1 d) => Unsafe.WriteUnaligned(location, d);
[MethodImpl(MethodImplOptions.NoInlining)] static void Write2(byte* location, Data2 d) => Unsafe.WriteUnaligned(location, d);
[MethodImpl(MethodImplOptions.NoInlining)] static void Write3(byte* location, Data3 d) => Unsafe.WriteUnaligned(location, d);
[MethodImpl(MethodImplOptions.NoInlining)] static void Write4(byte* location, Data4 d) => Unsafe.WriteUnaligned(location, d);
[MethodImpl(MethodImplOptions.NoInlining)] static void Write5(byte* location, Data5 d) => Unsafe.WriteUnaligned(location, d);


[MethodImpl(MethodImplOptions.NoInlining)]
public static int Main()
{
if (!OperatingSystem.IsWindows() && !OperatingSystem.IsLinux())
{
return 100;
}

for (int i = 0; i < 100; i++)
{
using CrossVirtualAlloc alloc = new();
if (alloc.IsFailedToCommit)
{
Console.WriteLine("VirtualAlloc/mmap/mprotect failed, giving up on test.");
break;
}
Read1(alloc.GetPointerNearPageEndFor<Data1>());
Write1(alloc.GetPointerNearPageEndFor<Data1>(), default);
Read2(alloc.GetPointerNearPageEndFor<Data2>());
Write2(alloc.GetPointerNearPageEndFor<Data2>(), default);
Read3(alloc.GetPointerNearPageEndFor<Data3>());
Write3(alloc.GetPointerNearPageEndFor<Data3>(), default);
Read4(alloc.GetPointerNearPageEndFor<Data4>());
Write4(alloc.GetPointerNearPageEndFor<Data4>(), default);
Read5(alloc.GetPointerNearPageEndFor<Data5>());
Write5(alloc.GetPointerNearPageEndFor<Data5>(), default);
}
return 100;
}

[MethodImpl(MethodImplOptions.NoInlining)]
static T Read<T>(byte* location) => Unsafe.ReadUnaligned<T>(location);

[MethodImpl(MethodImplOptions.NoInlining)]
static void Write<T>(byte* location, T t) => Unsafe.WriteUnaligned(location, t);
}

// Cross-platform implementation of VirtualAlloc that is focused on reproducing problems
// where JIT emits code that reads/writes more than requested
public unsafe class CrossVirtualAlloc : IDisposable
{
private readonly byte* _ptr;

public CrossVirtualAlloc()
{
if (OperatingSystem.IsWindows())
{
const int MEM_COMMIT = 0x1000;
const int MEM_RESERVE = 0x2000;
const int PAGE_READWRITE = 0x04;

_ptr = VirtualAlloc(null, PageSize * 2, MEM_RESERVE, PAGE_READWRITE);
if (_ptr != null)
{
_ptr = VirtualAlloc(_ptr, PageSize, MEM_COMMIT, PAGE_READWRITE);
if (_ptr == null)
{
VirtualFree(_ptr, 0, 0);
}
}
}
else
{
const int PROT_NONE = 0x0;
const int PROT_READ = 0x1;
const int PROT_WRITE = 0x2;
const int MAP_PRIVATE = 0x02;
const int MAP_ANONYMOUS = 0x20;

_ptr = mmap(null, 2 * PageSize, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (_ptr != null && mprotect(_ptr + PageSize, PageSize, PROT_NONE) != 0)
{
munmap(_ptr, 2 * PageSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will result in a double free

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(But I doubt that's the cause of #78758)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the reason now:

On success, mmap() returns a pointer to the mapped area.  On
       error, the value MAP_FAILED (that is, (void *) -1) is returned,
       and [errno](https://man7.org/linux/man-pages/man3/errno.3.html) is set to indicate the error.

So this does not correctly check that we failed to allocate the memory.

}
}
}

public bool IsFailedToCommit => _ptr == null;

public nuint PageSize => (nuint)Environment.SystemPageSize;

[MethodImpl(MethodImplOptions.NoInlining)]
public byte* GetPointerNearPageEndFor<T>() => _ptr + PageSize - Unsafe.SizeOf<T>();

public void Dispose()
{
if (OperatingSystem.IsWindows())
{
VirtualFree(_ptr, 0, 0);
}
else
{
munmap(_ptr, (nuint)Environment.SystemPageSize * 2);
}
}

[DllImport("kernel32")]
static extern byte* VirtualAlloc(byte* lpAddress, nuint dwSize, uint flAllocationType, uint flProtect);

[DllImport("kernel32")]
static extern int VirtualFree(byte* lpAddress, nuint dwSize, uint dwFreeType);

[DllImport("libc")]
static extern byte* mmap(byte* addr, nuint length, int prot, int flags, int fd, nuint offset);

[DllImport("libc")]
static extern int mprotect(byte* addr, nuint len, int prot);

[DllImport("libc")]
static extern int munmap(byte* addr, nuint length);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<Optimize>True</Optimize>
<AllowUnsafeBlocks>True</AllowUnsafeBlocks>
</PropertyGroup>
<ItemGroup>
<Compile Include="$(MSBuildProjectName).cs" />
</ItemGroup>
</Project>