Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Added SVE GetFfr, SetFfr, LoadVectorFirstFaulting, GatherVectorFirstFaulting #104502

Merged
merged 47 commits into from
Jul 27, 2024
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
d0efc9e
Initial work
TIHan Jul 2, 2024
42148fd
Merge remote-tracking branch 'upstream/main' into sve-ffr-part1
TIHan Jul 2, 2024
a7773ac
FirstFaulting partially works
TIHan Jul 2, 2024
76b42bd
Added template
TIHan Jul 2, 2024
bb01e37
Trying to test first-faulting behavior
TIHan Jul 4, 2024
a602b24
Using BoundedMemory to test FirstFaulting behavior for LoadVector.
TIHan Jul 6, 2024
60d410a
Fix size in validation
TIHan Jul 6, 2024
aee87d7
Added more helper functions. Added conditional select tests for LoadV…
TIHan Jul 8, 2024
7f3bb3c
Added first-faulting behavior tests for GatherVectorFirstFaulting
TIHan Jul 8, 2024
d952ff1
Merging with main
TIHan Jul 8, 2024
3923946
Added GetFfr suffix-style APIs
TIHan Jul 8, 2024
461b6a3
Fixing GatherVector tests
TIHan Jul 8, 2024
d5b8675
Formatting
TIHan Jul 8, 2024
07833e3
Feedback
TIHan Jul 9, 2024
ce5a9bd
Merge remote-tracking branch 'upstream/main' into sve-ffr-part1
TIHan Jul 9, 2024
05fb46d
Feedback
TIHan Jul 9, 2024
c63f878
Ensure the P/Invokes are blittable
tannergooding Jul 10, 2024
a4533fe
Merging
TIHan Jul 11, 2024
72d1dea
Merge remote-tracking branch 'upstream/main' into sve-ffr-part1
TIHan Jul 12, 2024
6c28927
Fix build
TIHan Jul 13, 2024
fb2012e
Remove checking for zeroes after the fault
TIHan Jul 13, 2024
aca6759
Added GatherVectorFirstFaultingVectorBases test template, but current…
TIHan Jul 16, 2024
d781fdc
Mark GetFfr methods as side-effectful
TIHan Jul 16, 2024
a73fe35
Verifying expected fault result. Test weaks.
TIHan Jul 19, 2024
81882a4
Merging with main
TIHan Jul 19, 2024
ad5ec2e
Fix build
TIHan Jul 20, 2024
0f88d8e
Add tracking of FFR register
kunalspathak Jul 19, 2024
10cf342
Change condition for PhysReg
kunalspathak Jul 23, 2024
e7507bb
jit format
kunalspathak Jul 23, 2024
aef79cd
Fix PoisonPage configuration while creating BoundedMemory
SwapnilGaikwad Jul 23, 2024
690e7ad
Use mmap() instead of memalign() for memory allocation
SwapnilGaikwad Jul 23, 2024
b23fac7
review feedback
kunalspathak Jul 23, 2024
0c8b688
unspill for LoadVectorFirstFaulting as well
kunalspathak Jul 23, 2024
3184b77
Merging with Kunal's FFR changes
TIHan Jul 24, 2024
5bb0b3d
Show error codes on failing failure
SwapnilGaikwad Jul 24, 2024
823e847
Merging with main
TIHan Jul 26, 2024
86715e5
Feedback
TIHan Jul 26, 2024
8b0f000
Feedback
TIHan Jul 26, 2024
044dbda
Feedback
TIHan Jul 26, 2024
0655d4b
Feedback
TIHan Jul 26, 2024
9d7f22f
Handle FFR correctly
kunalspathak Jul 26, 2024
18f8f52
reuse some of the code
kunalspathak Jul 26, 2024
0755372
Handle the special effect for SetFfr
kunalspathak Jul 26, 2024
567a442
some fixes + test coverage
kunalspathak Jul 26, 2024
3ac987d
do not zero init lvaFfrRegister
kunalspathak Jul 26, 2024
e8f7fcd
reverted local change
kunalspathak Jul 26, 2024
77ec96c
fix build break
kunalspathak Jul 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions src/coreclr/jit/codegenarmarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1508,8 +1508,18 @@ void CodeGen::genCodeForPhysReg(GenTreePhysReg* tree)
var_types targetType = tree->TypeGet();
regNumber targetReg = tree->GetRegNum();

inst_Mov(targetType, targetReg, tree->gtSrcReg, /* canSkip */ true);
genTransferRegGCState(targetReg, tree->gtSrcReg);
#ifdef TARGET_ARM64
if (varTypeIsMask(targetType))
{
assert(tree->gtSrcReg == REG_FFR);
GetEmitter()->emitIns_R(INS_sve_rdffr, EA_SCALABLE, REG_FFR);
}
else
#endif
{
inst_Mov(targetType, targetReg, tree->gtSrcReg, /* canSkip */ true);
genTransferRegGCState(targetReg, tree->gtSrcReg);
}

genProduceReg(tree);
}
Expand Down
4 changes: 4 additions & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -4344,6 +4344,10 @@ class Compiler
#endif // defined(FEATURE_SIMD)

unsigned lvaGSSecurityCookie; // LclVar number
#ifdef TARGET_ARM64
unsigned lvaFfrRegister; // LclVar number
unsigned getFFRegisterVarNum();
#endif
bool lvaTempsHaveLargerOffsetThanVars();

// Returns "true" iff local variable "lclNum" is in SSA form.
Expand Down
9 changes: 9 additions & 0 deletions src/coreclr/jit/fgdiagnostic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3436,6 +3436,15 @@ void Compiler::fgDebugCheckFlags(GenTree* tree, BasicBlock* block)
case NI_Sve_GatherPrefetch32Bit:
case NI_Sve_GatherPrefetch64Bit:
case NI_Sve_GatherPrefetch8Bit:
case NI_Sve_SetFfr:
TIHan marked this conversation as resolved.
Show resolved Hide resolved
case NI_Sve_GetFfrByte:
case NI_Sve_GetFfrInt16:
case NI_Sve_GetFfrInt32:
case NI_Sve_GetFfrInt64:
case NI_Sve_GetFfrSByte:
case NI_Sve_GetFfrUInt16:
case NI_Sve_GetFfrUInt32:
case NI_Sve_GetFfrUInt64:
{
assert(tree->OperRequiresCallFlag(this));
expectedFlags |= GTF_GLOB_REF;
Expand Down
41 changes: 36 additions & 5 deletions src/coreclr/jit/gentree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7451,7 +7451,11 @@ GenTreeIntCon* Compiler::gtNewFalse()
// return a new node representing the value in a physical register
GenTree* Compiler::gtNewPhysRegNode(regNumber reg, var_types type)
{
#ifdef TARGET_ARM64
assert(genIsValidIntReg(reg) || (reg == REG_SPBASE) || (reg == REG_FFR));
#else
assert(genIsValidIntReg(reg) || (reg == REG_SPBASE));
#endif
GenTree* result = new (this, GT_PHYSREG) GenTreePhysReg(reg, type);
return result;
}
Expand Down Expand Up @@ -11557,6 +11561,12 @@ void Compiler::gtGetLclVarNameInfo(unsigned lclNum, const char** ilKindOut, cons
{
ilName = "GsCookie";
}
#ifdef TARGET_ARM64
TIHan marked this conversation as resolved.
Show resolved Hide resolved
else if (lclNum == lvaFfrRegister)
{
ilName = "FFReg";
}
#endif
else if (lclNum == lvaRetAddrVar)
{
ilName = "ReturnAddress";
Expand Down Expand Up @@ -26557,6 +26567,7 @@ bool GenTreeHWIntrinsic::OperIsMemoryLoad(GenTree** pAddr) const
case NI_Sve_LoadVectorByteZeroExtendToUInt16:
case NI_Sve_LoadVectorByteZeroExtendToUInt32:
case NI_Sve_LoadVectorByteZeroExtendToUInt64:
case NI_Sve_LoadVectorFirstFaulting:
case NI_Sve_LoadVectorInt16SignExtendToInt32:
case NI_Sve_LoadVectorInt16SignExtendToInt64:
case NI_Sve_LoadVectorInt16SignExtendToUInt32:
Expand All @@ -26583,6 +26594,7 @@ bool GenTreeHWIntrinsic::OperIsMemoryLoad(GenTree** pAddr) const

case NI_Sve_GatherVector:
case NI_Sve_GatherVectorByteZeroExtend:
case NI_Sve_GatherVectorFirstFaulting:
case NI_Sve_GatherVectorInt16SignExtend:
case NI_Sve_GatherVectorInt16WithByteOffsetsSignExtend:
case NI_Sve_GatherVectorInt32SignExtend:
Expand Down Expand Up @@ -26674,11 +26686,12 @@ bool GenTreeHWIntrinsic::OperIsMemoryLoad(GenTree** pAddr) const
{
#ifdef TARGET_ARM64
static_assert_no_msg(
AreContiguous(NI_Sve_GatherVector, NI_Sve_GatherVectorByteZeroExtend, NI_Sve_GatherVectorInt16SignExtend,
NI_Sve_GatherVectorInt16WithByteOffsetsSignExtend, NI_Sve_GatherVectorInt32SignExtend,
NI_Sve_GatherVectorInt32WithByteOffsetsSignExtend, NI_Sve_GatherVectorSByteSignExtend,
NI_Sve_GatherVectorUInt16WithByteOffsetsZeroExtend, NI_Sve_GatherVectorUInt16ZeroExtend,
NI_Sve_GatherVectorUInt32WithByteOffsetsZeroExtend, NI_Sve_GatherVectorUInt32ZeroExtend));
AreContiguous(NI_Sve_GatherVector, NI_Sve_GatherVectorByteZeroExtend, NI_Sve_GatherVectorFirstFaulting,
NI_Sve_GatherVectorInt16SignExtend, NI_Sve_GatherVectorInt16WithByteOffsetsSignExtend,
NI_Sve_GatherVectorInt32SignExtend, NI_Sve_GatherVectorInt32WithByteOffsetsSignExtend,
NI_Sve_GatherVectorSByteSignExtend, NI_Sve_GatherVectorUInt16WithByteOffsetsZeroExtend,
NI_Sve_GatherVectorUInt16ZeroExtend, NI_Sve_GatherVectorUInt32WithByteOffsetsZeroExtend,
NI_Sve_GatherVectorUInt32ZeroExtend));
assert(varTypeIsI(addr) || (varTypeIsSIMD(addr) && ((intrinsicId >= NI_Sve_GatherVector) &&
(intrinsicId <= NI_Sve_GatherVectorUInt32ZeroExtend))));
#else
Expand Down Expand Up @@ -27096,6 +27109,15 @@ bool GenTreeHWIntrinsic::OperRequiresCallFlag() const
case NI_Sve_GatherPrefetch32Bit:
case NI_Sve_GatherPrefetch64Bit:
case NI_Sve_GatherPrefetch8Bit:
case NI_Sve_SetFfr:
TIHan marked this conversation as resolved.
Show resolved Hide resolved
case NI_Sve_GetFfrByte:
case NI_Sve_GetFfrInt16:
case NI_Sve_GetFfrInt32:
case NI_Sve_GetFfrInt64:
case NI_Sve_GetFfrSByte:
case NI_Sve_GetFfrUInt16:
case NI_Sve_GetFfrUInt32:
case NI_Sve_GetFfrUInt64:
{
return true;
}
Expand Down Expand Up @@ -27286,6 +27308,15 @@ void GenTreeHWIntrinsic::Initialize(NamedIntrinsic intrinsicId)
case NI_Sve_GatherPrefetch32Bit:
case NI_Sve_GatherPrefetch64Bit:
case NI_Sve_GatherPrefetch8Bit:
case NI_Sve_SetFfr:
TIHan marked this conversation as resolved.
Show resolved Hide resolved
case NI_Sve_GetFfrByte:
case NI_Sve_GetFfrInt16:
case NI_Sve_GetFfrInt32:
case NI_Sve_GetFfrInt64:
case NI_Sve_GetFfrSByte:
case NI_Sve_GetFfrUInt16:
case NI_Sve_GetFfrUInt32:
case NI_Sve_GetFfrUInt64:
{
// Mark as a call and global reference, much as is done for GT_KEEPALIVE
gtFlags |= (GTF_CALL | GTF_GLOB_REF);
Expand Down
30 changes: 11 additions & 19 deletions src/coreclr/jit/hwintrinsic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2217,6 +2217,7 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,
#elif defined(TARGET_ARM64)
case NI_Sve_GatherVector:
case NI_Sve_GatherVectorByteZeroExtend:
case NI_Sve_GatherVectorFirstFaulting:
case NI_Sve_GatherVectorInt16SignExtend:
case NI_Sve_GatherVectorInt16WithByteOffsetsSignExtend:
case NI_Sve_GatherVectorInt32SignExtend:
Expand Down Expand Up @@ -2297,10 +2298,15 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,

switch (intrinsic)
{
case NI_Sve_CreateBreakAfterMask:
case NI_Sve_CreateBreakAfterPropagateMask:
case NI_Sve_CreateBreakBeforeMask:
case NI_Sve_CreateBreakBeforePropagateMask:
{
// HWInstrinsic requires a mask for op3
convertToMaskIfNeeded(retNode->AsHWIntrinsic()->Op(3));
FALLTHROUGH;
}
case NI_Sve_CreateBreakAfterMask:
case NI_Sve_CreateBreakBeforeMask:
case NI_Sve_CreateMaskForFirstActiveElement:
case NI_Sve_CreateMaskForNextActiveElement:
case NI_Sve_GetActiveElementCount:
Expand All @@ -2310,30 +2316,16 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,
{
// HWInstrinsic requires a mask for op2
convertToMaskIfNeeded(retNode->AsHWIntrinsic()->Op(2));
break;
FALLTHROUGH;
}

default:
break;
}

switch (intrinsic)
{
case NI_Sve_CreateBreakAfterPropagateMask:
case NI_Sve_CreateBreakBeforePropagateMask:
{
// HWInstrinsic requires a mask for op3
convertToMaskIfNeeded(retNode->AsHWIntrinsic()->Op(3));
// HWInstrinsic requires a mask for op1
convertToMaskIfNeeded(retNode->AsHWIntrinsic()->Op(1));
break;
}

default:
break;
}

// HWInstrinsic requires a mask for op1
convertToMaskIfNeeded(retNode->AsHWIntrinsic()->Op(1));

if (HWIntrinsicInfo::IsMultiReg(intrinsic))
{
assert(HWIntrinsicInfo::IsExplicitMaskedOperation(retNode->AsHWIntrinsic()->GetHWIntrinsicId()));
Expand Down
74 changes: 74 additions & 0 deletions src/coreclr/jit/hwintrinsiccodegenarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,7 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node)

emitAttr emitSize;
insOpts opt;
bool unspilledFfr = false;

if (HWIntrinsicInfo::SIMDScalar(intrin.id))
{
Expand All @@ -318,6 +319,39 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node)
{
emitSize = EA_SCALABLE;
opt = emitter::optGetSveInsOpt(emitTypeSize(intrin.baseType));

switch (intrin.id)
TIHan marked this conversation as resolved.
Show resolved Hide resolved
{
case NI_Sve_GetFfrByte:
case NI_Sve_GetFfrInt16:
case NI_Sve_GetFfrInt32:
case NI_Sve_GetFfrInt64:
case NI_Sve_GetFfrSByte:
case NI_Sve_GetFfrUInt16:
case NI_Sve_GetFfrUInt32:
case NI_Sve_GetFfrUInt64:
{
if ((intrin.op1 != nullptr) && ((intrin.op1->gtFlags & GTF_SPILLED) != 0))
{
// If there was a op1 for this intrinsic, it means FFR is consumed here
// and we need to unspill.
unspilledFfr = true;
}
break;
}
case NI_Sve_LoadVectorFirstFaulting:
{
if ((intrin.op3 != nullptr) && ((intrin.op3->gtFlags & GTF_SPILLED) != 0))
{
// If there was a op3 for this intrinsic, it means FFR is consumed here
// and we need to unspill.
unspilledFfr = true;
}
break;
}
default:
break;
}
}
else if (intrin.category == HW_Category_Special)
{
Expand Down Expand Up @@ -2051,6 +2085,7 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node)

case NI_Sve_GatherVector:
case NI_Sve_GatherVectorByteZeroExtend:
case NI_Sve_GatherVectorFirstFaulting:
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
case NI_Sve_GatherVectorInt16SignExtend:
case NI_Sve_GatherVectorInt16WithByteOffsetsSignExtend:
case NI_Sve_GatherVectorInt32SignExtend:
Expand Down Expand Up @@ -2366,6 +2401,45 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node)
break;
}

case NI_Sve_LoadVectorFirstFaulting:
{
assert(op3Reg == REG_NA);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case, lately the check began to fail on my machine:

/home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true HardwareIntrinsics_Arm_ro.dll 'LoadVectorFirstFaulting'
17:56:34.401 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_LoadVectorFirstFaulting_float()
Supported ISAs:
  AdvSimd:   True
  Aes:       True
  ArmBase:   True
  Crc32:     True
  Dp:        True
  Rdm:       True
  Sha1:      True
  Sha256:    True
  Sve:       True

Beginning scenario: RunBasicScenario_Load

Assert failure(PID 2387565 [0x00246e6d], Thread: 2387565 [0x246e6d]): Assertion failed 'op3Reg == REG_NA' in 'JIT.HardwareIntrinsics.Arm._Sve.Sve__Sve_LoadVectorFirstFaulting_float:RunBasicScenario_LoadFirstFaulting():this' during 'Generate code' (IL size 138; hash 0x10c6e74d; Tier0)

    File: /home/mikabl01/dotnet/runtime/src/coreclr/jit/hwintrinsiccodegenarm64.cpp:2406
    Image: /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun

./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.sh: line 432: 2387565 Aborted                 (core dumped) $LAUNCHER $ExePath "${CLRTestExecutionArguments[@]}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this will be fixed as i pointed in #105348 (comment)

if (unspilledFfr)
{
// We have unspilled the FFR in op1Reg. Restore it back in FFR register.
GetEmitter()->emitIns_R(INS_sve_wrffr, emitSize, op1Reg, opt);
}

insScalableOpts sopt = (opt == INS_OPTS_SCALABLE_B) ? INS_SCALABLE_OPTS_NONE : INS_SCALABLE_OPTS_LSL_N;
GetEmitter()->emitIns_R_R_R_R(ins, emitSize, targetReg, op1Reg, op2Reg, REG_ZR, opt, sopt);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is op3Reg here? Should be REG_NA. Can we add an assert for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op3Reg would be REG_NA. I'll add an assert for it.

break;
}

case NI_Sve_GetFfrByte:
case NI_Sve_GetFfrInt16:
case NI_Sve_GetFfrInt32:
case NI_Sve_GetFfrInt64:
case NI_Sve_GetFfrSByte:
case NI_Sve_GetFfrUInt16:
case NI_Sve_GetFfrUInt32:
case NI_Sve_GetFfrUInt64:
{
if (unspilledFfr)
{
// We have unspilled the FFR in op1Reg. Restore it back in FFR register.
GetEmitter()->emitIns_R(INS_sve_wrffr, emitSize, op1Reg, opt);
}

GetEmitter()->emitIns_R(ins, emitSize, targetReg, INS_OPTS_SCALABLE_B);
break;
}
case NI_Sve_SetFfr:
{
assert(targetReg == REG_NA);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
GetEmitter()->emitIns_R(ins, emitSize, op1Reg, opt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need special codegen for GetFfr too otherwise it will generate RDFFR (predicated) instead of RDFFR (unpredicated)?

Copy link
Contributor Author

@TIHan TIHan Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we can add an optimization to use SETFFR if op1 is contained and IsAllBitsSet. I will add the opt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this comment is specifically for GetFfr and not for SetFfr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha. I mis-read.

break;
}

case NI_Sve_ConditionalExtractAfterLastActiveElementScalar:
case NI_Sve_ConditionalExtractLastActiveElementScalar:
{
Expand Down
Loading
Loading