Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Add legacy extended EVEX encoding and EVEX.ND/NF feature to x64 emitter backend #108796

Open
wants to merge 80 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
1820567
Ruihan: POC with REX2
Ruihan-Yin Mar 25, 2024
d1afc68
resolve comments
Ruihan-Yin May 17, 2024
2335aa3
refactor register encoding for REX2
Ruihan-Yin May 20, 2024
6578c58
merge REX2 path to legacy path
Ruihan-Yin May 21, 2024
01eeb80
Enable REX2 in more instructions.
Ruihan-Yin May 30, 2024
690aee3
Avoid repeatedly estimate the size of REX2 prefix
Ruihan-Yin Jun 3, 2024
31d7fb4
Enable REX2 encoding on RI and SV path
Ruihan-Yin Jun 5, 2024
a995878
Add rex2 support to rotate and shift.
Ruihan-Yin Jun 6, 2024
74aacf6
CR session.
Ruihan-Yin Jun 7, 2024
c330927
Testing infra updates: assert REX2 is enabled.
Ruihan-Yin Jun 11, 2024
fbf20d1
revert rcl_N and rcr_N, tp and latency data for these instructions is…
Ruihan-Yin Jun 11, 2024
ea02e70
partially enable REX2 on emitOutputAM, case covered: R_AR and AR_R.
Ruihan-Yin Jun 12, 2024
c74b801
Adding unit tests.
Ruihan-Yin Jun 13, 2024
34980b4
push, pop, inc, dec, neg, not, xadd, shld, shrd, cmpxchg, setcc, bswap.
Ruihan-Yin Jun 26, 2024
2ffdbeb
bug fix for bswap
Ruihan-Yin Jun 27, 2024
3a729bb
bt
Ruihan-Yin Jun 28, 2024
d943b03
xchg, idiv
Ruihan-Yin Jul 1, 2024
c8fee9c
Make sure add REX2 prefix if register encoding for EGPRs are being ca…
Ruihan-Yin Jul 2, 2024
6ec0e97
Ensure code size is correctly computed in R_R_I path.
Ruihan-Yin Jul 8, 2024
1d01003
clean up
Ruihan-Yin Jul 9, 2024
1acc219
Change all AddSimdPrefix to AddX86Prefix
Ruihan-Yin Jul 15, 2024
87ad443
div, mulEAX
Ruihan-Yin Jul 16, 2024
bb9905a
filter out test from REX2 encoding when using ACC form.
Ruihan-Yin Jul 19, 2024
86083b2
Make sure REX prefix will not be added when emitting with REX2.
Ruihan-Yin Jul 24, 2024
dfe8760
resolve comments.
Ruihan-Yin Aug 5, 2024
64761cd
make sure the APX debug knob is only available under debug build.
Ruihan-Yin Oct 24, 2024
f1aba62
clean up some out-dated code.
Ruihan-Yin Nov 12, 2024
f5cc5a8
enable movsxd
Ruihan-Yin Nov 12, 2024
7ca8433
Enable "Call"
Ruihan-Yin Nov 13, 2024
bc4d225
Enable "JMP"
Ruihan-Yin Nov 15, 2024
deb3814
resolve merge errors
Ruihan-Yin Nov 18, 2024
0d63230
formatting
Ruihan-Yin Nov 18, 2024
13b8076
remote coredistools.dll for internal tests only
Ruihan-Yin Nov 18, 2024
42c6cfc
bug fix
Ruihan-Yin Nov 19, 2024
b1a9617
SUB reg, reg, reg
Ruihan-Yin Aug 8, 2024
ec5d5ca
enable NDD on genCodeForBinary
Ruihan-Yin Aug 28, 2024
ebeaf04
consolidate TakesLegacyPromotedEvexPrefix logics.
Ruihan-Yin Aug 30, 2024
547f01d
ensure register encoding is correct under legacy-promoted-evex encoding.
Ruihan-Yin Aug 30, 2024
3566464
Make sure the overflow check is correctly emitted.
Ruihan-Yin Sep 4, 2024
f8e9c4d
simplify the compiler setup logics.
Ruihan-Yin Sep 4, 2024
6bfd050
emitInsNddBinary
Ruihan-Yin Sep 6, 2024
4b0085d
make sure REX will not be added when EVEX presents.
Ruihan-Yin Sep 7, 2024
5701b1c
resolve comment and clean up.
Ruihan-Yin Sep 11, 2024
6d30388
enable more NDD instructions.
Ruihan-Yin Sep 13, 2024
5d3768c
bug fixes
Ruihan-Yin Sep 13, 2024
a5619e4
enable imul
Ruihan-Yin Sep 13, 2024
c71ace6
add emitter unit tests, and fix encoding error for CMOVcc
Ruihan-Yin Sep 16, 2024
ca92da9
bug fixes:
Ruihan-Yin Sep 18, 2024
5d10aef
refactor emitInsBinary
Ruihan-Yin Sep 19, 2024
5f288a6
clean up
Ruihan-Yin Sep 19, 2024
f4e96b0
clean up and refactor some code
Ruihan-Yin Sep 20, 2024
637c413
make sure the code size estimation is correct for some apx promoted i…
Ruihan-Yin Sep 25, 2024
a203a4d
add tuning knob to EVEX.ND feature.
Ruihan-Yin Sep 30, 2024
a99705a
flip the Evex.nd knob.
Ruihan-Yin Oct 1, 2024
b5fa5bf
put NDD control knob to the correct place.
Ruihan-Yin Oct 3, 2024
b69d01e
resolve merge errors
Ruihan-Yin Nov 20, 2024
52539c3
Make sure APX related knobs are defined properly across platforms
Ruihan-Yin Nov 20, 2024
25d66bf
Add Evex.nf to instrDesc
Ruihan-Yin Oct 2, 2024
a19da9e
{nf} add reg, reg
Ruihan-Yin Oct 8, 2024
2e8d714
Enable EVEX.NF in more instructions
Ruihan-Yin Oct 9, 2024
df59342
more instructions
Ruihan-Yin Oct 10, 2024
226fabb
comments.
Ruihan-Yin Oct 10, 2024
36c6631
lzcnt, tzcnt, popcnt
Ruihan-Yin Oct 10, 2024
5f8a01d
Exclude ACC form from EVEX promotion.
Ruihan-Yin Oct 15, 2024
0453630
BMI instructions.
Ruihan-Yin Oct 15, 2024
07868bc
bug fixes
Ruihan-Yin Oct 16, 2024
69f7e8b
Tweak the code size calculation to make sure REX2 and APX-EVEX are pr…
Ruihan-Yin Oct 18, 2024
1c1a894
bug fixes for stress mode
Ruihan-Yin Oct 29, 2024
1be4b12
Add idEvexNoPromotion to emitter to exclude the APX-EVEX promotion fr…
Ruihan-Yin Nov 4, 2024
bfb06c7
resolve merge error
Ruihan-Yin Nov 20, 2024
9541a99
fix merge error
Ruihan-Yin Nov 21, 2024
543d949
Revert "Add idEvexNoPromotion to emitter to exclude the APX-EVEX prom…
Ruihan-Yin Nov 21, 2024
a879019
bug fix
Ruihan-Yin Nov 22, 2024
55cbda6
introduce _no_evex suffix for some instructions for cases when LOCK w…
Ruihan-Yin Nov 22, 2024
a9a3d5c
Merge remote-tracking branch 'origin/main' into apx-evex-nf-nov
Ruihan-Yin Dec 17, 2024
0eef560
resolve merge comflict
Ruihan-Yin Dec 17, 2024
0480c02
fix merge error.
Ruihan-Yin Dec 17, 2024
48cec5f
fix comments and some checks.
Ruihan-Yin Dec 19, 2024
7171e0e
formatting
Ruihan-Yin Dec 19, 2024
5f7606c
remove unneeded env var.
Ruihan-Yin Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
resolve comments
  • Loading branch information
Ruihan-Yin committed Nov 19, 2024
commit d1afc68751f1ef6f5cc1bfb75971df9464f6f14e
4 changes: 4 additions & 0 deletions src/coreclr/jit/compiler.cpp
Original file line number Diff line number Diff line change
@@ -2297,6 +2297,10 @@ void Compiler::compSetProcessor()
codeGen->GetEmitter()->SetUseEvexEncoding(true);
// TODO-XArch-AVX512 : Revisit other flags to be set once avx512 instructions are added.
}
if (canUseRex2Encoding())
{
codeGen->GetEmitter()->SetUseRex2Encoding(true);
}
}
#endif // TARGET_XARCH
}
17 changes: 14 additions & 3 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
@@ -9945,6 +9945,17 @@ class Compiler
return (compOpportunisticallyDependsOn(InstructionSet_EVEX));
}

//------------------------------------------------------------------------
// canUseRex2Encoding - Answer the question: Is Rex2 encoding supported on this target.
//
// Returns:
// `true` if Rex2 encoding is supported, `false` if not.
//
bool canUseRex2Encoding() const
{
return compOpportunisticallyDependsOn(InstructionSet_APX);
}

private:
//------------------------------------------------------------------------
// DoJitStressEvexEncoding- Answer the question: Do we force EVEX encoding.
@@ -9990,9 +10001,9 @@ class Compiler
bool DoJitStressRex2Encoding()
{
#ifdef DEBUG
// Using JitStressEVEXEncoding flag will force instructions which would
// otherwise use VEX encoding but can be EVEX encoded to use EVEX encoding
// This requires AVX512F, AVX512BW, AVX512CD, AVX512DQ, and AVX512VL support
// TODO-apx: currently to make sure APX can be encoded on non-APX machine,
// we don't assert APX support here, we will need to revisit
// this part after we have the hardware.

return this->m_jitStressRex2Encoding;
#endif // DEBUG
1 change: 1 addition & 0 deletions src/coreclr/jit/emit.h
Original file line number Diff line number Diff line change
@@ -470,6 +470,7 @@ class emitter
#ifdef TARGET_XARCH
SetUseVEXEncoding(false);
SetUseEvexEncoding(false);
SetUseRex2Encoding(false);
#endif // TARGET_XARCH

emitDataSecCur = nullptr;
97 changes: 75 additions & 22 deletions src/coreclr/jit/emitxarch.cpp
Original file line number Diff line number Diff line change
@@ -274,17 +274,23 @@ bool emitter::IsEvexEncodableInstruction(instruction ins) const
//
bool emitter::IsRex2EncodableInstruction(instruction ins) const
{
// TODO-apx: as we don't
// if(!UseRex2Encoding())
// {
// return false;
// }

return HasRex2Encoding(ins);
}

//------------------------------------------------------------------------
// IsLegacyMap1: Answer the question- Is this instruction undefined when prefixed by REX2
// IsLegacyMap1: Answer the question- Is this instruction on legacy-map-1
//
// Arguments:
// ins - The instruction to check.
//
// Returns:
// `true` if ins is undefined.
// `true` if ins is a legacy-map-1 instruction.
//
bool emitter::IsLegacyMap1(code_t code) const
{
@@ -1367,7 +1373,7 @@ bool emitter::TakesRex2Prefix(const instrDesc* id) const
return false;
}

if(TakesEvexPrefix(id) || TakesVexPrefix(ins))
if(TakesEvexPrefix(id))
{
return false;
}
@@ -1792,12 +1798,71 @@ bool emitter::HasHighSIMDReg(const instrDesc* id) const
// true if instruction will require REX2 encoding for its register operands.
bool emitter::HasExtendedGPReg(const instrDesc* id) const
{

// TODO-apx:
// Not all instructions has 2 regs, this part needs to be updated later.
#if defined(TARGET_AMD64)
if (IsExtendedGPReg(id->idReg1()) || IsExtendedGPReg(id->idReg2()))
return true;
int regCount = 0;

if(id->idHasReg1())
{
regCount++;
}

if(id->idHasReg2())
{
regCount++;
}

// TODO-apx: revisit code below, do we really have legacy map0/1 instructions taking 3/4 regs.
if(id->idHasReg3())
{
regCount++;
}

if(id->idHasReg4())
{
regCount++;
}

switch (regCount)
{
case 4:
{
if(IsExtendedGPReg(id->idReg4()))
{
return true;
}
FALLTHROUGH;
}

case 3:
{
if(IsExtendedGPReg(id->idReg3()))
{
return true;
}
FALLTHROUGH;
}

case 2:
{
if(IsExtendedGPReg(id->idReg2()))
{
return true;
}
FALLTHROUGH;
}

case 1:
{
if(IsExtendedGPReg(id->idReg1()))
{
return true;
}
FALLTHROUGH;
}

default:
return false;
}
#endif
// X86 JIT operates in 32-bit mode and hence extended reg are not available.
return false;
@@ -1864,14 +1929,14 @@ bool emitter::IsExtendedGPReg(regNumber reg) const
return false;
}

// TODO-apx: It would be better to have stress mode on LSRA to forcely allocate EGPRs,
// instead of stressing here.
#if defined(DEBUG)
if (emitComp->DoJitStressRex2Encoding())
{
return true;
}
#endif // DEBUG
// TODO-apx:
// For now keep it returning false unless stress it.
return false;
}

@@ -2603,18 +2668,6 @@ unsigned emitter::emitOutputRexOrSimdPrefixIfNeeded(instruction ins, BYTE* dst,
code = code >> 2;
}

// TODO-apx: need to complete the opcode check like REX here,
// as some of the opcode come with some prefix, we
// need to handle everything right here as other
// prefixs do.

// REX2 only supports Map0, 1 instructions, I'm not sure
// if there is any assumption we can make on the opcode
// length.

// Plus the pre-exist prefix seems only applies to SSE ins.
// Maybe don't need to be considered?

BYTE check = (code >> 24) & 0xFF;
if (check == 0)
{
12 changes: 12 additions & 0 deletions src/coreclr/jit/emitxarch.h
Original file line number Diff line number Diff line change
@@ -316,6 +316,18 @@ void SetUseEvexEncoding(bool value)
useEvexEncodings = value;
}

// Is Rex2 encoding supported.
bool useRex2Encodings;
bool UseRex2Encoding() const
{
return useRex2Encodings;
}

void SetUseRex2Encoding(bool value)
{
useRex2Encodings = value;
}

//------------------------------------------------------------------------
// UseSimdEncoding: Returns true if either VEX or EVEX encoding is supported
// contains Evex prefix.