-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Put scalar stack args with sign extension #97662
[RISC-V] Put scalar stack args with sign extension #97662
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsAccording to Integer calling convention arguments on stack narrower than XLEN bits are stored on stack with sign extension to XLEN according to theirs type. Part of #84834
|
Formatting patch
Diff results for #97662Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64MinOpts (-0.01% to +0.00%)
Throughput diffs for osx/arm64 ran on windows/x64MinOpts (-0.01% to +0.00%)
Details here |
Reproduction info.
using System.IO;
using System.Xml.Serialization;
namespace Oobe.Language.Model
{
public class LanguageInfo
{
[XmlAttribute("code")]
public string? Code { get; set; }
}
public class LanguageManger
{
public static void Main(string[] args)
{
var xs = new XmlSerializer(typeof(LanguageInfo));
System.Console.WriteLine(xs);
}
}
}
-O3 assembler of `TypeBuilder_SetMethodIL`(gdb) bt 4
#0 CQuickArrayBase<tagCOR_ILMETHOD_SECT_EH_CLAUSE_FAT>::AllocThrows (this=0x3fffffcbe0, iItems=0x3f00000000) at /usr/src/debug/coreclr-8.0.0-0.riscv64/src/coreclr/inc/corhlprpriv.h:389
#1 TypeBuilder_SetMethodIL (pModule=..., tk=0x6000001, fIsInitLocal=0x1, pBody=0xaad433f40 "\002(\002", cbBody=<optimized out>, pLocalSig=<optimized out>, sigLength=0x2, maxStackSize=<optimized out>, pExceptions=0x0, numExceptions=0x0, pTokenFixups=0xaad433fa0, numTokenFixups=<optimized out>) at /usr/src/debug/coreclr-8.0.0-0.riscv64/src/coreclr/vm/comdynamic.cpp:357
#2 0x0000003f78cf31e8 in ?? ()
(gdb) i r ra
ra 0x3ff777fab0 0x3ff777fab0 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+264>
(gdb) disassemble
Dump of assembler code for function TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32):
0x0000003ff777f9a8 <+0>: addi sp,sp,-1328
0x0000003ff777f9ac <+4>: sd ra,1320(sp)
0x0000003ff777f9b0 <+8>: sd s0,1312(sp)
0x0000003ff777f9b4 <+12>: sd s1,1304(sp)
0x0000003ff777f9b8 <+16>: sd s2,1296(sp)
0x0000003ff777f9bc <+20>: sd s3,1288(sp)
0x0000003ff777f9c0 <+24>: sd s4,1280(sp)
0x0000003ff777f9c4 <+28>: sd s5,1272(sp)
0x0000003ff777f9c8 <+32>: sd s6,1264(sp)
0x0000003ff777f9cc <+36>: sd s7,1256(sp)
0x0000003ff777f9d0 <+40>: sd s8,1248(sp)
0x0000003ff777f9d4 <+44>: sd s9,1240(sp)
0x0000003ff777f9d8 <+48>: sd s10,1232(sp)
0x0000003ff777f9dc <+52>: sd s11,1224(sp)
0x0000003ff777f9e0 <+56>: addi s0,sp,1328
0x0000003ff777f9e4 <+60>: auipc a0,0x46b
0x0000003ff777f9e8 <+64>: ld s9,532(a0) # 0x3ff7beabf8
0x0000003ff777f9ec <+68>: ld a0,0(s9)
0x0000003ff777f9f0 <+72>: mv s1,a7
0x0000003ff777f9f2 <+74>: mv s6,a6
0x0000003ff777f9f4 <+76>: mv s5,a5
0x0000003ff777f9f6 <+78>: mv s11,a4
0x0000003ff777f9f8 <+80>: mv s2,a3
0x0000003ff777f9fa <+82>: mv s4,a2
0x0000003ff777f9fc <+84>: mv s10,a1
0x0000003ff777f9fe <+86>: sd a0,-120(s0)
0x0000003ff777fa02 <+90>: sd zero,-1248(s0)
0x0000003ff777fa06 <+94>: sd zero,-1256(s0)
0x0000003ff777fa0a <+98>: li a0,-1
0x0000003ff777fa0c <+100>: sd a0,-1240(s0)
0x0000003ff777fa10 <+104>: sd zero,-1232(s0)
0x0000003ff777fa14 <+108>: sh zero,-1216(s0)
0x0000003ff777fa18 <+112>: sd zero,-1208(s0)
0x0000003ff777fa1c <+116>: sd zero,-1200(s0)
0x0000003ff777fa20 <+120>: auipc a0,0x46b
0x0000003ff777fa24 <+124>: addi a0,a0,-576 # 0x3ff7bea7e0
0x0000003ff777fa28 <+128>: jal ra,0x3ff76813f0 <__tls_get_addr@plt>
0x0000003ff777fa2c <+132>: ld a0,0(a0)
0x0000003ff777fa2e <+134>: ld a0,16(a0)
0x0000003ff777fa30 <+136>: sd a0,-1328(s0)
0x0000003ff777fa34 <+140>: ld s3,1536(s10)
0x0000003ff777fa38 <+144>: li a0,2
0x0000003ff777fa3a <+146>: sd s4,-1312(s0)
0x0000003ff777fa3e <+150>: bne s1,a0,0x3ff777fa50 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+168>
0x0000003ff777fa42 <+154>: lbu a0,0(s6)
0x0000003ff777fa46 <+158>: bnez a0,0x3ff777fa50 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+168>
0x0000003ff777fa48 <+160>: lbu a0,1(s6)
0x0000003ff777fa4c <+164>: beqz a0,0x3ff777ff04 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+1372>
0x0000003ff777fa50 <+168>: ld a0,0(s3)
0x0000003ff777fa54 <+172>: ld a1,0(a0)
0x0000003ff777fa56 <+174>: ld a4,184(a1)
0x0000003ff777fa58 <+176>: addi a3,s0,-1260
0x0000003ff777fa5c <+180>: mv a1,s6
0x0000003ff777fa5e <+182>: mv a2,s1
0x0000003ff777fa60 <+184>: jalr a4
0x0000003ff777fa62 <+186>: bltz a0,0x3ff777ff32 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+1418>
0x0000003ff777fa66 <+190>: lbu a1,-1271(s0)
0x0000003ff777fa6a <+194>: lw a0,-1260(s0)
0x0000003ff777fa6e <+198>: ld s8,16(s0)
0x0000003ff777fa72 <+202>: ld a2,0(s0)
0x0000003ff777fa74 <+204>: snez a3,s2
0x0000003ff777fa78 <+208>: slliw a3,a3,0x4
0x0000003ff777fa7c <+212>: sb a3,-1272(s0)
0x0000003ff777fa80 <+216>: andi a1,a1,240
0x0000003ff777fa84 <+220>: sb a1,-1271(s0)
0x0000003ff777fa88 <+224>: sh a2,-1270(s0)
0x0000003ff777fa8c <+228>: sw a0,-1264(s0)
0x0000003ff777fa90 <+232>: sw s5,-1268(s0)
0x0000003ff777fa94 <+236>: snez s7,s8
0x0000003ff777fa98 <+240>: beqz s8,0x3ff777faa2 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+250>
0x0000003ff777fa9c <+244>: addiw s5,s5,3
0x0000003ff777fa9e <+246>: andi s5,s5,-4
0x0000003ff777faa2 <+250>: addi a0,s0,-1272
0x0000003ff777faa6 <+254>: mv a1,s7
0x0000003ff777faa8 <+256>: auipc ra,0x161
0x0000003ff777faac <+260>: jalr -1404(ra) # 0x3ff78e052c <IlmethodSize(COR_ILMETHOD_FAT*, BOOL)>
0x0000003ff777fab0 <+264>: mv s6,a0
0x0000003ff777fab2 <+266>: sd zero,-648(s0)
0x0000003ff777fab6 <+270>: sd zero,-656(s0)
0x0000003ff777faba <+274>: li a0,512
0x0000003ff777fabe <+278>: sd a0,-640(s0)
0x0000003ff777fac2 <+282>: blez s8,0x3ff777fc62 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+698>
0x0000003ff777fac6 <+286>: li a0,24
0x0000003ff777fac8 <+288>: li a1,22
=> 0x0000003ff777faca <+290>: mul s2,s8,a0
0x0000003ff777face <+294>: bltu s8,a1,0x3ff777fae4 <TypeBuilder_SetMethodIL(QCall::ModuleHandle, INT32, BOOL, LPCBYTE, INT32, LPCBYTE, INT32, UINT16, ExceptionInstance*, INT32, INT32*, INT32)+316>
0x0000003ff777fad2 <+298>: mv a0,s2
0x0000003ff777fad4 <+300>: auipc ra,0x159
0x0000003ff777fad8 <+304>: jalr 70(ra) # 0x3ff78d8b1a <_Znam>
0x0000003ff777fadc <+308>: sd a0,-656(s0)
0x0000003ff777fae0 <+312>: sd s2,-640(s0)
0x0000003ff777fae4 <+316>: li t4,0
0x0000003ff777fae6 <+318>: li t3,0
0x0000003ff777fae8 <+320>: ld t0,8(s0)
0x0000003ff777faec <+324>: sd s2,-648(s0)
0x0000003ff777faf0 <+328>: addi t2,s0,-632
0x0000003ff777faf4 <+332>: li t1,2
...
(gdb) p &iItems
Address requested for identifier "iItems" which is in register $s8
(gdb) i r s8
s8 0x3f00000000 0x3f00000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified.
It should not be necessary to do anything to support this requirement, because all values in the program should follow these rules and be in canonical form at all times. There are only 64 bit compare-and-branch operations, and they depend on the values they compare being correctly in canonical form. C rules on all platforms require narrower values to be extended to int size according to their sign, and lb, lbu, lh, lhu automatically do this (and in fact widen correctly to the full 64 bits). Nothing strange there. The only RISC-V specific thing is that lw should be used to load both signed and unsigned 32 bit values (using lwu only if the value is loaded and immediately cast to uint64). All arithmetic operations automatically maintain the "32-bit values are sign-extended to 64 bits" property -- the *W operations for ADDI, ADD, SUB, MUL, DIV, REM, and the shifts. Normal 64 bit AND, OR, and XOR naturally retain the correct property in the result if the operands are correct. If the correct load instructions and correct arithmetic instructions are used, then no special action is required to put function arguments into the correct form -- they already will be. |
This is about call site construction, we need
The callee can and does load int arguments shorter than 64 bits with ld, since the int argument on the stack needs to be sign-extended to 64 bits, it's legal to do so. This bug came up with making interop calls, the callee being native. |
Perhaps it is the PR title and the comment that confused me. Certainly you need to store the appropriately-extended value. So you need to use e.g. see function foo() in https://godbolt.org/z/78h8a3rnM Code change LGTM, just the comment is confusing. |
According to Integer calling convention arguments on stack narrower than XLEN bits are stored on stack with sign extension to XLEN according to theirs type.
Part of #84834
cc @clamp03 @tomeksowi @gbalykov @denis-paranichev @ashaurtaev