-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ILP32 on RV64 in psABI #381
base: master
Are you sure you want to change the base?
Conversation
Regardless of whether we want it, x32 is completely the wrong name for it. x32 comes from the x86, x86_64 and (awful) x64 terminology specific to x86. |
I'll also note that, normally, ILP32-on-64 ABIs use ELFCLASS32, not ELFCLASS64. This is complicated on RISC-V by the fact that there is a single EM_RISCV, not separate EM_RISCV32 and EM_RISCV64. |
In ILP32-on-RV64, we used ELFCLASS32 and EM_RISCV. X32(Let's temporarily call it this name) is added to distinguish between ilp32 on RV64 and ilp32 on RV32 . I think this e_flag is necessary. This is the ELF Header which generated by ILP32-on-RV64 |
I would prefer using something like register a new e_machine value |
I could not agree with you.
|
e_machine is 16 bit value, and use as sequence number, so we still have 2^16 - 248 = 65288 to use, but e_flags is use as bit vector, and reserved 8 bit for non-std use, and we also used 5 bit, so we have only 19 bits left, and adding EF_RISCV_X32 make it become 18 bits left. So compared to e_flags, e_machine has much larger room to use (waste :P). |
OK, you convinced me. |
Seems like it should be happened here, we could send request to registry@xinuos.com and https://groups.google.com/g/generic-abi , but I would like to pick an random value before we reach further consensus. However I know a random value might be a bit too vague, maybe we could tentatively use |
I don't have an overly strong view on EM_RISCV_X32 (though Kito makes a good point about eflags bits being scarce), but don't see why we'd need to introduce EM_RISCV32 and EM_RISCV64 as well at this point - it feels like it would cause confusion with no real gain. |
The world would have been better if EM_RISCV had been split in two like that, but it wasn't, so we need to live with that, and trying to retroactively do it is a bad idea. So I agree with Alex. |
910fb7f
to
4edf2f5
Compare
This feature has already been adopted by upstream open-source RTOS projects, including NuttX and RT-Thread. During the 11.7 psABI meeting, it was agreed to include N32 in the psABI specification as an experimental feature. |
I don’t think it was completely agreed, but it will receive less opposition than trying to make it non-experimental |
Devboard: Due to the widespread adoption of N32 ABI, I support including N32 as an experimental feature in the psABI specification. |
Has the design of the RV64 ILP32 ABI been thoroughly vetted, and have the inevitable performance wrinkles been worked out? I don't want to be an impediment to adopting an RV64 ILP32 ABI, but this seems like a surprisingly small PR to that end. |
I don’t see a change regarding ELFCLASS32/64, for one. I’m also concerned about the 2 GiB restriction making this of extremely limited use. |
My other big concern as a quite separate issue is with the toolchain side. Our toolchain conventions dictate that cc -mabi=ilp32[fde]? for a 64-bit-targeting compiler gets you RV32. How then would you enable this ABI instead? |
Yeah. What is the exact nature of the restriction? We did design the virtual memory system so that addresses up to 4 GiB are usable. I would think that a 2 GiB relative limit in static addressing is a requirement, following the usual logic for RV64, but we should be able to provide the whole ~4 GiB heap. |
It's there because the ISA and ABI like to sign-extend things and so sign-extending an address >= 2 GiB gives you something in the kernel's address space, so you'd have to make more changes to the ABI that may have a performance cost. Unlike using UXL=32 where you can actually use the full 32-bit address space, which seems like the wrong way round for the two configurations if anything. Combined that makes this quite a niche thing; you'd likely struggle to build a full distro with it, I know Debian struggled with 32-bit MIPS's (architectural) address space limitations for many years, so it would be for embedded use only, at which point why not just use RV32, that's what it's for. |
I was conflating the UXL=32 story with the RV64 ILP32 situation. Sorry for that noise. |
Well, it is still somewhat relevant. As far as I understand, i386 and x32 can both use a 4 GiB address space (provided you're using a 64-bit kernel), and neither MIPS N32 nor MIPS O32 can use a 4 GiB address space as the various untranslated/kernel windows are architectural and the 32-bit regions still present in the 64-bit address space (though it wouldn't surprise me if R6 or another revision allows turning this off given it ditched a bunch of other historical baggage, my knowledge is dated). RISC-V, as far as I know, is unusual in its 32-bit ISA+ABI having a bigger usable address than this proposed 64-bit ISA + 32-bit ABI (or even different in any respect), so it's worth drawing attention to. |
The facts are:
Your question becomes, why did the above fact come out? So, Let me guess a little:
RV64 ISA has a differentiated advantage over arm32, making it easier to attract customers. Customers will ask you to give the advantages of rv32 over arm32, which is very difficult. Sometimes, telling customers that your product can be upgraded to 64-bit ISA and evolve to a higher-end product form is more convincing. |
Only when -march=rv64* && -mabi=ilp32* are specified, the rv64ilp32 ABI would be enabled. This will not affect the current default -mabi=ilp32* usage, which stays on the rv32ilp32 ABI. |
I also see value in RV64 ILP32, since the wider registers, memory accesses, etc. will offer higher performance than UXL=32. But I'd like to take a beat and see if we can somehow work around the 2 GiB heap addressing limitation. As a strawman, we could consider the idea that pointers are unsigned and zero-extended. Pointers would be loaded from memory using LWU instead of LW, which would hurt code size by a few %. Some address computations would be performed using ADD instead of ADDW, which would improve code size a little bit. Some type-punning operations would result in explicit zero-/sign-extensions, increasing dynamic instruction count somewhat. It would be possible for buggy code to generate pointers greater than 2^32, since ADD[I] and load/store offset addressing don't wrap addresses mod 2^32, but this would correspond to UB anyway. Is something like this workable? |
Em... There is no C.LWU for LWU, so code size is greatly affected. I make a comparison for Linux kernel code size:
Yes, and compiler work is more complex than it sounded.
I agree.
Theoretically, this is possible. But we must pay the inevitable performance cost and face complex compiler issues. So, we choose the simplest and most effective way to deal with it: Limit the addressing to 2GiB in psabi-spec to minimize the modification of the spec and compiler. Next, please let me give more explanation from the perspective of productization demand: From practice, 2GiB is enough for the embedded scenario. Most small memory devices only need no more than 1GB of physical address (e.g., rv32 Linux only supports a maximum of 1GiB physical memory), so 2GiB address space is enough. RISC-V Linux 64-bit compat mode (UXL=32) only supports 2GiB user-space address space, not 4GiB, because it's enough for the embedded scenarios in practice (e.g., k230/k230d productization). So 2GiB is enough for ILP32 in practice, as well as 64ILP32. In the end, rv64ilp32 is the supplement for rv64lp64. If you need more address space, go to rv64lp64 with a simple replacement. (ISA is the same: RV64*) |
Limiting the address space to just 2G may cause Asan to not work well. Asan Back to zero-extension vs. sign-extension, I would prefer sign-extension
And one Linux-specific question for rv64ilp32 with a 2G restriction: Where is |
It sounds like we might go down a path of doing the easier, and more code-size-efficient, 2 GiB thing. We could always do the 4 GiB thing later as another ABI if we absolutely need to do so. |
Linux user-mode rv32ilp32 only supports 2GiB with rv64lp64 Linux kernel (compat mode) and about 2.4GiB with rv32ilp32 Linux kernel (native mode). So, how did Asan support rv32ilp32, which only has 2~2.4 GiB address space? ps:
Yes, it's the same as user-space rv32ilp32 ABI of rv64lp64 Linux kernel (compat mode with only 2GiB address space). |
It's never supported on upstream so honest I don't know, some T-head folks say they will upstream the support but I don't saw yet. [1] https://lf-rise.atlassian.net/wiki/spaces/HOME/pages/8585550/DP_05_001+-+Address+Sanitizer
I am not family with Linux kernel, so I am wondering does compat mode upstreamed?
Could you explicitly explain that since I believe not every one is family with Linux kernel here including me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add rv64ilp32, rv64ilp32f, rv64ilp32d in Named ABIs
section, also add a new section === RV64ILP32 Calling Convention
right after === ILP32E Calling Convention
riscv-elf.adoc
Outdated
For the ILP32 ABI on RV64* ISA, the medlow allows the code to address lower 2GiB | ||
of the RV64 address space (`0x0` ~ `0x000000007FFFFFFF`). | ||
|
||
NOTE: Limiting the address space to lower 2GiB does not pose any issues with sign | ||
extending addresses into the upper 32 bits of a 64-bit register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer just drop this limitation from the psABI, medlow not really restrict to lower 2 G if using signed-extension, it more like restriction come from OS implementation, which is out of scope of psABI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ref to the pseudo-code in the "Linker Relaxation" Chapter. Keeping the 2GiB limitation, we needn't modify any of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use signed-extension for pointer, so in theory we have full 4 GiB for the memory from the 32 bits address space of view, even it seems like lower 2 GiB and highest 2 GiB
from the 64 bits address space view.
and I still don't think we should add the limitation here, that more like implementation limitation and should not written down in the psABI side.
Let me say that in another way: Linux user space rv64ilp32 has limitation that only limited to lower 2 GiB is fine, but we should not add that limitation into psABI spec, unless ALL rv64ilp32 scenario should have this limitation, e.g. Linux kernel space rv64ilp32, FreeBSD user/kernel space rv64ilp32 (of cause, this is not existing yet, but once we add this limitation, FreeBSD need take this limitation as well), all RTOS with rv64ilp32 should have this limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use signed-extension for pointer, so in theory we have full 4 GiB for the memory from the 32 bits address space of view, even it seems like
lower 2 GiB and highest 2 GiB
from the 64 bits address space view.and I still don't think we should add the limitation here, that more like implementation limitation and should not written down in the psABI side.
I'm not sure how to define the 4GiB address space. Some guys say 0xffffffff80000000-0x7fffffff, but some people say 0x0-0xffffffff. But 0x0-0x7ffffff (lowest 2GiB) is determined.
Let me say that in another way: Linux user space rv64ilp32 has limitation that only limited to lower 2 GiB is fine, but we should not add that limitation into psABI spec, unless ALL rv64ilp32 scenario should have this limitation, e.g. Linux kernel space rv64ilp32, FreeBSD user/kernel space rv64ilp32 (of cause, this is not existing yet, but once we add this limitation, FreeBSD need take this limitation as well), all RTOS with rv64ilp32 should have this limitation.
The rv64ilp32 Linux kernel runs in the 2GiB address range, which doesn't need the 4GiB address space. I used the duplicated page table mapping method to make "2GiB-4GiB" equal "-2GiB-0" address space, and sign-extend & zero-extend have the same result. That means the rv64ilp32 Linux kernel follows the rule of the lowest 2GiB address space. Letting the compiler care sign-extend address would cause a performance gap and additional compiler work. As long as the address space is limited to 2GiB, both Linux kernel and other OS kernels have a way of running.
The sign/zero-extend addressing is the pain point of 64ilp32 compared to 32ilp32 & 64lp64; any solution would pay the cost. So, the simplest solution is to limit the address to the lowest 2GiB space. So, let's start rv64ilp32 from the lowest 2GiB address space and see how to solve the upper 2GiB address later. Can we support the upper 2GiB address space in the future?
Best Regards
Guo Ren
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to define the 4GiB address space. Some guys say 0xffffffff80000000-0x7fffffff, but some people say 0x0-0xffffffff. But 0x0-0x7ffffff (lowest 2GiB) is determined.
You might say you don't know how to define how much space Sv32 has, right? If that's the case, then rv64ilp32 with sign-extension follows the same logic.
The rv64ilp32 Linux kernel runs in the 2GiB address range, which doesn't need the 4GiB address space. I used the duplicated page table mapping method to make "2GiB-4GiB" equal "-2GiB-0" address space, and sign-extend & zero-extend have the same result. That means the rv64ilp32 Linux kernel follows the rule of the lowest 2GiB address space. Letting the compiler care sign-extend address would cause a performance gap and additional compiler work. As long as the address space is limited to 2GiB, both Linux kernel and other OS kernels have a way of running.
The sign/zero-extend addressing is the pain point of 64ilp32 compared to 32ilp32 & 64lp64; any solution would pay the cost. So, the simplest solution is to limit the address to the lowest 2GiB space. So, let's start rv64ilp32 from the lowest 2GiB address space and see how to solve the upper 2GiB address later. Can we support the upper 2GiB address space in the future?
Linux kernel is just one of the user of the psABI spec, so let me reiterate again: Linux user space with lower 2 GiB limitation is fine, but I don't think spec should take this.
And psABI should define the pointer is use sign-extension or zero-extension, this should NOT defined in vague way, also I support this proposal is because I think pointer with sign-extension is reasonable way to RISC-V, and MIPS N32 also go that way as well, which mean this is at least a feasible way.
And last, I am really unhappy about that you guys say it's using sign-extension in the psABI meeting but still want to put this issue in vague way here, seriously I feel that's kinda cheating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to define the 4GiB address space. Some guys say 0xffffffff80000000-0x7fffffff, but some people say 0x0-0xffffffff. But 0x0-0x7ffffff (lowest 2GiB) is determined.
You might say you don't know how to define how much space Sv32 has, right? If that's the case, then rv64ilp32 with sign-extension follows the same logic.
The rv64ilp32 Linux kernel runs in the 2GiB address range, which doesn't need the 4GiB address space. I used the duplicated page table mapping method to make "2GiB-4GiB" equal "-2GiB-0" address space, and sign-extend & zero-extend have the same result. That means the rv64ilp32 Linux kernel follows the rule of the lowest 2GiB address space. Letting the compiler care sign-extend address would cause a performance gap and additional compiler work. As long as the address space is limited to 2GiB, both Linux kernel and other OS kernels have a way of running.
The sign/zero-extend addressing is the pain point of 64ilp32 compared to 32ilp32 & 64lp64; any solution would pay the cost. So, the simplest solution is to limit the address to the lowest 2GiB space. So, let's start rv64ilp32 from the lowest 2GiB address space and see how to solve the upper 2GiB address later. Can we support the upper 2GiB address space in the future?Linux kernel is just one of the user of the psABI spec, so let me reiterate again: Linux user space with lower 2 GiB limitation is fine, but I don't think spec should take this.
And psABI should define the pointer is use sign-extension or zero-extension, this should NOT defined in vague way, also I support this proposal is because I think pointer with sign-extension is reasonable way to RISC-V, and MIPS N32 also go that way as well, which mean this is at least a feasible way.
And last, I am really unhappy about that you guys say it's using sign-extension in the psABI meeting but still want to put this issue in vague way here, seriously I feel that's kinda cheating.
Some misunderstandings occur here.
There are three proposals about rv64ilp32 addressing:
1. zero-extension addressing
Address range: 0~4GiB
Because most riscv 32-bit ALU instructions are sign-extension by default, zero-extension addressing would cause more instructions and a performance gap.
(Not recommend)
2. sign-extension addressing
Address range: -2GiB~2GiB
We recommended it in the psABI meeting, but we found it caused a significant modification on psabi-spec:
For example, there are 15+ places of pseudo-code about address calculation:
5.2 Medium any code model
\# Calculate address
lui a0, %hi(symbol)
addi a0, a0, %lo(symbol) -> addiw a0
8.4.6. Program Linkage Table
1:
auipc t2, %pcrel_hi(.got.plt)
sub t1, t1, t3 # shifted .got.plt offset + hdr size + 12 -> subw
l[w|d] t3, %pcrel_lo(1b)(t2) # _dl_runtime_resolve
addi t1, t1, -(hdr size + 12) # shifted .got.plt offset -> addiw
addi t0, t2, %pcrel_lo(1b) # &.got.plt -> addiw
srli t1, t1, log2(16/PTRSIZE) # .got.plt offset
l[w|d] t0, PTRSIZE(t0) # link map
jr t3
3. 0~2GiB addressing limitation
Address range: 0~2GiB
Reasons:
- The motivation for putting 0~2GiB addressing limitation proposal out in this pr is to minimize psabi-spec modification.
- In actual use, 0~2GiB is enough.
(Yes, this is a vague way of zero and sign extension addressing in the psabi-spec, and it permits compilers to choose to use sign or zero extension for their implementation.)
In the end:
We're okay with sign-extension addressing. (-2GiB~2GiB)
If you need a sign-extension solution, we will supplement the modification of the address calculation pseudo code in psabi-spec for rv64ilp32 (Making psabi-spec logically self-consistent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer just drop this limitation from the psABI, medlow not really restrict to lower 2 G if using signed-extension, it more like restriction come from OS implementation, which is out of scope of psABI
For sign-extension, we still need to modify this part like this:
The medium low code model, or medlow, allows the code to address the whole RV32 address space or the lower 2 GiB and highest 2 GiB of the RV64 address space (64LP64 ABI: 0xFFFFFFFF7FFFF800 ~ 0xFFFFFFFFFFFFFFFF and 0x0 ~ 0x000000007FFFF7FF) (64ILP32 ABI: 0x0 ~ 0x000000007FFFFFFF and 0xFFFFFFFF80000000 ~ 0xFFFFFFFFFFFFFFFF). By using the lui and load / store instructions, when referring to an object, or addi (64LP64 ABI), or addiw (64ILP32 ABI), when calculating an address literal, for example, a 32-bit address literal can be produced.
The following instructions show how to load a value, store a value, or calculate an address in the
medlow code model.
# Load value from a symbol
lui a0, %hi(symbol)
lw a0, %lo(symbol)(a0)
# Store value to a symbol
lui a0, %hi(symbol)
sw a1, %lo(symbol)(a0)
# Calculate address
lui a0, %hi(symbol)
addi[w] a0, a0, %lo(symbol)
The ranges on RV64 with 64LP64 ABI are not 0x0 ~ 0x000000007FFFFFFF and 0xFFFFFFFF80000000 ~ 0xFFFFFFFFFFFFFFFF due to RISC-V’s sign-extension of immediates; the following code fragments show where the ranges come from:
# Largest postive number:
lui a0, 0x7ffff # a0 = 0x7ffff000
addi a0, 0x7ff # a0 = a0 + 2047 = 0x000000007FFFF7FF
# Smallest negative number:
lui a0, 0x80000 # a0 = 0xffffffff80000000
addi a0, a0, -0x800 # a0 = a0 + -2048 = 0xFFFFFFFF7FFFF800
The ranges on RV64 with 64ILP32 ABI are 0x0 ~ 0x000000007FFFFFFF and 0xFFFFFFFF80000000 ~ 0xFFFFFFFFFFFFFFFF due to RISC-V’s sign-extension of immediates; the following code fragments show where the ranges come from:
# Largest postive number:
lui a0, 0x7ffff # a0 = 0x7ffff000
addiw a0, 0x7ff # a0 = a0 + 2047 = 0x000000007FFFF7FF
# Smallest negative number:
lui a0, 0x80000 # a0 = 0xffffffff80000000
addiw a0, a0, -0x800 # a0 = a0 + -2048 = 0x000000007FFFF800
In the end:
If the sign-extend is determined, all calculated addresses about rv64 must distinguish lp64 & ilp32 ABIs in the psabi-spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if you limit to 2 GiB you need to do that already, because 0x000000007FFFFFFF cannot be produced by the normal RV64 code sequences. The only restriction that would let you ignore the problem would be if you only allowed the negative/upper half of the address space to be used. So I don't buy this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You pointed out the fallacy of the 0 - 2GiB limitation, and the correct name should be 0~0x7FFFF7FF (upper half of the address space). Thank you.
@kito-cheng is opposed to "limitation the upper half of the address space":
The psABI should define the pointer is use sign-extension or zero-extension, this should NOT defined in vague way.
So, we abandoned the "limitation of the upper half of the address space" proposal and returned to the sign-extension addressing proposal. These days, we will supplement the modification of the address calculation pseudo code in the psabi-spec for rv64ilp32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've updated PR with sign-extend addressing, corrected the naming with RV64ILP32* ABIs, and used "addiw" for address calculation.
If there is any problem, please let me know. Thanks.
Thx for mentioning. @joshua-arch1 has contributed a 64-bit Asan riscv porting, and he will continue the work of a 32-bit Asan riscv port. As far as I know, @joshua-arch1 would use a 2GiB address space layout, which leaves the target program with 1.7GiB of address space. This is also compatible with the 2.4GiB user address space for the rv32ilp32 Linux kernel.
Yes, compat mode has been upstreamed for two years.
vDSO must stay within TASK_SIZE. Yes, it also goes in the lower 2GiB. Here is the patch: |
riscv-cc.adoc: - Add ABIs and ISAs mapping description about RV64ILP32* ABIs on RV64* ISAs. - Correct C/{Cpp} type sizes and alignments descriptions. riscv-elf.adoc: - Add EF_RISCV_RV64ILP32 in e_flags field. Signed-off-by: Liao Shihua <shihua@iscas.ac.cn> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Add abi-rv64ilp32(f)(d)(q) calling convention sections. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Liao Shihua <shihua@iscas.ac.cn> Signed-off-by: Jia-Wei Chen <jiawei@iscas.ac.cn>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest that RV64ILP32 continue using addi
rather than addiw
to minimize the impact on code generation. Otherwise, the code sequence for lui+load/store
or auipc+load/store
would need to be dropped since the immediate part is just sign-extended, like addi.
The trade-off is that the medlow range would become 0xFFFFFFFF7FFFF800 ~ 0xFFFFFFFFFFFFFFFF
and 0x0 ~ 0x000000007FFFF7FF,
which is the same as the normal RV64/medlow. We might lose a few addresses for medlow, but we can keep using lui+load/store.
For medany, we should continue using auipc + addi
as well, but add a NOTE mentioning that the address space is NOT continuous in the middle. This property will remain even if we use auipc+addiw
. The same issue as medlow applies here—we can't use auipc+load/store
if we switch to addiw.
I believe this change would still work in most situations, such as under the lower 2G constraint in Linux user-space implementations. This approach is similar to the previous version but makes the 0xFFFFFFFF7FFFF800 ~ 0xFFFFFFFFFFFFFFFF
address space usable.
Once again, I understand your concerns about performance and code size. However, I also want to avoid over-constraining RV64ILP32. I believe this approach addresses concerns from both your perspective and mine.
Lastly, my previous comment may have come across as a bit sharp, and I understand that might not have been your intent. When drafting the spec, it’s normal to leave some room for interpretation, but I’d encourage using consistent terminology and avoiding ambiguity during discussions. This can help prevent misunderstandings.
The address space of RV64ILP32* ABIs is not continuous in the middle for medium any code model. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Liao Shihua <shihua@iscas.ac.cn> Signed-off-by: Jia-Wei Chen <jiawei@iscas.ac.cn>
This pull request adds a new e_flags X32. It occupies the sixth bit of e_flags layout.
We have initially implemented rv64 ilp32 on the gnu toolchain and kernel.
Details in this link.
@guoren83 @palmer-dabbelt @kito-cheng