Basic RISCV support #1198

citypw · 2018-07-05T04:27:32Z

Hi,

I rebased the previous PR[1] to the latest code. Plz review it.

[1] #1131

… the TableGen files generated from llvm-tblgen. Add Disassembler.h

…ler_getInstruction, and RISCV_getInstruction

…o RISCVGenDisassemblerTables.inc. Add and modified RISCVGenSubtargetInfo.inc. Start creation of RISCVInstPrinter.h

…nor fixes to RISCVDisassembler.c. Working on RISCVInstPrinter

…Info.inc, RISCVModule.c. Working on riscv.h

…DDI, AND works properly.

…and test_iter to work w/ the current code strcuture

…ents in struct initializer). Added RISCV tests to test_iter.c

aquynh · 2018-07-05T04:33:29Z

awesome, thanks for doing this!

but this still fails on CI now?

citypw · 2018-07-05T04:35:29Z

yeah, some compile complains. I'll try to fix it later.

citypw · 2018-07-05T05:22:48Z

It seems working now. Could you plz review?

aquynh · 2018-07-05T05:31:38Z

Sure, i will do that.

aquynh · 2018-07-05T13:33:06Z

can you please use tabs for indentation in all C code?

aquynh · 2018-07-05T13:34:05Z

arch/RISCV/RISCVDisassembler.c

+
+static DecodeStatus DecodeFPR32RegisterClass(MCInst *Inst, uint64_t RegNo,
+                                             uint64_t Address,
+                                             const void *Decoder) {


please put the open bracket { of a function on a new line

aquynh · 2018-07-05T13:35:56Z

arch/RISCV/RISCVModule.c

@@ -0,0 +1,49 @@
+/* Capstone Disassembly Engine */
+/* By Nguyen Anh Quynh <aquynh@gmail.com>, 2013-2014 */


these code are not mine, please put your name here (and elsewhere) ;-)

citypw · 2018-07-05T14:14:01Z

Should be fixed now. Plz review it again.

aquynh · 2018-07-05T14:38:09Z

arch/RISCV/RISCVBaseInfo.h

+
+// RISCVII - This namespace holds all of the target specific flags that
+// instruction info tracks. All definitions must match RISCVInstrFormats.td.
+enum


we are following Linux kernel coding style, so please put { after enum, not on the next line

Ok, will do.

aquynh · 2018-07-05T14:40:28Z

arch/RISCV/RISCVBaseInfo.h

+  RISCVFPRndMode_Invalid
+};
+
+inline static StringRef roundingModeToString(RoundingMode RndMode) {


please put open bracket of a function on a new line (next line), not on the same line

looks like this file is not indented with tabs yet?

please double check indentation of all other files, too.

i see that this code is commented out, so perhaps auto-indent did not work.
but i can still see code in some other files not in proper indentation format yet.

Oops, that one was commented so the indent tool didn't work on it. Will fix it manually.

aquynh · 2018-07-05T14:42:52Z

arch/RISCV/RISCVDisassembler.c

+#define GET_SUBTARGETINFO_ENUM
+#include "RISCVGenSubtargetInfo.inc"
+
+static uint64_t


our coding style put function type & function name on the same line, but does not break them into 2 lines like this.
please fix this, and also other places.

aquynh · 2018-07-05T14:43:41Z

arch/RISCV/RISCVDisassembler.c

+	// instruction set extensions have the option of defining instructions up to
+	// 176 bits wide.
+	*Size = 4;
+	if (code_len < 4)


this { should be on the same line with the condition check, not on the next line.

citypw · 2018-07-05T15:10:52Z

Seems all fixed. Plz review it again. Thanks.

aquynh · 2018-07-06T15:37:04Z

include/capstone/riscv.h

+#define CAPSTONE_RISCV_H
+
+/* Capstone Disassembly Engine */
+/* By Nguyen Anh Quynh <aquynh@gmail.com>, 2013-2014 */


please put your name here, not mine.

aquynh · 2018-07-06T15:42:26Z

include/capstone/riscv.h

+	RISCV_INS_FCLASS_S,
+	RISCV_INS_FCVT_D_L,
+	RISCV_INS_FCVT_D_LU,
+	RISCV_INS_FCVT_D_S,


my impression is that we can map related instructions into one, like RISCV_INS_FCVT_L_D & RISCV_INS_FCVT_L_S into just RISCV_INS_FCVT_L, or even all related ones into RISCV_INS_FCVT.

correct me if i am wrong (i dont know much about RISCV), but this is how we did with other archs, like we map ADDxxx into just ADD on X86.

Is this related to opcode? I have no idea if it could be mapped like x86.

@porto703, what do you think? can we map those opcode to fewer instructions?

If the mapping that you suggest is related to the opcode, then yes, probably instructions with the same opcode can be mapped together. But I am not sure how the mapping is done in x86.

@porto703, the mapping is all in xxxMappingInsn.inc file. for X86, you can grep X86_ADD from X86MappingInsn.inc file, and get these lines:

X86MappingInsn.inc: X86_ADD16i16, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16mi, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16mi8, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16mr, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16ri, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16ri8, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16rm, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16rr, X86_INS_ADD, ...

this is how we map all X86_ADDxxx to X86_INS_ADD, which is the opcode of all ADD instructions, regardless of operand types.

what do you think, should we do the same thing for RISCV?

So it seems they are mapped based on the opcode. In RISCV several instructions can be mapped into the same opcode group, but still there are other fields that are used to select the type of operation within the same opcode group. I didn't have the chance to look further into capstone to understand how the mapping was being used, and if this may fit for RISCV. So that is why I didn't include a more refined mapping into the first version that I worked on.
But at first glance, it looks to me that a similar mapping may be done based on the opcode.
Still I would like to see @citypw opinion on this.

well, I checked the both x86 and RV manuals a bit. It seems very similar which only mapped to the specific subgroup of opcode. I only tested the "add*" ins:

citypw@ac4cc0f
citypw@d8d8adb

The results of regression test case are the same. I don't know. Maybe we should work toward to cut the mapped ins into fewer ones?

@porto703, the mapping is all in xxxMappingInsn.inc file. for X86, you can grep X86_ADD from X86MappingInsn.inc file, and get these lines:

X86MappingInsn.inc: X86_ADD16i16, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16mi, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16mi8, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16mr, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16ri, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16ri8, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16rm, X86_INS_ADD, X86MappingInsn.inc: X86_ADD16rr, X86_INS_ADD, ...

this is how we map all X86_ADDxxx to X86_INS_ADD, which is the opcode of all ADD instructions, regardless of operand types.

what do you think, should we do the same thing for RISCV?

In riscv this is not possible, it is the only reason tell the CPU which is 32bit or 64bit.
for example, lw/ld can't merge to load. it only tells the difference according to the w/d.
not like X86 we call distinguish 32bit/64bit follow the register name, In riscv it only has
Xn.

aquynh · 2018-07-06T16:29:38Z

What matters is that the same code can be interpreted differently based on modes? If not, then the modes you mentioned make no difference. Example is X86 has 3 modes: 16, 32 & 64, each has different encodings. Or Arm has Arm and Thumb modes.

citypw · 2018-07-06T16:36:34Z

AFAIK, there's no separate hardware mode. This point I'll need to confirm.

neuschaefer · 2018-07-06T16:37:28Z

RISC-V also has instructions that take up two instead of four bytes (RVC), but unlike Thumb, they don't require a mode switch. They can be executed alongside four-byte instructions. (The only requirement is that the processor supports RVC.)

aquynh · 2018-07-06T16:41:14Z

Then how can we tell the next instruction is 2 bytes, or 4 bytes?

neuschaefer · 2018-07-06T16:43:17Z

It's encoded in the lower two bits (instructions are encoded in little-endian)

aquynh · 2018-07-06T16:46:00Z

If so we only have 1 mode (i.e one encoding scheme), thus we dont need to support cs_option()

neuschaefer · 2018-07-12T04:06:30Z

include/capstone/capstone.h

@@ -257,6 +258,7 @@ typedef struct cs_opt_skipdata {
 	// X86:     1 bytes.
 	// XCore:   2 bytes.
 	// EVM:     1 bytes.
+	// RISCV:   4 bytes.


2 bytes might be more appropriate, because of RVC

neuschaefer · 2018-07-12T04:10:21Z

RISC-V has 32-bit and 64-bit (and 128-bit) variants/modes though, which determine which instructions are valid. In some cases, such as C.JAL and C.ADDIW, the same bytes can disassemble to completely unrelated instructions, depending on the variant.

(NOTE: I don't think the code in this pull request supports RVC, and thus the C.* instructions above, yet)

aquynh · 2018-07-14T15:45:28Z

@neuschaefer , this sounds like RISV has option to be initialized in 32bit & 64 bit mode then.

XVilka · 2018-09-13T08:29:00Z

Ping? @aquynh @citypw

etherealvisage · 2018-10-26T18:45:17Z

I'm very interested in this pull request, as I'm planning on adding RISC-V support to a project that's already using capstone for x86_64 and aarch64 disassembly. Is there anything that I can do to help here, besides perhaps putting it through some more testing?

aquynh · 2018-10-27T02:21:03Z

this looks pretty good to me, but there are some open questions regarding mapping related instructions into smaller set, as discussed in this thread. let me know if you can contribute towards that.

XVilka · 2018-12-14T03:21:21Z

This one looks pretty good for inclusion into 4.0 version too.

radare · 2019-01-22T09:08:38Z

lots of conflicts, probably because of the changes in branch names, can you please retarget the PR for the 4.1 branch?

XVilka · 2019-02-05T12:10:29Z

Ping?

radare · 2019-02-15T12:40:24Z

peng?

citypw · 2019-02-15T14:37:16Z

@fanfuqiang any update?

fanfuqiang · 2019-02-15T17:09:03Z

lots of conflicts, probably because of the changes in branch names, can you please retarget the PR for the 4.1 branch?

The RISC-V port of LLVM is changing fast, especially in recent months. RISC-V 32 is stable for now.
I will make a PR base on the @citypw and LLVM upstream RISC-V, along with the LLVM TableGen patches, in the recent days.

aquynh · 2019-02-15T17:50:39Z

In general this PR looks quite nice, just some concerns raised without feedback yet.

Please target the next branch for future PR. We hope to have this ready for v5.

radare · 2019-06-23T09:34:09Z

Isnt this merged already?

aquynh · 2019-06-23T09:54:24Z

Yes

porto703 and others added 12 commits July 5, 2018 11:40

Added RISCV dir to contain the RISCV architecture engine code. Adding…

c800e62

… the TableGen files generated from llvm-tblgen. Add Disassembler.h

Started working on RISCVDisassembler.c - RISCV_init(), RISCVDisassemb…

a41e6fe

…ler_getInstruction, and RISCV_getInstruction

Added all functions to RISCVDisassembler.c and needed modifications t…

e52484a

…o RISCVGenDisassemblerTables.inc. Add and modified RISCVGenSubtargetInfo.inc. Start creation of RISCVInstPrinter.h

Finished RISCVGenAsmWriter.inc. Finished RISCVGenRegisterInfo.inc. Mi…

5e14573

…nor fixes to RISCVDisassembler.c. Working on RISCVInstPrinter

Finished RISCVInstPrinter, RISCVMapping, RISCVBaseInfo, RISCVGenInstr…

29cf04b

…Info.inc, RISCVModule.c. Working on riscv.h

Backport it from: porto703@0db412c

38dc6e0

All RISCV files added. Compiled correctly and initial test for ADD, A…

a01b64e

…DDI, AND works properly.

Add refactored cs.c for RISCV

0256bdf

Testing all I instructions in test_riscv.c

fedef9a

Modify the orignal backport for RISCVGenRegisterInfo.inc, capstone.h …

d851054

…and test_iter to work w/ the current code strcuture

Fix issue with RISCVGenRegisterInfo.inc - RISCVRegDesc[] (Excess elem…

f766f20

…ents in struct initializer). Added RISCV tests to test_iter.c

fixed bug related to incorrect initialization of memory after malloc

59a3ef8

fix compile bug

b5a341c

Shawn Chang added 2 commits July 5, 2018 13:08

Fix compile errors.

af4fa0c

move riscv.h to include/capstone

62fe0cc

aquynh reviewed Jul 5, 2018

View reviewed changes

fix indentation issues

5895bcb

aquynh reviewed Jul 5, 2018

View reviewed changes

fix coding style issues

2c8fdbb

aquynh reviewed Jul 6, 2018

View reviewed changes

Fix code sytle

dda303a

remove cs_mode support for RISCV

f18781a

neuschaefer reviewed Jul 12, 2018

View reviewed changes

aquynh mentioned this pull request Jul 18, 2018

Adding support for RISCV32I architecture #1131

Closed

aquynh mentioned this pull request Dec 20, 2018

Plan for future development of Capstone #1319

Closed

fanfuqiang mentioned this pull request Feb 27, 2019

RISCV support ISRV32/ISRV64 #1401

Merged

aquynh closed this Jun 23, 2019

Milo-D mentioned this pull request Jan 6, 2021

AVR Architecture Support #1716

Open

		@@ -0,0 +1,49 @@
		/* Capstone Disassembly Engine */
		/* By Nguyen Anh Quynh <aquynh@gmail.com>, 2013-2014 */

Basic RISCV support #1198

Basic RISCV support #1198

Conversation

citypw commented Jul 5, 2018

aquynh commented Jul 5, 2018

citypw commented Jul 5, 2018

citypw commented Jul 5, 2018

aquynh commented Jul 5, 2018 via email

aquynh commented Jul 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

citypw commented Jul 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

citypw commented Jul 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aquynh commented Jul 6, 2018 via email

citypw commented Jul 6, 2018

neuschaefer commented Jul 6, 2018

aquynh commented Jul 6, 2018 via email

neuschaefer commented Jul 6, 2018

aquynh commented Jul 6, 2018 via email

Choose a reason for hiding this comment

neuschaefer commented Jul 12, 2018

aquynh commented Jul 14, 2018

XVilka commented Sep 13, 2018

etherealvisage commented Oct 26, 2018

aquynh commented Oct 27, 2018

XVilka commented Dec 14, 2018

radare commented Jan 22, 2019

XVilka commented Feb 5, 2019

radare commented Feb 15, 2019

citypw commented Feb 15, 2019

fanfuqiang commented Feb 15, 2019 • edited Loading

aquynh commented Feb 15, 2019

radare commented Jun 23, 2019

aquynh commented Jun 23, 2019 via email

fanfuqiang commented Feb 15, 2019 •

edited

Loading