You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The original IR system was hacky, poorly designed, and has no unit tests, so it is being redesigned. Writing a decompiler is a new thing for me, so the code was very experimental and I didn't have a good idea of how the whole system would work when I designed it. Now that we know more, it's time to clean up and add tests.
The new system will have two representations. The first is AtomicOps and the second is Forms. The disassembled code will be converted to AtomicOps first, then the AtomicOps will be converted to Forms.
My plan is to keep the old representation around until the new representation is completed. Then we can delete all of IR1.
Atomic Op
The AtomicOp is the smallest possible operation from the point of view of the decompiler's type system and register use system. An Atomic Op can contain small, simple expressions, like for example: (set! v1 (+ v0 s1)). But you can't infinitely nest expressions, and there are restrictions on what you can put where. An AtomicOp itself has no "value" - it represents a sequence of MIPS instructions that the original GOAL compiler likely emitted together.
The Form is like a Lisp form. It can optionally have a value, or it can be used for side effects. It can be nested. Some forms represent control flow statements. Designing this part of the IR will be tricky, but it will be important to get right. In particular it needs to be easy to manipulate, and it needs to be able to look back at the AtomicOps its made up of to determine the types of registers and what variables they correspond with. Another tricky part is handling "internal substitution" on Forms that never existed as AtomicOp (see get_consumed() for example). It may be worth having some separation between "just a blob of code" and "something that can have internal references modified in expression stacking". Like how there is a concept of "set!" vs. "not a set!".
Make "register form" from sequence (not sure what to name this? it's a form that has the side effects of a sequence but "evaluates to" a register) [Decompiler] Expressions (Part 3) #213
The original IR system was hacky, poorly designed, and has no unit tests, so it is being redesigned. Writing a decompiler is a new thing for me, so the code was very experimental and I didn't have a good idea of how the whole system would work when I designed it. Now that we know more, it's time to clean up and add tests.
The new system will have two representations. The first is
AtomicOp
s and the second isForm
s. The disassembled code will be converted toAtomicOp
s first, then theAtomicOp
s will be converted toForm
s.My plan is to keep the old representation around until the new representation is completed. Then we can delete all of IR1.
Atomic Op
The AtomicOp is the smallest possible operation from the point of view of the decompiler's type system and register use system. An Atomic Op can contain small, simple expressions, like for example:
(set! v1 (+ v0 s1))
. But you can't infinitely nest expressions, and there are restrictions on what you can put where. An AtomicOp itself has no "value" - it represents a sequence of MIPS instructions that the original GOAL compiler likely emitted together.AtomicOp
: [Decompiler - New IR] Add AtomicOp #181Instruction
toAtomicOp
s (replacement ofBasicOpBuilder
): [Decompiler] Write IR2 to file and implement some Atomic Op conversions #187AtomicOp
conversionsAtomicOp
: [Decompiler] Implement IR2 Type Analysis Pass #193AtomicOp
: [Decompiler] add IR2 register usage pass #194Form
The Form is like a Lisp form. It can optionally have a value, or it can be used for side effects. It can be nested. Some forms represent control flow statements. Designing this part of the IR will be tricky, but it will be important to get right. In particular it needs to be easy to manipulate, and it needs to be able to look back at the
AtomicOp
s its made up of to determine the types of registers and what variables they correspond with. Another tricky part is handling "internal substitution" on Forms that never existed asAtomicOp
(seeget_consumed()
for example). It may be worth having some separation between "just a blob of code" and "something that can have internal references modified in expression stacking". Like how there is a concept of "set!" vs. "not a set!".Form
andFormElement
: [Decompiler] Begin ir2 form implementation #197std::vector<BaseForm>
) and special case size 1?AtomicOp
toForm
: [Decompiler] Begin ir2 form implementation #197CfgVtx
toForm
: [Decompiler] Begin ir2 form implementation #197top_level->to_form()
at this point : [Decompiler] Test framework for decompiler regression tests and gcommon tests #200Form
as needed for expression building: partially in [Decompiler] Get used variables, handle function calls better, and minor cleanup #205Form
stacking algorithm: [Decompiler] Expression Building #211 [Decompiler] Expressions (Part 3) #213The text was updated successfully, but these errors were encountered: