Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Decompiler] Replace IR #186

Closed
18 of 28 tasks
water111 opened this issue Jan 6, 2021 · 2 comments
Closed
18 of 28 tasks

[Decompiler] Replace IR #186

water111 opened this issue Jan 6, 2021 · 2 comments

Comments

@water111
Copy link
Collaborator

water111 commented Jan 6, 2021

The original IR system was hacky, poorly designed, and has no unit tests, so it is being redesigned. Writing a decompiler is a new thing for me, so the code was very experimental and I didn't have a good idea of how the whole system would work when I designed it. Now that we know more, it's time to clean up and add tests.

The new system will have two representations. The first is AtomicOps and the second is Forms. The disassembled code will be converted to AtomicOps first, then the AtomicOps will be converted to Forms.

My plan is to keep the old representation around until the new representation is completed. Then we can delete all of IR1.

Atomic Op

The AtomicOp is the smallest possible operation from the point of view of the decompiler's type system and register use system. An Atomic Op can contain small, simple expressions, like for example: (set! v1 (+ v0 s1)). But you can't infinitely nest expressions, and there are restrictions on what you can put where. An AtomicOp itself has no "value" - it represents a sequence of MIPS instructions that the original GOAL compiler likely emitted together.

Form

The Form is like a Lisp form. It can optionally have a value, or it can be used for side effects. It can be nested. Some forms represent control flow statements. Designing this part of the IR will be tricky, but it will be important to get right. In particular it needs to be easy to manipulate, and it needs to be able to look back at the AtomicOps its made up of to determine the types of registers and what variables they correspond with. Another tricky part is handling "internal substitution" on Forms that never existed as AtomicOp (see get_consumed() for example). It may be worth having some separation between "just a blob of code" and "something that can have internal references modified in expression stacking". Like how there is a concept of "set!" vs. "not a set!".

@water111
Copy link
Collaborator Author

As we develop IR2, we'll keep IR1 around and the default. This might cause some temporary code duplication until we remove IR1, but it will be extremely helpful to be switch between the two. It can be toggled here: https://github.com/water111/jak-project/blob/2901f4a99e1ff3152c98ee94ee833ec36fb34182/decompiler/config/jak1_ntsc_black_label.jsonc#L64

@water111
Copy link
Collaborator Author

IR2 is used for everything, other than generating the all-types file, which is a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant