Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements to optimizer documentation #11360

Merged
merged 1 commit into from
May 19, 2021

Conversation

maurelian
Copy link
Contributor

I read about halfway, and made some small tweaks to improve readability.

Will try to do the rest sometime.

@@ -65,20 +66,23 @@ A "runs" parameter of "1" will produce short but expensive code. The largest val
Opcode-Based Optimizer Module
=============================

The opcode-based optimizer module operates on assembly. It splits the
The opcode-based optimizer module operates on assembly code. It splits the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph was a big block. I broke it up into what seemed to me like reasonable sub paragraphs.

Modifications to storage and memory locations have to erase knowledge about
is a more complex expression which we know always evaluates to one.

Modifications to storage and memory locations must erase knowledge about
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really struggled to understand this paragraph. I think it's saying something about a function that takes arguments x, and y, and does SSTORE(X, VAL1), and SSTORE(Y, VAL2), but then I'm not sure what the optimizer does about that.

Copy link
Member

@hrkrshnn hrkrshnn May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimizer tries to symbolically track memory and storage. As an application, this is how Keccak-256 may be evaluated during compile time. For example, consider the sequence:

 [PUSH 32, PUSH 0, CALLDATALOAD, PUSH 100,  DUP2, MSTORE, KECCAK256]

or the equivalent Yul

let x := calldataload(0)
mstore(x, 100)
let value := keccak256(x, 32)

In this case, the optimizer has to 'symbolically' track the value at a memory location calldataload(0) and then realize that the Keccak-256 has can be evaluated at compile time. This only works if there is no other instruction that modifies memory in between the mstore and keccak256. So if there is an instruction that writes to memory (or storage), then we need to erase the knowledge of the current memory (or storage). There is, however, an exception to this erasing, where we can easily see that the instruction doesn't write to a certain location.

For example,

let x := calldataload(0)
mstore(x, 100)
/// Current knowledge memory location x -> 100
let y := add(x, 32)
/// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
mstore(y, 200)

I admit that the current paragraph is easy to understand if you already know how it works and rather confusing otherwise. Do you have a suggestion on rewording it?

Copy link
Member

@hrkrshnn hrkrshnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

docs/internals/optimizer.rst Outdated Show resolved Hide resolved
When it comes to the ASM output, one can also notice reduction of equivalent/duplicate
"code blocks" (compare the output of the flags ``--asm`` and ``--asm --optimize``). However,
Generally, the most visible difference is that constant expressions are evaluated at compile time.
When it comes to the ASM output, one can also notice a reduction of redundant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant might be too strong here. I think this line is referring to Block Deduplicator. This optimizer step would look for code blocks that are the same, removes all except one and replace jumps to blocks that were removed to the one block that remains.

Modifications to storage and memory locations have to erase knowledge about
is a more complex expression which we know always evaluates to one.

Modifications to storage and memory locations must erase knowledge about
Copy link
Member

@hrkrshnn hrkrshnn May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimizer tries to symbolically track memory and storage. As an application, this is how Keccak-256 may be evaluated during compile time. For example, consider the sequence:

 [PUSH 32, PUSH 0, CALLDATALOAD, PUSH 100,  DUP2, MSTORE, KECCAK256]

or the equivalent Yul

let x := calldataload(0)
mstore(x, 100)
let value := keccak256(x, 32)

In this case, the optimizer has to 'symbolically' track the value at a memory location calldataload(0) and then realize that the Keccak-256 has can be evaluated at compile time. This only works if there is no other instruction that modifies memory in between the mstore and keccak256. So if there is an instruction that writes to memory (or storage), then we need to erase the knowledge of the current memory (or storage). There is, however, an exception to this erasing, where we can easily see that the instruction doesn't write to a certain location.

For example,

let x := calldataload(0)
mstore(x, 100)
/// Current knowledge memory location x -> 100
let y := add(x, 32)
/// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
mstore(y, 200)

I admit that the current paragraph is easy to understand if you already know how it works and rather confusing otherwise. Do you have a suggestion on rewording it?

docs/internals/optimizer.rst Outdated Show resolved Hide resolved
@hrkrshnn hrkrshnn force-pushed the patch-2 branch 2 times, most recently from efdf5fc to 65f5fa5 Compare May 18, 2021 12:08
write to a symbolic storage location ``x`` and then to the symbolic storage location ``y``. In
general, ``y`` could overwrite ``x`` if ``y`` and ``x`` are the same storage slot. However, if the
optimizer can infer that the value of the expression ``x - y`` is non-zero, this means that the
knowledge about storage at the symbolic location ``x`` can be kept.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this. @leonardoalt Perhaps a review of at least this part?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't understand how it arrives at the implication that knowledge can be kept if the expression results in non-zero...

@hrkrshnn hrkrshnn force-pushed the patch-2 branch 7 times, most recently from b794c8c to ca29092 Compare May 18, 2021 14:24
docs/internals/optimizer.rst Outdated Show resolved Hide resolved
mstore(x, 100)
let value := keccak256(x, 32)

In this case, the optimizer has to 'symbolically' track the value at a memory location
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this case, the optimizer has to 'symbolically' track the value at a memory location
In this case, the optimizer has to symbolically track the value at a memory location

(same sentence but no quotes above)

let value := keccak256(x, 32)

In this case, the optimizer has to 'symbolically' track the value at a memory location
``calldataload(0)`` and then realize that the Keccak-256 hash can be evaluated at compile time. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm these last 2 sentences basically repeat the sentence before the example code.

docs/internals/optimizer.rst Outdated Show resolved Hide resolved
knowledge about storage or memory locations which may be equal to ``l``. More specifically, for
storage, the optimizer has to erase all knowledge of symbolic locations, that may be equal to ``l``
and for memory, the optimizer has to erase all knowledge of symbolic locations that may not be at
least 32 bytes away. This is done by computing the value ``sub(l, m)``. For storage, if this value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's m?

let value := keccak256(x, 32)

In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then
realize that the Keccak-256 hash can be evaluated at compile time. This only works if there is no
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
realize that the Keccak-256 hash can be evaluated at compile time. This only works if there is no
realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Co-authored-by: Harikrishnan Mulackal <webmail.hari@gmail.com>
@hrkrshnn hrkrshnn enabled auto-merge May 19, 2021 11:14
@hrkrshnn hrkrshnn merged commit d07c85d into ethereum:develop May 19, 2021
@maurelian maurelian deleted the patch-2 branch September 7, 2021 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants