-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some improvements to optimizer documentation #11360
Conversation
@@ -65,20 +66,23 @@ A "runs" parameter of "1" will produce short but expensive code. The largest val | |||
Opcode-Based Optimizer Module | |||
============================= | |||
|
|||
The opcode-based optimizer module operates on assembly. It splits the | |||
The opcode-based optimizer module operates on assembly code. It splits the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph was a big block. I broke it up into what seemed to me like reasonable sub paragraphs.
docs/internals/optimizer.rst
Outdated
Modifications to storage and memory locations have to erase knowledge about | ||
is a more complex expression which we know always evaluates to one. | ||
|
||
Modifications to storage and memory locations must erase knowledge about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really struggled to understand this paragraph. I think it's saying something about a function that takes arguments x, and y, and does SSTORE(X, VAL1)
, and SSTORE(Y, VAL2)
, but then I'm not sure what the optimizer does about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimizer tries to symbolically track memory and storage. As an application, this is how Keccak-256 may be evaluated during compile time. For example, consider the sequence:
[PUSH 32, PUSH 0, CALLDATALOAD, PUSH 100, DUP2, MSTORE, KECCAK256]
or the equivalent Yul
let x := calldataload(0)
mstore(x, 100)
let value := keccak256(x, 32)
In this case, the optimizer has to 'symbolically' track the value at a memory location calldataload(0)
and then realize that the Keccak-256 has can be evaluated at compile time. This only works if there is no other instruction that modifies memory in between the mstore
and keccak256
. So if there is an instruction that writes to memory (or storage), then we need to erase the knowledge of the current memory (or storage). There is, however, an exception to this erasing, where we can easily see that the instruction doesn't write to a certain location.
For example,
let x := calldataload(0)
mstore(x, 100)
/// Current knowledge memory location x -> 100
let y := add(x, 32)
/// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
mstore(y, 200)
I admit that the current paragraph is easy to understand if you already know how it works and rather confusing otherwise. Do you have a suggestion on rewording it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
docs/internals/optimizer.rst
Outdated
When it comes to the ASM output, one can also notice reduction of equivalent/duplicate | ||
"code blocks" (compare the output of the flags ``--asm`` and ``--asm --optimize``). However, | ||
Generally, the most visible difference is that constant expressions are evaluated at compile time. | ||
When it comes to the ASM output, one can also notice a reduction of redundant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant might be too strong here. I think this line is referring to Block Deduplicator. This optimizer step would look for code blocks that are the same, removes all except one and replace jumps to blocks that were removed to the one block that remains.
docs/internals/optimizer.rst
Outdated
Modifications to storage and memory locations have to erase knowledge about | ||
is a more complex expression which we know always evaluates to one. | ||
|
||
Modifications to storage and memory locations must erase knowledge about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimizer tries to symbolically track memory and storage. As an application, this is how Keccak-256 may be evaluated during compile time. For example, consider the sequence:
[PUSH 32, PUSH 0, CALLDATALOAD, PUSH 100, DUP2, MSTORE, KECCAK256]
or the equivalent Yul
let x := calldataload(0)
mstore(x, 100)
let value := keccak256(x, 32)
In this case, the optimizer has to 'symbolically' track the value at a memory location calldataload(0)
and then realize that the Keccak-256 has can be evaluated at compile time. This only works if there is no other instruction that modifies memory in between the mstore
and keccak256
. So if there is an instruction that writes to memory (or storage), then we need to erase the knowledge of the current memory (or storage). There is, however, an exception to this erasing, where we can easily see that the instruction doesn't write to a certain location.
For example,
let x := calldataload(0)
mstore(x, 100)
/// Current knowledge memory location x -> 100
let y := add(x, 32)
/// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)
mstore(y, 200)
I admit that the current paragraph is easy to understand if you already know how it works and rather confusing otherwise. Do you have a suggestion on rewording it?
efdf5fc
to
65f5fa5
Compare
docs/internals/optimizer.rst
Outdated
write to a symbolic storage location ``x`` and then to the symbolic storage location ``y``. In | ||
general, ``y`` could overwrite ``x`` if ``y`` and ``x`` are the same storage slot. However, if the | ||
optimizer can infer that the value of the expression ``x - y`` is non-zero, this means that the | ||
knowledge about storage at the symbolic location ``x`` can be kept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this. @leonardoalt Perhaps a review of at least this part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't understand how it arrives at the implication that knowledge can be kept if the expression results in non-zero...
b794c8c
to
ca29092
Compare
docs/internals/optimizer.rst
Outdated
mstore(x, 100) | ||
let value := keccak256(x, 32) | ||
|
||
In this case, the optimizer has to 'symbolically' track the value at a memory location |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the optimizer has to 'symbolically' track the value at a memory location | |
In this case, the optimizer has to symbolically track the value at a memory location |
(same sentence but no quotes above)
docs/internals/optimizer.rst
Outdated
let value := keccak256(x, 32) | ||
|
||
In this case, the optimizer has to 'symbolically' track the value at a memory location | ||
``calldataload(0)`` and then realize that the Keccak-256 hash can be evaluated at compile time. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm these last 2 sentences basically repeat the sentence before the example code.
docs/internals/optimizer.rst
Outdated
knowledge about storage or memory locations which may be equal to ``l``. More specifically, for | ||
storage, the optimizer has to erase all knowledge of symbolic locations, that may be equal to ``l`` | ||
and for memory, the optimizer has to erase all knowledge of symbolic locations that may not be at | ||
least 32 bytes away. This is done by computing the value ``sub(l, m)``. For storage, if this value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's m
?
docs/internals/optimizer.rst
Outdated
let value := keccak256(x, 32) | ||
|
||
In this case, the optimizer tracks the value at a memory location ``calldataload(0)`` and then | ||
realize that the Keccak-256 hash can be evaluated at compile time. This only works if there is no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
realize that the Keccak-256 hash can be evaluated at compile time. This only works if there is no | |
realizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Co-authored-by: Harikrishnan Mulackal <webmail.hari@gmail.com>
I read about halfway, and made some small tweaks to improve readability.
Will try to do the rest sometime.