-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further speed improvements #8
Further speed improvements #8
Conversation
by specializing to the amount of bits rotated, the called function is a one-argument function, so elm does not have to apply the usual F and A wrappers that enable currying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my machine this now does ~16MB/s in a benchmark, and the 90mb file takes just over 5 seconds of scripting time.
State (Tuple5 (initial.a + a) (initial.b + b) (initial.c + c) (initial.d + d) (initial.e + e)) | ||
State | ||
{ a = initial.a + a | ||
, b = initial.b + b | ||
, c = initial.c + c | ||
, d = initial.d + d | ||
, e = initial.e + e | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A plain record is much faster than using the alias (which is a function internally, the record literal is really an object literal in JS)
@@ -373,24 +374,84 @@ So in the recursion, `b16` is dropped, all the others shift one position to the | |||
Then the `deltaState` is also updated with the `value`. | |||
|
|||
-} | |||
reduceWords i deltaState b16 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 = | |||
if (i - blockSize) < 0 then | |||
reduceWords i ((DeltaState { a, b, c, d, e }) as deltaState) b16 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function now performs 8 calculations per iteration.That means fewer function calls, but diminishing returns mean that it's not worth it to go up to 16.
(shiftedA + f + e + int) | ||
|> Bitwise.shiftRightZfBy 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shiftRightZfBy 0
fixes the bug on the large file reported in #5
rotateLeftBy1 : Int -> Int | ||
rotateLeftBy1 i = | ||
-- because of how `rotateLeftBy1` is used, the `Bitwise.shiftRightZfBy 0` is not required | ||
Bitwise.or (Bitwise.shiftRightZfBy 31 i) (Bitwise.shiftLeftBy 1 i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specializing to shift by 1 is important. This function takes only 1 argument now, an so elm does not need to apply wrappers that enable currying of this function. Much faster.
Brilliant, thanks. Love learning about elm perf from you, thanks for all the explanations |
Now published as 2.1.1 (cc @BKSpurgeon) |
and also a bug fix, see comments