Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further speed improvements #8

Merged
merged 3 commits into from
Jun 18, 2020

Conversation

folkertdev
Copy link
Contributor

and also a bug fix, see comments

by specializing to the amount of bits rotated, the called function is a one-argument function, so elm does not have to apply the usual F and A wrappers that enable currying
Copy link
Contributor Author

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine this now does ~16MB/s in a benchmark, and the 90mb file takes just over 5 seconds of scripting time.

Comment on lines -366 to +367
State (Tuple5 (initial.a + a) (initial.b + b) (initial.c + c) (initial.d + d) (initial.e + e))
State
{ a = initial.a + a
, b = initial.b + b
, c = initial.c + c
, d = initial.d + d
, e = initial.e + e
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A plain record is much faster than using the alias (which is a function internally, the record literal is really an object literal in JS)

@@ -373,24 +374,84 @@ So in the recursion, `b16` is dropped, all the others shift one position to the
Then the `deltaState` is also updated with the `value`.

-}
reduceWords i deltaState b16 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 =
if (i - blockSize) < 0 then
reduceWords i ((DeltaState { a, b, c, d, e }) as deltaState) b16 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function now performs 8 calculations per iteration.That means fewer function calls, but diminishing returns mean that it's not worth it to go up to 16.

Comment on lines +444 to +445
(shiftedA + f + e + int)
|> Bitwise.shiftRightZfBy 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shiftRightZfBy 0 fixes the bug on the large file reported in #5

Comment on lines +488 to +491
rotateLeftBy1 : Int -> Int
rotateLeftBy1 i =
-- because of how `rotateLeftBy1` is used, the `Bitwise.shiftRightZfBy 0` is not required
Bitwise.or (Bitwise.shiftRightZfBy 31 i) (Bitwise.shiftLeftBy 1 i)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specializing to shift by 1 is important. This function takes only 1 argument now, an so elm does not need to apply wrappers that enable currying of this function. Much faster.

@folkertdev folkertdev mentioned this pull request Jun 17, 2020
@TSFoster TSFoster merged commit 6a2b3ae into TSFoster:master Jun 18, 2020
@TSFoster
Copy link
Owner

Brilliant, thanks. Love learning about elm perf from you, thanks for all the explanations

@TSFoster
Copy link
Owner

Now published as 2.1.1

(cc @BKSpurgeon)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants