Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[suggestion] Reduce crc32 code size #143

Closed
puzrin opened this issue Jun 17, 2014 · 15 comments
Closed

[suggestion] Reduce crc32 code size #143

puzrin opened this issue Jun 17, 2014 · 15 comments

Comments

@puzrin
Copy link
Contributor

puzrin commented Jun 17, 2014

Current crc32 implementation contains constants table, that can be generated on the fly without noticeable delay. See implementation, used in pako https://github.com/nodeca/pako/blob/master/lib/zlib/crc32.js

Also, i'm not sure, that it's good idea (perfomance) to define table every time when function called. Probably it worth move table outside of crc32 scope (i didn't tested performance).

@dduponchel
Copy link
Collaborator

That's already in #141 :D See dduponchel/jszip@cdcdbce.
With pako's implementation I also get a speed boost between 50% and 600% 👍

@SheetJSDev
Copy link
Contributor

@dduponchel @puzrin see #142 (comment)

@dduponchel
Copy link
Collaborator

@SheetJSDev can we continue the discussion here ? An issue about crc32 seems a better place than an issue about utf8 :)

When I imported pako's functions, I added a new one to handle binary strings with a charCodeAt. My benchmark used this implementation and js-crc32 in two use cases : a 10MB binary string and a 10MB Uint8Array. In Firefox and Chrome, I have about 30 ops/sec for both libraries. I should have given more details on my test suite, sorry. I will create a jsperf test to illustrate it.

Thanks for taking the time to investigate this !

@puzrin
Copy link
Contributor Author

puzrin commented Jun 18, 2014

@SheetJSDev i don't pretend that crc32 implementation used in pako is the best for all cases. It's done exactly for pako's case - typed and native arrays. That's dictated by inflate/deflate optimizations. I just noted evident improvment in jszip. Also, in my experiments, i didn't noticed profit from loop unroll. And note, that most frequent pako's usage are inflate/deflate (with optional alder32), not gzip with crc32.

I think, if one care about speed, crc32 calculation can be omitted. Also, prior to optimize any place, it would be nice to find critical chains by profiler. It's just my suggestion - probably, all pain sit in jszip types mapper, that converts any-format-to-any-format.

Can't dig deep into jszip. Just share my pako experience, when it can help to this project.

@SheetJSDev
Copy link
Contributor

@dduponchel @puzrin from my understanding of the APPNOTE, CRC32 must be calculated on write, which is a major source of delays according to the v8 benchmark for the js-xlsx test suite. The XLSX and XLSM and XLSB files contain utf8-encoded XML files and binary files

From my testing, CRC32 from jszip is one of the most significant functions that v8 deoptimizes. v8 has some cool options like --trace_opt --trace_deopt --code_comments that indicates the nature of the deoptimization. The original implementation was deoptimizing with message tagged-to-i, which means that it tried to optimize as if it were a 32 bit integer but sometimes the value wasn't a 32 bit signed integer. As you probably guessed, it came from the table's storage method.

In exploring how to improve performance, there were a few non-obvious places for performance wins that work everywhere (node and in browser based on my tests):

  • separate functions for the different types (one for binary strings, one for buffers, one for utf8 strings)
  • for small binary strings, it's more efficient to use charCodeAt than to convert to a buffer
  • loop unrolling is significant (this was my JSPerf test: http://jsperf.com/crc32-table/2 )

@puzrin I'm slowly working my way to a new deflate implementation (https://github.com/SheetJS/js-adler32 is a translation of the same logic to the adler-32 checksum algorithm), and hopefully we find ways to improve pako and jszip along the way.

@puzrin
Copy link
Contributor Author

puzrin commented Jun 18, 2014

@SheetJSDev i'm not sure, that understand your benchmark. It seems to measure table creation speed. But this table is filled only once, on script load or on first crc32 call. That can not affect crc32 calculation speed.

@SheetJSDev
Copy link
Contributor

@puzrin I was pointing to the performance benefits of loop unrolling. Here is another example made by someone else: http://jsperf.com/loop-unrolling/

@puzrin
Copy link
Contributor Author

puzrin commented Jun 18, 2014

This example is also not related to discussed topic. I don't care about loop unrolling theory and synthetic tests. The only useful benchmark is benchmark, that compare crc32 speed in unrolled and not unrolled way.

@SheetJSDev
Copy link
Contributor

@puzrin clone https://github.com/SheetJS/js-crc32 and run make perf. There is a simple test of doing 2 operations per loop: https://github.com/SheetJS/js-crc32/blob/master/perf/bstr.js#L46-L53 so you can see for yourself :)

@SheetJSDev
Copy link
Contributor

@puzrin @dduponchel here's a jsperf: http://jsperf.com/js-crc32-arr (and check the console to confirm both give the same result). Chrome and Safari both show significant improvement, firefox shows a slight improvement, and IE shows a slight degradation (although the strange bar when run using IE9 compatibility mode makes me question the entire IE result set)

@puzrin
Copy link
Contributor Author

puzrin commented Jun 19, 2014

I see ~ equal results in Chromium & FF under Ubuntu 12.04 LTS. Seriously.

@SheetJSDev
Copy link
Contributor

@puzrin are you running Chromium 34.0.1847? Chromium is actually up to 37 now. On the chrome side, 35 is the stable version and 37 in canary. Can you update your browser and run the test?

@puzrin
Copy link
Contributor Author

puzrin commented Jun 19, 2014

Yes, Cromium 34 is mine. Can't spend much efforts to dig deep. Honestly, after spending ton's of time to play with pako, i've changed my focus on another projects.

@puzrin
Copy link
Contributor Author

puzrin commented Jun 19, 2014

If anyone interested, that's a link with my old benchmark, used when writing pako http://jsperf.com/crc32-on-array . Unlooping and adler32 were also benchmarked, but i don't remember where.

@puzrin
Copy link
Contributor Author

puzrin commented Feb 16, 2016

Let's close it as not very actual.

@puzrin puzrin closed this as completed Feb 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants