-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime hardware assisted CRC32 for intel processors #4933
Conversation
You can force push to an existing PR branch no need to recreate many PR. Also try to squash commits and make sure coding style is matching.
|
I originally pushed from the wrong branch. sorry for spam |
4space tabs - fixed |
c817acb
to
8d39063
Compare
Do you have any performance comparisons for this? In Perhaps a simple wrapper over #if defined(__SSE4_2__) || defined(__AVX__)
#include <nmmintrin.h>
#define IM_CRC32_U8(CRC, C) (_mm_crc32_u8(CRC, C))
#else
// static const ImU32 GCrc32LookupTable[256] = ...
#define IM_CRC32_U8(CRC, C) ((CRC >> 8) ^ GCrc32LookupTable[(CRC & 0xFF) ^ C])
#endif |
Since data is small, it'll end up in cache after 1st strlen thus following memchr is not that big of a deal, and ImHashData tries to do 8chars at a time. In my tests strings >16 chars start to win vs your proposed "good compromise" approach. |
FYI a simpler version of this was merged for #8169 which included a change of CRC32 table so both paths would match (which is the most important thing). As per discussion above it didn't seem worth to do the dual pass ### scan for strings (which should generally be small). Thanks! |
This is my 2nd attempt to enable hardware crc32 . This time I made it with minimal modifications required, compile-time only . It should work with intel/microsoft/gcc/clang compilers on windows/linux.
To test it in visual studio, make sure you enable AVX instruction set in project settings.
Brief description of changes to the code:
ImHashData does hardware crc32 if built with instruction set supporting at least sse42 on intel platform, otherwise original implementation is used
Regards
Edit: sorry for multiple requests posted earlier - noob git user here.