Yet another xxhash
addon for Node.js which can be x50 times faster than crypto
MD5
IMPORTANT: xxhash-addon
v2 is finally here. This is almost a re-work of this project with heavy focus on performance and consistency. FAQ has some very good info that you may not want to miss.
Platform | Build Status |
---|---|
AppVeyor (Windows - Release build) | |
Actions (Ubuntu, macOS, Windows - Release and ASan builds) |
xxhash-addon
is a native addon for Node.js (>=8.6.0) written using N-API. It 'thinly' wraps xxhash v0.8.2
, which has support for a new algorithm XXH3
that has been showed to outperform its predecessor.
IMPORTANT: As of v0.8.0
, XXH3 and XXH128 are now considered stable. Rush to the upstream CHANGELOG for the formal announcement! xxhash-addon v1.4.0
is the first iteration packed with stable XXH3 and XXH128.
- Greatly improved performance backed by benchmarks (see charts below.)
- Better consistency and smaller code size thanks to pure C-style wrapping.
The following results are generated by running the benchmark.js file. Duration (ms)
measures time taken to digest 10GB of randomly filled buffer using streaming methods (update()
and digest()
) of the hash functions.
npm run benchmark
- On an ARM MacBook Pro (16-inch, 2021): M1 Pro, 16GB of Mem, macOS Monterey 12.4, Node.js v16.15.1
Hash func | Length (bits) | Duration (ms) | Note |
---|---|---|---|
MD5 (node:crypto) | 128 | 19653 | |
SHA1 (node:crypto) | 160 | 4380 | |
BLAKE2s256 (node:crypto) | 256 | 18293 | BLAKE2s is so slow on Node.js. This is not aligned with xxHash benchmark. |
XXH64 (xxhash-addon) | 64 | 732 | Compilied with -O2 . |
XXH3 (xxhash-addon) | 64 | 350 | Compilied with -O2 . On ARM, XXH3 is x2 times faster than XXH64 and x50 times faster than MD5. |
- On an Intel Mac mini (2018): Core i3, 8GB of Mem, macOS Monterey 12.4, Node.js v16.15.0
Hash func | Length (bits) | Duration (ms) | Note |
---|---|---|---|
MD5 (node:crypto) | 128 | 15187 | |
SHA1 (node:crypto) | 160 | 10568 | |
BLAKE2s256 (node:crypto) | 256 | 27334 | BLAKE2s is so slow on Node.js. This is not aligned with xxHash benchmark. |
XXH64 (xxhash-addon) | 64 | 1038 | Compilied with -O2 . |
XXH3 (xxhash-addon) | 64 | 767 | Compilied with -O2 . Significant improvement on XXH3. Even more impressive on ARM. |
xxhash-addon
exposes xxhash's API in a friendly way for downstream consumption (see the Example of Usage section).- Covering all 4 variants of the algorithm: XXH32, XXH64, XXH3 64-bit, XXH3 128-bit.
- Supporting XXH3 secret.
- Consistently producing canonical (big-endian) form of hash values as per xxhash's recommendation.
- The addon is extensively sanity-checked againts xxhash's sanity test suite to ensure that generated hashes are correct and align with xxhsum's (
xxhsum
is the official utility of xxhash). Check the filexxhash-addon.test.js
to see howxxhash-addon
is being tested. - Being seriously checked against memory safety and UB issues with ASan and UBSan. See the CI for how this is done.
- Benchmarks are publicly available.
- Minimal dependency: the package does not depend on any other npm packages.
- TypeScript support.
xxhash-addon
is strongly recommended to be used with TypeScript. Definitely check FAQ before using the addon.
npm install xxhash-addon
Note: This native addon requires recompiling. If you do not have Node.js building toolchain then you must install them first:
On a Windows machine
npm install --global --production windows-build-tools
On a Debian/Ubuntu machine
sudo apt-get update && sudo apt-get install python g++ make
On a RHEL/CentOS machine
If you are on RHEL 6 or 7, you would need to install GCC/G++ >= 6.3 via devtoolset-
for the module to compile. See SCL.
On a Mac
Install Xcode command line tools
const { XXHash32, XXHash3 } = require('xxhash-addon');
// Hash a string using the static one-shot method.
const salute = 'hello there';
const buf_salute = Buffer.from(salute);
console.log(XXHash32.hash(buf_salute).toString('hex'));
// Digest a byte-stream (hash a buffer piece by piece).
const hasher32 = new XXHash32(Buffer.from([0, 0, 0, 0]));
hasher32.update(buf_salute.slice(0, 3));
console.log(hasher32.digest().toString('hex'));
hasher32.update(buf_salute.slice(3));
console.log(hasher32.digest().toString('hex'));
// Reset the hasher for another hashing.
hasher32.reset();
// Using secret for XXH3
// Same constructor call syntax, but hasher switches to secret mode whenever
// it gets a buffer of at least 136 bytes.
const hasher3 = new XXHash3(require('fs').readFileSync('package-lock.json'));
- Why TypeScript?
- Short answer: for much better performance and security.
- Long answer:
Dynamic type check is so expensive that it can hurt performance. In the world with no TypeScript, the streaming
update()
method had to check whether the buffer passed to it was an actual Node'sBuffer
. Failing to detect Buffer type might causev8
toCHECK
and crashed Node process. Such dynamic type check could degrade performance ofxxhash-addon
by 10-15% per my onw benchmark on a low-end Intel Mac mini (on Apple Silicon, the difference is neglectable though.)
So how does TypeScript (TS) help? Static type check.
There is still a theoretical catch. TS' type system is structural so in a corner case where you have a class that is structurally like Buffer
and you pass an instance of that class to update()
. This is an extreme case that should never happen in practice. Nevertheless, there are official techniques to 'force' nominal typing. Check https://www.typescriptlang.org/play#example/nominal-typing for an in-depth.
If you don't use TS then you probably want to enable run-time type check of xxhash-addon
. Uncomment the line # "defines": [ "ENABLE_RUNTIME_TYPE_CHECK" ]
in binding.gyp
and re-compile the addon. Use this at your own risk.
This is for people who are interested in creating a PR.
How to clone?
git clone https://github.com/ktrongnhan/xxhash-addon
git submodule update --init
npm install jest --save-dev
npm run debug:build
npm run benchmark
npm test
Note: debug:build
compiles and links with Address Sanitizer (-fsanitze=address
). npm test
may not work out-of-the-box on macOS.
How to debug asan build?
You may have troubles running tests with asan build. Here is my snippet to get it running under lldb
on macOS.
$ lldb node node_modules/jest/bin/jest.js
(lldb) env DYLD_INSERT_LIBRARIES=/Library/Developer/CommandLineTools/usr/lib/clang/13.1.6/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
(lldb) env ASAN_OPTIONS=detect_leaks=1
(lldb) breakpoint set -f src/addon.c -l 100
(lldb) run
(lldb) next
OR
DYLD_INSERT_LIBRARIES=$(clang --print-file-name=libclang_rt.asan_osx_dynamic.dylib) ASAN_OPTIONS=detect_leaks=1 node node_modules/jest/bin/jest.js
Key takeaways:
- If you see an error saying ASan Interceptor is loaded too late, set the env
DYLD_INSERT_LIBRARIES
. You need to use absolute path to your Node.js binary and jest.js as well. Curious why? An interesting article. - ASan doesn't detect mem-leak on macOS by default. You may want to turn this on with the env
ASAN_OPTIONS=detect_leaks=1
.
If you are debugging on Linux with GCC as your default compiler, here is a helpful oneliner:
$ LD_PRELOAD=$(gcc -print-file-name=libasan.so) LSAN_OPTIONS=suppressions=suppr.lsan DEBUG=1 node node_modules/jest/bin/jest.js
How to upgrade xxHash?
Everything should be set up already. Just pull from the release branch of xxHash.
git submodule update --remote
git status
git add xxHash
git commit -m "Bump xxHash to..."
git push origin your_name/upgrade_deps
export interface XXHash {
update(data: Buffer): void;
digest(): Buffer;
reset(): void;
}
export class XXHash32 implements XXHash {
constructor(seed: Buffer); // Buffer must be 4-byte long.
update(data: Buffer): void;
digest(): Buffer;
reset(): void;
static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}
export class XXHash64 implements XXHash {
constructor(seed: Buffer); // Buffer must be 4- or 8-byte long.
update(data: Buffer): void;
digest(): Buffer;
reset(): void;
static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}
export class XXHash3 implements XXHash {
constructor(seed_or_secret: Buffer); // For using seed: Buffer must be 4- or 8-byte long; for using secret: must be at least 136-byte long.
update(data: Buffer): void;
digest(): Buffer;
reset(): void;
static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}
export class XXHash128 implements XXHash {
constructor(seed_or_secret: Buffer); // For using seed: Buffer must be 4- or 8-byte long; for using secret: must be at least 136-byte long.
update(data: Buffer): void;
digest(): Buffer;
reset(): void;
static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}
The project is licensed under BSD-2-Clause.