Skip to content

Yet another xxhash addon for Node.js which can be x50 times faster than crypto MD5

License

Notifications You must be signed in to change notification settings

ktrongnhan/xxhash-addon

Repository files navigation

Yet another xxhash addon for Node.js which can be x50 times faster than crypto MD5

IMPORTANT: xxhash-addon v2 is finally here. This is almost a re-work of this project with heavy focus on performance and consistency. FAQ has some very good info that you may not want to miss.

npm NPM

Platform Build Status
AppVeyor (Windows - Release build) Build status
Actions (Ubuntu, macOS, Windows - Release and ASan builds) .github/workflows/ci.yml

Overview

xxhash-addon is a native addon for Node.js (>=8.6.0) written using N-API. It 'thinly' wraps xxhash v0.8.2, which has support for a new algorithm XXH3 that has been showed to outperform its predecessor.

IMPORTANT: As of v0.8.0, XXH3 and XXH128 are now considered stable. Rush to the upstream CHANGELOG for the formal announcement! xxhash-addon v1.4.0 is the first iteration packed with stable XXH3 and XXH128.

Why v2?

  1. Greatly improved performance backed by benchmarks (see charts below.)
  2. Better consistency and smaller code size thanks to pure C-style wrapping.

The following results are generated by running the benchmark.js file. Duration (ms) measures time taken to digest 10GB of randomly filled buffer using streaming methods (update() and digest()) of the hash functions.

npm run benchmark
  • On an ARM MacBook Pro (16-inch, 2021): M1 Pro, 16GB of Mem, macOS Monterey 12.4, Node.js v16.15.1
Hash func Length (bits) Duration (ms) Note
MD5 (node:crypto) 128 19653
SHA1 (node:crypto) 160 4380
BLAKE2s256 (node:crypto) 256 18293 BLAKE2s is so slow on Node.js. This is not aligned with xxHash benchmark.
XXH64 (xxhash-addon) 64 732 Compilied with -O2.
XXH3 (xxhash-addon) 64 350 Compilied with -O2. On ARM, XXH3 is x2 times faster than XXH64 and x50 times faster than MD5.
  • On an Intel Mac mini (2018): Core i3, 8GB of Mem, macOS Monterey 12.4, Node.js v16.15.0
Hash func Length (bits) Duration (ms) Note
MD5 (node:crypto) 128 15187
SHA1 (node:crypto) 160 10568
BLAKE2s256 (node:crypto) 256 27334 BLAKE2s is so slow on Node.js. This is not aligned with xxHash benchmark.
XXH64 (xxhash-addon) 64 1038 Compilied with -O2.
XXH3 (xxhash-addon) 64 767 Compilied with -O2. Significant improvement on XXH3. Even more impressive on ARM.

Features

  • xxhash-addon exposes xxhash's API in a friendly way for downstream consumption (see the Example of Usage section).
  • Covering all 4 variants of the algorithm: XXH32, XXH64, XXH3 64-bit, XXH3 128-bit.
  • Supporting XXH3 secret.
  • Consistently producing canonical (big-endian) form of hash values as per xxhash's recommendation.
  • The addon is extensively sanity-checked againts xxhash's sanity test suite to ensure that generated hashes are correct and align with xxhsum's (xxhsum is the official utility of xxhash). Check the file xxhash-addon.test.js to see how xxhash-addon is being tested.
  • Being seriously checked against memory safety and UB issues with ASan and UBSan. See the CI for how this is done.
  • Benchmarks are publicly available.
  • Minimal dependency: the package does not depend on any other npm packages.
  • TypeScript support. xxhash-addon is strongly recommended to be used with TypeScript. Definitely check FAQ before using the addon.

Installation

npm install xxhash-addon

Note: This native addon requires recompiling. If you do not have Node.js building toolchain then you must install them first:

On a Windows machine

npm install --global --production windows-build-tools

On a Debian/Ubuntu machine

sudo apt-get update && sudo apt-get install python g++ make

On a RHEL/CentOS machine

If you are on RHEL 6 or 7, you would need to install GCC/G++ >= 6.3 via devtoolset- for the module to compile. See SCL.

On a Mac

Install Xcode command line tools

Example

const { XXHash32, XXHash3 } = require('xxhash-addon');

// Hash a string using the static one-shot method.
const salute = 'hello there';
const buf_salute = Buffer.from(salute);
console.log(XXHash32.hash(buf_salute).toString('hex'));

// Digest a byte-stream (hash a buffer piece by piece).
const hasher32 = new XXHash32(Buffer.from([0, 0, 0, 0]));
hasher32.update(buf_salute.slice(0, 3));
console.log(hasher32.digest().toString('hex'));
hasher32.update(buf_salute.slice(3));
console.log(hasher32.digest().toString('hex'));

// Reset the hasher for another hashing.
hasher32.reset();

// Using secret for XXH3
// Same constructor call syntax, but hasher switches to secret mode whenever
// it gets a buffer of at least 136 bytes.
const hasher3 = new XXHash3(require('fs').readFileSync('package-lock.json'));

FAQ

  1. Why TypeScript?
  • Short answer: for much better performance and security.
  • Long answer: Dynamic type check is so expensive that it can hurt performance. In the world with no TypeScript, the streaming update() method had to check whether the buffer passed to it was an actual Node's Buffer. Failing to detect Buffer type might cause v8 to CHECK and crashed Node process. Such dynamic type check could degrade performance of xxhash-addon by 10-15% per my onw benchmark on a low-end Intel Mac mini (on Apple Silicon, the difference is neglectable though.)

So how does TypeScript (TS) help? Static type check.

There is still a theoretical catch. TS' type system is structural so in a corner case where you have a class that is structurally like Buffer and you pass an instance of that class to update(). This is an extreme case that should never happen in practice. Nevertheless, there are official techniques to 'force' nominal typing. Check https://www.typescriptlang.org/play#example/nominal-typing for an in-depth.

If you don't use TS then you probably want to enable run-time type check of xxhash-addon. Uncomment the line # "defines": [ "ENABLE_RUNTIME_TYPE_CHECK" ] in binding.gyp and re-compile the addon. Use this at your own risk.

Development

This is for people who are interested in creating a PR.

How to clone?

git clone https://github.com/ktrongnhan/xxhash-addon
git submodule update --init
npm install jest --save-dev
npm run debug:build
npm run benchmark
npm test

Note: debug:build compiles and links with Address Sanitizer (-fsanitze=address). npm test may not work out-of-the-box on macOS.

How to debug asan build?

You may have troubles running tests with asan build. Here is my snippet to get it running under lldb on macOS.

$ lldb node node_modules/jest/bin/jest.js
(lldb) env DYLD_INSERT_LIBRARIES=/Library/Developer/CommandLineTools/usr/lib/clang/13.1.6/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
(lldb) env ASAN_OPTIONS=detect_leaks=1
(lldb) breakpoint set -f src/addon.c -l 100
(lldb) run
(lldb) next

OR

DYLD_INSERT_LIBRARIES=$(clang --print-file-name=libclang_rt.asan_osx_dynamic.dylib) ASAN_OPTIONS=detect_leaks=1 node node_modules/jest/bin/jest.js

Key takeaways:

  • If you see an error saying ASan Interceptor is loaded too late, set the env DYLD_INSERT_LIBRARIES. You need to use absolute path to your Node.js binary and jest.js as well. Curious why? An interesting article.
  • ASan doesn't detect mem-leak on macOS by default. You may want to turn this on with the env ASAN_OPTIONS=detect_leaks=1.

If you are debugging on Linux with GCC as your default compiler, here is a helpful oneliner:

$ LD_PRELOAD=$(gcc -print-file-name=libasan.so) LSAN_OPTIONS=suppressions=suppr.lsan DEBUG=1 node node_modules/jest/bin/jest.js

How to upgrade xxHash?

Everything should be set up already. Just pull from the release branch of xxHash.

git submodule update --remote
git status
git add xxHash
git commit -m "Bump xxHash to..."
git push origin your_name/upgrade_deps

API reference

Streaming Interface

export interface XXHash {
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
}

XXHash32

export class XXHash32 implements XXHash {
  constructor(seed: Buffer); // Buffer must be 4-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

XXHash64

export class XXHash64 implements XXHash {
  constructor(seed: Buffer); // Buffer must be 4- or 8-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

XXHash3

export class XXHash3 implements XXHash {
  constructor(seed_or_secret: Buffer); // For using seed: Buffer must be 4- or 8-byte long; for using secret: must be at least 136-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

XXHash128

export class XXHash128 implements XXHash {
  constructor(seed_or_secret: Buffer); // For using seed: Buffer must be 4- or 8-byte long; for using secret: must be at least 136-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

Licence

The project is licensed under BSD-2-Clause.

About

Yet another xxhash addon for Node.js which can be x50 times faster than crypto MD5

Resources

License

Stars

Watchers

Forks

Packages

No packages published