-
-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: remove superfluous call to toLowerCase #677
Conversation
26d9601
to
9fd5c48
Compare
I think this verbiage is the bit you're looking for:
I.e. toString() will always return lowercase strings, so we can remove the [Edit: heh ... looks like you ninja'd me and already did so] |
Okay, ref: https://stackoverflow.com/a/19751935/148072 So I've updated the PR to remove the |
@broofa hehe, we commented at the same time. Yes, we can skip the call! 🙌 |
This change makes the bundle smaller, and the uuid generation faster. On V8 it makes Potentially we want to add |
Haha, yeah, it's a bit funny that it has been in there all the time 😂 |
That's awesome! I'm happy my tests proven something closer that made this library better! |
Apparently the Can you confirm that this issue was considered and is not relevant anymore ? // Note: Be careful editing this code! It's been tuned for performance
// and works in ways you may not expect. See https://github.com/uuidjs/uuid/pull/434
// `toLowerCase` used to fix memory issue caused by concatenation: https://bugs.chromium.org/p/v8/issues/detail?id=3175#c4 (I see this statement proven true: #267 (comment):
) |
@kkntl Good question! @LinusU , did you consider the memory consumption issue when putting up this PR? It looks like that Testing this, it does still appear to be an issue. It's worth noting that if this is a regression - as it seems to be - the impact is mitigated somewhat by #600, which short-circuits this project's use of v4() in preference for Running this script both with and without the const {v4} = require('uuid');
// Using empty `options` to prevent `uuid` from short-circuiting to
// `crypto.randomUUID()`, which isn't what we want to test
const opts = {}; // empty `options` to
// Call v4() once to prime entropy cache
v4(opts);
// Create memory snapshot here, prior to creating 10K IDs
//
const ids = [];
for (let i = 0; i < 10000; i++) ids.push(v4(opts));
// Create memory snapshot here, after creating 10K IDs, and
// use the "comparison" view in Chrome debugger to inspect
// the difference.
console.log('done'); With code on Adding |
It sounds like we should bring this back and improve the comment? |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [uuid](https://github.com/uuidjs/uuid) | dependencies | major | [`^9.0.0` -> `^10.0.0`](https://renovatebot.com/diffs/npm/uuid/9.0.1/10.0.0) | --- ### Release Notes <details> <summary>uuidjs/uuid (uuid)</summary> ### [`v10.0.0`](https://github.com/uuidjs/uuid/blob/HEAD/CHANGELOG.md#1000-2024-06-07) [Compare Source](uuidjs/uuid@v9.0.1...v10.0.0) ##### ⚠ BREAKING CHANGES - update node support (drop node@12, node@14, add node@20) ([#​750](uuidjs/uuid#750)) ##### Features - support support rfc9562 MAX uuid (new in RFC9562) ([#​714](uuidjs/uuid#714)) ([0385cd3](uuidjs/uuid@0385cd3)) - support rfc9562 v6 uuids ([#​754](uuidjs/uuid#754)) ([c4ed13e](uuidjs/uuid@c4ed13e)) - support rfc9562 v7 uuids ([#​681](uuidjs/uuid#681)) ([db76a12](uuidjs/uuid@db76a12)) - update node support matrix (only support node 16-20) ([#​750](uuidjs/uuid#750)) ([883b163](uuidjs/uuid@883b163)) - support rfc9562 v8 uuids ([#​759](uuidjs/uuid#759)) ([35a5342](uuidjs/uuid@35a5342)) ##### Bug Fixes - revert "perf: remove superfluous call to toLowerCase ([#​677](uuidjs/uuid#677))" ([#​738](uuidjs/uuid#738)) ([e267b90](uuidjs/uuid@e267b90)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or rename PR to start with "rebase!". 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
From the discussion and experimentation in #676 I found that we are running
toLowerCase
on every generated id instead of once when constructingbyteToHex
.In fact,
toString(16)
seems to already return lower case, but I cannot be sure that this is guaranteed by the spec after reading it, and since it doesn't do any harm I kept it:https://tc39.es/ecma262/multipage/ecmascript-data-types-and-values.html#sec-numeric-types-number-tostring
npm run test:benchmark
is returning some strange results though:uuid.v1
is much faster, butuuid.stringify
is slower? 🤔If I benchmark this with
hyperfine
(see #676 for details) then thisunsafeStringify
is 1.66 times faster than the one currently inmain
.The only thing I can think would cause
stringify
to be slower, would be if V8 sets some kind of "is ascii" or similar flag on the string when runningtoLowerCase
, and then uses this flag to chose another regex engine 🤷♂️Indeed, adding
toLowerCase
to thevalidate
function (before passing it to an case-insensitive regex 😅) brings the speed of theuuid.stringify()
up tox 4,465,165 ops/sec ±0.12% (98 runs sampled)
again.Hyperfine seems to support this:
lc-1.js
lc-2.js
(diff)I've created a test case that shows the current code we have, the changes proposed in this PR, and a third option that adds
.toLowerCase
only in the "safe"stringify
function.https://jsbench.me/u4lcs1i0oy/1
In Safari, the solution here faster. But on V8, probably due to the quirk mentioned above, calling
toLowerCase
before passing toRegex#test
is faster, so both 1 & 3 is faster.I'm not really sure what is best to optimize for, but this PR should improve performance on every runtime for the
v1
,v3
,v4
, andv5
functions. If we on top of that want to insert a weird optimization (callingtoLowerCase
beforeRegex#test
) just to optimize on V8, that's fine with me.Even without the performance benefits, I also think that this code change makes sense. Why lowercase the result every time, instead of building it from parts with the correct case from the beginning?