buffer: Reallocate() instead of creating a new backing store #53552

mildsunrise · 2024-06-22T22:40:49Z

in Buffer::New(..., node::encoding), when the prediction returned by StringBytes::Size() doesn't match the actual size emitted by StringBytes::Write(), we currently create a second backing store and then memcpy() to it.

AFAIU, we can instead reallocate the backing store into the new size, which should contract the allocation if possible.

this is especially relevant for base64 (see discussion at #53550), which since #52428 overestimates fairly often.

nodejs-github-bot · 2024-06-22T22:49:46Z

CI: https://ci.nodejs.org/job/node-test-pull-request/59931/

nodejs-github-bot · 2024-06-22T22:49:56Z

CI: https://ci.nodejs.org/job/node-test-pull-request/59932/

mildsunrise · 2024-06-22T23:32:44Z

this seems to improve performance substantially when decoding large (order of MBs or more) inputs, but not that much in smaller inputs. but even then, it simplifies the code so I think it's valuable

lemire · 2024-06-23T00:34:53Z

I thought that's what we had, and then @targos deliberately reverted it.

#52292

Note that we should be concerned by the fact that if we are even just 1 byte off, we create a new buffer and copy. Copies are cheap and that's never going a massive issue, but we are potentially wasting time for no good reason. I think that @joyeecheung raised this issue in a comment to #52292

mildsunrise · 2024-06-23T07:27:49Z

ah, dang, I did not realize Reallocate() was deprecated... sad. that'll teach me to always look at the history before trying stuff :)

mildsunrise · 2024-06-23T07:41:06Z

there is something I don't understand, though. #52234 claims there is no longer a performance benefit since we are no longer overriding the allocator's reallocation... but the performance increase I observed (in main) was too large to have been a fluke 🤔

joyeecheung · 2024-06-23T11:37:29Z

Just a guess: the performance difference may be caused by the fact that when allocating a new backing store, V8 always zero-initializes the buffer even though its content (or most of it) will be overwritten by the copy later; whereas the realloc implementation in V8 only allocates an uninitialized buffer first, copy the data and if the new length is bigger, zero-initialize the uncopied tail.

joyeecheung · 2024-06-23T11:43:55Z

Also in a micro-benchmark scenario: repeatedly allocating a new backing store can urge the GC a lot more eagerly than simply reallocating it (the latter isn’t incorporated into the GC schedule as much AFAICT), though technically that would be a pitfall of the microbenchmark itself.

mildsunrise · 2024-06-23T13:37:36Z

Just a guess: the performance difference may be caused by the fact that when allocating a new backing store, V8 always zero-initializes the buffer even though its content (or most of it) will be overwritten by the copy later; whereas the realloc implementation in V8 only allocates an uninitialized buffer first, copy the data and if the new length is bigger, zero-initialize the uncopied tail.

yeah, I thought about that memset but it doesn't seem to be supported by profiling...

Also in a micro-benchmark scenario: repeatedly allocating a new backing store can urge the GC a lot more eagerly than simply reallocating it (the latter isn’t incorporated into the GC schedule as much AFAICT), though technically that would be a pitfall of the microbenchmark itself.

so you're saying that even if the reallocation is implemented as allocate + copy, it is still treated differently in terms of GC?

joyeecheung · 2024-06-23T14:07:25Z

Yes because the GC scheduling code is in V8. A Realloc call from Node.js’s side doesn’t affect GC much AFAICT, whereas NewBackingStore does.

My comment in the other PR was talking about certain use cases where we don’t need the buffer to be allocated through the array buffer allocator and they can just be allocated normally. IIUC we sometimes prefer to allocate native memory using the array buffer allocator so that when the amount of external memory used is high enough V8 would perform GC to hopefully reduce the memory footprint (and if those external memory are set up to be released when some JS values get GC’ed, it would do the job). However this setup is not always doing more good than harm. Performance wise it may not be bad to just directly manage the external using stdlib calls and use Isolate::AdjustAmountOfExternalAllocatedMemory() to tune GC instead of going through the array buffer allocator machinery. However I don’t think this applies to Buffers because they are Uint8Arrays now and are always backed by real array buffers anyway.

nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Jun 22, 2024

buffer: call Reallocate() instead of creating new backing store

3c2c597

mildsunrise force-pushed the buffer-new-reallocate branch from a647a40 to 3c2c597 Compare June 22, 2024 22:43

mildsunrise added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 22, 2024

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 22, 2024

anonrig approved these changes Jun 22, 2024

View reviewed changes

mildsunrise marked this pull request as ready for review June 22, 2024 23:33

anonrig added needs-benchmark-ci PR that need a benchmark CI run. author ready PRs that have at least one approval, no pending requests for changes, and a CI started. labels Jun 22, 2024

mildsunrise requested review from addaleax and targos June 23, 2024 00:17

mildsunrise closed this Jun 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: Reallocate() instead of creating a new backing store #53552

buffer: Reallocate() instead of creating a new backing store #53552

mildsunrise commented Jun 22, 2024

nodejs-github-bot commented Jun 22, 2024

nodejs-github-bot commented Jun 22, 2024

mildsunrise commented Jun 22, 2024

lemire commented Jun 23, 2024

mildsunrise commented Jun 23, 2024

mildsunrise commented Jun 23, 2024

joyeecheung commented Jun 23, 2024 •

edited

Loading

joyeecheung commented Jun 23, 2024

mildsunrise commented Jun 23, 2024

joyeecheung commented Jun 23, 2024 •

edited

Loading

buffer: Reallocate() instead of creating a new backing store #53552

buffer: Reallocate() instead of creating a new backing store #53552

Conversation

mildsunrise commented Jun 22, 2024

nodejs-github-bot commented Jun 22, 2024

nodejs-github-bot commented Jun 22, 2024

mildsunrise commented Jun 22, 2024

lemire commented Jun 23, 2024

mildsunrise commented Jun 23, 2024

mildsunrise commented Jun 23, 2024

joyeecheung commented Jun 23, 2024 • edited Loading

joyeecheung commented Jun 23, 2024

mildsunrise commented Jun 23, 2024

joyeecheung commented Jun 23, 2024 • edited Loading

joyeecheung commented Jun 23, 2024 •

edited

Loading

joyeecheung commented Jun 23, 2024 •

edited

Loading