Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: improve performance for sync write finishes #30710

Closed
wants to merge 2 commits into from

Conversation

addaleax
Copy link
Member

Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                     confidence improvement accuracy (*)    (**)   (***)
streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

@nodejs/streams @ronag

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: nodejs#18013
Refs: nodejs#18367
@nodejs-github-bot nodejs-github-bot added the stream Issues and PRs related to the stream subsystem. label Nov 29, 2019
state.afterWriteTickInfo.count++;
} else {
state.afterWriteTickInfo = { count: 1, cb, stream, state };
process.nextTick(afterWriteTick, state.afterWriteTickInfo);
Copy link
Contributor

@mscdex mscdex Nov 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any difference in just using afterWrite directly here (process.nextTick(afterWrite, stream, ...))?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex We need to allocate an object anyway so that we can modify count later, so that’s why it’s not just spreading the arguments right now

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

how did you find out about this problem? Was this tied with a specific use case? I'm almost never pass a callback to write.

@addaleax
Copy link
Member Author

how did you find out about this problem? Was this tied with a specific use case? I'm almost never pass a callback to write.

Well … Somebody asked for help privately because they were running a Node.js program, which consisted of a single synchronous loop and printed output using console.log() on each iteration, which lead to memory exhaustion after a few hours because the process.nextTick() queue was filling up 🙃

@ronag
Copy link
Member

ronag commented Nov 29, 2019

LGTM

Using the callback is very uncommon in my experience. I'm not sure the extra complexity is worth it? I'm a little worried about the maintenance cost (in general) of streams.

I would maybe instead consider looking into why console.log uses a callback and whether it's strictly necessary.

@ronag
Copy link
Member

ronag commented Nov 29, 2019

Actually looking into this further this is not an optimization just for the callback case, but in general when doing multiple write calls in the same tick.

lib/_stream_writable.js Outdated Show resolved Hide resolved
@addaleax
Copy link
Member Author

Using the callback is very uncommon in my experience. I'm not sure the extra complexity is worth it? I'm a little worried about the maintenance cost (in general) of streams.

Yeah, this also applies when no callback is passed -- that being said, I would understand if people were concerned about this working only for streams that potentially call write callbacks synchronously (although a number of builtin streams do that).

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented Nov 29, 2019

CI: https://ci.nodejs.org/job/node-test-pull-request/27173/ (:white_check_mark:)

}
}
}

function afterWrite(stream, state, cb) {
function afterWriteTick({ stream, state, count, cb }) {
Copy link
Member

@ronag ronag Nov 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might clear the wrong object. I think clearing the count and cb of the passed object is safer then modifying state?

function afterWriteTick(info) {
  const { stream, state, count, cb } = info;
  info.cb = null;
  return afterWrite(stream, state, count, cb);

This would also allow reusing the object and avoiding allocations:

if (!state.afterWriteTickInfo || state.afterWriteTickInfo.cb) {
  state.afterWriteTickInfo = { stream, state, cb, count: 1 };
} else {
  state.afterWriteTickInfo.cb = cb;
  state.afterWriteTickInfo.count = 1;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it matter though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is for the row below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronag So … the effect of setting afterWriteTickInfo to null is that the next time the code above is reached, a new process.nextTick() call with a new afterWriteTickInfo object is made. That’s always safe, right?

I think setting .cb to null would have the same effect, and .count is cleared anyway. I can do that instead, if you prefer, although it might screw with the map/hidden class of afterWriteTickInfo, as .cb is always a function right now.

Copy link
Member

@ronag ronag Nov 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of the case where you have two different cbs, e.g.

write('a', cba) // schedule tick a
write('b', cbb) // clear info a, schedule tick b

// ...

// tick a
// clear info b

// tick b
// clear nothing

The a tick will actually clear the info for the b tick.

Probably not a problem, but maybe a little weird... I don't have a strong opinion if you think it's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might screw with the map/hidden class

Oh, I didn't know that null could cause a problems with that once it's been a function type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The a tick will actually clear the info for the b tick.

Probably not a problem, but maybe a little weird... I don't have a strong opinion if you think it's fine.

Yeah, I think that’s fine, because it would only make a difference if there’s a write('b', cbb) inside cba(), and that seems like a somewhat unlikely scenario, and even then it would only affect performance, not behaviour.

@ronag ronag mentioned this pull request Nov 30, 2019
4 tasks
@addaleax addaleax added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Nov 30, 2019
addaleax added a commit that referenced this pull request Dec 1, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@addaleax
Copy link
Member Author

addaleax commented Dec 1, 2019

Landed in 2205f85

@addaleax addaleax closed this Dec 1, 2019
@addaleax addaleax deleted the writable-nocb branch December 1, 2019 01:14
targos pushed a commit that referenced this pull request Dec 1, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BridgeAR BridgeAR mentioned this pull request Dec 3, 2019
targos pushed a commit that referenced this pull request Dec 5, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BethGriggs BethGriggs mentioned this pull request Dec 9, 2019
MylesBorins pushed a commit that referenced this pull request Dec 17, 2019
Improve performance and reduce memory usage when a writable stream
is written to with the same callback (which is the most common case)
and when the write operation finishes synchronously (which is also
often the case).

                                                         confidence improvement accuracy (*)    (**)   (***)
    streams/writable-manywrites.js sync='no' n=2000000                  0.99 %       ±3.20%  ±4.28%  ±5.61%
    streams/writable-manywrites.js sync='yes' n=2000000        ***    710.69 %      ±19.65% ±26.47% ±35.09%

Refs: #18013
Refs: #18367

PR-URL: #30710
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BethGriggs BethGriggs mentioned this pull request Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants