-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: synchronize close with other I/O for streams #30837
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM makes sense as long as it's considered semver-minor.
This would become unnecessary and reverted through #29656.
lib/internal/fs/streams.js
Outdated
fs.write(this.fd, data, 0, data.length, this.pos, (er, bytes) => { | ||
// Return early if this stream has been destroyed. The close() call inside | ||
// _destroy() may cause errors when writing and we don't want to emit those. | ||
if (this.destroyed) return cb(); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a long conversation about swallowing errors after destroy()
here, #29197. The consensus was that we should not swallow until after 'close'
.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@ronag Thinking about this more, the |
5eba809
to
2a723ac
Compare
@ronag PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM with @ronag's suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Should we apply the same fixes to |
lib/internal/fs/streams.js
Outdated
@@ -339,7 +356,17 @@ WriteStream.prototype._write = function(data, encoding, cb) { | |||
}); | |||
} | |||
|
|||
if (this.destroyed) return cb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might need to be a cb(new ERR_STREAM_DESTROYED('write'))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? There hasn’t been any error, has there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The write hasn't completed. If we don't send an error here the caller would think the write has completed even though it hasn't.
See, https://github.com/nodejs/node/blob/master/lib/_stream_writable.js#L821
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m worried that introducing an error when there previously was none (or at least not usually) would be semver-major, and I’d prefer to keep this PR as close to just being a fix for the bug as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only if called with a callback (which is very unusual) in which case it's actually a bug if it's not an error.
Though if you are worried about it I guess we can leave it as is. It's an unusual edge case after all. Could we at least have a separate semver-major PR for "correct" behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a correctness issue I don't think it needs to be semver-major?
Maybe it doesn’t need to be, but I still feel like these are two very different things…
- This PR addresses a race condition that can occur randomly and surprisingly. Fixing it removes unexpected errors.
- Adding an error to the callback makes behaviour more consistent, but it would add unexpected errors.
I’d rather not mix the two, and I think we’ve treated other situations where we add errors for consistency as semver-major in the past (and I don’t really see any reason not to do that here, too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that follows, does it? Right now you'll receive an EBADF error event. While the EBADF error is, er, in error, it at least tells you that the write didn't go through.
(I suppose it could also end up writing to a different file if the fd was reopened in the mean time, which of course is - edit: a lot - worse than what this PR does.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my reasoning is mostly that this is only relevant when the stream has already been destroyed at this point, and so it should be expected that writes may not finish?
If you feel strongly, I’ll apply @ronag’s suggestion, but I’m still a bit worried about breakage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it is that you already receive unpredictable EBADF errors now. A predictable error is better, and certainly better than silently dropping data on the floor.
Another way of looking at it: how likely is this change to break existing, functionally correct code? I expect the answer is 'close to zero' - any code that breaks was probably already broken, just not reliably so.
Does that sound reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve pushed a commit with the suggestion … still feeling a bit worried about it but we’ll see if this is problematic
_write and _read can be called from 'connect' after Socket.destroy() has been called. This should be a noop. Refs: nodejs#30837
lib/internal/fs/streams.js
Outdated
@@ -339,7 +356,17 @@ WriteStream.prototype._write = function(data, encoding, cb) { | |||
}); | |||
} | |||
|
|||
if (this.destroyed) return cb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I share @ronag's concern though: it's tantamount to silently ignoring the write request from the user.
Since this is a correctness issue I don't think it needs to be semver-major?
fs.read(this.fd, pool, pool.used, toRead, this.pos, (er, bytesRead) => { | ||
this[kIsPerformingIO] = false; | ||
// Tell ._destroy() that it's safe to close the fd now. | ||
if (this.destroyed) return this.emit(kIoDone, er); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is observable when emit()
is monkey-patched, which isn't entirely uncommon. Not a reason per se not to introduce this pattern (it's pretty elegant) but I thought I'd point it out anyway.
Just a thought. What if for whatever reason the io doesn’t complete? Do we need a timeout? Or does libuv handle that? |
@ronag In that case, this PR delays the |
I would like to ask for @mcollina's take on this before merging. See, #30864 (comment). |
Part of the flakiness in the parallel/test-readline-async-iterators-destroy test comes from fs streams starting `_read()` and `_destroy()` without waiting for the other to finish, which can lead to the `fs.read()` call resulting in `EBADF` if timing is bad. Fix this by synchronizing write and read operations with `close()`. Refs: nodejs#30660
1c9c40f
to
b1f7bf0
Compare
@ronag I guess we can do that, but at the same time I think we should fix this bug. |
Given @mcollina's answer in the linked comment I'm not sure he would agree with this PR. Another valid (?) solution is also to just add |
@ronag Let’s wait for him to comment. In my opinion |
I mostly agree. My personal concern is whether this fix should apply everywhere (net, http, http2, quic etc...) or whether fs is a somehow an edge case?
This is a bit strange for me though. Shouldn't the iterator be released once leaving the The readable stream will of course still |
I'm a bit conflicted by this change. I think we should consider making the callback of What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Yes, that's a bit contradictory. I guess it is "safe" to wait for I/O if we're under the assumption that it will always complete (or fail) within reasonable time, otherwise we might end up with a stuck stream without means to abort it, e.g. a socket trying to write to a server which is bugged/crashed/corrupt? It might be a case by case basis. In
I think it should be part of the official API. Not sure how that helps us here though? Also, before making it public we should probably ensure the |
I agree.
I think documenting that is enough. |
I’m not sure if that’s always the case, but here it’s definitely problematic. Getting
I’d be okay with that, yes 👍 |
Part of the flakiness in the parallel/test-readline-async-iterators-destroy test comes from fs streams starting `_read()` and `_destroy()` without waiting for the other to finish, which can lead to the `fs.read()` call resulting in `EBADF` if timing is bad. Fix this by synchronizing write and read operations with `close()`. Refs: #30660 PR-URL: #30837 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Landed in 8a5c7f6 |
Part of the flakiness in the parallel/test-readline-async-iterators-destroy test comes from fs streams starting `_read()` and `_destroy()` without waiting for the other to finish, which can lead to the `fs.read()` call resulting in `EBADF` if timing is bad. Fix this by synchronizing write and read operations with `close()`. Refs: #30660 PR-URL: #30837 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Part of the flakiness in the parallel/test-readline-async-iterators-destroy test comes from fs streams starting `_read()` and `_destroy()` without waiting for the other to finish, which can lead to the `fs.read()` call resulting in `EBADF` if timing is bad. Fix this by synchronizing write and read operations with `close()`. Refs: #30660 PR-URL: #30837 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Part of the flakiness in the parallel/test-readline-async-iterators-destroy test comes from fs streams starting `_read()` and `_destroy()` without waiting for the other to finish, which can lead to the `fs.read()` call resulting in `EBADF` if timing is bad. Fix this by synchronizing write and read operations with `close()`. Refs: #30660 PR-URL: #30837 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com>
Part of the flakiness in the
parallel/test-readline-async-iterators-destroy test comes from
fs streams starting
_read()
and_destroy()
without waitingfor the other to finish, which can lead to the
fs.read()
callresulting in
EBADF
if timing is bad.Fix this by synchronizing write and read operations with
close()
.Refs: #30660
/cc @ronag
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes