Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: added _writev (up to x50 faster writes) #2167

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions lib/fs.js
Original file line number Diff line number Diff line change
Expand Up @@ -1867,6 +1867,49 @@ WriteStream.prototype._write = function(data, encoding, cb) {
};


function writev(fd, chunks, position, callback) {
function wrapper(err, written) {
// Retain a reference to chunks so that they can't be GC'ed too soon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain why this is necessary, if you don't mind me asking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm following the other examples in fs.js that have this pattern. I did not invent this to be honest.
My guess is that not having the chunks (that contain buffers) in this closure would GC and free the buffers before libuv has a chance to access and use them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct (afaik). Buffers allocate their own memory and delimit deallocating to v8, which doesn't take into account the event loop. Relevant: f9b7714

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I am a culprit of propagating this practice. Best I can tell it originated with e7604b1. I then furthered what had been done in commits like 7ca77ea. In retrospect I should have changed this when I introduced FSReqWrap() in 819690f.

Don't worry about changing it for this PR though. There should be a single clean commit that addresses just this issue. I'll put that change on my list.

callback(err, written || 0, chunks);
}

const req = new FSReqWrap();
req.oncomplete = wrapper;
binding.writeBuffers(fd, chunks, position, req);
};

WriteStream.prototype._writev = function(data, cb) {
if (typeof this.fd !== 'number')
return this.once('open', function() {
this._writev(data, cb);
});

const self = this;
const len = data.length;
const chunks = new Array(len);
var size = 0;

for (var i = 0; i < len; i++) {
var chunk = data[i].chunk;

chunks[i] = chunk;
size += chunk.length;
}

writev(this.fd, chunks, this.pos, function(er, bytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reference to chunks beyond this. In the case not all the data chunks can be flushed immediately, a reference to this object needs to be kept around until the callback is called.

/cc @bnoordhuis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reference inside of writev (which is still a JS function, not C-land). Scroll up to line 1873 for that reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed that. Thanks.

if (er) {
self.destroy();
return cb(er);
}
self.bytesWritten += bytes;
cb();
});

if (this.pos !== undefined)
this.pos += size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the write fails? Is changing pos okay even in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually change this logic. It was already like this. However, I think it's a safe operation. The most important thing of using pos is that the next call starts at the right offset. If the write fails, there will be an error and the stream will be destroyed anyway.

};


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule is 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, usually two spaces are used to separate unrelated function declarations. It is 3+ where the linter complains.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the current 2 is fine then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, 2 is fine.

WriteStream.prototype.destroy = ReadStream.prototype.destroy;
WriteStream.prototype.close = ReadStream.prototype.close;

Expand Down
55 changes: 55 additions & 0 deletions src/node_file.cc
Original file line number Diff line number Diff line change
Expand Up @@ -908,6 +908,60 @@ static void WriteBuffer(const FunctionCallbackInfo<Value>& args) {
}


// Wrapper for writev(2).
//
// bytesWritten = writev(fd, chunks, position, callback)
// 0 fd integer. file descriptor
// 1 chunks array of buffers to write
// 2 position if integer, position to write at in the file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change at in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could, but it is a literal copy of the comment above writeBuffer which I didn't author. Would you still want me to fix it, and if so, in both places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see how the English in this line is incorrect btw (disclaimer: not a native English speaker).

// if null, write from the current position
static void WriteBuffers(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);

CHECK(args[0]->IsInt32());
CHECK(args[1]->IsArray());

int fd = args[0]->Int32Value();
Local<Array> chunks = args[1].As<Array>();
int64_t pos = GET_OFFSET(args[2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the comment above this function, this could be null as well right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. In fact, that's the common case as far as I can tell. GET_OFFSET returns -1 in that case.

Local<Value> req = args[3];

uint32_t chunkCount = chunks->Length();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newlines between declarations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more readable, personally.

uv_buf_t s_iovs[1024]; // use stack allocation when possible
uv_buf_t* iovs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC valgrind will complain when attempting any operation on an uninitialized variable. Mind just setting this = nullptr;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We immediately assign in the next statements. Would valgrind treat the current code any different from the following?

uv_buf_t* iovs = chunkCount > ARRAY_SIZE(s_iovs) ? new uv_buf_t[chunkCount] : s_iovs;

Valgrind complaining about reads, sure. But that doesn't happen here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my fail. shouldn't review while sleep deprived and at a conference. I traced the logic paths incorrectly. Your code is good.


if (chunkCount > ARRAY_SIZE(s_iovs))
iovs = new uv_buf_t[chunkCount];
else
iovs = s_iovs;

for (uint32_t i = 0; i < chunkCount; i++) {
Local<Value> chunk = chunks->Get(i);

if (!Buffer::HasInstance(chunk)) {
if (iovs != s_iovs)
delete[] iovs;
return env->ThrowTypeError("Array elements all need to be buffers");
}

iovs[i] = uv_buf_init(Buffer::Data(chunk), Buffer::Length(chunk));
}

if (req->IsObject()) {
ASYNC_CALL(write, req, fd, iovs, chunkCount, pos)
if (iovs != s_iovs)
delete[] iovs;
return;
}

SYNC_CALL(write, nullptr, fd, iovs, chunkCount, pos)
if (iovs != s_iovs)
delete[] iovs;
args.GetReturnValue().Set(SYNC_RESULT);
}


// Wrapper for write(2).
//
// bytesWritten = write(fd, string, position, enc, callback)
Expand Down Expand Up @@ -1249,6 +1303,7 @@ void InitFs(Handle<Object> target,
env->SetMethod(target, "readlink", ReadLink);
env->SetMethod(target, "unlink", Unlink);
env->SetMethod(target, "writeBuffer", WriteBuffer);
env->SetMethod(target, "writeBuffers", WriteBuffers);
env->SetMethod(target, "writeString", WriteString);

env->SetMethod(target, "chmod", Chmod);
Expand Down