Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"readable" event behaving inconsistently #18058

Closed
mkartashov opened this issue Jan 9, 2018 · 6 comments
Closed

"readable" event behaving inconsistently #18058

mkartashov opened this issue Jan 9, 2018 · 6 comments

Comments

@mkartashov
Copy link

mkartashov commented Jan 9, 2018

  • Node.js Version: 9.3.0 (and most probably earlier ones)
  • OS: Linux 64 bits
  • Scope (install, code, runtime, meta, other?): runtime
  • Module (and version) (if relevant): stream [, http]

From nodejs docs (https://nodejs.org/api/stream.html#stream_event_readable):

Note: In general, the readable.pipe() and 'data' event mechanisms are easier to understand than the 'readable' event. However, handling 'readable' might result in increased throughput.

I decided to try this increased throughput. However, I ran into an issue of compatibility of 'readable' with 'data', as well as inconsistent behaviour of 'readable' on its own, in the case of http responses.

Issue

From https://nodejs.org/api/stream.html#stream_event_readable:

The 'readable' event is emitted when there is data available to be read from the stream. In some cases, attaching a listener for the 'readable' event will cause some amount of data to be read into an internal buffer.
The 'readable' event will also be emitted once the end of the stream data has been reached but before the 'end' event is emitted.
Effectively, the 'readable' event indicates that the stream has new information: either new data is available or the end of the stream has been reached. In the former case, stream.read() will return the available data. In the latter case, stream.read() will return null.

However, it seems that if 'readable' is used in conjunction with 'data', then it emits false-positive end-of-stream events (i.e. where the stream.read() is null), along with no useful data (all useful data was consumed by 'data' handler).

If it's used on its own, without 'data' handler, it fails to detect the end-of-stream event in case when the stream is not delayed and short (i.e. when it's consumed in one go).

How to reproduce

Code

Here is a simple node snippet that creates a simple http server with a simple handler for 3 urls for different cases and then sends different requests in chain. In order not to introduce other unrelated errors (like ParseError on TCP.onread or Bad Request responses), all content lengths were carefully provided for each request/response. The code is lengthy but simple (with hooks for envvars for easy changes).

The http server serves 3 types of responses: immediate short, immediate appended and delayed appended.

const PORT = 8080;
const http = require('http');
const HelloWorld = "Hello World!";
const HelloAgainLater = "Hello again later!";
const HelloAgainNow = "Hello again now!";
const HelloServer = "Hello, server!";

// these control whether you attach a data/readable handler to the response
const DataHandler = !process.env.NO_DATA;
const ReadableHandler = !process.env.NO_READABLE

var server = http.createServer((req, res) => {
    console.log("Received", req.url);
    if(req.url == "/delay"){
        res.writeHead(200, {"Content-Length": "" + (HelloWorld.length + HelloAgainLater.length)});
        res.write(HelloWorld);
        setTimeout(() => { 
            res.end(HelloAgainLater)
        }, 1000);
    } else if(req.url == "/more"){
        res.writeHead(200, {"Content-Length": "" + (HelloWorld.length + HelloAgainNow.length)});
        res.write(HelloWorld);
        res.end(HelloAgainNow);
    } else {
        res.writeHead(200, {"Content-Length": "" + (HelloWorld.length)});
        res.write(HelloWorld);
        res.end();
    }
}).listen(PORT);


var request = function(path){
    var opts = {
        hostname: "localhost",
        port: PORT,
        path: path,
        headers: {
            "Content-Length": "" + HelloServer.length,
        }
    };

    return new Promise((resolve) => {
        console.log("Sending ", path);
        var req = http.request(opts, (response) => {
            response.on("error", (err) => {
                console.log("error", err);
            });
            if(ReadableHandler){
                response.on("readable", () => {
                    var data = response.read();
                    if(data){
                        data = data.toString();
                    }
                    console.log("readable", data);
                });
            }
            
            if(DataHandler){
                response.on("data", (data) => {
                    if(data){
                        data = data.toString();
                    }
                    console.log("data", data);
                });
            }

            response.on("end", () => {
                console.log("end\n");
                resolve();
            });
        });
        req.write(HelloServer);
        req.end();
    });
};

var r = request("/").then(() => {
    return request("/more");
}).then(() => {
    return request("/delay");
}).then(() => {
    server.close();
});

How to run

  1. For both 'data' and 'readable' events:
    $ cat test.js | docker run -i node:9.3.0

  2. For only 'data' event:
    $ cat test.js | docker run -e NO_READABLE=true -i node:9.3.0

  3. For only 'readable' event:
    $ cat test.js | docker run -e NO_DATA=true -i node:9.3.0

Outputs

Case 1 (both handlers)

Sending  /
Received /
data Hello World!
readable null
end

Sending  /more
Received /more
data Hello World!Hello again now!
readable null
end

Sending  /delay
Received /delay
data Hello World!
readable null
data Hello again later!
readable null
end

Notice the first readable null in Sending /delay that tells us the stream is finished yet it's not.

Case 2 (just 'data' event)

Received /
data Hello World!
end

Sending  /more
Received /more
data Hello World!Hello again now!
end

Sending  /delay
Received /delay
data Hello World!
data Hello again later!
end

No problems with 'data', as expected.

Case 3 (only 'readable' event)

Sending  /
Received /
readable Hello World!
end

Sending  /more
Received /more
readable Hello World!Hello again now!
end

Sending  /delay
Received /delay
readable Hello World!
readable Hello again later!
readable null
end

Notice the absence of any readable lines with null in Sending / and Sending /more.

@seeker89
Copy link

seeker89 commented Jan 9, 2018

I can confirm I am seeing the same behaviour. Which one is wrong - the docs, or the implementation ?

@mcollina
Copy link
Member

mcollina commented Feb 1, 2018

I think what you are experiencing with both handlers is a doc problem.

The docs reports:

 In flowing mode, readable.read() is called automatically until the internal buffer is fully drained.

Flowing mode happens when you are adding a on('data')  handler.

'readable' returning null does not mean the stream has ended, it means that the buffer is empty.

Would you like to send a PR with a doc update?

@mkartashov
Copy link
Author

hey @mcollina , thank you for your reply and sorry that I disappeared for 20 days 😆

I would gladly submit a PR clarifying the behaviour with both handlers! Hopefully it will be much quicker.

However what I still didn't get is that why 'readable' is not returning the last null before ending in my case? I've had a look through the IncomingMessage implementation in _http_incoming, but it seems fine and fairly standard without too many weird quirks; and I could not yet reproduce the same "no-null" behaviour with other streams (I tried fs so far though; I will try something else as well).

@mkartashov
Copy link
Author

I think I might be onto something.

If I implement my own simple stream:

const stream = require("stream");

var xx = stream.Duplex();
var things = [];

xx._read = function(){
    return xx.push(things.pop() || null );
}

xx._write = function(d){
    return things.push(d);
};


xx.on('readable', () => {
    var d = xx.read();
    console.log("readable", d && d.length);
});
xx.on('end', () => {
  console.log('end');
});

xx.write("lol", "utf8");

and run this I will get proper output:

readable 3
readable null
end

However if I remove || null then I will get only readable 3 as the only line in the output.

What I have found is that http response overrides both .read() and ._read() methods (https://github.com/nodejs/node/blob/master/lib/_http_incoming.js#L91 and https://github.com/nodejs/node/blob/master/lib/_http_incoming.js#L100), and it doesn't seem to return null from the latter.

Since IncomingMessage.read() calls the Stream.Readable.prototype.read anyway, it may be very important for Stream.read() to receive null from the Stream._read() - https://github.com/nodejs/node/blob/master/lib/_stream_readable.js#L456 .

Fine, the comment says that most of the job will be done by parserOnBody, but even there it seems to be calling stream.push only when len > 0 -
https://github.com/nodejs/node/blob/master/lib/_http_common.js#L122

I have a hunch that I'm onto something, but I can't quite prove or refute.

But it seems I decoupled my 'readable' issue from the stream module; it seems there could be a problem in http module.

Do you think you might have some ideas how to finish the case? :)

@mcollina
Copy link
Member

I've been pretty puzzled by this bug. #18939 removes those, and the tests are not failing.

Can you check if that solves your problem?

@mcollina
Copy link
Member

#18994 fixes it, unfortunately it's a semver-major change.

mcollina added a commit to mcollina/node that referenced this issue Apr 4, 2018
In Streams3 the 'readable' event/.read() method had a lower precedence
than the `'data'` event that made them impossible to use them together.
This make `.resume()` a no-op if there is a listener for the
`'readable'` event, making the stream non-flowing if there is a
`'data'`  listener.

Fixes: nodejs#18058
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants