Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request streams from http2 intermittantly not emitting end event #31309

Closed
DullReferenceException opened this issue Jan 11, 2020 · 48 comments
Closed

Comments

@DullReferenceException
Copy link

  • Version: v12.14.0
  • Platform: Linux
  • Subsystem: http2

In a nutshell, what we and others are seeing is a case where server-side code downloads bytes from a request using streaming mode, but no end event is received. The data events are firing fine, and all bytes are downloaded (a counter confirms that the number of bytes equals content-length). As you can imagine, this impacts things like middlewares which are waiting for the end event before continuing further processing. In our case, this surfaces as intermittent client-side timeouts and eventually the request is aborted.

We believe that others are seeing the same issue based on similar reports. See the following:

I have a way to reproduce... sort of. Essentially I create a secure http2 server and spam it with JSON post requests.

const { once } = require('events');
const { promises: { readFile } } = require('fs');
const { connect, createSecureServer } = require('http2');
const path = require('path');

const PORT = 8443;
const REQUEST_TIMEOUT = 10000;
const TIME_BETWEEN_REQUESTS = 100;
const REQUEST_PROCESSING_TIME = 100;

async function reproduce() {
  const server = await startServer();

  try {
    await Promise.race([
      requestSessionLoop(),
      requestSessionLoop()
    ]);
  } finally {
    server.close();
  }
}

async function startServer() {
  const [cert, key] = await Promise.all([
    readFile(path.resolve(__dirname, './certs/bugrepro.org.crt')),
    readFile(path.resolve(__dirname, './certs/bugrepro.org.key'))
  ]);

  const server = createSecureServer({ cert, key }, handleRequest);
  server
    .on('session', session => {
      console.log('Started new session', session);
    })
    .on('sessionError', err => {
      console.error('Got session error:', err.stack);
    })
  server.listen(PORT);

  await once(server, 'listening');
}

async function handleRequest(req, res) {
  const contentLength = req.headers['content-length'];
  const buffers = [];
  let bytesRead = 0;

  // Simulate processing delay
  //await delay(REQUEST_PROCESSING_TIME);

  req
    .on('data', chunk => {
      bytesRead += chunk.length;
      buffers.push(chunk);
    })
    .once('end', () => {
      try {
        const concatenated = buffers.join('');
        const body = JSON.parse(concatenated);
        console.log('Got request:', JSON.stringify(body));
        res.statusCode = 204;
        res.end();
      } catch (err) {
        res.statusCode = 500;
        res.end(err.stack);
      }
    })
    .once('aborted', () => {
      console.log(`Server received abort event from client. Bytes received: ${bytesRead} out of ${contentLength}.`);
      console.log('Request state:')
      console.dir(req);
    })
    .once('error', err => {
      console.error('Got error from request stream');
      res.statusCode = 500;
      res.end(err.stack);
    });
}

let sessionCounter = 0;

async function requestSessionLoop() {
  const ca = await readFile(path.resolve(__dirname, './certs/fakeca.crt'));
  const session = connect(`https://bugrepro.org:${PORT}`, { ca });
  const sessionId = ++sessionCounter;
  let requestCounter = 0;

  session
    .on('error', err => {
      console.error(`Received error from H2 session: ${err.stack}`);
    })
    .on('frameError', err => {
      console.error(`Received frame error from H2 session: ${err.stack}`);
    })
    .on('timeout', () => {
      console.log('Session timed out');
    })

  await once(session, 'connect');

  while (true) {
    // console.log(`Sending request #${++requestCounter}`);
    const stream = session.request({ ':method': 'POST', ':path': '/', 'content-type': 'application/json' });
    stream.setTimeout(REQUEST_TIMEOUT);
    stream.end(JSON.stringify({ sessionId, request: ++requestCounter, random: Math.random(), some: { arbitrary: 'content' } }));
    await verifyResponse(stream);
    stream.destroy();

    await delay(TIME_BETWEEN_REQUESTS);
  }
}

async function verifyResponse(req) {
  return new Promise((resolve, reject) => {
    req
      .once('error', err => {
        reject(new Error(`Received error from request stream: ${err.stack}`));
      })
      .once('frameError', err => {
        reject(new Error(`Received frame error from request stream: ${err.stack}`));
      })
      .once('aborted', () => {
        reject(new Error(`Request was aborted`));
      })
      .once('timeout', () => {
        reject(new Error(`Request timed out`));
      })
      .once('response', (headers, flags) => {
        const responseBuffer = [];
        req
          .on('data', chunk => {
            responseBuffer.push(chunk);
          })
          .once('end', () => {
            const body = responseBuffer.join('');
            const status = headers[':status'];
            if (status >= 400) {
              throw new Error(`Received unexpected HTTP error ${status} with body: ${body}`);
            } else {
              resolve();
            }
          })
      });
  });
}

function delay(ms) {
  return new Promise((resolve) => {
    setTimeout(resolve, ms);
  });
}

reproduce().catch(console.error);

Sadly, I couldn't get node-issued requests to reproduce by themselves. However, while this is running, trying a bunch of curl requests in parallel eventually gets the condition to occur:

curl --cacert ./certs/fakeca.crt https://bugrepro.org:8443/ -H 'content-type: application/json' --data '{ "fake": "json", "data": "payload" }' -m 10 --compressed --verbose
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to bugrepro.org (127.0.0.1) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: ./certs/fakeca.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Unknown (8):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Client hello (1):
* TLSv1.3 (OUT), TLS Unknown, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=Washington; L=Seattle; O=Bug Repro Fake Org; CN=bugrepro.org; emailAddress=jpage@godaddy.com
*  start date: Jan 10 22:55:45 2020 GMT
*  expire date: Feb  9 22:55:45 2020 GMT
*  common name: bugrepro.org (matched)
*  issuer: C=US; ST=Washington; L=Seattle; O=Fake CA Ltd; CN=Fake CA; emailAddress=jpage@godaddy.com
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* Using Stream ID: 1 (easy handle 0x7fffd54d47e0)
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
> POST / HTTP/2
> Host: bugrepro.org:8443
> User-Agent: curl/7.58.0
> Accept: */*
> Accept-Encoding: deflate, gzip
> content-type: application/json
> Content-Length: 37
> 
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* We are completely uploaded and fine
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
* Operation timed out after 10000 milliseconds with 0 bytes received
* Connection #0 to host bugrepro.org left intact
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received

When this timeout abort occurs, I see from the server's logs:

Server received abort event from client. Bytes received: 37 out of 37.

If it helps, here's what the console.dir of the request object shows. Maybe its state lends some clues as to what's going on:

Http2ServerRequest {
  _readableState: ReadableState {
    objectMode: false,
    highWaterMark: 16384,
    buffer: BufferList { head: null, tail: null, length: 0 },
    length: 0,
    pipes: null,
    pipesCount: 0,
    flowing: true,
    ended: false,
    endEmitted: false,
    reading: true,
    sync: false,
    needReadable: true,
    emittedReadable: false,
    readableListening: false,
    resumeScheduled: false,
    paused: false,
    emitClose: true,
    autoDestroy: false,
    destroyed: false,
    defaultEncoding: 'utf8',
    awaitDrain: 0,
    readingMore: false,
    decoder: null,
    encoding: null
  },
  readable: true,
  _events: [Object: null prototype] {
    pause: [Function: onRequestPause],
    resume: [Function: onRequestResume],
    data: [Function],
    end: [Function: bound onceWrapper] { listener: [Function] },
    error: [Function: bound onceWrapper] { listener: [Function] }
  },
  _eventsCount: 5,
  _maxListeners: undefined,
  [Symbol(state)]: { closed: false, didRead: true },
  [Symbol(headers)]: [Object: null prototype] {
    ':method': 'POST',
    ':path': '/',
    ':scheme': 'https',
    ':authority': 'bugrepro.org:8443',
    'user-agent': 'curl/7.58.0',
    accept: '*/*',
    'accept-encoding': 'deflate, gzip',
    'content-type': 'application/json',
    'content-length': '37'
  },
  [Symbol(rawHeaders)]: [
    ':method',         'POST',
    ':path',           '/',
    ':scheme',         'https',
    ':authority',      'bugrepro.org:8443',
    'user-agent',      'curl/7.58.0',
    'accept',          '*/*',
    'accept-encoding', 'deflate, gzip',
    'content-type',    'application/json',
    'content-length',  '37'
  ],
  [Symbol(trailers)]: {},
  [Symbol(rawTrailers)]: [],
  [Symbol(stream)]: ServerHttp2Stream {
    _readableState: ReadableState {
      objectMode: false,
      highWaterMark: 16384,
      buffer: [BufferList],
      length: 0,
      pipes: null,
      pipesCount: 0,
      flowing: true,
      ended: false,
      endEmitted: false,
      reading: true,
      sync: false,
      needReadable: true,
      emittedReadable: false,
      readableListening: false,
      resumeScheduled: false,
      paused: false,
      emitClose: true,
      autoDestroy: false,
      destroyed: false,
      defaultEncoding: 'utf8',
      awaitDrain: 0,
      readingMore: false,
      decoder: null,
      encoding: null
    },
    readable: true,
    _events: [Object: null prototype] {
      pause: [Function: streamOnPause],
      newListener: [Function: streamListenerAdded],
      removeListener: [Function: streamListenerRemoved],
      trailers: [Function: onStreamTrailers],
      end: [Function: onStreamEnd],
      error: [Function: onStreamError],
      aborted: [Array],
      close: [Array],
      drain: [Function: onStreamDrain],
      wantTrailers: [Function: onStreamTrailersReady],
      data: [Function: onStreamData]
    },
    _eventsCount: 11,
    _maxListeners: undefined,
    _writableState: WritableState {
      objectMode: false,
      highWaterMark: 16384,
      finalCalled: false,
      needDrain: false,
      ending: false,
      ended: false,
      finished: false,
      destroyed: false,
      decodeStrings: false,
      defaultEncoding: 'utf8',
      length: 0,
      writing: false,
      corked: 0,
      sync: true,
      bufferProcessing: false,
      onwrite: [Function: bound onwrite],
      writecb: null,
      writelen: 0,
      bufferedRequest: null,
      lastBufferedRequest: null,
      pendingcb: 0,
      prefinished: false,
      errorEmitted: false,
      emitClose: true,
      autoDestroy: false,
      bufferedRequestCount: 0,
      corkedRequestsFree: [Object]
    },
    writable: true,
    allowHalfOpen: true,
    timeout: 0,
    [Symbol(asyncId)]: 52824,
    [Symbol(kSession)]: ServerHttp2Session {
      _events: [Object: null prototype],
      _eventsCount: 6,
      _maxListeners: undefined,
      timeout: 120000,
      [Symbol(state)]: [Object],
      [Symbol(encrypted)]: true,
      [Symbol(alpnProtocol)]: 'h2',
      [Symbol(type)]: 0,
      [Symbol(proxySocket)]: null,
      [Symbol(socket)]: [TLSSocket],
      [Symbol(timeout)]: [Timeout],
      [Symbol(kHandle)]: [Http2Session],
      [Symbol(kNativeFields)]: [Uint8Array],
      [Symbol(server)]: [Http2SecureServer],
      [Symbol(local-settings)]: [Object],
      [Symbol(remote-settings)]: [Object]
    },
    [Symbol(timeout)]: null,
    [Symbol(state)]: {
      didRead: true,
      flags: 19,
      rstCode: 8,
      writeQueueSize: 0,
      trailersReady: false,
      endAfterHeaders: false
    },
    [Symbol(request)]: [Circular],
    [Symbol(proxySocket)]: null,
    [Symbol(id)]: 1,
    [Symbol(kHandle)]: Http2Stream { owner: [Circular], [Symbol(owner)]: [Circular] },
    [Symbol(protocol)]: 'https',
    [Symbol(authority)]: 'bugrepro.org:8443',
    [Symbol(response)]: Http2ServerResponse {
      _events: [Object: null prototype] {},
      _eventsCount: 0,
      _maxListeners: undefined,
      writable: true,
      [Symbol(state)]: [Object],
      [Symbol(headers)]: [Object: null prototype] {},
      [Symbol(trailers)]: [Object: null prototype] {},
      [Symbol(stream)]: [Circular]
    }
  },
  [Symbol(aborted)]: true
}

I know it's not ideal that I don't have a quick and easy way to reproduce. There's probably something key to reproducing, since we're actually able to reproduce in our application more easily, but still intermittantly. It occurs for us in Chrome, but only when a given session ID is reused, when sending a post request. We can tell from Chrome logs that indeed the request is sent fully, and it times out/aborts while waiting for server headers.

Please let me know of any more details I can provide or things I can try to capture. We are highly motivated to squash this bug, since our alternative is to re-engineer our live application back to http1, which is obviously not ideal.

@DullReferenceException
Copy link
Author

Archive to reproduce (contains the certs). You also have to point your host file such that bugrepro.org is resolved as 127.0.0.1.

http2-bug-repro.zip

@DullReferenceException
Copy link
Author

http2-bug-repro.zip

(Created prior zip incorrectly)

@DullReferenceException
Copy link
Author

DullReferenceException commented Jan 13, 2020

Updates on debugging. I was able to make this a bit easier to reproduce by adding more activity to the server:
repro.js.txt

Still, none of the node client code initiates the failure; it only seems to be Chrome and curl (using http2) that triggers the issue, so maybe there's something extra that the node clients are doing that some other H2 clients are not. Sadly, turning on NODE_DEBUG=http2 makes me unable to reproduce, so maybe the extra console I/O eliminates a race condition or something like that.

I've also captured an additional util.inspect for the Http2ServerRequest object.

This is the state of the request during the handling of the final data event:
before.txt

...and this is the state at the time of the request having been aborted:
after.txt

Doing a diff on the two, here's two takeaways I got:

  • The before readable state has reading: false; at the time of abort it's reading: true
  • The stream symbol property differs between the two snapshots. The before state has it as an Http2Stream. The after state is a ServerHttp2Stream.

@DullReferenceException
Copy link
Author

DullReferenceException commented Jan 13, 2020

Found a good lead when turning on tracing for curl requests. There's a definite difference in client-side behaviors when comparing a successful curl request versus an unsuccessful request. Here's the output of a success versus a failure:

curl-success-trace.txt
curl-failure-trace.txt

In all cases for successes, curl sends the request body immediately after sending the request headers. In the case of failures, there is additional activity between the sending of the headers and the body. I'm not sure exactly what this activity is, but it looks like a back-and-forth that's present in both cases, just occurring at a different point in time (I'd recommend you use a diff tool to see what I mean).

Seems like it relates to a new TLS session being issued between the sending of the header and body; some state must be getting lost in the process.

@ronag
Copy link
Member

ronag commented Jan 14, 2020

Can you reproduce it when not using the compat API? Just to see if we can remove one layer of complexity.

@DullReferenceException
Copy link
Author

DullReferenceException commented Jan 14, 2020

@ronag: I think it's just the compat API. Here's why. I have created a version of the repro that does not use the compat API:

repro-non-compat.js.txt

I struggled to get the same TLS sequence that got the failures to occur using this repro script. After switching back to the compat API, I could get it to reproduce. Notably, after reproducing with the compat API, I ran the non-compat API, immediately after and got the same trace pattern as with the failing pattern, and the non-compat API did not get the timeout/failure. Here's the same two same mismatching session traces when running against the non-compat API that did not have the failure:

pattern-1-success.txt
pattern-2-success.txt

@DullReferenceException
Copy link
Author

Strike that... while seeing if using the stream property of the request to read the body instead would work as a workaround, I was able to reproduce. So then I plugged in the repro code above that doesn't use the compat layer and ran the curl command in a zsh while loop, and I was able to get it to reproduce as well.

So this appears to be a problem with raw http2 streams, not just the compatibility API.

@DullReferenceException DullReferenceException changed the title Request streams from http2 intermittantly not emitting end event with compatibility API Request streams from http2 intermittantly not emitting end event Jan 15, 2020
@ronag
Copy link
Member

ronag commented Jan 15, 2020

@nodejs/http2 seems related to http2 core

@ronag
Copy link
Member

ronag commented Jan 15, 2020

@DullReferenceException: Can you see what other events you get? Does it emit e.g. 'error' or 'close'?

@DullReferenceException
Copy link
Author

It does not emit any error event. It does emit close, but only after the client aborts the request, not before.

@DullReferenceException
Copy link
Author

Since the compat wrapper isn't to blame, here's a dump of the stream's state at the point of receiving the last data chunk and after the abort event:

Request state after reading last chunk:
Http2Stream {
  id: 1,
  closed: false,
  destroyed: false,
  state: {
    state: 2,
    weight: 16,
    sumDependencyWeight: 0,
    localClose: 0,
    remoteClose: 0,
    localWindowSize: 65498
  },
  readableState: ReadableState {
    objectMode: false,
    highWaterMark: 16384,
    buffer: BufferList { head: null, tail: null, length: 0 },
    length: 0,
    pipes: null,
    pipesCount: 0,
    flowing: true,
    ended: false,
    endEmitted: false,
    reading: true,
    sync: false,
    needReadable: true,
    emittedReadable: false,
    readableListening: false,
    resumeScheduled: false,
    paused: false,
    emitClose: true,
    autoDestroy: false,
    destroyed: false,
    defaultEncoding: 'utf8',
    awaitDrain: 0,
    readingMore: false,
    decoder: null,
    encoding: null
  },
  writableState: WritableState {
    objectMode: false,
    highWaterMark: 16384,
    finalCalled: false,
    needDrain: false,
    ending: false,
    ended: false,
    finished: false,
    destroyed: false,
    decodeStrings: false,
    defaultEncoding: 'utf8',
    length: 0,
    writing: false,
    corked: 0,
    sync: true,
    bufferProcessing: false,
    onwrite: [Function: bound onwrite],
    writecb: null,
    writelen: 0,
    bufferedRequest: null,
    lastBufferedRequest: null,
    pendingcb: 0,
    prefinished: false,
    errorEmitted: false,
    emitClose: true,
    autoDestroy: false,
    bufferedRequestCount: 0,
    corkedRequestsFree: {
      next: null,
      entry: null,
      finish: [Function: bound onCorkedFinish]
    }
  }
}

Stream state after abort:
ServerHttp2Stream {
  _readableState: ReadableState {
    objectMode: false,
    highWaterMark: 16384,
    buffer: BufferList { head: null, tail: null, length: 0 },
    length: 0,
    pipes: null,
    pipesCount: 0,
    flowing: true,
    ended: false,
    endEmitted: false,
    reading: true,
    sync: false,
    needReadable: true,
    emittedReadable: false,
    readableListening: false,
    resumeScheduled: false,
    paused: false,
    emitClose: true,
    autoDestroy: false,
    destroyed: false,
    defaultEncoding: 'utf8',
    awaitDrain: 0,
    readingMore: false,
    decoder: null,
    encoding: null
  },
  readable: true,
  _events: [Object: null prototype] {
    pause: [Function: streamOnPause],
    newListener: [Function: streamListenerAdded],
    removeListener: [Function: streamListenerRemoved],
    data: [Function],
    end: [Function: bound onceWrapper] { listener: [Function] },
    error: [Function: bound onceWrapper] { listener: [Function] }
  },
  _eventsCount: 6,
  _maxListeners: undefined,
  _writableState: WritableState {
    objectMode: false,
    highWaterMark: 16384,
    finalCalled: false,
    needDrain: false,
    ending: false,
    ended: false,
    finished: false,
    destroyed: false,
    decodeStrings: false,
    defaultEncoding: 'utf8',
    length: 0,
    writing: false,
    corked: 0,
    sync: true,
    bufferProcessing: false,
    onwrite: [Function: bound onwrite],
    writecb: null,
    writelen: 0,
    bufferedRequest: null,
    lastBufferedRequest: null,
    pendingcb: 0,
    prefinished: false,
    errorEmitted: false,
    emitClose: true,
    autoDestroy: false,
    bufferedRequestCount: 0,
    corkedRequestsFree: {
      next: null,
      entry: null,
      finish: [Function: bound onCorkedFinish]
    }
  },
  writable: true,
  allowHalfOpen: true,
  timeout: 0,
  [Symbol(asyncId)]: 54782,
  [Symbol(kSession)]: ServerHttp2Session {
    _events: [Object: null prototype] {
      newListener: [Function: sessionListenerAdded],
      removeListener: [Function: sessionListenerRemoved],
      stream: [Function: sessionOnStream],
      error: [Function: sessionOnError],
      priority: [Function: sessionOnPriority],
      timeout: [Function]
    },
    _eventsCount: 6,
    _maxListeners: undefined,
    timeout: 120000,
    [Symbol(state)]: {
      destroyCode: 0,
      flags: 1,
      goawayCode: null,
      goawayLastStreamID: null,
      streams: [Map],
      pendingStreams: Set {},
      pendingAck: 0,
      writeQueueSize: 0,
      originSet: undefined
    },
    [Symbol(encrypted)]: true,
    [Symbol(alpnProtocol)]: 'h2',
    [Symbol(type)]: 0,
    [Symbol(proxySocket)]: null,
    [Symbol(socket)]: TLSSocket {
      _tlsOptions: [Object],
      _secureEstablished: true,
      _securePending: false,
      _newSessionPending: false,
      _controlReleased: true,
      _SNICallback: null,
      servername: 'bugrepro.org',
      alpnProtocol: 'h2',
      authorized: false,
      authorizationError: null,
      encrypted: true,
      _events: [Object: null prototype],
      _eventsCount: 6,
      connecting: false,
      _hadError: false,
      _parent: [Socket],
      _host: null,
      _readableState: [ReadableState],
      readable: false,
      _maxListeners: undefined,
      _writableState: [WritableState],
      writable: false,
      allowHalfOpen: false,
      _sockname: null,
      _pendingData: null,
      _pendingEncoding: '',
      server: [Http2SecureServer],
      _server: null,
      ssl: null,
      _requestCert: false,
      _rejectUnauthorized: true,
      timeout: 0,
      write: [Function: writeAfterFIN],
      [Symbol(res)]: null,
      [Symbol(asyncId)]: 54706,
      [Symbol(kHandle)]: null,
      [Symbol(lastWriteQueueSize)]: 0,
      [Symbol(timeout)]: [Timeout],
      [Symbol(kBuffer)]: null,
      [Symbol(kBufferCb)]: null,
      [Symbol(kBufferGen)]: null,
      [Symbol(kBytesRead)]: 179,
      [Symbol(kBytesWritten)]: 18,
      [Symbol(error-emitted)]: false,
      [Symbol(kSession)]: [Circular],
      [Symbol(disable-renegotiation)]: true,
      [Symbol(server)]: [Http2SecureServer]
    },
    [Symbol(timeout)]: Timeout {
      _idleTimeout: 120000,
      _idlePrev: [Timeout],
      _idleNext: [Timeout],
      _idleStart: 5189,
      _onTimeout: [Function: bound _onTimeout],
      _timerArgs: undefined,
      _repeat: null,
      _destroyed: false,
      [Symbol(refed)]: false,
      [Symbol(asyncId)]: 54764,
      [Symbol(triggerId)]: 54706
    },
    [Symbol(kHandle)]: Http2Session { fields: [Uint8Array], [Symbol(owner)]: [Circular] },
    [Symbol(kNativeFields)]: Uint8Array [ 0, 0, 0 ],
    [Symbol(server)]: Http2SecureServer {
      _contexts: [],
      requestCert: false,
      rejectUnauthorized: true,
      ALPNProtocols: [Buffer [Uint8Array]],
      pfx: undefined,
      key: [Buffer [Uint8Array]],
      passphrase: undefined,
      cert: [Buffer [Uint8Array]],
      clientCertEngine: undefined,
      ca: undefined,
      minVersion: undefined,
      maxVersion: undefined,
      secureProtocol: undefined,
      crl: undefined,
      sigalgs: undefined,
      ciphers: undefined,
      ecdhCurve: undefined,
      dhparam: undefined,
      honorCipherOrder: true,
      secureOptions: undefined,
      sessionIdContext: '81681a7a1b4c3ea5ccc041bb81bb43b3',
      _sharedCreds: [SecureContext],
      _events: [Object: null prototype],
      _eventsCount: 7,
      _maxListeners: undefined,
      _connections: 600,
      _handle: [TCP],
      _usingWorkers: false,
      _workers: [],
      _unref: false,
      allowHalfOpen: false,
      pauseOnConnect: false,
      timeout: 120000,
      _connectionKey: '6::::8443',
      [Symbol(handshake-timeout)]: 120000,
      [Symbol(snicallback)]: undefined,
      [Symbol(asyncId)]: 14,
      [Symbol(enableTrace)]: undefined,
      [Symbol(options)]: [Object]
    },
    [Symbol(local-settings)]: {
      headerTableSize: 4096,
      enablePush: true,
      initialWindowSize: 65535,
      maxFrameSize: 16384,
      maxConcurrentStreams: 4294967295,
      maxHeaderListSize: 4294967295,
      enableConnectProtocol: 0
    }
  },
  [Symbol(timeout)]: null,
  [Symbol(state)]: {
    didRead: true,
    flags: 19,
    rstCode: 8,
    writeQueueSize: 0,
    trailersReady: false,
    endAfterHeaders: false
  },
  [Symbol(request)]: null,
  [Symbol(proxySocket)]: null,
  [Symbol(id)]: 1,
  [Symbol(kHandle)]: Http2Stream { owner: [Circular], [Symbol(owner)]: [Circular] },
  [Symbol(protocol)]: 'https',
  [Symbol(authority)]: 'bugrepro.org:8443'
}

@DullReferenceException
Copy link
Author

Simplified repro script. Did away with the node client requests loop. Just running a curl command in a loop is sufficient, like:

while curl --cacert ./certs/fakeca.crt https://bugrepro.org:8443/ -H 'content-type: application/json' --data '{ "fake": "json", "data": "payload" }' -m 10; do :; done

Here's the simplified JS:

const { once } = require('events');
const { promises: { readFile } } = require('fs');
const { createSecureServer, constants: { HTTP2_HEADER_STATUS } } = require('http2');
const path = require('path');
const { inspect } = require('util');

const PORT = 8443;
const REQUEST_PROCESSING_TIME = { min: 10, max: 100 };

async function reproduce() {
  await startServer();

  console.log(`
Server has started. To reproduce, you must:

1. Have a version of curl that supports http2
2. Add a hosts file entry (typically at /etc/hosts) containing a line with: 127.0.0.1 bugrepro.org

Next, run this command, or if you don't have a shell like bash or zsh, you may have to translate to something else:

while curl --cacert ./certs/fakeca.crt https://bugrepro.org:8443/ -H 'content-type: application/json' --data '{ "fake": "json", "data": "payload" }' -m 10; do :; done

This should repeatedly send POST requests to the server until the error situation occurs, at which point the server will terminate.`);
}

async function startServer() {
  const [cert, key] = await Promise.all([
    readFile(path.resolve(__dirname, './certs/bugrepro.org.crt')),
    readFile(path.resolve(__dirname, './certs/bugrepro.org.key'))
  ]);

  const server = createSecureServer({ cert, key });
  server
    .on('sessionError', err => {
      console.error('Got session error:', err.stack);
    })
    .on('stream', handleStream)
  server.listen(PORT);

  await once(server, 'listening');

  return server;
}

let hasOutputSuccessEvents = false;

async function handleStream(stream, headers, flags) {
  await delay(randomRange(REQUEST_PROCESSING_TIME));

  const streamEvents = captureAllEvents(stream);

  const contentLength = parseInt(headers['content-length'], 10);
  const buffers = [];
  let bytesRead = 0;
  let lastChunkState;

  stream
    .on('data', chunk => {
      bytesRead += chunk.length;
      buffers.push(chunk);
      if (bytesRead === contentLength) {
        lastChunkState = inspect(stream);
      }
    })
    .once('end', () => {
      try {
        const concatenated = buffers.join('');
        JSON.parse(concatenated);
        stream.respond({
          [HTTP2_HEADER_STATUS]: 204
        });
        stream.end();
        if (!hasOutputSuccessEvents) {
          console.log('Stream event sequence on success:');
          console.dir(streamEvents);
          hasOutputSuccessEvents = true;
        }
      } catch (err) {
        stream.respond({
          [HTTP2_HEADER_STATUS]: 500
        });
        stream.end(err.stack);
      }
    })
    .once('aborted', () => {
      console.log(`Server received abort event from client. Bytes received: ${bytesRead} out of ${contentLength}.`);
      console.log('\nRequest state after reading last chunk:');
      console.log(lastChunkState);
      console.log('\nStream state after abort:')
      console.dir(stream);
      console.log('\n\nStream event sequence on success:');
      console.dir(streamEvents);
    })
    .once('error', err => {
      console.error('Got error from stream');
      stream.respond({ [HTTP2_HEADER_STATUS]: 500 });
      stream.end(err.stack);
    });
}

function captureAllEvents(emitter) {
  const events = [];
  const origEmit = emitter.emit.bind(emitter);
  let firstTimestamp = null;
  emitter.emit = (event, ...args) => {
    if (!firstTimestamp) {
      firstTimestamp = Date.now();
    }
    const timestamp = Date.now() - firstTimestamp;
    events.push({
      timestamp,
      event,
      args
    });
    origEmit(event, ...args);
  };
  return events;
}

function delay(ms) {
  return new Promise((resolve) => {
    setTimeout(resolve, ms);
  });
}

function randomRange({ min, max }) {
  const magnitude = max - min;
  return Math.random() * magnitude + min;
}

reproduce().catch(console.error);

I added some event capturing to the stream, but nothing interesting/unusual shows up.

@rahbari
Copy link

rahbari commented Jan 20, 2020

I used node 12.4 without any problem, after updating to 12.14.1 after 4 or 5 successful uploads, request stream halts till client or server timeout (no end event) after receiving all data.

@DullReferenceException
Copy link
Author

I just tried my repro on 12.4.0, and it does reproduce. So does 12.0.0.

@rahbari
Copy link

rahbari commented Jan 20, 2020

I used 12.4 for so long and didn't have any problem, this is how I read the request:

new Promise((res, rej) => {
  const a = [];
  req.on('error', e => rej(e)).on('data', c => a.push(c)).on('end', () => res(Buffer.concat(a)));
});

req is a Http2ServerRequest object from Compatibility API.
after updating to 12.14.1 it halts randomly (end is not emitted) after request is readed completely, I removed 12.14.1 and installed 12.4 again and there is no problem as of now.

@rahbari
Copy link

rahbari commented Feb 18, 2020

This issue still exists in 12.16.0 and 13.8.0. Also I tested different version of node.js 12, my problem starts to occurs from node 12.8.1. Now I'm using 12.8.0 and it has no problem.

@akc42
Copy link

akc42 commented Mar 3, 2020

I was just about to leave a similar report. I find it occurs much more frequently on slower hardware. I am running my server on a raspberry pi 3. I am using the npm 'router' module as my middleware router, but instead of using 'bodyParser' I wrote my own with a timeout on getting the 'end' event. Here is that code

    readBody(req,next) {
      const buffer = [];
      let t = 0;
      function cleanup() {
        req.off('data', gotData);
        req.off('error', anError);
        let body;
        try {
          body = JSON.parse(Buffer.concat(buffer).toString());
        } catch(e) {
          //normally after a restart
          body = {};
        }
        req.body = body;
        next();
      }
      function gotData(chunk) {
        if (t === 0) t = setTimeout(() => {
          logger('error', `Reading request body timed out for url ${req.url}`);
          t = 0;
          req.off('end', anEnd);
          anEnd();
        }, 1000); //allow a second for an end to fire
        buffer.push(chunk);
      }
      function anEnd() {
        if (t !== 0) {
          clearTimeout(t);
          t = 0;
        }
        debugapi('got a end in reading request body');
        req.off('close', aClose);
        cleanup();
      }
      function aClose() {
        debugapi('got a close in reading request body');
        req.off('end', anEnd);
        cleanup();
      }
      function anError(err) {
        debugapi('error whilst reading body o ', err);
        req.off('data', gotData);
        req.off('end', anEnd);
        req.off('close', aClose);
        next(err);
      }
      
      debugapi('setting up to read request body');
      req.once('end', anEnd);
      req.once('close', aClose);
      req.once('error', anError);
      req.on('data', gotData);
    }

On the PI it occurs frequently. On my desktop less so. I have only just got this far in tracing the issue so I don't really have the data.

Before I put the timeout in - what would happen is the request would timeout (from the client? it took two minutes) and then repeat and more often than not it would then work.

@asilvas
Copy link

asilvas commented Mar 3, 2020

@akc42 what version(s) of node were you able to confirm this issue?

@akc42
Copy link

akc42 commented Mar 3, 2020

I was on 12.15.0, I just downgraded to 12.8.0 and the problem went away. I then upgraded to 12.8.1 and the problem returned. Looks like @rahbari is correct - the problem is between those two versions. The changelog shows lots of http2 changes between those two versions.

@akc42
Copy link

akc42 commented Mar 4, 2020

I've upgraded to v13.9.0 and changed by code so that the readBody routine above does a const reqStream = req.stream; at the head of it, and then all the req.{on,once,off} calls are replaced with the reqStream equivalents

This is still failing in the same way - maybe a fraction less often (it seems to be 100% of the time when this is the first time through the routine when the client has reloaded - about 50% if the time after that) on the raspberry pi with node --inspect myapp in operation. (No breakpoints at the moment, as I am getting stream fails if I try to put something on and I just wanted to prove my new code still worked the same)

@rahbari
Copy link

rahbari commented Mar 5, 2020

@asilvas @akc42 i tested all version from 12.4 to 12.16 to find out where the problem starts. It's okay in 12.8 and it's interesting that it happens in 12.8.1 (maybe 1 of 2) so much more than 12.16. I tested on latest windows 10, new core i7 machine so I don't think it has anything to do with slow hardware.
I'm stuck with 12.8 and I don't know how others are using newer versions because it's very fundamental bug that make 12.8.1+ unusable. Some suggested to check received data against content-length but it's not good if this header is not available.

@jasnell
Copy link
Member

jasnell commented Mar 6, 2020

Hey all. Interesting issue for sure. I'll try to dig in on it later today but will need some time to review the discussion here.

@akc42
Copy link

akc42 commented Mar 12, 2020

@jasnell any progress on this?

@mrfx
Copy link

mrfx commented Mar 20, 2020

Hi, thanks for your thoughts. Since stream.on('end') is not reliable yet, my workoraund is:

const http2handler (req, res) {

  let buffer = [];
  let counter = 0;
  req.stream.on('data', ( chunk ) => {
    counter += chunk.length;
    buffer.push( chunk );
    if ( counter === Number( req.headers['content-length'] ) ) {
      const body = Buffer.concat(buffer).toString();
      const response = sampleBodyHandleFunc( body );
      res.stream.respond(headers);
      res.stream.end(response);
      return;
    }
  } );

}

const server = http2
  .createSecureServer(serverOptions, http2handler)
  .listen(8888, () => {
    console.log("http2 server started on port", 8888);
});

@FallingSnow
Copy link

Any luck on this? This is affecting code in production for me.

@jasnell
Copy link
Member

jasnell commented Apr 17, 2020

Been spending some time digging in on the http2 code again this week and will be continuing next week. This issue is on my list of ones to get to but I haven't yet been able to get to it.

clshortfuse added a commit to clshortfuse/node that referenced this issue Jun 18, 2020
Adds support for reading from a stream where the final frame is a non-empty
DATA frame with the END_STREAM flag set, instead of hanging waiting for another
frame. When writing to a stream, uses a END_STREAM flag on final DATA frame
instead of adding an empty DATA frame.

BREAKING: http2 client now expects servers to properly support END_STREAM flag

Fixes: nodejs#31309
Fixes: nodejs#33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback
@niftylettuce
Copy link

Just chiming in here - is this related to the bug I discovered here? #33687

@clshortfuse
Copy link
Contributor

clshortfuse commented Jun 28, 2020

@niftylettuce 99% sure, yes. I did a Wireshark specifically with Chrome and narrowed it down to lack of END_STREAM flag support. It also can happen with Safari and curl on Mac. Unfortunately, the test file I built to properly show the issue is gone because my Macbook Pro decides it no longer wants to boot. But the good news is I no longer really need it since a PR is in place.

I suggest you use the workaround in the comment above which will "wake up" the stalled state.

@niftylettuce
Copy link

Thanks @clshortfuse - this seems Chrome/Chromium specific. Is there a patch they can do on their side too to alleviate this? I imagine people update their Chrome/Chromium version faster than Node.

@niftylettuce
Copy link

@clshortfuse Is there an example fix I could add to my Koa setup? And is there one we can add for people using Express, Fastify, etc?

@clshortfuse
Copy link
Contributor

I don't know exactly how things work on Koa or Express. I recently gave up trying to get full functionality HTTP2 out of Express and wound up writing my own framework. That's actually how I came across this issue.

You need to be able to interface with the session and send something, anything to wake it up. I'm assuming Koa and Express don't interface with the stream event and are using the the downconverted HTTP1-compatible request. With a stream, you can get the .session object and send a ping.

If the documentation here is how I assume it to be, you might be able to do .request.stream.session.ping(() => {}). But that's assuming the request or stream property hasn't been mutated (which I know Express likes to do). And I'm pretty sure, at least on Express, if you try reading from the stream directly, yourself, you're going to have to problems because the framework expects complete access. So you'll have to figure out how and when to slip the ping in,.

@niftylettuce
Copy link

@clshortfuse Hey! As a workaround in the interim, couldn't we just change this?

server.timeout = 5000; // 5s timeout vs. 120s timeout for HTTP/2 to reset

Ref: https://nodejs.org/api/http.html#http_server_timeout

@clshortfuse
Copy link
Contributor

@niftylettuce You're still going to drop a perfectly good connection. You'll send something like an error with status code 500 when they try to load a page or submit something. I'm assuming you're using koa/bodyparser, so based on the dependency tree of:

koa/bodyparser => co-body => raw-body

You'll have to modify https://github.com/stream-utils/raw-body/blob/0e1291f1d6cbc9ee8e16f893d091e417841c95e5/index.js#L240 in your node_modules and inject sending the ping.

To be honest, it might be easier to switch to spdy which I know doesn't have the issue. Or skip koa/bodyparser and read from the stream yourself, sending a ping on stall. Or, maybe, you can try your luck by blindly calling ctx.req.stream.session.ping(() => {}) if you feel it's stalled after a while.

@niftylettuce
Copy link

A bit confused on where to inject it, @clshortfuse any chance you could send me an example? Also, can you toss me an email at niftylettuce@gmail.com with your PayPal or preferred means of receiving the bug bounty award for this?

@niftylettuce
Copy link

As I mentioned over email, the examples you had shared by email @clshortfuse did not work unfortunately.

MylesBorins pushed a commit to MylesBorins/node that referenced this issue Aug 18, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame. When writing to a stream, uses a
END_STREAM flag on final DATA frame instead of adding an empty
DATA frame.

BREAKING: http2 client now expects servers to properly support
END_STREAM flag

Fixes: nodejs#31309
Fixes: nodejs#33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

PR-URL: nodejs#33875
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to clshortfuse/node that referenced this issue Aug 20, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame. When writing to a stream, uses a
END_STREAM flag on final DATA frame instead of adding an empty
DATA frame.

BREAKING: http2 client now expects servers to properly support
END_STREAM flag

Fixes: nodejs#31309
Fixes: nodejs#33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback
clshortfuse added a commit to clshortfuse/node that referenced this issue Aug 20, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame.

Fixes: nodejs#31309
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

PR-URL: nodejs#33875
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Aug 21, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame. When writing to a stream, uses a
END_STREAM flag on final DATA frame instead of adding an empty
DATA frame.

BREAKING: http2 client now expects servers to properly support
END_STREAM flag

Fixes: #31309
Fixes: #33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

PR-URL: #33875
Backport-PR-URL: #34838
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
richardlau pushed a commit that referenced this issue Oct 7, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame.

Fixes: #31309
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

PR-URL: #33875
Backport-PR-URL: #34857
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
clshortfuse added a commit to clshortfuse/node that referenced this issue Oct 7, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame. When writing to a stream, uses a
END_STREAM flag on final DATA frame instead of adding an empty
DATA frame.

BREAKING: http2 client now expects servers to properly support
END_STREAM flag

Fixes: nodejs#31309
Fixes: nodejs#33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

PR-URL: nodejs#33875
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Oct 13, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame. When writing to a stream, uses a
END_STREAM flag on final DATA frame instead of adding an empty
DATA frame.

BREAKING: http2 client now expects servers to properly support
END_STREAM flag

Fixes: #31309
Fixes: #33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

Backport-PR-URL: #34845
PR-URL: #33875
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Nov 16, 2020
Adds support for reading from a stream where the final frame is a
non-empty DATA frame with the END_STREAM flag set, instead of hanging
waiting for another frame. When writing to a stream, uses a
END_STREAM flag on final DATA frame instead of adding an empty
DATA frame.

BREAKING: http2 client now expects servers to properly support
END_STREAM flag

Fixes: #31309
Fixes: #33891
Refs: https://nghttp2.org/documentation/types.html#c.nghttp2_on_data_chunk_recv_callback

Backport-PR-URL: #34845
PR-URL: #33875
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.