Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeJS app throwing Invalid method encountered error, unsure how to catch or fix #376

Closed
katelovescode opened this issue Apr 25, 2024 · 11 comments

Comments

@katelovescode
Copy link

I've got a NodeJS app running on AWS EC2 and we keep getting intermittent 502s. We're running the server using pm2, and the error log only tells us

2024-04-24T14:23:14: Error: Parse Error: Invalid method encountered
2024-04-24T14:23:14: error Command failed with exit code 1.

I'm fairly certain the error message that's being sent is this one, seemingly throwing the error from the llparse object:

n('after_start_req')

However, I'm not 100% sure how to catch and handle this error. pm2 keeps running but NGINX gets connection refused from the port the node server is meant to be running on, which makes me think the Node server has stopped as a result of this.

I'm not really sure how to even debug the source of the problem because the error message doesn't catch any specifics or trace regarding what threw it. Is there a standard way to handle this?

For now I've added this to my node server:

process.on('uncaughtExceptionMonitor', (err, origin) => {
  console.error('err', err)
  console.error('origin', origin)
});)

And we'll see if it'll catch more information than it currently is, but from reading the code in the linked file above (and digging a little into llparse/Typescript/Node errors), I can't determine what to tell the process.on to look for (i.e. in the event list there's 'exit', 'unhandledException', etc. - and I'm not sure what event the error linked above is tied to).

Just wondering if anyone has any insight.

@Dreamsorcerer
Copy link
Contributor

Dreamsorcerer commented May 17, 2024

@ShogunPanda We're getting some similar reports in aiohttp (aio-libs/aiohttp#8414), not sure if it's the same problem or not (given the lack of details here).

The problem occurs with an Upgrade message, such as:

POST / HTTP/1.1\r\nConnection: Upgrade\r\nContent-Length: 2\r\nUpgrade: h2c\r\nContent-Type: application/json\r\n\r\n{}

As far as I can tell from our parser code, this is passing the above message to llhttp_execute(), which then hits a HPE_PAUSED_UPGRADE after the headers. We then call llhttp_resume_after_upgrade() and call llhttp_execute() again with the remaining message ({}). At this point we end up with an invalid method error (which suggests that it's expecting the start of a new message, rather than the continuation of a message).

Maybe there's something wrong with our logic, but from what I can see of the specs, this should be a valid message.

@ShogunPanda
Copy link
Contributor

@Dreamsorcerer Once the connection is upgraded, the message is no longer HTTP/1.1. The reason why llhttp pauses there is because after you start the body llhttp should not be involved anymore. So you should not and can not call llhttp_execute and you can directly proceed with forwarding the rest of the data to the app.

@ShogunPanda
Copy link
Contributor

@katelovescode

By default Node handles those invalid methods with a 400 and no logging, UNLESS the user has installed a listener for clientError event.
Are you doing that and properly ending the connection as stated in https://nodejs.org/dist/latest/docs/api/http.html#event-clienterror?
Also, do you have an idea of what might be hitting the server with a wrong method?

@Dreamsorcerer
Copy link
Contributor

Dreamsorcerer commented May 18, 2024

Once the connection is upgraded, the message is no longer HTTP/1.1. The reason why llhttp pauses there is because after you start the body llhttp should not be involved anymore. So you should not and can not call llhttp_execute and you can directly proceed with forwarding the rest of the data to the app.

@ShogunPanda We don't have support for HTTP/2, so if I'm understanding the spec correctly, we just ignore the header and continue processing the HTTP/1.1 message, right? So, my expectation is that the parser should continue working if we resume parsing the message (I'm not sure why else llhttp_resume_after_upgrade() would exist?).

A server MAY ignore a received Upgrade header field if it wishes to continue using the current protocol on that connection. Upgrade cannot be used to insist on a protocol change.
https://www.rfc-editor.org/rfc/rfc9110.html#section-7.8-2

@ShogunPanda
Copy link
Contributor

@Dreamsorcerer You are right, it seems like you should be able to resume the connection. And in the tests it looks like is possible. Can you please share your code?

@Dreamsorcerer
Copy link
Contributor

I just tested with aiohttp. I can push an xfailing test, so you can verify the fix (using the CI integration we made earlier) if you like, but obviously the code is Python, so probably not something you want to dig into. I believe the message and steps described in #376 (comment) are correct. So, let me know if you can't reproduce it in llhttp directly with those steps, and I'll dig closer into the Python code.

@ShogunPanda
Copy link
Contributor

Ok, I made some C test and finally understood the rationale behind this problem.
According to the specs the upgrade request can contain a body (even if it is uncommon). That's why llhttp full parses the request, including the body (as in your case it is a POST with Content-Length: 2.
If you want to llhttp to discard the body, you would have to use a headers_complete callback which returns 2. In that case llhttp will discard the body (and the connection can switch to the new protocol).

@Dreamsorcerer
Copy link
Contributor

That's why llhttp full parses the request, including the body (as in your case it is a POST with Content-Length: 2.

Hmm, I was under the impression that it wasn't parsing the body, and that was the issue?

I think the bit I missed in the steps above, is that after receiving the HPE_PAUSED_UPGRADE, we then use llhttp_get_error_pos(), which returns the position after the headers. When we resume parsing, we then send the rest of the message from that position on (therefore sending {} again).

So, maybe our logic is wrong and we shouldn't pass the data again?

@Dreamsorcerer
Copy link
Contributor

So, maybe our logic is wrong and we shouldn't pass the data again?

No, that doesn't seem to work. The on_body callback never happens...

@ShogunPanda
Copy link
Contributor

I'm not sure why is that. I tried with the following code and it gets called right away:

#include "stdio.h"
#include "llhttp.h"
#include "string.h"

int handle_on_message_complete(llhttp_t *parser)
{
  fprintf(stdout, "Message completed!\n");
  return 0;
}

int handle_on_body(llhttp_t *parser, const char *at, unsigned long length)
{
  fprintf(stdout, "On body (%d)!\n", length);
  return 0;
}

int main()
{
  llhttp_t parser;
  llhttp_settings_t settings;

  /*Initialize user callbacks and settings */
  llhttp_settings_init(&settings);

  /*Set user callback */
  settings.on_message_complete = handle_on_message_complete;
  settings.on_body = handle_on_body;

  /*Initialize the parser in HTTP_BOTH mode, meaning that it will select between
   *HTTP_REQUEST and HTTP_RESPONSE parsing automatically while reading the first
   *input.
   */
  llhttp_init(&parser, HTTP_BOTH, &settings);

  /*Parse request! */
  const char *request = "POST / HTTP/1.1\r\nConnection: Upgrade\r\nContent-Length: 2\r\nUpgrade: h2c\r\nContent-Type: application/json\r\n\r\n{}";
  int request_len = strlen(request);

  enum llhttp_errno err = llhttp_execute(&parser, request, request_len);
  if (err == HPE_OK)
  {
    fprintf(stdout, "Successfully parsed!\n");
  }
  else
  {
    fprintf(stderr, "Parse error: %s %s\n", llhttp_errno_name(err), parser.reason);
  }
}

@Dreamsorcerer
Copy link
Contributor

If you want to llhttp to discard the body, you would have to use a headers_complete callback which returns 2.

OK, I think this was our problem. We returned 2 whenever the upgrade attribute was set on the message. I guess we should be checking the Upgrade header and only returning 2 if it's an upgrade we support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants