Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed incomplete JSON body on count request making org.elasticsearch.rest.action.RestActions#parseTopLevelQueryBuilder go into endless loop #26680

Merged
merged 1 commit into from
Sep 19, 2017

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Sep 16, 2017

Took a shot at #26083 here :)

The problem is that parser.nextToken() will return null when hitting the end a JSON string. So if the String is partly valid like an empty string:

for (XContentParser.Token token = parser.nextToken(); token != XContentParser.Token.END_OBJECT; token = parser.nextToken()) {

will loop forever because nextToken() will keep returning null.
You can see this by one core going to 100% load while a request like:

curl -XGET  -H 'Content-Type: application/json' http://localhost:9200/\*/_count -d '""'

blocks forever.

In 5.2 this seems to have been caught in the logic guessing the content type of the request and the above example returned:

{"error":{"root_cause":[{"type":"parse_exception","reason":"Failed to derive xcontent"}],"type":"parse_exception","reason":"Failed to derive xcontent"},"status":400}%

Since this is no longer the cause of the problem, I took a shot at a more appropriate error and it now returns:

"error":{"root_cause":[{"type":"parsing_exception","reason":"Failed to parse","line":1,"col":1}],"type":"parsing_exception","reason":"Failed to parse","line":1,"col":1,"caused_by":{"type":"e_o_f_exception","reason":"Unexpected end of JSON request body."}},"status":400}%

Fixes #26083

@@ -244,6 +245,9 @@ private static QueryBuilder parseTopLevelQueryBuilder(XContentParser parser) {
try {
QueryBuilder queryBuilder = null;
for (XContentParser.Token token = parser.nextToken(); token != XContentParser.Token.END_OBJECT; token = parser.nextToken()) {
if (token == null) {
throw new EOFException("Unexpected end of JSON request body.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather like to have a check outside the parsing loop that asserts that the first token the parser emmits is XContentParser.Token.START_OBJECT. I think its save to assume we are expecting full json objects. I'd also just throw a ParsingException in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbuescher but if we just verify that the first token is XContentParser.Token.START_OBJECT, we'll still see this infinite loop for "{" for example. Do you mean we want to accept that?
Is it really a good idea to have the behavior of org.elasticsearch.rest.action.RestActions#parseTopLevelQueryBuilder be to loop forever on part of a valid JSON request? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll still see this infinite loop for "{" for example.

I'd expect the parser to throw an exception on this?

Is it really a good idea to have the behavior of org.elasticsearch.rest.action.RestActions#parseTopLevelQueryBuilder be to loop forever on part of a valid JSON request? :)

Of course not, and this is not what @cbuescher said.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlrx

I'd expect the parser to throw an exception on this?

Currently, it doesn't throw on half a valid JSON request that's where my confusion was coming from. Also just goes into the infinite loop then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just implemented the tests @cbuescher asked for below (good call obviously it started throwing on empty string now too :)).
The best fix I see that includes the empty string case as well would be to do this:

            if (parser.nextToken() == null) {
                return null;
            }
            for (XContentParser.Token token = parser.nextToken(); token != XContentParser.Token.END_OBJECT; token = parser.nextToken()) {
                if (token == null) {
                    throw new ParsingException(parser.getTokenLocation(), "Failed to parse");
                }

if the first token isn't a XContentParser.Token.START_OBJECT the parser throws eventually anyways and we handle the empty string case without much code, how about it? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wops my bad :) you're right, '{' dies just fine.
So basically we have return null if the first token is null (to cover the empty string case), throw if it's not null and not ContentParser.Token.START_OBJECT and leave the loop untouched? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And because this is really a bit confusing (I was suprised by some of this myself), maybe add some of the above cases in tests. Or wdyt @tlrx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbuescher I agree

Copy link
Member Author

@original-brownbear original-brownbear Sep 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbuescher @tlrx done :) Added all the discussed cases here https://github.com/elastic/elasticsearch/pull/26680/files#diff-2098a5df6345871e4c3fdb934d3f9f49R72 and moved checks before the loop.

@@ -66,6 +66,14 @@ public void testParseTopLevelBuilderEmptyObject() throws IOException {
}
}

public void testParseTopLevelBuilderMalformedJson() throws IOException {
String requestBody = "\"\"";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for the cases "\"someString\"" and maybe also make sure that we don't barf for the empty String, but simply return null like in the case of "{}"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended tests to cover this :)

@cbuescher
Copy link
Member

@original-brownbear thanks, I left two small comments. I was suprised to see that we allow creating XContentParser instances from input Strings like "foo", I would have expected the Jackson JsonFactory to barf on that. I talked to @tlrx about this and unfortunately there is some open debate about whether a single string value is considered as valid JSON, otherwise I would prefer to reject request bodies of that kind altogether but that doesn't seem to be an option atm.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, I left two other small comments regarding the error message and the corresponding test. The rest looks good to me.

if (first == null) {
return null;
} else if (first != XContentParser.Token.START_OBJECT) {
throw new ParsingException(parser.getTokenLocation(), "Failed to parse");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the message a little bit more descriptive, since we know whats wrong in this case?

Copy link
Member Author

@original-brownbear original-brownbear Sep 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbuescher sure, how about "Invalid JSON object, first token must be '{' but was 'VALUE_STRING'" ? (or whatever other token type it may be) I liked this better than printing the string back, which is hard to interpret for accidental string literals like in the "'{'" example if that makes sense? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to be JSON, we support Yaml, Cbor and Smile, so better to say we expected the beginning of an object or something along those lines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea that looks perfect, sec adjusting :)

try (XContentParser parser = createParser(JsonXContent.jsonXContent, requestBody)) {
ParsingException exception =
expectThrows(ParsingException.class, () -> RestActions.getQueryContent(parser));
assertEquals("Failed to parse", exception.getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error message is different, maybe you can differentiate between the case where we detect a missing START_OBJECT and the case where the underlying Jackson parser throws.

Copy link
Member Author

@original-brownbear original-brownbear Sep 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbuescher sure already done locally (as well as the broken indentation in the test), just waiting for an answer on the above :)

@javanna
Copy link
Member

javanna commented Sep 18, 2017

Question: is this a problem with the count api only? Seems like the same could happen with any API, or do we have protection in other places already?

@cbuescher
Copy link
Member

@javanna As far as I can see the code path in question here (RestActions.getQueryContent()) is used by RestValidateQueryAction, RestCountAction (incl. the cat variant) and the RestExplainAction. In other places, e.g. for parsing in SearchSourceBuilder we already seem to have similar checks: https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/search/builder/SearchSourceBuilder.java#L962

@cbuescher
Copy link
Member

Seems like the same could happen with any API, or do we have protection in other places already?

And yes, I talked to @tlrx before and it seems we fixed some of these cases on a case by case basis, there might be other places left to check though, but this is outside of the scope of this PR I think. I would have prefered that XContent#createParser would reject these cases already but that seems not to be the case and more difficult to change.

@original-brownbear
Copy link
Member Author

@cbuescher exception message adjusted :)

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear thanks a lot, looks good to me now. Let's just wait for CI to pass, there were some failures earlier on that I think were unrelated but just to be sure.
Also, would you squash and shorten the commit title a little bit before merging? Since running into this bug can be quiet nasty, I think we should backport it to all active 6.x branches and the current 5.6 branch.

@original-brownbear
Copy link
Member Author

@original-brownbear
Copy link
Member Author

@cbuescher done, squashed with shorter commit message :)

@cbuescher
Copy link
Member

@original-brownbear thanks, do you want to merge and do the backport or should I do it?

@original-brownbear
Copy link
Member Author

@cbuescher I don't know how you guys are handling merging and backporting, probably safest for your repo if you do it this time around :)

@cbuescher cbuescher merged commit 2db3bcc into elastic:master Sep 19, 2017
@cbuescher
Copy link
Member

@original-brownbear thanks again for this fix, I rewrote the commit message a bit and will merge this to the active 6.x branches and 5.6 now.

@original-brownbear
Copy link
Member Author

@cbuescher thanks!

cbuescher pushed a commit that referenced this pull request Sep 19, 2017
Request bodys that only consists of a String value can lead to endless loops in the
parser of several rest requests like e.g. `_count`. Up to 5.2 this seems to have been caught 
in the logic guessing the content type of the request, but since then it causes the node to
block. This change introduces checks for receiving a valid xContent object before starting the
parsing in RestActions#parseTopLevelQueryBuilder().

Closes #26083
cbuescher pushed a commit that referenced this pull request Sep 19, 2017
Request bodys that only consists of a String value can lead to endless loops in the
parser of several rest requests like e.g. `_count`. Up to 5.2 this seems to have been caught 
in the logic guessing the content type of the request, but since then it causes the node to
block. This change introduces checks for receiving a valid xContent object before starting the
parsing in RestActions#parseTopLevelQueryBuilder().

Closes #26083
cbuescher pushed a commit that referenced this pull request Sep 19, 2017
Request bodys that only consists of a String value can lead to endless loops in the
parser of several rest requests like e.g. `_count`. Up to 5.2 this seems to have been caught
in the logic guessing the content type of the request, but since then it causes the node to
block. This change introduces checks for receiving a valid xContent object before starting the
parsing in RestActions#parseTopLevelQueryBuilder().

Closes #26083
jasontedor added a commit to droberts195/elasticsearch that referenced this pull request Sep 19, 2017
* master: (278 commits)
  Move pre-6.0 node checkpoint to SequenceNumbers
  Invalid JSON request body caused endless loop (elastic#26680)
  added comment
  fix line length violation
  Moved the check to fetch phase. This basically means that we throw a better error message instead of an AOBE and not adding more restrictions.
  inner hits: Do not allow inner hits that use _source and have a non nested object field as parent
  Separate Painless Whitelist Loading from the Painless Definition (elastic#26540)
  convert more admin requests to writeable (elastic#26566)
  Handle release of 5.6.1
  Allow `InputStreamStreamInput` array size validation where applicable (elastic#26692)
  Update global checkpoint with permit after recovery
  Filter pre-6.0 nodes for checkpoint invariants
  Skip bad request REST test on pre-6.0
  Reenable BWC tests after disabling for backport
  Add global checkpoint tracking on the primary
  [Test] Fix reference/cat/allocation/line_8 test failure
  [Docs] improved description for fs.total.available_in_bytes (elastic#26657)
  Fix discovery-file plugin to use custom config path
  fix testSniffNodes to use the new error message
  Add check for invalid index in WildcardExpressionResolver (elastic#26409)
  ...
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Sep 20, 2017
* master: (67 commits)
  Restoring from snapshot should force generation of a new history uuid (elastic#26694)
  test: Use a single primary shard so that the exception can caught in the same way
  Move pre-6.0 node checkpoint to SequenceNumbers
  Invalid JSON request body caused endless loop (elastic#26680)
  added comment
  fix line length violation
  Moved the check to fetch phase. This basically means that we throw a better error message instead of an AOBE and not adding more restrictions.
  inner hits: Do not allow inner hits that use _source and have a non nested object field as parent
  Separate Painless Whitelist Loading from the Painless Definition (elastic#26540)
  convert more admin requests to writeable (elastic#26566)
  Handle release of 5.6.1
  Allow `InputStreamStreamInput` array size validation where applicable (elastic#26692)
  Update global checkpoint with permit after recovery
  Filter pre-6.0 nodes for checkpoint invariants
  Skip bad request REST test on pre-6.0
  Reenable BWC tests after disabling for backport
  Add global checkpoint tracking on the primary
  [Test] Fix reference/cat/allocation/line_8 test failure
  [Docs] improved description for fs.total.available_in_bytes (elastic#26657)
  Fix discovery-file plugin to use custom config path
  ...
@clintongormley clintongormley changed the title #26083 Fixed incomplete JSON body on count request making org.elasticsearch.rest.action.RestActions#parseTopLevelQueryBuilder go into endless loop Fixed incomplete JSON body on count request making org.elasticsearch.rest.action.RestActions#parseTopLevelQueryBuilder go into endless loop Sep 25, 2017
@lcawl lcawl removed the v6.1.0 label Dec 12, 2017
@clintongormley clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Query DSL labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v5.6.2 v6.0.0-rc1 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants