-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore whitespace text nodes when parsing projects #1187
Conversation
This just ignores purely whitespace text nodes which seem to be part of the parsing in .NET Core and newer versions of System.Xml Closes dotnet#270, fixes test failures in dotnet#1004
@rainersigwald figured out the problem. The XML parser only looks at the first 4k of text to see if it's whitespace or not. Since our whitespace is 70k, it gives up and assumes it's a text node instead of ignoring it. Our call to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits. Awesome to finally have this root-caused and fixed!
@@ -48,6 +48,13 @@ private static List<XmlElementWithLocation> GetChildElements(XmlElementWithLocat | |||
VerifyThrowProjectValidNamespace(childElement); | |||
children.Add(childElement); | |||
break; | |||
case XmlNodeType.Text: | |||
// Whitespace is read as a #text element so if the node is purely whitespace, just ignore it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's odd that we're getting a whitespace-only text node when it should come back as a Whitespace node. But the reason is that there's a limit in XmlTextReaderImpl
's determine-node-type code: if it's greater than 4kB of whitespace, it'll return as Text
.
But given that it works this way, this check sounds good.
if (!String.IsNullOrWhiteSpace(child.InnerText) && throwForInvalidNodeTypes) | ||
{ | ||
ThrowProjectInvalidChildElement(child.Name, element.Name, element.Location); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer
if (string.IsNullOrWhitespace(child.InnerText)) {
// long whitespace comes back as text, see [link]
break;
}
goto default;
and rely on the default
case, instead of having to check whether or not to throw and throwing in this case.
with a fall
@@ -48,8 +48,15 @@ private static List<XmlElementWithLocation> GetChildElements(XmlElementWithLocat | |||
VerifyThrowProjectValidNamespace(childElement); | |||
children.Add(childElement); | |||
break; | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AHH more whitespace, i'll remove this. #whitespacematters
This just ignores purely whitespace text nodes. The XML reader only looks at the first 4k of text to determine if it should be ignored. Our tests are generating 70k of whitespace so the XML reader gives up after 4k and assumes its a text node.
Closes #270
Fixes test failures in #1004 when this change is ported to master.