Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use streamreader instead of string to load xml #1004

Merged
merged 11 commits into from
Oct 12, 2016

Conversation

martinscholz83
Copy link
Contributor

fixes #985

If you have URL characters in your project path like C:\Project\Project%20\proj.csproj XMLTextReader.Read() method try to unescape these characters. In result the build is failed because it doesn't find the path. Workaround is to load xml from StreamReader instead of string.

@cdmihai
Copy link
Contributor

cdmihai commented Sep 9, 2016

I wonder if the tests fail because the stream is not disposed, and thus the file handle not released. Can you try adding the stream in the using statement?

@martinscholz83 martinscholz83 force-pushed the Use-StreamReader-For-Load-XML branch 3 times, most recently from b05d91a to b2c2a24 Compare September 9, 2016 08:23
@rainersigwald
Copy link
Member

So . . . was the old way doing encoding detection based on XML magic, and the new way now isn't? Or is there another explanation for those test failures?

@martinscholz83
Copy link
Contributor Author

@rainersigwald, for now we use XmlTextReader to import the xml. XmlTextReader has the ability to read the encoding from string. But since we now read from a stream it can't read the encoding and use a default one. This is why these tests are failing.

@martinscholz83
Copy link
Contributor Author

I have added a commit to change from XmlTextReader to XmlReader. I've tested with url characters like %20 and all works fine
image

Because we now read from a stream we should use XmlReader instead of
XmlTextReader and get encoding with `GetAttribute()`.
@cdmihai
Copy link
Contributor

cdmihai commented Sep 15, 2016

@dotnet-bot test this please

@rainersigwald
Copy link
Member

Ah, nice detective work @maddin2016.

Also

Starting with the .NET Framework 2.0, we recommend that you use the System.Xml.XmlReader class instead.

😆 Well, that class was deprecated so recently!

@martinscholz83
Copy link
Contributor Author

👍 😉

In XMLReader `DtdProcessing` property is readonly and had to be set in
constructor
enable BOM for StreamReader and use this as default encoding instead of
`Default.Encoding`
change unit tests that they work with the new XMLReader
XmlReaderSettings dtdSettings = new XmlReaderSettings();
dtdSettings.DtdProcessing = DtdProcessing.Ignore;

using (var stream = new StreamReader(fullPath, true))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndyGerlicher the bool here is for BOM detection, which may be relevant in the future make-assumptions-about-content-without-explicit-setting stuff interesting. I don't think it's wrong, but I do hope we can just get to the UTF-8 everywhere model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stumbled upon this here where encoding is added with BOM and we should be able to read this encoding instead of using a default one.

@@ -112,7 +112,7 @@ public void TestLargeElementLocationUsedLargeColumn()
}
catch (InvalidProjectFileException ex)
{
Assert.Equal(70012, ex.ColumnNumber);
Assert.Equal(1, ex.ColumnNumber);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These results are wrong, though, aren't they? In xplat we have them disabled for netcore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean ex.ColumnNumber should be 70012?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sure seems like it--we're deliberately inserting 70k spaces and no newline, so I'd expect that element to start around column 70k.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh...ok, it wasn't clear to me what column exactly means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's a bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like a bug to me. But I don't know where, exactly.

These tests were disabled because of same error like in dotnet#270
jeffkl added a commit to jeffkl/msbuild that referenced this pull request Oct 12, 2016
This just ignores purely whitespace text nodes which seem to be part of the parsing in .NET Core and newer versions of System.Xml

Closes dotnet#270, fixes test failures in dotnet#1004
jeffkl added a commit that referenced this pull request Oct 12, 2016
This just ignores purely whitespace text nodes. The XML reader only looks at the first 4k of text to determine if it should be ignored. Our tests are generating 70k of whitespace so the XML reader gives up after 4k and assumes its a text node.

Closes #270

Fixes test failures in #1004 when this change is ported to master.
jeffkl added a commit that referenced this pull request Oct 12, 2016
Manual port of 96e6a30 from xplat

Fixes test issue in #1004
@rainersigwald
Copy link
Member

@dotnet-bot test this please

(the tests should fail after #1189 made the location stuff work again)

@jeffkl
Copy link
Contributor

jeffkl commented Oct 12, 2016

Actually, I think the tests were disabled in 430a051. @maddin2016 can you undo your changes to the ElementLocation_Tests.cs file? The tests should be fixed now via #1189.

@rainersigwald
Copy link
Member

@maddin2016 also revert decd351 please--the current state should fail since the column-number problem should now be fixed.

This reverts commit decd351.
Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let's see those tests pass!

@martinscholz83
Copy link
Contributor Author

🙌 🎉

@jeffkl
Copy link
Contributor

jeffkl commented Oct 12, 2016

Thanks a lot for this contribution!

@jeffkl jeffkl merged commit 33761a5 into dotnet:master Oct 12, 2016
rainersigwald added a commit to rainersigwald/msbuild that referenced this pull request Oct 31, 2016
When dotnet#1004 moved the standard XML reading approach to be stream-based
rather than file-based, the exceptions thrown on malformed XML
changed--System.Xml no longer knows the path, so it isn't included in the
XmlException. That caused MSBuild to fail to report the location of the
XML error in a nice way as it had done before.

Almost every case where we constructed a BuildEventFileInfo object already
had access to the full path, so I added a constructor that accepted that
as an argument and overrides the possibly-empty path returned from
XmlException.SourceUri.

Fixes dotnet#1286.
rainersigwald added a commit that referenced this pull request Oct 31, 2016
When #1004 moved the standard XML reading approach to be stream-based
rather than file-based, the exceptions thrown on malformed XML
changed--System.Xml no longer knows the path, so it isn't included in the
XmlException. That caused MSBuild to fail to report the location of the
XML error in a nice way as it had done before.

Almost every case where we constructed a BuildEventFileInfo object already
had access to the full path, so I added a constructor that accepted that
as an argument and overrides the possibly-empty path returned from
XmlException.SourceUri.

Added a unit test to verify that the information is indeed preserved in the exception.

Fixes #1286.
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this pull request Nov 22, 2016
Background:
* Previous implementation on Full Framework used XmlTextReader(path). This
  contained an issue with certain characters (dotnet#985) and was fixed by using
  streams (dotnet#1004). dotnet#1004 also changed from XmlTextReader to XmlReader.
* XmlReader contains logic to normalize line endings. Internally, it sets
  the Normalize property to true and replaces (some? all?) \r\n with \n.

This change switches implementation to use XmlTextReader. This class sets
the internal Normalize to false and does not replace \r\n with \n. This
fixes dotnet#1340.

However, .NET Core does not ship with XmlTextReader, only XmlReader. dotnet#1340
still exists for .NET Core.
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this pull request Nov 22, 2016
Background:
* Previous implementation on Full Framework used XmlTextReader(path). This
  contained an issue with certain characters (dotnet#985) and was fixed by using
  streams (dotnet#1004). dotnet#1004 also changed from XmlTextReader to XmlReader.
* XmlReader contains logic to normalize line endings. Internally, it sets
  the Normalize property to true and replaces (some? all?) \r\n with \n.

This change switches implementation to use XmlTextReader. This class sets
the internal Normalize to false and does not replace \r\n with \n. This
fixes dotnet#1340.

However, .NET Core does not ship with XmlTextReader, only XmlReader. dotnet#1340
still exists for .NET Core.
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this pull request Nov 22, 2016
Background:
* Previous implementation on Full Framework used XmlTextReader(path). This
  contained an issue with certain characters (dotnet#985) and was fixed by using
  streams (dotnet#1004). dotnet#1004 also changed from XmlTextReader to XmlReader.
* XmlReader contains logic to normalize line endings. Internally, it sets
  the Normalize property to true and replaces (some? all?) \r\n with \n.

This change switches implementation to use XmlTextReader. This class sets
the internal Normalize to false and does not replace \r\n with \n. This
fixes dotnet#1340.

However, .NET Core does not ship with XmlTextReader, only XmlReader. dotnet#1340
still exists for .NET Core.
AndyGerlicher added a commit that referenced this pull request Nov 22, 2016
Background:
* Previous implementation on Full Framework used XmlTextReader(path). This
  contained an issue with certain characters (#985) and was fixed by using
  streams (#1004). #1004 also changed from XmlTextReader to XmlReader.
* XmlReader contains logic to normalize line endings. Internally, it sets
  the Normalize property to true and replaces (some? all?) \r\n with \n.

This change switches implementation to use XmlTextReader. This class sets
the internal Normalize to false and does not replace \r\n with \n. This
fixes #1340.

However, .NET Core does not ship with XmlTextReader, only XmlReader. #1340
still exists for .NET Core.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error building when project has percent '%' char in the path
6 participants