Fix test issue caused by change in Requests library #1762
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Feature Summary and Justification
Requests recently changed the library they use for the detection of content encoding. If you are running Python 3, and chardet, the old library they used, is not present on your system, the new library charset_normalizer is used. Unfortunately, charset_normalizer has some trouble detecting the content encoding when the content size is small, which causes the test
TestSubreddit.test_submit_image_large
to fail. The solution offered here is to specify the content-encoding in the affected test.Here is an example of the failed test, which happened in 1760. That PR just happened to be the first one made after the test environment was updated, and the change suggested in it was not responsible for the test's failure.
In the affected test,
is fed a mock response, constructed as
Subreddit.submit_image
callsSubreddit._submit_media
, which callsSubreddit._parse_xml_response
, which takesresponse
defined above and setsSince the character encoding in
response
wasn't specified, Requests's text property checks the apparent encoding. The test environment doesn't have chardet, so charset_normalizer is called to check the encoding, and it incorrectly guesses that the encoding is UTF-16BE. The response is then decoded with the wrong encoding, whichxml.etree.ElementTree.XML
fails to parse.Here is a script that shows more details:
which produces this output:
You can see that the last line is the same bizarre Chinese string shown in the failed test.
References