-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary double encoding of url #839
Comments
Faced the same issue!
Fails with a reason 'Exception in thread "main" org.jsoup.HttpStatusException: HTTP error fetching URL. Status=400, URL=https://en.wikipedia.org/wiki/Adolph_St%25C3%25B6hr' (double encoding) Possible solution is Apache HttpClient for external data loading:
|
same here. this changed in this version and my scraper "died" |
same. It worked well in the 1.10.1. |
I have same issue. |
Sorry about the issues here. This is fixed in 1.10.3 (upcoming). I fixed it back in 56a728d Will close when 1.10.3 is released |
jsoup 1.10.3 is out now: https://jsoup.org/news/release-1.10.3 |
The url
http://test.com/&
is not treated correctly as of version 1.10.2 (or maybe even 1.10.1).Jsoup.connect("http://test.com/" + URLEncoder.encode("&", "UTF-8")).get();
The url that gets passed to Jsoup is now
http://test.com/%26
because the&
was url encoded. So everything works fine in 1.9.2 because the encodeUrl(String url) method in the HttpConnection class does not modify the given url in this example because there is no space in the given url. The same url in 1.10.2 gets encoded again in the encodeUrl() method which leads to the following url:http://test.com/%2526
(the percent of the url passed to Jsoup is unnecessarily encoded again).A workaround for this issue is to downgrade to 1.9.2 where the encodeUrl method was implemented differently (see below)
1.10.2:
1.9.2:
The text was updated successfully, but these errors were encountered: