Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed: 0 Server returned nothing (no headers, no data) #200

Closed
cornernote opened this issue Apr 19, 2015 · 8 comments · Fixed by #201 or #204
Closed

failed: 0 Server returned nothing (no headers, no data) #200

cornernote opened this issue Apr 19, 2015 · 8 comments · Fixed by #201 or #204
Assignees

Comments

@cornernote
Copy link

Some external sites are giving an error when I test from travis-ci.org:

I added them to my exclude, but I'd rather not. Any ideas as to why this happens?

@benbalter
Copy link
Contributor

I am also seeing this with TechCrunch URLs (in addition to all WordPress URLs). See https://travis-ci.org/benbalter/benbalter.github.com/builds/59152824 for an example. Perhaps Travis's IP is blacklisted? Curl or requesting the page without an agent header returns the content as expected.

@gjtorikian
Copy link
Owner

Perhaps Travis's IP is blacklisted?

Seems not. I see curl working too, but proofer fails locally:

HTML::Proofer.new(['http://techcrunch.com/2011/01/07/twitter-informs-users-of-doj-wikileaks-court-order-didnt-have-to/']).run

Checking 1 external link...
- 
  *  External link http://techcrunch.com/2011/01/07/twitter-informs-users-of-doj-wikileaks-court-order-didnt-have-to/ failed: 0 Server returned nothing (no headers, no data)
RuntimeError: HTML-Proofer found 1 failure!

@gjtorikian
Copy link
Owner

The Typhoeus User Agent is being blocked:

HTML::Proofer.new(['http://techcrunch.com/2011/01/07/twitter-informs-users-of-doj-wikileaks-court-order-didnt-have-to/'], :typhoeus => { :verbose => true, :headers => { 'User-Agent' => 'LOL' }}).run
Running ["ImageCheck", "LinkCheck", "ScriptCheck"] checks on ["http://techcrunch.com/2011/01/07/twitter-informs-users-of-doj-wikileaks-court-order-didnt-have-to/"] on *.html... 


Checking 1 external link...
Hostname was NOT found in DNS cache
  Trying 192.0.79.33...
Connected to techcrunch.com (192.0.79.33) port 80 (#0)
HEAD /2011/01/07/twitter-informs-users-of-doj-wikileaks-court-order-didnt-have-to/ HTTP/1.1
Host: techcrunch.com
Accept: */*
User-Agent: LOL

HTTP/1.1 200 OK

What are the ethics involved in my changing the User-Agent to something non-static to avoid being blacklisted again?

@gjtorikian
Copy link
Owner

You too can test this with a simple

curl -A 'Typhoeus' -I http://techcrunch.com/2011/01/07/twitter-informs-users-of-doj-wikileaks-court-order-didnt-have-to/

@benbalter
Copy link
Contributor

What are the ethics involved in my changing the User-Agent to something non-static to avoid being blacklisted again?

@gjtorikian Do you have any sense if we're being blocked by Travis or if we're being blocked by Drupal, WP, etc.?

@nacin do you have any idea why *.?wordpress.org would block Typheous (Ruby cURL library) user agents?

@benbalter
Copy link
Contributor

Failing test on master:

1) Links test fails on redirects if not following
     Failure/Error: expect(proofer.failed_tests.first).to match(/failed: 301 No error/)
       expected "spec/html/proofer/fixtures/links/linkWithRedirect.html: External link http://timclem.wordpress.com/2012/03/01/mind-the-end-of-your-line/ failed: 0 Server returned nothing (no headers, no data)" to match /failed: 301 No error/
       Diff:
       @@ -1,2 +1,2 @@
       -/failed: 301 No error/
       +"spec/html/proofer/fixtures/links/linkWithRedirect.html: External link http://timclem.wordpress.com/2012/03/01/mind-the-end-of-your-line/ failed: 0 Server returned nothing (no headers, no data)"

     # ./spec/html/proofer/links_spec.rb:82:in `block (2 levels) in <top (required)>'

@benbalter
Copy link
Contributor

@gjtorikian I believe the issue is someplace in response_handler (on our end). It seems we properly handle 30x redirects when the server returns a response body, but fail for what should be proper 30x redirect without a response body.

Example:

GET http://timclem.wordpress.com/2012/03/01/mind-the-end-of-your-line/ returns:

HTTP/1.1 301 Moved Permanently
Location: https://timclem.wordpress.com/2012/03/01/mind-the-end-of-your-line/
Content-Type: text/html
Server: nginx
X-ac: 1.dca _dca
Date: Mon, 20 Apr 2015 18:56:20 GMT
Content-Length: 178
Connection: close

<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx</center>
</body>
</html>

GET https://timclem.wordpress.com/2012/03/01/mind-the-end-of-your-line/ returns:

HTTP/1.1 301 Moved Permanently
Transfer-Encoding: Identity
Content-Type: text/html; charset=utf-8
Connection: close
Server: nginx
Location: http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/
X-ac: 1.dca _dca
Date: Mon, 20 Apr 2015 18:57:08 GMT
Vary: Cookie

Edit: My guess is this broke when WP started enforcing HTTPS, which would be a redirect to a redirect. Either way, we should properly handle contentless redirects.

@gjtorikian
Copy link
Owner

Interesting! I'll take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants