Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPSConnectionPool(host='www.amd.com', port=443): Read timed out. (read timeout=60) #575

Closed
jpdsc opened this issue Sep 22, 2020 · 7 comments

Comments

@jpdsc
Copy link

jpdsc commented Sep 22, 2020

Hi there.
I have quite some filters to notify me on SW / Firmware updates.
All of them except 1 are working as intended.

name: "AMD x570 Chipset Drivers"
url: "https://www.amd.com/en/support/chipsets/amd-socket-am4/x570"
ignore_connection_errors: true
ignore_http_error_codes: 4xx, 5xx
filter:
    - xpath: '(//div[contains(@class,"os-row")]//h4)[1]/a'
    - html2text: re

the class os-row is there the drivers are listed and I grab the first one in the list which is always the chipset driver.
However, once I test the filter I get:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.8/http/client.py", line 1347, in getresponse
    response.begin()
  File "/usr/lib/python3.8/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.8/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 724, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 403, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise
    raise value
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 428, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='www.amd.com', port=443): Read timed out. (read timeout=60)

Any idea what is causing this?

Thank you :-)

@thp
Copy link
Owner

thp commented Sep 23, 2020

https://www.amd.com/ seems to be timing out randomly for you (after one minute by default).

You could change the timeout of the job but the question is, why does the page not respond within 60 seconds?

There's also ignore_timeout_errors to ignore such errors, although looking at the backtrace, I'm not sure if it would have worked (requests.exceptions.Timeout vs urllib3.exceptions.ReadTimeoutError ).

@jpdsc
Copy link
Author

jpdsc commented Sep 23, 2020

Thank you for your reply. I tried ignore_timeout_errors and timeout: 120.
Unfortunatly it didn't work. Also looking in Google it didn't give me much help.

@thp
Copy link
Owner

thp commented Sep 23, 2020

Try the following patch, does it make the ignore_timeout_errors work?

index c6f972d..f9b8536 100644
--- a/lib/urlwatch/jobs.py
+++ b/lib/urlwatch/jobs.py
@@ -37,7 +37,7 @@ import subprocess
 import requests
 import textwrap
 import urlwatch
-from requests.packages.urllib3.exceptions import InsecureRequestWarning
+from requests.packages.urllib3.exceptions import InsecureRequestWarning, ReadTimeoutError
 
 from .util import TrackSubClasses
 from .filters import FilterBase
@@ -340,7 +340,7 @@ class UrlJob(Job):
     def ignore_error(self, exception):
         if isinstance(exception, requests.exceptions.ConnectionError) and self.ignore_connection_errors:
             return True
-        if isinstance(exception, requests.exceptions.Timeout) and self.ignore_timeout_errors:
+        if isinstance(exception, (requests.exceptions.Timeout, ReadTimeoutError)) and self.ignore_timeout_errors:
             return True
         if isinstance(exception, requests.exceptions.TooManyRedirects) and self.ignore_too_many_redirects:
             return True

@jpdsc
Copy link
Author

jpdsc commented Sep 23, 2020

I edited /usr/lib/python3.8/site-packages/urlwatch/jobs.py with your changes and tried the filter again.
Filter:

name: "AMD x570 Chipset Drivers"
url: "https://www.amd.com/en/support/chipsets/amd-socket-am4/x570"
ignore_connection_errors: true
ignore_http_error_codes: 4xx, 5xx
ignore_timeout_errors: true
filter:
    - xpath: '(//div[contains(@class,"os-row")]//h4)[1]/a'
    - html2text: re

No luck, same time-out error.
rllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='www.amd.com', port=443): Read timed out. (read timeout=60)

I also checked Pi-Hole just to ensure it is not blocking anything, I don't see the query there.
Are you in a possibility of trying the filter to see if you get the same error? If not, perhaps one of the packages in my Dockerfile is outdated?

RUN set -xe \
    && apk add --no-cache ca-certificates \
                          build-base      \
                          libffi-dev      \
                          libxml2         \
                          libxml2-dev     \
                          libxslt         \
                          libxslt-dev     \
                          openssl-dev     \
                          python3         \
                          python3-dev     \
			  py3-pip	  \
    && python3 -m pip install appdirs   \
                              cssselect \
                              keyring   \
                              lxml      \
                              minidb    \
                              pyyaml    \
                              requests  \
                              chump     \
                              urlwatch  \

@jpdsc
Copy link
Author

jpdsc commented Sep 25, 2020

Closing this issue.

@thp, issue was the HTTP headers FYI.
I added:

headers:
    User-Agent: <redacted>

@jpdsc jpdsc closed this as completed Sep 25, 2020
@snowman
Copy link

snowman commented Mar 14, 2021

@thp The patch you gave works for me, consider merging the patch? Thanks ❤️

Edited

Okay, The patch is not needed, and I said The patch you gave works for me is totally wrong, because the website sometimes works, sometimes not working, interesting, the reason maybe the website is throttling you.

I have added ignore_timeout_errors config option to ignore the error, previously I use ignore_connection_errors, sorry to mention you.

@neutric
Copy link
Contributor

neutric commented Jan 6, 2022

@thp, issue was the HTTP headers FYI. I added:

headers:
    User-Agent: <redacted>

This helped in my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants