Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib3.exceptions.ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60) #2266

Open
kenorb opened this issue Mar 1, 2019 · 5 comments

Comments

@kenorb
Copy link

kenorb commented Mar 1, 2019

I've 12 services in my docker-compose.yml which I'm running at the same time, however sometimes they're failing with timeout when running docker-compose up.

Logs:

test_2018_04_1  | Xvfb killed (pid 51)
test_2018_04_1  | 0m0.044s 0m0.073s
test_2018_04_1  | 0m35.236s 0m17.980s
test_2018_04_1  | /opt/scripts/run_backtest.sh done
test_2018_04_1  | Exiting /opt/scripts/run_backtest.sh with 0
test_2018_04_1  | Exiting /opt/scripts/eval.sh with 0
Exception in thread Thread-37:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 383, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/lib/python3.6/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.6/http/client.py", line 258, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 357, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 389, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 309, in _raise_timeout
    raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3/dist-packages/compose/cli/log_printer.py", line 213, in watch_events
    for event in event_stream:
  File "/usr/lib/python3/dist-packages/compose/project.py", line 429, in events
    container = Container.from_id(self.client, event['id'])
  File "/usr/lib/python3/dist-packages/compose/container.py", line 43, in from_id
    return cls(client, client.inspect_container(id), has_been_inspected=True)
  File "/usr/lib/python3/dist-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/docker/api/container.py", line 728, in inspect_container
    self._get(self._url("/containers/{0}/json", container)), True
  File "/usr/lib/python3/dist-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/docker/api/client.py", line 198, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 520, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 630, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 521, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

advanced_test_2018_08_1 exited with code 0

Here is another one:

$ docker-compose up
Removing rider_test_2018_02_1
Removing rider_test_2018_03_1
Removing rider_test_2018_04_1
Removing rider_test_2018_05_1
Removing rider_test_2018_06_1
Removing rider_test_2018_07_1
Removing rider_test_2018_08_1
Removing rider_test_2018_09_1
Removing rider_test_2018_10_1
Removing rider_test_2018_11_1
Removing rider_test_2018_12_1
Recreating 1457f8b21b88_rider_test_2018_10_1 ... done
Recreating 34ed31327751_rider_test_2018_09_1 ... 
Recreating fd54029ab487_rider_test_2018_02_1 ... 
Recreating 34ed31327751_rider_test_2018_09_1 ... done
Recreating 53d8ae2428c7_rider_test_2018_11_1 ... 
Recreating 6cb568778336_rider_test_2018_07_1 ... 
Recreating fd54029ab487_rider_test_2018_02_1 ... done
Recreating 6cb568778336_rider_test_2018_07_1 ... done
Recreating bfadf2054817_rider_test_2018_05_1 ... 
Recreating 738b58d91cd8_rider_test_2018_03_1 ... 
Recreating 567e864cdb7e_rider_test_2018_12_1 ... 
Recreating 15966599ed4e_rider_test_2018_04_1 ... 

ERROR: for rider_test_2018_01_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for d1f6883e1d63_rider_test_2018_08_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for 567e864cdb7e_rider_test_2018_12_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for 738b58d91cd8_rider_test_2018_03_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

I've a good machine, so I don't think that's the issue. Everything is on local.


$ docker-compose --version
docker-compose version 1.21.0, build unknown
$ docker --version
Docker version 18.09.2, build 6247962

$ docker info
Containers: 362
 Running: 0
 Paused: 0
 Stopped: 362
Images: 718
Server Version: 18.09.2
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc version: 09c8266bf2fcf9519a651b04ae54c967b9ab86ec
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.18.0-15-generic
Operating System: Ubuntu 18.10
OSType: linux
Architecture: x86_64
CPUs: 36
Total Memory: 125.6GiB
Name: Ubuntu-PC
ID: QFCG:BXDV:INJF:DB6J:T3AY:7RST:KPG6:4A3S:6WAF:V5NB:UGLN:TJQH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Related:

@MattGal
Copy link

MattGal commented May 15, 2019

+1 ... seeing this on Azure VMs running Ubuntu 16.04 semi-regularly which are basically just using docker-py to pull and run with some mounted volumes, and not sure how to deal with it.

@staehle
Copy link

staehle commented Jun 3, 2019

Are your containers reading/writing lots of data to disk?
I have had this same error before, and diagnosed it to high disk I/O in docker container volumes causing lag/dropped connections in the Docker API (or something similar -- its the best I could find).
My only solution was to reduce the number of running containers with Docker volumes on a single disk

@MattGal
Copy link

MattGal commented Jun 3, 2019

I've worked around this by realizing it always happened after right after reboot, despite systemctl saying that docker was active. The workaround was literally to do "docker run hello-world" in a loop until it succeeded before starting the scripts which use docker-py :)

@Hisol
Copy link

Hisol commented Jul 1, 2019

I am seeing this behaviour on an AWS EC2 c5.4xlarge instance, that is doing nothing but run containers. I don't think it is docker isn't running yet as I am getting this problem from code running within a container that is attempting to start up another container. Seems to work without issue on my local machines Ubuntu 18.04 VM, but on the EC2 instance, it all works for a period and then I get this error.

The Container being started up is a PostgreSQL container, and it is being initialised with SQL data sets of a few 100MB from SQL dump. I get the error from a Docker-Py call to run on a container in detached mode.

@adamnovak
Copy link

adamnovak commented Aug 16, 2019

I'm seeing very similar behavior to @Hisol; on an Amazon i3.8xlarge on Ubuntu with docker.io installed, one particular attempt to create a container with a couple mounted volumes through the docker Python module (version 2.5.1) will fail with a timeout. The image has been pulled already.

The actual container creation seems to take a bit over 2 minutes. The Python code actually waits around for the full 2 minutes and change, and only then raises the error that the timeout was exceeded, after the daemon finally sends a reply. The container does end up getting created, but it stays in "created" state and is never put into "running" state because my Python code has been stopped by the timeout.

I think it might have to do with the amount of data being mounted into the container (probably a few GB at least). When I capture the bytes that the module is sending over the Docker socket for the request, and replay them later after clearing out all existing containers and after the actual data that was supposed to be getting mounted in has been deleted, I can get the container to be created promptly.

Here's what socat has to say about what is going through the Docker socket when I interpose it:

> 2019/08/16 17:12:16.033632  length=918 from=0 to=917
 50 4f 53 54 20 2f 76 31 2e 33 39 2f 63 6f 6e 74  POST /v1.39/cont
 61 69 6e 65 72 73 2f 63 72 65 61 74 65 3f 6e 61  ainers/create?na
 6d 65 3d 72 75 6e 63 68 75 6e 6b 61 6c 69 67 6e  me=runchunkalign
 6d 65 6e 74 2d 2d 68 37 31 6a 64 33 76 37 41 66  ment--h71jd3v7Af
 62 51 20 48 54 54 50 2f 31 2e 31 0d 0a           bQ HTTP/1.1..
 48 6f 73 74 3a 20 6c 6f 63 61 6c 68 6f 73 74 0d  Host: localhost.
 0a                                               .
 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 6b 65 65 70  Connection: keep
 2d 61 6c 69 76 65 0d 0a                          -alive..
 41 63 63 65 70 74 2d 45 6e 63 6f 64 69 6e 67 3a  Accept-Encoding:
 20 67 7a 69 70 2c 20 64 65 66 6c 61 74 65 0d 0a   gzip, deflate..
 41 63 63 65 70 74 3a 20 2a 2f 2a 0d 0a           Accept: */*..
 55 73 65 72 2d 41 67 65 6e 74 3a 20 64 6f 63 6b  User-Agent: dock
 65 72 2d 73 64 6b 2d 70 79 74 68 6f 6e 2f 32 2e  er-sdk-python/2.
 35 2e 31 0d 0a                                   5.1..
 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 61 70  Content-Type: ap
 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 6f 6e 0d 0a  plication/json..
 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20  Content-Length: 
 36 36 33 0d 0a                                   663..
 0d 0a                                            ..
 7b 22 56 6f 6c 75 6d 65 73 22 3a 20 7b 22 2f 64  {"Volumes": {"/d
 61 74 61 22 3a 20 7b 7d 7d 2c 20 22 54 74 79 22  ata": {}}, "Tty"
 3a 20 66 61 6c 73 65 2c 20 22 4e 65 74 77 6f 72  : false, "Networ
 6b 44 69 73 61 62 6c 65 64 22 3a 20 66 61 6c 73  kDisabled": fals
 65 2c 20 22 57 6f 72 6b 69 6e 67 44 69 72 22 3a  e, "WorkingDir":
 20 22 2f 64 61 74 61 22 2c 20 22 49 6d 61 67 65   "/data", "Image
 22 3a 20 22 71 75 61 79 2e 69 6f 2f 76 67 74 65  ": "quay.io/vgte
 61 6d 2f 76 67 3a 64 65 76 2d 76 31 2e 31 38 2e  am/vg:dev-v1.18.
 30 2d 31 30 2d 67 36 64 65 63 37 37 65 34 38 2d  0-10-g6dec77e48-
 74 33 32 35 2d 72 75 6e 22 2c 20 22 43 6d 64 22  t325-run", "Cmd"
 3a 20 5b 22 6d 61 70 22 2c 20 22 2d 66 22 2c 20  : ["map", "-f", 
 22 72 65 61 64 73 5f 63 68 75 6e 6b 5f 30 5f 30  "reads_chunk_0_0
 2e 66 71 2e 67 7a 22 2c 20 22 2d 74 22 2c 20 22  .fq.gz", "-t", "
 31 22 2c 20 22 2d 78 22 2c 20 22 67 72 61 70 68  1", "-x", "graph
 2e 76 67 2e 78 67 22 2c 20 22 2d 67 22 2c 20 22  .vg.xg", "-g", "
 67 72 61 70 68 2e 76 67 2e 67 63 73 61 22 5d 2c  graph.vg.gcsa"],
 20 22 53 74 64 69 6e 4f 6e 63 65 22 3a 20 66 61   "StdinOnce": fa
 6c 73 65 2c 20 22 48 6f 73 74 43 6f 6e 66 69 67  lse, "HostConfig
 22 3a 20 7b 22 42 69 6e 64 73 22 3a 20 5b 22 2f  ": {"Binds": ["/
 74 6d 70 2f 74 6f 69 6c 2d 34 33 38 30 39 39 36  tmp/toil-4380996
 66 2d 62 62 38 32 2d 34 30 62 34 2d 39 64 30 39  f-bb82-40b4-9d09
 2d 61 34 39 30 38 38 66 65 37 64 39 37 2d 30 36  -a49088fe7d97-06
 36 30 38 35 65 61 30 31 61 62 34 35 37 30 62 65  6085ea01ab4570be
 65 37 39 65 36 63 64 63 38 38 64 61 61 37 2f 74  e79e6cdc88daa7/t
 6d 70 55 35 31 77 49 4a 2f 66 37 37 39 30 64 62  mpU51wIJ/f7790db
 31 2d 65 39 37 38 2d 34 36 30 62 2d 39 30 33 64  1-e978-460b-903d
 2d 30 35 62 38 35 62 38 31 61 62 37 30 2f 74 58  -05b85b81ab70/tX
 5a 76 33 52 74 3a 2f 64 61 74 61 3a 72 77 22 5d  Zv3Rt:/data:rw"]
 2c 20 22 4e 65 74 77 6f 72 6b 4d 6f 64 65 22 3a  , "NetworkMode":
 20 22 64 65 66 61 75 6c 74 22 7d 2c 20 22 45 6e   "default"}, "En
 74 72 79 70 6f 69 6e 74 22 3a 20 5b 22 76 67 22  trypoint": ["vg"
 5d 2c 20 22 55 73 65 72 22 3a 20 22 31 30 30 30  ], "User": "1000
 3a 31 30 30 30 22 2c 20 22 45 6e 76 22 3a 20 5b  :1000", "Env": [
 22 4c 43 5f 41 4c 4c 3d 43 22 2c 20 22 56 47 5f  "LC_ALL=C", "VG_
 46 55 4c 4c 5f 54 52 41 43 45 42 41 43 4b 3d 31  FULL_TRACEBACK=1
 22 2c 20 22 54 4d 50 44 49 52 3d 2e 22 5d 2c 20  ", "TMPDIR=."], 
 22 41 74 74 61 63 68 53 74 64 69 6e 22 3a 20 66  "AttachStdin": f
 61 6c 73 65 2c 20 22 41 74 74 61 63 68 53 74 64  alse, "AttachStd
 65 72 72 22 3a 20 66 61 6c 73 65 2c 20 22 41 74  err": false, "At
 74 61 63 68 53 74 64 6f 75 74 22 3a 20 66 61 6c  tachStdout": fal
 73 65 2c 20 22 4f 70 65 6e 53 74 64 69 6e 22 3a  se, "OpenStdin":
 20 66 61 6c 73 65 7d                              false}
--
< 2019/08/16 17:14:24.128184  length=297 from=0 to=296
 48 54 54 50 2f 31 2e 31 20 32 30 31 20 43 72 65  HTTP/1.1 201 Cre
 61 74 65 64 0d 0a                                ated..
 41 70 69 2d 56 65 72 73 69 6f 6e 3a 20 31 2e 33  Api-Version: 1.3
 39 0d 0a                                         9..
 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 61 70  Content-Type: ap
 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 6f 6e 0d 0a  plication/json..
 44 6f 63 6b 65 72 2d 45 78 70 65 72 69 6d 65 6e  Docker-Experimen
 74 61 6c 3a 20 66 61 6c 73 65 0d 0a              tal: false..
 4f 73 74 79 70 65 3a 20 6c 69 6e 75 78 0d 0a     Ostype: linux..
 53 65 72 76 65 72 3a 20 44 6f 63 6b 65 72 2f 31  Server: Docker/1
 38 2e 30 39 2e 37 20 28 6c 69 6e 75 78 29 0d 0a  8.09.7 (linux)..
 44 61 74 65 3a 20 46 72 69 2c 20 31 36 20 41 75  Date: Fri, 16 Au
 67 20 32 30 31 39 20 31 37 3a 31 34 3a 32 34 20  g 2019 17:14:24 
 47 4d 54 0d 0a                                   GMT..
 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20  Content-Length: 
 39 30 0d 0a                                      90..
 0d 0a                                            ..
 7b 22 49 64 22 3a 22 38 36 37 39 61 33 65 31 34  {"Id":"8679a3e14
 38 63 35 38 31 62 30 32 34 34 30 34 32 35 64 65  8c581b02440425de
 63 30 31 39 30 39 38 31 39 31 65 62 61 61 64 65  c019098191ebaade
 65 32 65 66 36 61 62 63 35 65 39 33 35 33 30 66  e2ef6abc5e93530f
 38 38 39 33 36 30 62 22 2c 22 57 61 72 6e 69 6e  889360b","Warnin
 67 73 22 3a 6e 75 6c 6c 7d 0a                    gs":null}.
--
2019/08/16 17:14:24 socat[14638] E write(6, 0x5590ed5487c0, 297): Broken pipe

Note that it takes 2 minutes for the reply to come back from the daemon. I don't see the timeout exception on the client side until after the reply is printed by socat.

Given that the timeout doesn't actually seem to be any protection against a hung server, and given that operations like pulling an image or even starting a container from an image that has already been pulled can consistently, in some configurations, take more than 60 seconds, the default timeout should be raised. In fact, I think the default behavior should be to wait indefinitely as long as the connection to the daemon has not dropped, unless the user actually asks for a timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants