Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL error from warcprox #2696

Closed
rebeccacremona opened this issue Jan 22, 2020 · 10 comments
Closed

SSL error from warcprox #2696

rebeccacremona opened this issue Jan 22, 2020 · 10 comments

Comments

@rebeccacremona
Copy link
Contributor

2020-01-22 17:16:28,399 [ERROR] mitmproxy.py 380: problem processing request 'GET /en HTTP/1.1': SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:720)')
Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/perma_web-kkDLCj8f/lib/python3.5/site-packages/warcprox/mitmproxy.py", line 372, in do_COMMAND
    self._connect_to_remote_server()
  File "/root/.local/share/virtualenvs/perma_web-kkDLCj8f/lib/python3.5/site-packages/warcprox/warcproxy.py", line 184, in _connect_to_remote_server
    return warcprox.mitmproxy.MitmProxyHandler._connect_to_remote_server(self)
  File "/root/.local/share/virtualenvs/perma_web-kkDLCj8f/lib/python3.5/site-packages/warcprox/mitmproxy.py", line 279, in _connect_to_remote_server
    self._remote_server_conn.sock, server_hostname=self.hostname)
  File "/usr/lib/python3.5/ssl.py", line 385, in wrap_socket
    _context=self)
  File "/usr/lib/python3.5/ssl.py", line 760, in __init__
    self.do_handshake()
  File "/usr/lib/python3.5/ssl.py", line 996, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/lib/python3.5/ssl.py", line 641, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:720)
2020-01-22 17:16:28,404 [WARNING] mitmproxy.py 499: code 500, message [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:720)
2020-01-22 17:16:32,530 [INFO] __init__.py 161: <ListenerPostfetchProcessor(RunningStats, started daemon 140606953219840)> shutting down
2020-01-22 17:16:32,533 [INFO] __init__.py 161: <WarcWriterProcessor(WarcWriterProcessor, started daemon 140606961612544)> shutting down
2020-01-22 17:16:32,626 [INFO] warcproxy.py 500: shutting down

Happens when attempting to capture anything at https://www.deutschland.de
requests has no problem; phatomjs has no problem even if using warcprox_controller.proxy.ca.ca_file as its --ssl-certificates-path

@bensteinberg
Copy link
Contributor

Not sure if this will shed any light, but https://www.ssllabs.com/ssltest/analyze.html?d=www.deutschland.de

@bensteinberg
Copy link
Contributor

Why is mitmproxy trying to use SSLv3? It looks like that server doesn't have it enabled at all.

@rebeccacremona
Copy link
Contributor Author

Maybe connected? internetarchive/warcprox#115

@rebeccacremona
Copy link
Contributor Author

rebeccacremona commented Jan 22, 2020

Script to reproduce:

import warcprox
from warcprox.controller import WarcproxController
warcprox_port = 27500
options = warcprox.Options(
    address="127.0.0.1",
    port=warcprox_port,
    max_threads=100,
    writer_threads=1,
    gzip=True,
    stats_db_file="",
    dedup_db_file="",
    directory="./"
)
warcprox_controller = WarcproxController(options)
proxy_address = "127.0.0.1:%s" % warcprox_port
import threading
warcprox_thread = threading.Thread(target=warcprox_controller.run_until_shutdown, name="warcprox", args=())
warcprox_thread.start()
import requests
r = requests.get('https://www.deutschland.de/en', verify=False, proxies={'http': 'http://' + proxy_address, 'https': 'http://' + proxy_address})

warcprox_controller.stop.set()
warcprox_controller.proxy.pool.shutdown(wait=False)
warcprox_thread.join()

If run in the python REPL in the Perma docker image, you get the error.

If you run in the REPLY in a fresh python:3.5.3 image with requests[security]==2.20.0 and warcprox==2.4b2, you do not.

If you run in the REPL of a Perma image built freshly just now, with no build cache, you still get the error.

Hmm.

Time to start installing pinned versions of other packages.......

@rebeccacremona
Copy link
Contributor Author

rebeccacremona commented Jan 22, 2020

Nope: pip freeze > requirements.txt, and then copy that into the python:3.5.3 container and pip install -r requirements.txt and no error..... though I observe that the python 3.5.3 image is built from Debian GNU/Linux 8.9 (jessie)!!

I've tried building the Perma image from debian:stretch-backports: same error.

Next: trying a fresh python 3.5.3 image built on stretch, and/or seeing if I can get Perma running on a buster image...

@rebeccacremona
Copy link
Contributor Author

rebeccacremona commented Jan 22, 2020

No error on a fresh python:3.5-stretch image (==3.5.7... 3.5.3 is not available) with requirements frozen and installed as per above. No error on a buster-based Perma image (had to mysql-client to default-mysql-client)... but, the full application isn't yet working:

selenium.common.exceptions.WebDriverException: Message: Can not connect to GhostDriver on port 47115
Shutting down browser and proxies.

#phantomjs.log
Auto configuration failed
139948485062272:error:25066067:DSO support routines:DLFCN_LOAD:could not load the shared library:dso_dlfcn.c:185:filename(libssl_conf.so): libssl_conf.so: cannot open shared object file: No such file or directory
139948485062272:error:25070067:DSO support routines:DSO_load:could not load the shared library:dso_lib.c:244:
139948485062272:error:0E07506E:configuration file routines:MODULE_LOAD_DSO:error loading dso:conf_mod.c:285:module=ssl_conf, path=ssl_conf
139948485062272:error:0E076071:configuration file routines:MODULE_RUN:unknown module name:conf_mod.c:222:module=ssl_conf

Next: either trying to work out what's up with GhostDriver in Busterland, or trying to zero in on what's wrong with the Perma stretch env......

@rebeccacremona
Copy link
Contributor Author

rebeccacremona commented Jan 23, 2020

This is fantastic: the error occurs running python 3.5.3 on all the various debian stretch images I had (pulled from Docker long ago), BOTH installed via apt-get AND built via pyenv... but not python 3.5.4, 3.5.7....... as built by pyenv. It does not occur on the 3.5.3 debian jessie image. It does still occur using the latest stretch image (docker run -it debian:stretch-20191224 bash) and debian:stretch-backports.

Using python -m requests.help I have confirmed that the same OpenSSL versions are used (system and pyopenssl) regardless of python version, in a given container.

This is weird!

Best solution: let's either get pyenv going, or let's get the buster upgrade working!

@rebeccacremona
Copy link
Contributor Author

Other URLS:

https://www.transstudent.org/definitions
https://www.solvoyo.com/
https://content.sciendo.com/configurable/contentpage/journals$002fkbo$002f25$002f2$002farticle-p141.xml
https://www.livetimesng.com/ill-run-for-kenya-presidency-in-2021-barack-obama-declared-intention/?fbclid=IwAR2YiZH6b7rPEeHww8FTk_xU-r_cmDQdyL0TQ5DQbFYOcOQ1v3bjvsRD2jg
https://www.stockholmresilience.org/research/planetary-boundaries/planetary-boundaries/about-the-research/the-nine-planetary-boundaries.html

This accounts for a significant fraction of recent failed captures. To satisfy curiosity, can we tell what they have in common?

@rebeccacremona
Copy link
Contributor Author

Got it: we are running into https://bugs.python.org/issue29697. (Located courtesy of matrix-org/synapse#2350 (comment) and websocket-client/websocket-client#353). This is a known cPython bug, that only surfaces when using OpenSSL 1.1. Patching warcprox to set context.set_ecdh_curve('secp384r1') works for 5 of the 6 URLs above, on 3.5.3 cPython and Debian Stretch.

Rather than fussing with the particular curve, I'm going to continue to try and get things working on Buster, as a better long-term strategy.

@rebeccacremona
Copy link
Contributor Author

Fixed by latest deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants