BUG, ENH: Read Data From Password-Protected URL's and allow self signed SSL certs #16910

skynss · 2017-07-13T19:55:11Z

closes read_csv from HTTPs + basic-auth + custom port throws an error (urlopen error) #16716
tests added / passed
passes git diff upstream/master --name-only -- '*.py' | flake8 --diff
whatsnew entry

gfyoung · 2017-07-13T20:02:26Z

doc/source/whatsnew/v0.21.0.txt

@@ -40,7 +40,8 @@ Other Enhancements
 - :func:`DataFrame.clip()` and :func:`Series.clip()` have gained an ``inplace`` argument. (:issue:`15388`)
 - :func:`crosstab` has gained a ``margins_name`` parameter to define the name of the row / column that will contain the totals when ``margins=True``. (:issue:`15972`)
 - :func:`Dataframe.select_dtypes` now accepts scalar values for include/exclude as well as list-like. (:issue:`16855`)
-
+- :func:`read_csv` `read_html` `read_json` `read_html`  now accept auth in url //<user>:<password>@<host>:<port>/<url-path>, or ``auth`` tuple (username, password) parameter 
+- :func:`read_csv` `read_html` `read_json` `read_html`  now accept ``verify_ssl`` False to disable https/ssl certificate verification (eg: self signed ssl certs in testing) 
 .. _whatsnew_0210.api_breaking:


Let's condense into one line:

- It is possible to read data (i.e. CSV, JSON, HTML) from a URL that is password-protected (:issue:`16716`)

Note that I also put the issue number at the end of the line.

gfyoung · 2017-07-13T20:03:58Z

pandas/io/common.py

+    compression:
+    auth: (str,str), default None. (username, password) for HTTP(s) basic auth
+    verify_ssl: Default True. If False, allow self signed and invalid SSL
+                 certificates for https



Why is the compression field empty?

The formatting for auth and verify_ssl should be patched. The general format is the following:

<var_name> : <data_type>, <defaults> <description>

Actually the compression already existed, but was not in the comments. I simply added it - to comments and left it empty because I was not too familiar to add best docs. I'll fix the rest

jreback · 2017-07-13T19:59:28Z

doc/source/whatsnew/v0.21.0.txt

@@ -40,7 +40,8 @@ Other Enhancements
 - :func:`DataFrame.clip()` and :func:`Series.clip()` have gained an ``inplace`` argument. (:issue:`15388`)
 - :func:`crosstab` has gained a ``margins_name`` parameter to define the name of the row / column that will contain the totals when ``margins=True``. (:issue:`15972`)
 - :func:`Dataframe.select_dtypes` now accepts scalar values for include/exclude as well as list-like. (:issue:`16855`)
-
+- :func:`read_csv` `read_html` `read_json` `read_html`  now accept auth in url //<user>:<password>@<host>:<port>/<url-path>, or ``auth`` tuple (username, password) parameter 


use commans in between, and you need :func: on each one

@jreback : See my comment above. Do we need this level of detail here? Perhaps a separate section describing what you can now might be good?

@jreback @gfyoung Pls let me know which is the way to go. Also, if summary here with detailed section.. where does the section belong in this doc?

@skynss : Sure thing. I would refrain from doing the detailed section right now, as I would like to see what @jreback thinks of that suggestion. If he thinks it's a good idea, the section would go right below the "Other Enhancements" title. In that section, you would provide examples of how to use your newly added functionality.

jreback · 2017-07-13T19:59:47Z

doc/source/whatsnew/v0.21.0.txt

@@ -40,7 +40,8 @@ Other Enhancements
 - :func:`DataFrame.clip()` and :func:`Series.clip()` have gained an ``inplace`` argument. (:issue:`15388`)
 - :func:`crosstab` has gained a ``margins_name`` parameter to define the name of the row / column that will contain the totals when ``margins=True``. (:issue:`15972`)
 - :func:`Dataframe.select_dtypes` now accepts scalar values for include/exclude as well as list-like. (:issue:`16855`)
-
+- :func:`read_csv` `read_html` `read_json` `read_html`  now accept auth in url //<user>:<password>@<host>:<port>/<url-path>, or ``auth`` tuple (username, password) parameter 
+- :func:`read_csv` `read_html` `read_json` `read_html`  now accept ``verify_ssl`` False to disable https/ssl certificate verification (eg: self signed ssl certs in testing) 


is this user visible?

Yes, it is. These parameters are available at the top level API.

jreback · 2017-07-13T20:00:09Z

pandas/io/common.py

@@ -186,7 +190,12 @@ def get_filepath_or_buffer(filepath_or_buffer, encoding=None,
    ----------
    filepath_or_buffer : a url, filepath (str, py.path.local or pathlib.Path),
                         or buffer
+            supports 'https://username:password@fqdn.com:port/aaa.csv'


pls clarify and add a versionadded tag (0.21.)

jreback · 2017-07-13T20:01:16Z

pandas/io/common.py

    encoding : the encoding to use to decode py3 bytes, default is 'utf-8'
+    compression:


compression: string, default None # the expl is indented on the next line auth: (string, string), default None username, password...... same for verify_ssl

add a versionadded tag

jreback · 2017-07-13T20:01:45Z

pandas/io/common.py

+    -------
+    (username, password), url_no_usrpwd : username or "", password or "", 
+         url without username or password (if it contained it )
+    """


show what this Raises

jreback · 2017-07-13T20:01:54Z

pandas/io/common.py

+    url_with_uname : a url that may or may not contain username and password
+            see section 3.1 RFC 1738 https://www.ietf.org/rfc/rfc1738.txt
+            //<user>:<password>@<host>:<port>/<url-path>
+    auth : ( username/""/None, password/"", None) tuple


same as above

jreback · 2017-07-13T20:03:32Z

pandas/tests/io/test_common.py

+                                       '',
+                                       'https://ccc.com:1010/aaa.txt'
+                                       )]:
+            un, pw, mod_url = common.split_uname_from_url(url)


does this raise at all?

This function name actually doesn't exist anymore. @skynss : could you rename this to the correct function?

jreback · 2017-07-13T20:04:36Z

pandas/tests/test_common.py

@@ -1,223 +1,262 @@
-# -*- coding: utf-8 -*-
-
+"""


what is changed in this file that is releveant to this PR?

gfyoung · 2017-07-13T20:07:18Z

pandas/io/common.py

+    -------
+    (username, password), url_no_usrpwd : username or "", password or "", 
+         url without username or password (if it contained it )
+    """


See my comment here to patch the formatting for url_with_uname.

The return format will need to be changed. The general format is this:

<var_name> : <data_type> <Description>

However, in this case, it would be preferable to describe the returned object without any naming, since this is a nested tuple object e.g.:

Returns -------- A length-two tuple containing the following: - A length-two tuple of username and password. These will be empty strings if none were extracted - The URL stripped of the username and password if provided in the URL.

gfyoung · 2017-07-13T20:08:03Z

pandas/io/common.py

+            //<user>:<password>@<host>:<port>/<url-path>
+    auth : ( username/""/None, password/"", None) tuple
+    verify_ssl: If False, SSL certificate verification is disabled.
+


See my comment here to patch the formatting for your parameters listed above.

gfyoung · 2017-07-13T20:08:48Z

pandas/io/common.py

+
+    Returns
+    -------
+    Request, kwargs to pass to urlopen. kwargs may be {} or {'context': obj }


See my comment here to patch the formatting for your return variable listed above.

gfyoung · 2017-07-13T20:14:17Z

pandas/io/parsers.py

-                 float_precision=None):
+                 float_precision=None,
+
+                 # Basic auth (http/https) (username, password)


Remove this comment. Your documentation of the parameters in the docstring should make this clear.

gfyoung · 2017-07-13T20:14:42Z

pandas/io/parsers.py

+                 # Basic auth (http/https) (username, password)
+                 auth=None,
+
+                 # skip verify self signed SSL certificates


See my comment above. You should also be able to remove this comment.

gfyoung · 2017-07-13T20:15:42Z

pandas/tests/io/test_common.py

+        """
+        Test extraction of username, pwd from url, if contained.
+        """
+        for url, uname, pwd, nurl in [('https://aaa:bbb@ccc.com:1010/aaa.txt',


You can take advantage of pytest parametrisation decorator. See here for how to do it.

…sting

gfyoung · 2017-07-14T01:03:11Z

doc/source/whatsnew/v0.21.0.txt

@@ -40,7 +40,8 @@ Other Enhancements
 - :func:`DataFrame.clip()` and :func:`Series.clip()` have gained an ``inplace`` argument. (:issue:`15388`)
 - :func:`crosstab` has gained a ``margins_name`` parameter to define the name of the row / column that will contain the totals when ``margins=True``. (:issue:`15972`)
 - :func:`Dataframe.select_dtypes` now accepts scalar values for include/exclude as well as list-like. (:issue:`16855`)
-
+- :func:`read_csv`, :func:`read_html`, :func:`read_json`, :func:`read_html`  now accept auth in url //<user>:<password>@<host>:<port>/<url-path>, or ``auth`` tuple (username, password) parameter 
+- :func:`read_csv`, :func:`read_html`, :func:`read_json`, :func:`read_html`  now accept ``verify_ssl`` False to disable https/ssl certificate verification (eg: self signed ssl certs in testing)  (:issue:`16716`)


I'm personally still in favor of providing something more general like what I had suggested before and have a section explaining what you can do now.

@jreback : Thoughts?

can you elaborate.

I was thinking that we should condense into one line:

- It is possible to read data (i.e. CSV, JSON, HTML) from a URL that is password-protected (:issue:`16716`)

In addition, we should add a section about what you can do now with password-authenticated URL's.

gfyoung · 2017-07-14T01:05:32Z

@skynss : In addition to failing your own test, you seemed to have broken several others because the file paths on some of the tests didn't work anymore. Could you run tests locally on your machine to confirm and investigate this?

gfyoung · 2017-07-14T02:17:07Z

@skynss : whoa, what happened to test_common.py ? Why did you delete everything there?

skynss · 2017-07-14T02:24:21Z

@gfyoung Sorry I had made a silly mistake. I ran the tests that I changed and they seem to work. I cannot run all tests because of my current environment (ssh does not allow for Displayable tests to run)
The issue was I had wrongfully overwritten python/tests/test_common.py with python/tests/io/test_common.py originally and submitted it - which obviously caused tests to fail. I think I have corrected the mistake. I am a bit new at this process.

gfyoung · 2017-07-14T02:29:49Z

@skynss : No worries! No one's going to get mad if you make mistakes like this, especially since you are very new to the process 😄

I would focus right now on trying to address all of our comments as well as you can and ensuring that tests do not fail. Make as many commits as you need to get that done.

gfyoung · 2017-07-19T15:02:22Z

if we used requests as an optional dep to handle url like things would pandas code be much simpler?

@jreback : Yes, absolutely. We wouldn't need to setup any authentication, as requests would handle that for us. You can look at the code examples from the issue to see the difference.

jreback · 2017-07-19T15:58:41Z

i would make this an optional dep then

gfyoung · 2017-07-19T16:01:05Z

i would make this an optional dep then

Makes sense. Probably should first clean-up the PR as is before adding this in.

skynss · 2017-07-19T21:50:41Z

@jreback @gfyoung wrt requests as optional dependency, what do we expect as end user experience?

Here are few options to and their advantages disadvantages:
Option 1: Add requests.Session() as a optional param to read_csv( url, req_session=my_session) or read_csv( url, use_requests_lib=True)- in which case, the user is knowingly opting to bypass urllib from pandas and use requests instead. If read_csv( url, username='darth', password='mtfbwy', verify_ssl=False, req_session=my_session), only if my_session is valid or use_requests_lib=True is passed, is requests library used, else urllib is used even if requests package is already installed.
Advantage: user has greater control. Passing requests session might enable power user functionality such as accessing AWS API gateway with token, and other scenarios.
Disadvantage: Existing code stays.. Add new code to optionally handle requests

Option 2: Auto-detect and use requests package if installed, and use urllib if not installed. So end user would possibly call read_csv( url, username='darth', password='mtfbwy', verify_ssl=False) and has no control over whether urllib is used or requests is used.
Advantage: read_csv api is cleaner as it doesnt have additional parameter of use_requests_lib.
Disadvantage: User may notice inconsistent behavior in different environments without warning - and might be very hard to detect the cause is the presence or absence of requests package. Example: Imagine AWS Lambda does not have requests package installed in their python runtime.. and I upload a Lambda python function with locally packaged pandas v.0.20 in single zip. AWS decides that requests is common ask and includes it in their runtime. The behavior of my lambda function might change without me having any clue of why.

Option 3: Forget adding username, password, verify_ssl to read_csv and just add req_session parameter to read_csv instead. Expect end user to know how to build a requests session with username/pwd/verify set and pass that object to read_csv.
Advantage: Cleaner and powerful read_csv.
Disadvantage: Dev now has to know about auth/ssl using requests. Possibly, if end user is going to do this, it might just be easier to just utilize requests + StringIO and bypass the http functionality in read_csv

Option 4: Stop using urllib and switch with strong dependency to requests.
Advantages: Simpler code
Disadvantages: Potential regressions. Bulkier packages going up AWS Lambda. read_csv will still need to be able to accept a req_session object because there is no way to add all potential scenarios as parameters to read_csv

Option 5: Forget requests, allow user to just pass in urllib.request.Request object and possibly other headers etc directly to read_csv.
Advantage: No regressions. No external dependency. Great for power user eg: change useragent string, etc. Use for other scenarios such as AWS API gateway authentication. Easy to code it in existing codebase.
Disadvantage: For simple use cases, a novice user has to learn urllib.

I am not happy with Option 2 as it will produce inconsistent behavior eg Issue #17019 and has greater chance of causing regression.
While Option 3 is better than released version, I might as well consider getting data separately and passing in StringIO object.

I dont recommend Option4 or Option 5 due to complexity esp in Py2/Py3 scenario.

I think Option 1 is best but it be checked in 2 phases:
Phase 1: without requests i.e this issue.
Phase 2: allow requests session as parameter. Have you folks observed lot of issues that requests might resolve? If yes, Phase 2 should be sooner rather than later. I think making the request call is probably easy to implement, but handling and testing various scenarios of response (mimetypes, gzip, multipart, redirections, chunked, various verbs, encoding) will probably need time and expertise. I'm bit too tight on bandwidth to dig deeper into this.

gfyoung · 2017-07-19T22:08:52Z

@skynss : The signature for read_csv is bloated enough as it is. Username and password should be the only way to authenticate and is indeed the easiest. Thus, I would be against option 1.

Auto-detecting would be the best way to incorporate requests behavior. I agree that inconsistencies might be a concern, but that's bound to happen no matter how much you try to standardize. That's why we're making it an optional dependency. We would also need to document that potential issue so that users are aware of this.

I agree with you on option 3: username and password are the simplest hands down.
I agree with you on option 4: we are not relying on requests beyond being an optional dependency.

Option 5 is our backup. We don't need to add requests support at the moment. You could even do that as a follow-up.

…tml.py. Common logic update. whatsnew fixed

…r 'test_html.py' with error AssertionError: Did not see expected warning of class 'InsecureRequestWarning'

jreback · 2017-07-20T10:25:06Z

the way to do this is:

if no auth is provided use current behavior
if auth is provide, import requests, if its not available, raise a nice error message that auth requires requests
use request to service the request.

the idea is that we can push as much code as possible to use requests, rather that re-invent the wheel

skynss · 2017-07-20T20:26:08Z

@jreback Why push only auth scenarios to requests if installed? Instead, if requests is installed, use it by default for all http(s) cases. If not installed, use existing codepath - as-is?

Also, instead of passing in username, pwd, verify_ssl, it is probably easier to instead just pass in optional only requests.session object which accepts not only auth but also enables a ton of other scenarios such as adding proxies, client certs, useragent changes, and custom auth scenarios such as 'Calling AWS api-gateway with cognito (just googled a sample scenario)'

Also at a quick glance, verb other than GET such as POST, etc might be automatically supported using prepared requests

Right now, I don't see how pandas returns status code, and response headers. By utilizing requests event hooks functionality built into requests, this can be handled.

import pandas as pd
from requests import Session

# req_session is optional and replaces username, password, 
# by default, no requests.session needed. Backwards compatible api.
df = pd.read_csv('https://uname:pwd@aa.com/bb.csv') # will work automatically because 
# requests will handles it if installed, else handle with existing codebase

# custom auth
s = Session()
s.auth = MyAuthProvider('secret-key') # custom auth provider supported by requests
df = pd.read_csv(url, req_session=s)

# optional advanced scenarios
s = Session()
s.auth = ('darth', 'l0rd')  # if user wants to perform basic auth Skip if url itself contains username and pwd
s.timeout = (3.05, 27)                           # if user wants to modify timeout
s.verify = False                                      # if user wants to disable ssl cert verification
s.headers.update( {'User-Agent': 'Custom user agent'} )  # extensible to set any custom header needed
s.proxies = { 'http': 'http://a.com:100'}  # if user has proxies 
s.cert = '/path/client.cert'                     # if custom cert is needed
df = pd.read_csv( 'https://aa.com/bbb.csv', req_session=s)

# support verbs other than 'GET' such as 'POST'
r = Request('POST', 'http://joker:pwd@nlp_service.api/email_sentiment_extract?out=json')
prepped = req.prepare()
prepped.body = 'from: aaa@aa.bb\nto: cc@dd.ee\nsubject:Complaint letter\n\nbody: I am feeling :(' # multiple lines
df = pd.read_json( prepped) # minor update pandas code to detect type(Request) and submit it using requests session in lieu of URL.
"""
[{
  'from': 'aaa@aa.bb',
  'to': 'cc@dd.ee',
  'email_type': 'complaint',
  'sentiment': 'unhappy',
}]
"""

# Event hooks callback (eg log http status codes)
def print_http_status(r, *args, **kwargs):
    print(r.status_code)
    print(r.headers['Content-Length'])
s = Session()
s.hooks = dict(response=print_http_status)
df = pd.read_csv( 'https://aa.com/bbb.csv', req_session=s)

It does mean that some of the work I did becomes unnecessary, but if you are ok to use requests, I think the most extensible and simplest scenario is to allow passing in optional request.session parameter.

For the above scenarios, following changes would need to be made to code checked into master:

read_* accept new optional parameter req_session, and detect if 'url' is passed or 'prepared' is.
if requests installed, use url/prepared/req_session as specified. Call session.send()
if requests not installed, don't change current behavior of read_*
Testing: Don't have to test stuff that requests already tests.

I am far from being an expert at requests library and encourage others to review and correct me.

EDIT: I think I might already have this mostly working (except for read_html)

skynss · 2017-07-21T03:03:54Z

@jreback @gfyoung : Take a look at a new branch use-requests . Its a completely different codebase which uses requests

If installed uses python-requests by default.. else use existing codebase of urlopen
only one new optional param of req_session added in lieu of username/pwd, etc
fully working
all pytest unit tests passed in pandas/tests/io - only thing remaining is to figure out how to add test case to emulate python-requests is not installed?
whatsnew updated
codebase rebased with pandas-dev/pandas master as of few mins ago.

It is probably ready to be merged in. please review. let me know if you want me to start a different pull request.

gfyoung · 2017-07-21T03:59:08Z

the idea is that we can push as much code as possible to use requests, rather that re-invent the wheel

@jreback : I wouldn't say we're re-inventing the wheel. It's rather than requests came up with a nice way to provide an even higher-level interface to Python urllib libraries. That being said, I'm perfectly fine with using requests so long as we're okay with making it an optional dependency that is required when authentication with data IO is needed.

gfyoung · 2017-07-21T04:11:44Z

Why push only auth scenarios to requests if installed? Instead, if requests is installed, use it by default for all http(s) cases. If not installed, use existing codepath - as-is?

@skynss You mentioned before that you were concerned about inconsistencies between requests and urllib. If we were to use requests (at least for now) solely for authentication, we would not have to worry about this issue. Not to mention, it will make things easier to test for the time being.

gfyoung · 2017-07-21T04:15:35Z

Also, instead of passing in username, pwd, verify_ssl, it is probably easier to instead just pass in optional only requests.session object which accepts not only auth but also enables a ton of other scenarios

@skynss : That's a bit of reversal since your comment earlier here. That being said, I'm still wary about allowing a user to pass this in.

Since it seems we are moving to incorporate requests into pandas, IMO we should keep the interface relatively agnostic to the user i.e., instead of accepting a session object, we accept an auth parameter, which is a dictionary mapping to different types of authentication that we could accept such as username / password with verify_ssl or a session object.

This interface allows us to have the flexibility of implementing whatever handling we would like in terms of authentication in the future WITHOUT impacting how the user has to interact with it. Internally, we would check if certain fields existed and utilize them for authentication (e.g. if 'session' in auth, then use the session to authenticate).

jreback · 2017-07-21T10:35:15Z

so the current behavior must work w/o requests installed at all.

I would push to have as minimal code added as possible, IOW use requests for the auth and just use the current code for existing things; that said if you want to make a mock requests internally then this might be simpler.

skynss · 2017-07-21T17:59:22Z

@gfyoung

IMO we should keep the interface relatively agnostic to the user i.e., instead of accepting a session object, we accept an auth parameter, which is a dictionary mapping to different types of authentication that we could accept such as username / password with verify_ssl or a session object.

Can you clarify?

# for basic auth
df = pd.read_csv(url, auth={ 'uname':'aa', 'pwd':'bb'})

# to bypass ssl cert verification (which is not authentication) it would be confusing to pass it into auth param
df = pd.read_csv(url, auth={ 'verify_ssl':False})

Perhaps what you meant (and what I think is good idea) was you prefer end user to not make any imports of requests but allow to pass in dict or requests.Session such as following:

# simple user never has to import requests
up = {  'auth' : ('user','pwd'), 'verify_ssl' : False } 
df = pd.read_csv( url, url_params=up)

# or power user can choose to use request.Session directly
import requests
up = requests.Session()
up.auth = ('uname','pwd')
up.header.modify( {'User-Agent' : 'My custom user agent'})
df = pd.read_csv( url, url_params=up)

# internally we check 
if type(url_params) is requests.Session:
    sess = url_params
elif type(url_params) is dict:
    sess = requests.Session()
    # then we add in auth/verify_ssl in here

@jreback

so the current behavior must work w/o requests installed at all

Agreed. Both basic-auth-https-self-signed as well as use-requests already satisfy them.

use requests for the auth and just use the current code for existing things

If we just want to use requests for auth (and I assume you meant ssl cert verification bypass too) then why are we even having a dependency on requests? First option of basic-auth-https-self-signed already does that, and its tested.

use-requests has the least amount of code change ( compared to basic-auth-https-self-signed ). It uses requests for all scenarios, only if it is installed. This functionality is passing all pytest . Also, if requests is not installed, existing functionality continues to work as is.

I got engaged as I opened an issue for basic auth + self signed certs.
Tomorrow someone might open an issue that pandas doesnt support today eg: proxies. Yet another issue might be setting custom headers, etc.
Here is a bug in released version of pandas. This does not involve https, or auth of any kind. This is supposed to work

df = pd.read_csv('http://handsome-equator.000webhostapp.com/no_auth/aaa.csv') # works fine
df = pd.read_csv('http://handsome-equator.000webhostapp.com:80/no_auth/aaa.csv') # fails with 404 if port number is added. it shouldnt. It doesnt in browser, or requests

It fails because urllib behavior. It works in use-requests if requests is installed and it is used even for existing scenarios when installed.

So, what is the direction for http(s) in pandas? Natively use urllib or move to requests? If the goal is to try to use requests the most we can, then what is the test acceptance criteria for trying to use requests most when available? use-requests passes all tests. Besides api design, what else remains?

Wish there was a bit more synchronous way to communicate, to hammer out the issues. faster

gfyoung · 2017-07-21T19:38:24Z

@skynss : You understood my point correctly! True that verify_ssl is technically not an authentication parameter, so I agree that url_params might be better then.

skynss · 2017-07-24T05:54:29Z

You understood my point correctly! True that verify_ssl is technically not an authentication parameter, so I agree that url_params might be better then.

@gfyoung Take a look at latest version of use-requests, this change is implemented. Added new tests, all pytests are passing. Should I create a pull request for that branch of fork instead of this one?

gfyoung · 2017-07-24T05:57:16Z

Should I create a pull request for that branch of fork instead of this one?

I would wait to see what @jreback has to say before you go through the effort of preparing a PR. I would try to clean up this PR to best you can for now.

jreback · 2017-07-26T10:52:50Z

@skynss you branch above looks much better than the current soln. I have a number of comments, but will comment directly on the PR.

skynss · 2017-07-26T17:31:04Z

I would wait to see what @jreback has to say before you go through the effort of preparing a PR

@skynss you branch above looks much better than the current soln. I have a number of comments, but will comment directly on the PR.

@jreback So I should go ahead and submit PR for use-requests branch? I am asking to be sure you meant use-requests branch as "your branch above"

gfyoung · 2017-07-26T17:49:04Z

So I should go ahead and submit PR for use-requests branch? I am asking to be sure you meant use-requests branch as "your branch above"

Yes, absolutely!

gfyoung · 2017-07-26T17:49:26Z

Don't close this PR yet. When we merge, we'll close whichever PR we don't want to use.

jreback · 2017-07-26T18:22:34Z

I mean use-requests (which you pointed to above). actually pls close this one when you push. That's going to be a replacement for this one.

jreback · 2017-09-23T16:56:45Z

@skynss I think would be good for 0.21.0. can you rebase / update and get this passing?

jreback · 2017-11-10T20:17:48Z

pls rebase / update & move whatsnew to 0.22.0

jreback · 2017-11-25T16:24:02Z

can you rebase

jreback · 2018-01-21T18:13:45Z

closing as stale. if you want to continue working, pls ping.

@skynss this is a nice change, but we would need to integrate to the current infrastructure.

gfyoung added Bug IO Data IO issues that don't fit into a more specific label labels Jul 13, 2017

gfyoung reviewed Jul 13, 2017

View reviewed changes

jreback requested changes Jul 13, 2017

View reviewed changes

gfyoung reviewed Jul 13, 2017

View reviewed changes

gfyoung added the Enhancement label Jul 13, 2017

gfyoung reviewed Jul 13, 2017

View reviewed changes

gfyoung changed the title ~~Basic auth https self signed~~ BUG, ENH: Read Data From Password-Protected URLBasic auth https self signed Jul 13, 2017

gfyoung changed the title ~~BUG, ENH: Read Data From Password-Protected URLBasic auth https self signed~~ BUG, ENH: Read Data From Password-Protected URL's Jul 13, 2017

Sky NSS added 8 commits July 13, 2017 17:43

Added http(s) basic auth and allow self signed ssl certs

8b6e426

Change username pwd to auth (username, password)

145c7f4

simplify and unify py2 vs py3

9473316

added temporary test script for https/basic-auth/unsigned ssl cert te…

9c7524d

…sting

Updated comments. Removed test_basic_auth_self_signed

598cf7b

Added what's new

3b454dd

Updates to comments, etc + some change in username password logic

d359b2d

fix to test case - had checked in wrong one

eb03fd3

skynss force-pushed the basic-auth-https-self-signed branch from 843f135 to eb03fd3 Compare July 14, 2017 01:02

gfyoung reviewed Jul 14, 2017

View reviewed changes

skynss changed the title ~~BUG, ENH: Read Data From Password-Protected URL's~~ BUG, ENH: Read Data From Password-Protected URL's and allow self signed SSL certs Jul 14, 2017

Fixing test_common.py which was mistakenly clobbered

cbe3f49

Sky NSS added 2 commits July 19, 2017 19:56

Lot more test cases for HTTP. However, one fails when run with test_h…

209dd58

…tml.py. Common logic update. whatsnew fixed

Updated test case to work with py3. However still fails when run afte…

f520f7b

…r 'test_html.py' with error AssertionError: Did not see expected warning of class 'InsecureRequestWarning'

skynss mentioned this pull request Jul 26, 2017

Use python-requests library, if installed, to support broad http(s) scenarios including Basic Authentication #17087

Closed

4 tasks

jreback closed this Jan 21, 2018

ocefpaf mentioned this pull request Oct 10, 2019

use requests when it is installed #28874

Closed

4 tasks

		encoding : the encoding to use to decode py3 bytes, default is 'utf-8'
		compression:

BUG, ENH: Read Data From Password-Protected URL's and allow self signed SSL certs #16910

BUG, ENH: Read Data From Password-Protected URL's and allow self signed SSL certs #16910

Conversation

skynss commented Jul 13, 2017 • edited by gfyoung Loading

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Jul 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung commented Jul 14, 2017

gfyoung commented Jul 14, 2017

skynss commented Jul 14, 2017

gfyoung commented Jul 14, 2017

gfyoung commented Jul 19, 2017

jreback commented Jul 19, 2017

gfyoung commented Jul 19, 2017

skynss commented Jul 19, 2017

gfyoung commented Jul 19, 2017

jreback commented Jul 20, 2017

skynss commented Jul 20, 2017 • edited Loading

skynss commented Jul 21, 2017

gfyoung commented Jul 21, 2017

gfyoung commented Jul 21, 2017

gfyoung commented Jul 21, 2017 • edited Loading

jreback commented Jul 21, 2017

skynss commented Jul 21, 2017 • edited Loading

gfyoung commented Jul 21, 2017

skynss commented Jul 24, 2017

gfyoung commented Jul 24, 2017

jreback commented Jul 26, 2017

skynss commented Jul 26, 2017

gfyoung commented Jul 26, 2017

gfyoung commented Jul 26, 2017

jreback commented Jul 26, 2017

jreback commented Sep 23, 2017

jreback commented Nov 10, 2017

jreback commented Nov 25, 2017

jreback commented Jan 21, 2018

skynss commented Jul 13, 2017 •

edited by gfyoung

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

gfyoung Jul 13, 2017 •

edited

Loading

skynss commented Jul 20, 2017 •

edited

Loading

gfyoung commented Jul 21, 2017 •

edited

Loading

skynss commented Jul 21, 2017 •

edited

Loading