-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add headers paramater to read_json and read_csv #36754
Conversation
Hello @Antetokounpo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-10-03 17:20:18 UTC |
pandas/io/common.py
Outdated
return urllib.request.urlopen(*args, **kwargs) | ||
# Request class is only available in Python3, which | ||
# allows headers to be specified | ||
if hasattr(urllib.request, "Request"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas no longer supports Python2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the check then.
pandas/io/common.py
Outdated
@@ -176,6 +183,7 @@ def get_filepath_or_buffer( | |||
compression: CompressionOptions = None, | |||
mode: ModeVar = None, # type: ignore[assignment] | |||
storage_options: StorageOptions = None, | |||
headers: dict = {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we be more specific using Dict? What types are the keys and values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's HTTP Headers, so it should be str
for keys, and the type for value can vary I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests for the change? That is typically the first thing we look for as reviewers
Also not sure about adding this to the API - what values would a user typically provide here? Documenting that in the docstrings would help for sure
@@ -148,9 +148,9 @@ def urlopen(*args, **kwargs): | |||
Lazy-import wrapper for stdlib urlopen, as that imports a big chunk of | |||
the stdlib. | |||
""" | |||
import urllib.request | |||
from urllib.request import Request, urlopen as _urlopen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it makes the code more concise, but if you think the old way is better I can change it back.
c64d378
to
be529b1
Compare
I'm not sure we would add new arguments to the function signatures like this. There has been some updates to the original issue and I think would prefer to just document how this can be handled instead of changing these functions - can you adjust the PR accordingly? |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
Closing in favor of #37966 |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
This adds the option to specify headers when reading a csv or a json file from an URL in Python3.
Let me know if new tests are needed.