-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Make most arguments for read_html and read_json keyword-ony #27573
Conversation
22fd9dc
to
5318a76
Compare
We need to deprecate the old behavior first. As written, this is an api breaking change. And we shouldn’t do this until 1.0 or later. |
and how exactly would you deprecate ? |
Should be able to use a decorator to inspect the calm, right? And if any positional or keyword args are passed in args then we warn?
… On Jul 25, 2019, at 07:44, Jeff Reback ***@***.***> wrote:
We need to deprecate the old behavior first. As written, this is an api breaking change.
And we shouldn’t do this until 1.0 or later.
and how exactly would you deprecate ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thank you. I also wanted to propose somewhat like this.
Or do such a decorator already exist? |
Hello @alexitkes! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-03-31 18:09:11 UTC |
b00ec4a
to
99e0e3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests to read_html / read_json to ensure that warnings are raised there?
d6f29f4
to
cd5d658
Compare
Lifting #27573 (comment) to the top level
What do you think about this @alexitkes? |
@TomAugspurger Thank you for review. Yes, writing argument list twice will be quite annoying, but I will need some time to find a way to avoid it, and I would like to try. However, I think it would be better to preserve the |
Why do you think we'll need that? That sounds more complicated than just detecting kwargs passed positionally. If we aren't going to need it then I'd rather not implement it. |
@TomAugspurger I have just written such a decorator and it does not look too complicated. However it is my first experience with It will not be difficult to remove |
I'm not sure without looking more closely. My thought is that all keyword
arguments being keyword-only is easier to explain to users.
…On Tue, Jul 30, 2019 at 3:10 AM Alex Itkes ***@***.***> wrote:
@TomAugspurger <https://github.com/TomAugspurger> I have just written
such a decorator and it does not look too complicated. However it is my
first experience with inspect module so possibly more tests are needed
though some tests are written and they all run successfully.
It will not be difficult to remove allowed_args argument and always set
it to list of required arguments, if you think eventually all arguments
having the default values should be keyword-only arguments. Should I do it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27573?email_source=notifications&email_token=AAKAOIWDW7YBWLIQNA6SISTQB7ZQNA5CNFSM4IGSQB62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3DFE6A#issuecomment-516313720>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIQDBCCUJD5VIZIZ5VLQB7ZQNANCNFSM4IGSQB6Q>
.
|
I have almost forgotten one thing - the |
Oh sorry. In that case we'll definitely need to have the ability to have some keyword-or-positional arguments. |
Well, it seems to work fine though I am a bit worried because it's my first experience with |
""" | ||
df = pd.DataFrame({"A": [2, 4, 6], "B": [3, 6, 9]}, index=[0, 1, 2]) | ||
with tm.assert_produces_warning(FutureWarning): | ||
assert_frame_equal(df, read_json(df.to_json(orient="split"), "split")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you minimize the amount of code in the context manager? I think just read_json(buf, "split")
.
Also, do we need to test all three of these? What differs between them?
pandas/util/_decorators.py
Outdated
def wrapper(*args, **kwargs): | ||
if isinstance(allow_args, int) and len(args) > allow_args: | ||
msg = ( | ||
"After version %s all arguments of %s " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version -> pandas version.
Use {}
-style string templates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done it. I also added a separate function to form a better looking warning message depending on the number of arguments, but so far I only tested it manually. Is there a proper way to test how exactly does the produced warning look?
2982165
to
f737870
Compare
@WillAyd @jreback @jorisvandenbossche do you have thoughts on this (the overall goal, the implementation seems fine).
|
cd84833
to
c8284ef
Compare
pandas/tests/io/test_html.py
Outdated
|
||
def test_index(self): | ||
df1 = self.read_html(self.spam_data, ".*Water.*", index_col=0) | ||
df2 = self.read_html(self.spam_data, "Unit", index_col=0) | ||
with tm.assert_produces_warning(FutureWarning): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment for non-duplicated tests; would rather not use the deprecated behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, such amount of duplicated tests looks quite ugly, so I removed all tests with deprecated behavior except for one. Would it be enough?
46e9ceb
to
70359f6
Compare
Co-Authored-By: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-Authored-By: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-Authored-By: Joris Van den Bossche <jorisvandenbossche@gmail.com>
01ec83a
to
ed63145
Compare
@WillAyd OK, done it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be OK with moving forward here - @jreback or @TomAugspurger ?
Do we have a general policy for which arguments are keyword only? |
lgtm. on anything with lots of args (pretty much read_*), happy to only have the first arg positional (path_or_buf). |
looks good, @TomAugspurger ok here? |
Yeah
… On Apr 6, 2020, at 18:25, Jeff Reback ***@***.***> wrote:
looks good, @TomAugspurger ok here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
thanks! |
Thanks @alexitkes ! |
As mentioned in #27544 it is better to use keyword-only arguments for functions with too many arguments. A deprecation warning will be displayed if
read_html
get more than 2 positional arguments (io
andmatch
) orread_json
get more than 1 positional argument (path_or_buf
).Three groups of tests are actually needed
deprecate_nonkeyword_arguments
decoratorrread_json
function emits aFutureWarning
whenever necessary and does not emit it whenever not.read_html
function emits aFutureWarning
whenever necessary and does not emit it whenever not.black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff