CI: Unify code_checks whitespace checking #30755

ShaharNaveh · 2020-01-06T21:20:51Z

Unify test cases of #30467 #30708 #30737

alimcmaster1 · 2020-01-06T21:42:46Z

For #30737. Is it possible to do this via regex? similar to the current code check for context manager:
https://github.com/pandas-dev/pandas/blob/master/ci/code_checks.sh#L157

ShaharNaveh · 2020-01-06T22:00:40Z

For #30737. Is it possible to do this via regex? similar to the current code check for context manager:
https://github.com/pandas-dev/pandas/blob/master/ci/code_checks.sh#L157

Yes, for simple cases like

with pytest.raises(ValueError)

or (For false positives)

with pytest.raises(ValueError, match="foo")

But I want a "global" test case.

So even if there's an edge case, for example:

with pytest.raises(
                       ValueError,
                       # Some comment
)

or (For the case of false positives)

with pytest.raises(
                       ValueError,
                       # Some comment
                       match="foo"
)

Regex is very fast, and for a reason, regex is not looking in the ast. But it's hard for me to predict the future edge cases, and write regex to handle them.

So I wrote a global test case that looks in the ast.

When I detect the occurrence of pytests.raises, I brute-force all the tokens in the logical line until the end of the logical line, and if I find the occurrence of match as a argument (token.NAME) I consider this logical line as good.

My point is, regex is nice for "simple things" but when you have so many edge cases, I'd like to stick to checking it via the ast, because it's hard to check with regex what is a logical line.

P.S
If you're not convinced yet @alimcmaster1 I have some more edge cases that I couldn't found matching regex for them, If you succeed doing so I'd be more than happy to see the solution:)

scripts/validate_unwanted_patterns.py

WillAyd

Any chance you can make the edits in place first without changing files? Typically easier to review that way

If you are moving and modifying things easier to do separately

datapythonista · 2020-01-07T10:41:56Z

Agree with Will, do you mind leaving the name unchanged in this PR, even if it's not accurate, so we can review the new changes only. Then it's easy to just rename in a follow up, and get that merged quickly.

Just one early comment, I'm -1 on creating abbreviations like STD. I think it's worth having proper names that can be directly understood, even if they are longer.

ShaharNaveh · 2020-01-07T14:43:54Z

Any chance you can make the edits in place first without changing files?

Not sure what you mean by that @WillAyd , can you please be more specific?

datapythonista · 2020-01-07T14:44:54Z

Check my last comment

ShaharNaveh · 2020-01-07T14:46:01Z

Check my last comment

So just change the function names, back to the very original?

datapythonista · 2020-01-07T14:47:24Z

No, sorry. The file name, so the diff will show what you changed in that file.

@datapythonista

as @datapythonista suggested in pandas-dev#30755 (comment)

datapythonista

Nice stuff. Some comments, but the general idea looks good.

scripts/validate_string_concatenation.py

ShaharNaveh · 2020-02-20T03:47:20Z

This is... a lot. It'd be great if one of the upstream tools (flake8/black/...) did this, but I'm not sure it belongs here. i.e. if I had to maintain this myself, I would just learn to be OK with the downsides of regexps.

@jbrockmendel I can see flake8 maybe adopting the unconcatnated strings, but for the rest of the tests cases I can't see why a linter would adopt those, as it's a very specific pandas code style/nitpick, maybe a very opinionated linters such as pylint would use those.

(And do you know by any chance someone from flake8 to ask if they would use it?
And I would love to see how they do it, I tried doing this code checks using the ast module, couldn't figure it out :/)

And about the regex, I completely understand why this PR seems like overkill when a simple regex can do the trick, for the most part. but isn't that the fact why we are using tools like black an isort? so there will no questions if the code style right or not, there is always one correct answer and it is what the linter says, when we disagree with the decision of the linter for example in psf/black#1051 we just create our own check (If it's not upstreamed, we do it here).

datapythonista · 2020-02-20T09:29:27Z

I agree having these checks in the pandas code base is not ideal. But I'd move forward with this PR, get the validation working better (we already have the validation, this is just improving and refactoring what we've got). And later on, we can consider moving this out of pandas, as we did with the docstrings.

@MomIsBestFriend can you merge master and see if the CI is green please?

jbrockmendel · 2020-02-20T14:56:41Z

I'm completely happy deferring to @datapythonista on this. Thanks @MomIsBestFriend for taking this on.

ShaharNaveh · 2020-02-22T10:16:29Z

Restarting azure

ShaharNaveh · 2020-03-07T16:46:30Z

ping @datapythonista

datapythonista

Thanks @MomIsBestFriend

@WillAyd I think your comments were addressed, can you have a look and merge if you're happy.

WillAyd · 2020-03-07T21:12:44Z

I will relook in a few days

datapythonista · 2020-03-23T09:52:19Z

@MomIsBestFriend sorry we've been slow with this. Do you mind merging master once more please? I'll merge this once CI is green.

ShaharNaveh · 2020-03-23T09:54:06Z

@datapythonista I just merged master, I'll wait for another commit to be merged, before I can merge master again.

datapythonista · 2020-03-23T09:55:45Z

Oh, sorry, I thought the CI was stalled, never mind then. Ping me once it's green if I don't merge it before. Thanks!

ShaharNaveh · 2020-03-23T10:19:56Z

ping @datapythonista :)

datapythonista · 2020-03-23T10:31:36Z

Thanks @MomIsBestFriend, great job.

Can you follow up renaming the file please?

CI: Unify tests cases

d16e566

This was referenced Jan 6, 2020

CI: Test case for wrong placed space #30708

Closed

CI: Disallow bare pytest raise #30737

Closed

alimcmaster1 reviewed Jan 6, 2020

View reviewed changes

scripts/validate_unwanted_patterns.py Outdated Show resolved Hide resolved

alimcmaster1 added the CI Continuous Integration label Jan 6, 2020

Changed functions name

6e6bb66

WillAyd requested changes Jan 7, 2020

View reviewed changes

MomIsBestFriend added 5 commits January 7, 2020 16:51

Renamed file name, back to the original

48f3e86

STY: inconsistent linebreaks

2e249a8

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

f611e26

Made the code check "ID" to be more verbose

4ec704c

as @datapythonista suggested in pandas-dev#30755 (comment)

Added ".py" in ci/code_check.sh to the script call

8b2f603

datapythonista reviewed Jan 7, 2020

View reviewed changes

MomIsBestFriend added 10 commits January 8, 2020 19:54

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

63dbeb7

Replaced the dictonary with 'globals()'

bb5b905

Changed from "id" to "validation-type"

2225175

Better explained (I hope) why this file exists

a2b60c5

Yielding the string directly instead of assigning it to a variable

590ca48

Fixed indentation in docs

ef00c68

Imporved docstring for spaces test case, as datapythonista suggested

9341623

Removed uppercase "MSG" from docs and replaced it with lowercase "msg"

c0f800d

Fixed comment placement

63b39dd

Applied datapythonista suggestions

2b61e53

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

32a715d

ShaharNaveh closed this Feb 22, 2020

ShaharNaveh reopened this Feb 22, 2020

ShaharNaveh requested a review from WillAyd February 27, 2020 17:01

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

460be1a

ShaharNaveh mentioned this pull request Feb 28, 2020

STY: spaces in wrong place #32323

Merged

5 tasks

MomIsBestFriend added 2 commits February 28, 2020 12:30

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

7fa0c5a

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

f7f4342

datapythonista approved these changes Mar 7, 2020

View reviewed changes

MomIsBestFriend added 3 commits March 13, 2020 12:16

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

d0e606f

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

c7d9d52

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

c672375

ShaharNaveh mentioned this pull request Mar 19, 2020

STY: Correct whitespace placement #32830

Merged

5 tasks

MomIsBestFriend and others added 3 commits March 20, 2020 02:59

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

0ba7539

Merge remote-tracking branch 'upstream/master' into CI-unify-boilerplate

81990ca

Merge branch 'master' into CI-unify-boilerplate

f2897f5

datapythonista merged commit c455606 into pandas-dev:master Mar 23, 2020

ShaharNaveh mentioned this pull request Mar 23, 2020

Renamed validate_string_concatenation.py to validate_unwanted_patterns.py #32926

Merged

ShaharNaveh deleted the CI-unify-boilerplate branch March 23, 2020 12:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Unify code_checks whitespace checking #30755

CI: Unify code_checks whitespace checking #30755

ShaharNaveh commented Jan 6, 2020 •

edited

Loading

alimcmaster1 commented Jan 6, 2020

ShaharNaveh commented Jan 6, 2020 •

edited

Loading

WillAyd left a comment

datapythonista commented Jan 7, 2020

ShaharNaveh commented Jan 7, 2020

datapythonista commented Jan 7, 2020

ShaharNaveh commented Jan 7, 2020

datapythonista commented Jan 7, 2020

datapythonista left a comment

ShaharNaveh commented Feb 20, 2020

datapythonista commented Feb 20, 2020

jbrockmendel commented Feb 20, 2020

ShaharNaveh commented Feb 22, 2020

ShaharNaveh commented Mar 7, 2020

datapythonista left a comment

WillAyd commented Mar 7, 2020

datapythonista commented Mar 23, 2020

ShaharNaveh commented Mar 23, 2020

datapythonista commented Mar 23, 2020

ShaharNaveh commented Mar 23, 2020

datapythonista commented Mar 23, 2020

CI: Unify code_checks whitespace checking #30755

CI: Unify code_checks whitespace checking #30755

Conversation

ShaharNaveh commented Jan 6, 2020 • edited Loading

alimcmaster1 commented Jan 6, 2020

ShaharNaveh commented Jan 6, 2020 • edited Loading

WillAyd left a comment

Choose a reason for hiding this comment

datapythonista commented Jan 7, 2020

ShaharNaveh commented Jan 7, 2020

datapythonista commented Jan 7, 2020

ShaharNaveh commented Jan 7, 2020

datapythonista commented Jan 7, 2020

datapythonista left a comment

Choose a reason for hiding this comment

ShaharNaveh commented Feb 20, 2020

datapythonista commented Feb 20, 2020

jbrockmendel commented Feb 20, 2020

ShaharNaveh commented Feb 22, 2020

ShaharNaveh commented Mar 7, 2020

datapythonista left a comment

Choose a reason for hiding this comment

WillAyd commented Mar 7, 2020

datapythonista commented Mar 23, 2020

ShaharNaveh commented Mar 23, 2020

datapythonista commented Mar 23, 2020

ShaharNaveh commented Mar 23, 2020

datapythonista commented Mar 23, 2020

ShaharNaveh commented Jan 6, 2020 •

edited

Loading

ShaharNaveh commented Jan 6, 2020 •

edited

Loading