Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new message called duplicate-value #5928

Merged
merged 16 commits into from
Apr 2, 2022

Conversation

ksaketou
Copy link
Contributor

Type of Changes

Type
✨ New feature

Description

This PR creates a new error message called duplicate-value which identifies duplicate values inside sets.

Its functionality is similar to the duplicate-key message which finds duplicate values in dictionaries.

Closes #5880

This commit adds some new functionality which checks for duplicate values
inside sets. For that reason, we create a new message called duplicate-value.
This commit adds new functional tests for the new message duplicate-value.
@coveralls
Copy link

coveralls commented Mar 16, 2022

Pull Request Test Coverage Report for Build 2079079540

  • 11 of 11 (100.0%) changed or added relevant lines in 1 file are covered.
  • 179 unchanged lines in 19 files lost coverage.
  • Overall coverage decreased (-0.008%) to 94.307%

Files with Coverage Reduction New Missed Lines %
pylint/checkers/design_analysis.py 1 98.9%
pylint/checkers/newstyle.py 1 96.15%
pylint/config/arguments_manager.py 1 97.62%
pylint/extensions/code_style.py 1 99.15%
pylint/extensions/comparetozero.py 1 97.06%
pylint/extensions/emptystring.py 1 96.77%
pylint/checkers/exceptions.py 3 97.7%
pylint/checkers/imports.py 4 94.39%
pylint/extensions/docparams.py 4 97.83%
pylint/extensions/typing.py 4 97.06%
Totals Coverage Status
Change from base Build 2078945924: -0.008%
Covered Lines: 15438
Relevant Lines: 16370

💛 - Coveralls

@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Mar 16, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.14.0 milestone Mar 16, 2022
@nickdrozd
Copy link
Collaborator

Tested locally; works! Thanks @ksaketou!

@DanielNoord DanielNoord self-requested a review March 16, 2022 18:27
Copy link
Contributor

@areveny areveny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. I think some additions would pick up many more common cases.

Also surfacing the failure on the Pylint presubmit: https://github.com/PyCQA/pylint/runs/5571708705?check_suite_focus=true

************* Module pylint.checkers.format
pylint/checkers/format.py:721:23: W0130: Duplicate value '\x0b' in set (duplicate-value)
pylint/checkers/format.py:721:23: W0130: Duplicate value '\x0c' in set (duplicate-value)

Which is this:

        unsplit_ends = {
            "\v",
            "\x0b",
            "\f",
            "\x0c",
            "\x1c",
            "\x1d",
            "\x1e",
            "\x85",
            "\u2028",
            "\u2029",
        }

https://stackoverflow.com/questions/26184100/how-does-v-differ-from-x0b-or-x0c

It seems \v == \x0b and \x0c == \f. I think we remove duplicates in our code. It can benefit clarity to explicitly write both synonym escape sequences, but that clarification is better in a comment.

pylint/checkers/base.py Outdated Show resolved Hide resolved
pylint/checkers/base.py Outdated Show resolved Hide resolved
@ksaketou
Copy link
Contributor Author

ksaketou commented Mar 20, 2022

It seems \v == \x0b and \x0c == \f. I think we remove duplicates in our code. It can benefit clarity to explicitly write both
synonym escape sequences, but that clarification is better in a comment.

So regarding that case should we modify unsplit_ends by keeping one of the two cases and referring to the other one in a comment? Which one should we keep in comments?

@DanielNoord
Copy link
Collaborator

It seems \v == \x0b and \x0c == \f. I think we remove duplicates in our code. It can benefit clarity to explicitly write both
synonym escape sequences, but that clarification is better in a comment.

So regarding that case should we modify unsplit_ends by keeping one of the two cases and referring to the other one in a comment? Which one should we keep in comments?

Let's keep the unicode one and add \v in the comments.

This commit changes the unsplit_ends set by keeping only the unicode sequences
and noting their synonyms in comments in order to avoid duplicate-value error triggering.
@DanielNoord
Copy link
Collaborator

Hi @ksaketou,

We have been working on the documentation of the messages that pylint emits. As this PR adds a new message could I ask you to rebase on main and add two examples for bad and good code like explained in #5953? That way we immediately help users understand what this new message will do 😄

@Pierre-Sassoulas
Copy link
Member

Hello @ksaketou, sorry the basic checker got split up on main in #5489 so the conflict resolution might be a little rough when rebasing on main. It's only code moving around so you can copy paste the code you added in the new files that were created.

ChangeLog Outdated Show resolved Hide resolved
@@ -66,6 +66,10 @@ New checkers

Closes #5670

* Added new message called ``duplicate-value`` which identifies duplicate values inside sets.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to 2.14.

pylint/checkers/base.py Outdated Show resolved Hide resolved
else:
continue
if value in values:
self.add_message("duplicate-value", node=node, args=value)
Copy link
Collaborator

@DanielNoord DanielNoord Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.add_message("duplicate-value", node=node, args=value)
self.add_message("duplicate-value", node=node, args=value, confidence=HIGH)

You'll probably need to import that.

@ksaketou
Copy link
Contributor Author

Hello @DanielNoord @Pierre-Sassoulas , sorry for the late reply. Unfortunately, I don't currently have enough time to expand the checker's functionality (#5928 (comment)) . However, I can work on the documentation and the other minor suggested changes above during the following days if that's okay.

@ksaketou
Copy link
Contributor Author

Quick question, should I add the documentation on a separate PR? @DanielNoord

@DanielNoord
Copy link
Collaborator

Quick question, should I add the documentation on a separate PR? @DanielNoord

No adding here is fine. It shouldn't increase the review burden significantly.

@ksaketou
Copy link
Contributor Author

ksaketou commented Mar 30, 2022

Did the refactoring affect the test execution at all? because I get errors regarding the setup.cfg.
I suppose this checker now goes into basic_checker.py?

When I add the timeout option into the markers config option the test passes, but I dont think I should modify this file.

@Pierre-Sassoulas
Copy link
Member

Did the refactoring affect the test execution at all? because I get errors regarding the setup.cfg.

No it should be the same behavior, only less code per file.

I suppose this checker now goes into basic_checker.py?

Yes, I checked the diff it was in the BasicChecker class.

When I add the timeout option into the markers config option the test passes, but I dont think I should modify this file.

I had the same issue, I just fixed it in #6054, it should work if you rebase on latest main

Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Most of this looks good to go in my opinion!

ChangeLog Outdated Show resolved Hide resolved
@@ -0,0 +1,2 @@
This error message is emitted when a set contains the same value two or more times.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Information like this should probably included in the message description in its declaration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only need to move this to pylint/checkers/base/basic_checker.py

doc/whatsnew/2.13.rst Outdated Show resolved Hide resolved
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good job @ksaketou !

doc/data/messages/d/duplicate-value/good.py Outdated Show resolved Hide resolved
ChangeLog Outdated Show resolved Hide resolved
Co-authored-by: Daniël van Noord <13665637+DanielNoord@users.noreply.github.com>
@@ -0,0 +1,2 @@
This error message is emitted when a set contains the same value two or more times.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only need to move this to pylint/checkers/base/basic_checker.py

@Pierre-Sassoulas Pierre-Sassoulas merged commit 06bc0a6 into pylint-dev:main Apr 2, 2022
@ksaketou ksaketou deleted the duplicate-value branch April 2, 2022 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check sets for duplicate entries
6 participants