Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug pylint 4206 #921

Merged
merged 61 commits into from
Apr 6, 2021
Merged

Bug pylint 4206 #921

merged 61 commits into from
Apr 6, 2021

Conversation

hippo91
Copy link
Contributor

@hippo91 hippo91 commented Mar 14, 2021

Steps

  • For new features or bug fixes, add a ChangeLog entry describing what your PR does.
  • Write a good description on what the PR does.

Description

This PR:

  1. handles simple subscript inference dealing with __class_getitem__ method. If a class implements this method then the class is subscriptable without the need that its metaclass implements __getitem__ method. In this PR only FunctionDef nodes are taken into account. Maybe it should be enlarge to other node types such as AssignName but it should be made in another PR
  2. For python39 and above, allows the collections objects implementing __class_getitem__ to be detected as subscriptable. This is done in the collections brain because the implementation is a bit tricky and without the hack in the brain it would be necessary to improve the way __class_getitem__ is inferred (see point 1).
  3. For python versions lower than 3.9, allows the typing types that are alias of collections object to be detected as subscriptable. This is done without modifying inheritance tree and without hacking the metaclass. The way chosen here is to add __class_getitem__ method on the types that are detected as subscriptable. Original way of adding __getitem__ on the metaclass is not a good idea, because the same metaclass instance is shared between subscriptable and not subscriptable types.
  4. Modifies the way collections brain hacks are registered so that pylints acceptance tests are ok.

Points 1, 2, and 3 are necessary to fix the bug specified in pylint-dev/pylint#4206 but it is not sufficient to make the acceptance test ok. That's why point 4 exists.

I would like to highlight the fact deque, defaultdict and OrderedDict are defined in the _collections module which is a C library. In order message dealing with this module (in fact its representation in the brain) to be displayed correctly (with no crash) i have specified the file path as /tmp/unknown. It is uggly but i dont know what to do otherwise. Moreover it wont impact many users, except those executing pylint against typing module (as the pylints acceptance test does).

One last thing, for python versions lower thant 3.9, the case where both typing and collections modules are imported in the same module may lead to spurious false positive. This is because parsing the typing module while give subscript ability to some objects also defined in collection module that are not subscriptable. I don't think it matters because it is not wise, IMHO, to mix those modules. Instead the user should choose between object in collection that are not subscriptable or the same in typing that are subscriptable.

Type of Changes

| ✓ | 🐛 Bug fix |

Related Issue

First step towards: pylint-dev/pylint#4206

hippo91 added 20 commits March 13, 2021 18:25
… a class may use the __class_getitem__ method of the current class instead of looking for __getitem__ in the metaclass.
…ded and thus have no metaclass but starting from python3.9 it supports subscripting thanks to the __class_getitem__ method.
…he class scope.

The brain_typing module does not add a ABCMeta_typing class thus there is no need to test it. Moreover it doesn't add neither a __getitem__ to the metaclass
… type 're.Pattern' is not an acceptable base type
… support subscripting thanks to __class_getitem__ method. However the wat it is implemented is not straigthforward and instead of complexifying the way __class_getitem__ is handled inside the getitem method of the ClassDef class, we prefer to hack a bit.
…Before python3.9 the objects of the typing module that are alias of the same objects in the collections.abc module have subscripting possibility thanks to the _GenericAlias metaclass. To mock the subscripting capability we add __class_getitem__ method on those objects.
…e which is a pure C lib. While letting those class mocks inside collections module is fair for astroid it leds to pylint acceptance tests fail.
@hippo91
Copy link
Contributor Author

hippo91 commented Mar 14, 2021

@cdce8p if you could have a look to this PR it would be cool!

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some surface comment, because to be fair I'm not understanding a lot of what is happening here 😄 At least I learned about MutableSet 😄 ! Fixing the typing acceptance fail is nice and I don't think we can avoid making special case for python 3.9. It would be the right time to release astroid 2.5.2 after we merge this, right ?

astroid/brain/brain_collections.py Outdated Show resolved Hide resolved
tests/unittest_brain.py Show resolved Hide resolved
@cdce8p
Copy link
Member

cdce8p commented Mar 14, 2021

@hippo91 I'll definitely take a look at it, hopefully later today. Thanks for taking the time to work on it!

@cdce8p
Copy link
Member

cdce8p commented Mar 15, 2021

I had some time to take a closer look and do some tests.

The good news first, I really like the use of __class_getitem__ instead of __getitem__ and the change doesn't seem to cause any regression bugs.

However, it doesn't solve the problem completely. Since you add __class_getitem__ directly to the collections.abc class (without an intermediate), those will be subscriptable as well (if the typing module is imported) although they aren't in 3.7 and 3.8. This seems to be more a less the issue I had with #916.

# a.py
import typing
from collections.abc import Iterable

var: Iterable[int]
# python38
pylint a.py
************* Module a
a.py:1:0: W0611: Unused import typing (unused-import)

# python38, without the typing import 
pylint a.py
************* Module a
a.py:3:5: E1136: Value 'Iterable' is unsubscriptable (unsubscriptable-object)

Regarding a solution

  1. Is is possible to repurpose a Call node as ClassDef? If that's the case, we could use the _alias node as an intermediate and add __class_getitem__ to it. The issue I had was that I create a new ClassDef from nothing. That caused the acceptance test to fail.
  2. If (1) is not possible, we could address this in pylint. Since this is limited to mostly the collections classes, it might be possible to add some special case handling.

--
Edit
typing.Match and typing.Pattern also seem to require some special care.

@hippo91
Copy link
Contributor Author

hippo91 commented Mar 15, 2021

@cdce8p thanks a lot for taking time to review this.
IMHO, the following code:

import typing
from collections.abc import Iterable

is a bit weird. I don't see any reason to use a container of the collections module if the typing module is already imported.
Why not use directly the corresponding container of the typing module?

@cdce8p
Copy link
Member

cdce8p commented Mar 15, 2021

@hippo91 PEP 585 deprecated the use of the the generic aliases from the typing module. As such I suspect will become quite common after a while to use the collections.abc classes.

I've been testing the approach (1) a bit today and it looks promising. I hope to be able to upload it later today or tomorrow.

@hippo91
Copy link
Contributor Author

hippo91 commented Mar 15, 2021

@cdce8p you are right concerning PEP 585. But the deprecation should start with python3.9. Isn't it? If yes can we consider that mixing typing and collections with python3.8 and lower is bad practice?

@hippo91
Copy link
Contributor Author

hippo91 commented Mar 31, 2021

@Pierre-Sassoulas @cdce8p i think it is almost good now!
With last modifications:

  • with python3.9, the generic alias test of Add generic alias test cases pylint#4239 are all ok. There is just a weird thing. If i run:
    pytest tests/test_functional.py -k generic_alias there is a failure on collections tests. Whereas if i run:
    pytest tests/test_functional.py -k generic_alias_collections, i got no errors;
  • with python3.7, the generic alias test are ok except for Pattern and Match which should need a specific brain.

@cdce8p in pylint-dev/pylint#4239 i needed to change in base.py the collections.deque and collections.defaultdict in _collections.deque and _collections.defaultdict.

With this PR all pylint's acceptance tests are fines (at least on my machine :-) ).

Moreover it seems that the problem with mixing collections and typing modules is gone away, probably thanks to the use of inference_tip function.

For example linting:

# pylint:disable=missing-module-docstring, invalid-name
import typing
import collections.abc


A = collections.abc.ByteString[int]
C = collections.abc.Iterable[int]
D = collections.abc.List[int]
B = typing.ByteString[int]
E = typing.Iterable[int]
D = typing.List[int]
H = typing.Dict[int]
peillexg@UN00308016:

gives, with python3.7:

test.py:6:4: E1136: Value 'collections.abc.ByteString' is unsubscriptable (unsubscriptable-object)
test.py:7:4: E1136: Value 'collections.abc.Iterable' is unsubscriptable (unsubscriptable-object)
test.py:8:4: E1101: Module 'collections.abc' has no 'List' member (no-member)
test.py:9:4: E1136: Value 'typing.ByteString' is unsubscriptable (unsubscriptable-object)

and with python3.9:

test.py:8:4: E1101: Module 'collections.abc' has no 'List' member (no-member)
test.py:9:4: E1136: Value 'typing.ByteString' is unsubscriptable (unsubscriptable-object)

I think it is correct.

@cdce8p
Copy link
Member

cdce8p commented Mar 31, 2021

@hippo91 Unfortunately the latest change doesn't improve much on my part. I'll need more time to figure out what exactly the issue are, but most tests aren't passing. Maybe fetch the latest changes? As for the file you posted, for py37:

D = collections.abc.List[int]  # this should be `unsubscriptable-object`, not `no-member`

and for py39:

D = collections.abc.List[int]  # no issue, just like `collections.abc.Iterable`

@hippo91
Copy link
Contributor Author

hippo91 commented Mar 31, 2021

@hippo91 Unfortunately the latest change doesn't improve much on my part. I'll need more time to figure out what exactly the issue are, but most tests aren't passing. Maybe fetch the latest changes? As for the file you posted, for py37:

D = collections.abc.List[int]  # this should be `unsubscriptable-object`, not `no-member`

and for py39:

D = collections.abc.List[int]  # no issue, just like `collections.abc.Iterable`

@cdce8p it seems that there is not List object in collections.abc.

ipython
Python 3.9.1 (default, Mar  5 2021, 20:11:01) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import collections.abc

In [2]: collections.abc.List
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-832f6e94cf81> in <module>
----> 1 collections.abc.List

AttributeError: module 'collections.abc' has no attribute 'List'

Thus i think that no-member is correct.

@cdce8p
Copy link
Member

cdce8p commented Mar 31, 2021

True, my mistake. It's a builtin list.
I'll need to go through my test results, before I say anymore. It's just that I saw a lot of red

@cdce8p
Copy link
Member

cdce8p commented Mar 31, 2021

@hippo91 I think I've found something now. The strange part was that when I ran the tests with

pytest tests/test_functional.py -k generic_alias

I saw a whole lot of errors that miraculously disappeared when I only ran a single file.

--

Try this example (Python 3.7 / 3.8)

import typing

A = list[int]  # [unsubscriptable-object]
B = typing.List[int]

The result is

test2.py:3:4: E1136: Value 'list' is unsubscriptable (unsubscriptable-object)

If you now switch lines 3 and 4, the error will disappear.

--

I'd guess the root cause here might be caching on pylints astroids part. Wouldn't it be easier to just insert intermediate classes?

--
Edit
It should be the caching done by astroid, not pylint.

Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the use of inference_tip! It seems this was the missing piece.

Nevertheless, I think intermediate classes solve a lot of the remaining issues ;)

astroid/brain/brain_collections.py Outdated Show resolved Hide resolved
astroid/brain/brain_collections.py Outdated Show resolved Hide resolved
astroid/brain/brain_typing.py Outdated Show resolved Hide resolved
astroid/brain/brain_typing.py Outdated Show resolved Hide resolved
astroid/brain/brain_typing.py Show resolved Hide resolved
astroid/brain/brain_typing.py Outdated Show resolved Hide resolved
astroid/scoped_nodes.py Outdated Show resolved Hide resolved
@Pierre-Sassoulas Pierre-Sassoulas added Enhancement ✨ Improvement to a component High effort 🏋 Difficult solution or problem to solve labels Apr 1, 2021
tests/unittest_brain.py Outdated Show resolved Hide resolved
tests/unittest_brain.py Outdated Show resolved Hide resolved
@hippo91
Copy link
Contributor Author

hippo91 commented Apr 5, 2021

@cdce8p i took into account your remarks. Thanks!
For the remaining problem of simultaneous import of typing and collections module, i would prefer a hack into pylint over introducing intermediate classes in the hierarchy. IMHO it is a non worthy complication to handle just what i think is a corner case.
For example in the pylint's utils module i modified the supports_getitem method, to something along those lines:

def supports_getitem(
    value: astroid.node_classes.NodeNG, node: astroid.node_classes.NodeNG
) -> bool:
    def is_typing_imported(node: astroid.node_classes.NodeNG):
        """
        Returns True if the typing module is imported
        """
        rt = node.root()
        import_nodes = rt.nodes_of_class((astroid.Import,))
        for i_node in import_nodes:
            if 'typing' in i_node.names[0]:
                return True
        return False

    def is_name_imported_from_collections(node: astroid.node_classes.NodeNG):
        """
        Returns True if the Name node is imported from collections module
        """
        cls_name = node.value
        if isinstance(cls_name, astroid.Name):
            lookup_res = cls_name.lookup(cls_name.name)
            if lookup_res:
                for l_res in lookup_res[1]:
                    if l_res.modname.startswith("collections"):
                        return True
        return False

    def is_attribute_from_collections(node: astroid.node_classes.NodeNG):
        """
        Returns True if the Attribute node is linked to collections module
        """
        cls_name = node.value
        if isinstance(cls_name, astroid.Attribute):
            for lmod in cls_name.expr.inferred():
                if lmod.name.startswith('collections'):
                    return True
        return False

    if is_typing_imported(node) and PY39_LESS and isinstance(node, astroid.Subscript):
        if is_name_imported_from_collections(node) or is_attribute_from_collections(node):
            return False
    if isinstance(value, astroid.ClassDef):
        if _supports_protocol_method(value, CLASS_GETITEM_METHOD):
            return True
        if is_class_subscriptable_pep585_with_postponed_evaluation_enabled(value, node):
            return True
    return _supports_protocol(value, _supports_getitem_protocol)

And it seems to give good results even if it should probably be improved.
@cdce8p @AWhetter @Pierre-Sassoulas what do you think about it?

@cdce8p
Copy link
Member

cdce8p commented Apr 5, 2021

@hippo91 Unfortunately your suggestion doesn't work in pylint and even introduces regression bugs. I also don't agree with your assessment that it's only a small corner case. IMO it's much more than that. In PY37 and 38 the use of collection classes is only possible in some very limited circumstances. A user, who doesn't know about that, might attempt to just replace all typing imports which will cause difficult to find bugs. This is one of the key things I believe pylint should be able to detect.

As for the intermediate classes. The implementation is much simpler in astroid as you probably think. Take a look at cdce8p/astroid@bug_squash_merged...cdce8p:type-alias-rebased. These are all changes that would be necessary on top of your last commit to fully support the collection classes without the need to change anything in pylint. If you like, I can do this as a followup after your MR is merged.

It might seem like these intermediate classes are a hack, but honestly it's it more similar to what cpython actually does. It also inserts new classes in the hierarchy with _GenericAlias.

Lastly, it would also enable the possibility to do deprecation detection for the typing aliases (for PY39+), since we would be able to easily differentiate typing aliases from collections classes.

@hippo91
Copy link
Contributor Author

hippo91 commented Apr 6, 2021

@cdce8p you convinced me! You are right concerning the fact that your implementation is not so complex and it mimics the cpython trick with _GenericAlias. My apologies for being so long to convince!
I would appreciate if you could add this as a followup as you proposed.

@cdce8p
Copy link
Member

cdce8p commented Apr 6, 2021

@hippo91 That sound like a plan! I'm ready to get this finally done 🚀
FYI: With the last commit, I've pushed some small style changes to reduce the diff for the followup MR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component High effort 🏋 Difficult solution or problem to solve
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants