Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strict_optional=False makes checking scripts using xarray to fail #9437

Closed
dalonsoa opened this issue Sep 11, 2020 · 12 comments
Closed

strict_optional=False makes checking scripts using xarray to fail #9437

dalonsoa opened this issue Sep 11, 2020 · 12 comments
Labels

Comments

@dalonsoa
Copy link

🐛 Bug Report

When using mypy with strict_optional = False in the config file or using the --no-strict-optional flag in the terminal to check a file using the xarray package, it fails with an scary INTERNAL ERROR message.

This might be a bug better suited for the xarray team, but thought to check with you first.

To Reproduce

Write a python script just importing xarray:

import xarray as xr

And check it with mypy using mypy --no-strict-optional script.py

Expected Behavior

Everything works fine and I get: Success: no issues found in 1 source file . This is indeed the behavior if the --no-strict-optional flag is NOT used.

Actual Behavior

I get the following output (personal info replaced by XXXXX):

/XXXXX/venv/lib/python3.7/site-packages/xarray/core/dataarray.py:2001: error: INTERNAL ERROR -- Please try using mypy master on Github:
https://mypy.rtfd.io/en/latest/common_issues.html#using-a-development-mypy-build
If this issue continues with mypy master, please report a bug at https://github.com/python/mypy/issues
version: 0.782
/XXXXX/venv/lib/python3.7/site-packages/xarray/core/dataarray.py:2001: : note: please use --show-traceback to print a traceback when reporting a bug

I've tried also the development version of mypy, as sugested, with identical results, and also used the --show-traceback option, with results in:

/XXXXX/venv/lib/python3.7/site-packages/xarray/core/dataarray.py:2001: error: INTERNAL ERROR -- Please try using mypy master on Github:
https://mypy.rtfd.io/en/latest/common_issues.html#using-a-development-mypy-build
Please report a bug at https://github.com/python/mypy/issues
version: 0.782
Traceback (most recent call last):
  File "mypy/checker.py", line 401, in accept
  File "mypy/nodes.py", line 1129, in accept
  File "mypy/checker.py", line 3437, in visit_for_stmt
  File "mypy/checker.py", line 3470, in analyze_iterable_item_type
  File "mypy/checkexpr.py", line 2345, in check_method_call_by_name
  File "mypy/checkmember.py", line 126, in analyze_member_access
  File "mypy/checkmember.py", line 143, in _analyze_member_access
  File "mypy/checkmember.py", line 225, in analyze_instance_member_access
  File "mypy/checkmember.py", line 376, in analyze_member_var_access
  File "mypy/checkmember.py", line 578, in analyze_var
  File "mypy/meet.py", line 50, in meet_types
  File "mypy/types.py", line 794, in accept
  File "mypy/meet.py", line 497, in visit_instance
  File "mypy/meet.py", line 647, in meet
  File "mypy/meet.py", line 50, in meet_types
  File "mypy/types.py", line 1347, in accept
  File "mypy/meet.py", line 584, in visit_tuple_type
  File "mypy/typeops.py", line 44, in tuple_fallback
  File "mypy/join.py", line 531, in join_type_list
  File "mypy/join.py", line 105, in join_types
  File "mypy/types.py", line 794, in accept
  File "mypy/join.py", line 158, in visit_instance
  File "mypy/join.py", line 371, in join_instances
  File "mypy/subtypes.py", line 143, in is_subtype_ignoring_tvars
  File "mypy/subtypes.py", line 93, in is_subtype
  File "mypy/subtypes.py", line 135, in _is_subtype
  File "mypy/types.py", line 794, in accept
  File "mypy/subtypes.py", line 259, in visit_instance
  File "mypy/subtypes.py", line 541, in is_protocol_implementation
  File "mypy/subtypes.py", line 93, in is_subtype
  File "mypy/subtypes.py", line 135, in _is_subtype
  File "mypy/types.py", line 1786, in accept
  File "mypy/subtypes.py", line 459, in visit_partial_type
RuntimeError: 
/XXXXX/venv/lib/python3.7/site-packages/xarray/core/dataarray.py:2001: : note: use --pdb to drop into pdb

Your Environment

  • Mypy version used: 0.782
  • xarray version: 0.16.0
  • Mypy command-line flags: --no-strict-optional
  • Mypy configuration options from mypy.ini (and other config files): strict_optional = False has the same effect.
  • Python version used: 3.7.4
  • Operating system and version: Mac OS Catalina 10.15.6
@dalonsoa dalonsoa added the bug mypy got something wrong label Sep 11, 2020
@gvanrossum
Copy link
Member

Thanks for the report. I have reproduced this using Python 3.8, in 0.782 as well as on master.

One needs to pip install xarray to trigger this; the xarray package has a py.typed marker file (PEP 561).

I'm pasting the traceback from the master here:

/Users/guido/v38/lib/python3.8/site-packages/xarray/core/dataarray.py:2001: error: INTERNAL ERROR -- Please try using mypy master on Github:
https://mypy.rtfd.io/en/latest/common_issues.html#using-a-development-mypy-build
Please report a bug at https://github.com/python/mypy/issues
version: 0.790+dev.f743b0af0f62ce4cf8612829e50310eb0a019724
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/guido/mypy/mypy/__main__.py", line 12, in <module>
    main(None, sys.stdout, sys.stderr)
  File "/Users/guido/mypy/mypy/main.py", line 93, in main
    res = build.build(sources, options, None, flush_errors, fscache, stdout, stderr)
  File "/Users/guido/mypy/mypy/build.py", line 180, in build
    result = _build(
  File "/Users/guido/mypy/mypy/build.py", line 254, in _build
    graph = dispatch(sources, manager, stdout)
  File "/Users/guido/mypy/mypy/build.py", line 2630, in dispatch
    process_graph(graph, manager)
  File "/Users/guido/mypy/mypy/build.py", line 2953, in process_graph
    process_stale_scc(graph, scc, manager)
  File "/Users/guido/mypy/mypy/build.py", line 3051, in process_stale_scc
    graph[id].type_check_first_pass()
  File "/Users/guido/mypy/mypy/build.py", line 2111, in type_check_first_pass
    self.type_checker().check_first_pass()
  File "/Users/guido/mypy/mypy/checker.py", line 294, in check_first_pass
    self.accept(d)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 940, in accept
    return visitor.visit_class_def(self)
  File "/Users/guido/mypy/mypy/checker.py", line 1723, in visit_class_def
    self.accept(defn.defs)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 1005, in accept
    return visitor.visit_block(self)
  File "/Users/guido/mypy/mypy/checker.py", line 1972, in visit_block
    self.accept(s)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 677, in accept
    return visitor.visit_func_def(self)
  File "/Users/guido/mypy/mypy/checker.py", line 726, in visit_func_def
    self._visit_func_def(defn)
  File "/Users/guido/mypy/mypy/checker.py", line 730, in _visit_func_def
    self.check_func_item(defn, name=defn.name)
  File "/Users/guido/mypy/mypy/checker.py", line 792, in check_func_item
    self.check_func_def(defn, typ, name)
  File "/Users/guido/mypy/mypy/checker.py", line 975, in check_func_def
    self.accept(item.body)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 1005, in accept
    return visitor.visit_block(self)
  File "/Users/guido/mypy/mypy/checker.py", line 1972, in visit_block
    self.accept(s)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 1196, in accept
    return visitor.visit_if_stmt(self)
  File "/Users/guido/mypy/mypy/checker.py", line 3255, in visit_if_stmt
    self.accept(b)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 1005, in accept
    return visitor.visit_block(self)
  File "/Users/guido/mypy/mypy/checker.py", line 1972, in visit_block
    self.accept(s)
  File "/Users/guido/mypy/mypy/checker.py", line 401, in accept
    stmt.accept(self)
  File "/Users/guido/mypy/mypy/nodes.py", line 1130, in accept
    return visitor.visit_for_stmt(self)
  File "/Users/guido/mypy/mypy/checker.py", line 3478, in visit_for_stmt
    iterator_type, item_type = self.analyze_iterable_item_type(s.expr)
  File "/Users/guido/mypy/mypy/checker.py", line 3511, in analyze_iterable_item_type
    return iterator, echk.check_method_call_by_name(nextmethod, iterator, [], [], expr)[0]
  File "/Users/guido/mypy/mypy/checkexpr.py", line 2346, in check_method_call_by_name
    method_type = analyze_member_access(method, base_type, context, False, False, True,
  File "/Users/guido/mypy/mypy/checkmember.py", line 126, in analyze_member_access
    result = _analyze_member_access(name, typ, mx, override_info)
  File "/Users/guido/mypy/mypy/checkmember.py", line 143, in _analyze_member_access
    return analyze_instance_member_access(name, typ, mx, override_info)
  File "/Users/guido/mypy/mypy/checkmember.py", line 225, in analyze_instance_member_access
    return analyze_member_var_access(name, typ, info, mx)
  File "/Users/guido/mypy/mypy/checkmember.py", line 376, in analyze_member_var_access
    return analyze_var(name, v, itype, info, mx, implicit=implicit)
  File "/Users/guido/mypy/mypy/checkmember.py", line 578, in analyze_var
    dispatched_type = meet.meet_types(mx.original_type, itype)
  File "/Users/guido/mypy/mypy/meet.py", line 50, in meet_types
    return t.accept(TypeMeetVisitor(s))
  File "/Users/guido/mypy/mypy/types.py", line 833, in accept
    return visitor.visit_instance(self)
  File "/Users/guido/mypy/mypy/meet.py", line 497, in visit_instance
    args.append(self.meet(ta, sia))
  File "/Users/guido/mypy/mypy/meet.py", line 647, in meet
    return meet_types(s, t)
  File "/Users/guido/mypy/mypy/meet.py", line 50, in meet_types
    return t.accept(TypeMeetVisitor(s))
  File "/Users/guido/mypy/mypy/types.py", line 1386, in accept
    return visitor.visit_tuple_type(self)
  File "/Users/guido/mypy/mypy/meet.py", line 584, in visit_tuple_type
    return TupleType(items, tuple_fallback(t))
  File "/Users/guido/mypy/mypy/typeops.py", line 44, in tuple_fallback
    return Instance(info, [join_type_list(typ.items)])
  File "/Users/guido/mypy/mypy/join.py", line 531, in join_type_list
    joined = join_types(joined, t)
  File "/Users/guido/mypy/mypy/join.py", line 105, in join_types
    return t.accept(TypeJoinVisitor(s))
  File "/Users/guido/mypy/mypy/types.py", line 833, in accept
    return visitor.visit_instance(self)
  File "/Users/guido/mypy/mypy/join.py", line 158, in visit_instance
    nominal = join_instances(t, self.s)
  File "/Users/guido/mypy/mypy/join.py", line 371, in join_instances
    elif t.type.bases and is_subtype_ignoring_tvars(t, s):
  File "/Users/guido/mypy/mypy/subtypes.py", line 143, in is_subtype_ignoring_tvars
    return is_subtype(left, right, ignore_type_params=True)
  File "/Users/guido/mypy/mypy/subtypes.py", line 93, in is_subtype
    return _is_subtype(left, right,
  File "/Users/guido/mypy/mypy/subtypes.py", line 135, in _is_subtype
    return left.accept(SubtypeVisitor(orig_right,
  File "/Users/guido/mypy/mypy/types.py", line 833, in accept
    return visitor.visit_instance(self)
  File "/Users/guido/mypy/mypy/subtypes.py", line 264, in visit_instance
    if right.type.is_protocol and is_protocol_implementation(left, right):
  File "/Users/guido/mypy/mypy/subtypes.py", line 550, in is_protocol_implementation
    is_compat = is_subtype(subtype, supertype, ignore_pos_arg_names=ignore_names)
  File "/Users/guido/mypy/mypy/subtypes.py", line 93, in is_subtype
    return _is_subtype(left, right,
  File "/Users/guido/mypy/mypy/subtypes.py", line 135, in _is_subtype
    return left.accept(SubtypeVisitor(orig_right,
  File "/Users/guido/mypy/mypy/types.py", line 1825, in accept
    return visitor.visit_partial_type(self)
  File "/Users/guido/mypy/mypy/subtypes.py", line 468, in visit_partial_type
    raise RuntimeError
RuntimeError: 
/Users/guido/v38/lib/python3.8/site-packages/xarray/core/dataarray.py:2001: : note: use --pdb to drop into pdb

@gvanrossum
Copy link
Member

Maybe this is a new regression? Someone with som time on their hands could perhaps bisect to a specific commit that introduced this.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 27, 2020

I checked, since it seemed like a good use case for mypy_primer. It does appear to be a regression in mypy, but an old one: #6442. I didn't find this too enlightening (though you'll notice the stacktrace goes through tuple_fallback); maybe it would be better to bisect xarray in search of a smaller repro.

How to bisect (somewhat) easily:

I added the following to primer.py, where xarray_bug.py just contains import xarray as xr as above:

+    Project(
+        location="xarray_bug.py",
+        mypy_cmd="{mypy} --no-strict-optional --show-traceback xarray_bug.py",
+        deps=["xarray"],
+    ),

And then ran:

python3.7 -m primer -k xarray_bug --old v0.670 --new v0.700 --bisect-error visit_partial_type --debug

Edit: I added the -p flag to mypy_primer, so you no longer need to edit the script to bisect things like this.

The somewhat* is because bisecting over large ranges of (old) mypy revisions proved to be fraught, so I had to narrow it down a little with a couple other mypy_primer commands, hence the --new v0.700).

@gvanrossum gvanrossum added crash and removed bug mypy got something wrong labels Sep 28, 2020
@gvanrossum
Copy link
Member

gvanrossum commented Sep 28, 2020

Thanks; I agree #6442 doesn't look like a great starting point.

Sorry, this is one of my long rambling crash narratives. I used pdb a lot.

TL;DR: Root cause __hash__ = None # type: ignore (and some context I don't understand yet)


I reproduced this on the master of xarray, the crash line is here:
https://github.com/pydata/xarray/blob/ed1b865688b2437a434dbf5bd5e602ad714453db/xarray/core/dataarray.py#L2040

Here self.coords is a property defined on line 714. Its return type is DataArrayCoordinates which is defined in coordinates.py at line 262. It derives from Coordinates in the same file at line 33, subclassing Mapping[Hashable, "DataArray"]. Note that DataArray is the class in dataarray.py of which the crash line is a part, so there's an import cycle here (conditional on if TYPE_CHECKING).

The offending line is this:

for name, coord in self.coords.items():
    ...

I replaced it with this:

c = self.coords
for name in c:
    coord = c[name]

and then the crash moved to a later place in the same file:

for dim, coord in self.coords.items():

So it seems there's something funny about self.coords.items(). (I tried .keys() and .values() and they don't trigger the crash.)

After "fixing" the second case in a similar way, the crash moved to this line in def dot():

if isinstance(other, Dataset):

Here other has type "DataArray" (i.e. a forward reference to the containing class). Is that a clue? In all cases we seem to be having something whose type is "DataArray", a forward reference to the current class. Maybe there's something wrong with these forward references (perhaps due to that import cycle)?

After commenting out that block and the next (isinstance(other, DataArray)) the crash went away.


Now let's focus on mypy. In all cases the crash is here:

mypy/mypy/subtypes.py

Lines 466 to 468 in 4fb5a21

def visit_partial_type(self, left: PartialType) -> bool:
# This is indeterminate as we don't really know the complete type yet.
raise RuntimeError

That's this function in class SubtypeVisitor:

    def visit_partial_type(self, left: PartialType) -> bool:
        # This is indeterminate as we don't really know the complete type yet.
        raise RuntimeError

But that function wasn't actually changed by #6442, so maybe this is a dead end. (It is ancient, from 2015, commit fc72019, when partial types were introduced; the commit log claims the work is "still very incomplete".)

So what's going on?

We're in a subtype check, which is implemented using SubtypeVisitor. The visit_partial_type method there represents either a "shouldn't happen" case or a "not sure how to implement this" case. Which is it? The comment seems ambiguous to me, but I think it means we shouldn't be getting here, i.e. is_subtype() should only be called when all partial types have been replaced.

(Quick intro to partial types: these are used to avoid "Need type annotation" errors in cases like this:

a = {}
for i in range(3):
    a[i] = i

From the last line we infer that a has type Dict[int, int]. It works for some common cases only. It does similar things with variables initialized to None and later filled in.

Are partial types also involved with forward references? I dunno.)

If I simply change the function to return False the crash goes away, but clearly this isn't the right solution. :-) If I print the left and self.right with this change, I see 5 calls, left is always <partial None>, and right is mostly def () -> int (but for the final call it is None).

And... Aha! This leads here:
https://github.com/pydata/xarray/blob/ed1b865688b2437a434dbf5bd5e602ad714453db/xarray/core/dataarray.py#L942

    # mutable objects should not be hashable
    # https://github.com/python/mypy/issues/4266
    __hash__ = None  # type: ignore

So now at least I have a root cause.

This also leads to the workaround:

    __hash__: Optional[Callable[[], int]] = None  # type: ignore  # UPDATE: added type: ignore

(There are three places in xarray where this occurs, they all need to be patched. UPDATE: There are two other places where this idiom is used, but they don't cause a crash.)

UPDATE: Probably better workaround:

    __hash__: None = None  # type: ignore

Note that the comment leads to #4266, which seems relevant (it's about __hash__ = None failing in strict-optional mode) but doesn't really touch upon the crash.

Alas, there's a little more to it, because this code doesn't tickle the crash. Maybe it's the import cycle between dataarray.py and coordinates.py?

class C:
    __hash__ = None  # type: ignore

More maybe later, or someone can just come up with a fix based on this idea.

@gvanrossum
Copy link
Member

gvanrossum commented Sep 28, 2020

At least I now have a short repro.

(v38) ~/mypy$ head t.py u.py
==> t.py <==
from u import DataArrayCoordinates

class DataArray:
    __hash__ = None  # type: ignore  # UPDATE: '# type: ignore' is not needed

    def transpose(self, a: DataArrayCoordinates) -> None:
        for _ in a.items():  # <---------- CRASH HERE
            ...

==> u.py <==
from typing import Mapping, Hashable

from t import DataArray

class DataArrayCoordinates(Mapping[Hashable, "DataArray"]):
    ...
(v38) ~/mypy$ python -m mypy t.py --no-strict-optional
t.py:7: error: INTERNAL ERROR -- Please try using mypy master on Github:
https://mypy.rtfd.io/en/latest/common_issues.html#using-a-development-mypy-build
If this issue continues with mypy master, please report a bug at https://github.com/python/mypy/issues
version: 0.790+dev.0b4a2c9d8edb57e2b61219c96eefe010073a5c14
t.py:7: : note: please use --show-traceback to print a traceback when reporting a bug
(v38) ~/mypy$ 

@gvanrossum
Copy link
Member

Hmm... The short repro helped because the debug cycle is quicker, but I'm still not sure how to fix it.

What happens is that check_assignment() in checker.py at L2044 misfires somehow when checking __hash__ = None # type: ignore.

  • lvalue_type, index_lvalue, inferred = self.check_value(lvalue) on L2052 returns (None, None, <something>).
  • self.check_compatibility_all_supers(...) on L2072 returns False, because the error is suppressed.
  • Since both lvalue_type not index_lvalue are None, but inferred isn't, we enter the block at the end of the function (L2146).
  • On L2149 we call self.infer_variable_type(...) defined on L2843.
  • This calls self.infer_partial_type(...) on L2854; the definition is right below on L2871.
  • Since init_type is the mypy type representing None, we create a PartialType(None, name) on L2874.
  • And this partial type is set as the type of the variable (__hash__) (L2900).

But the partial type just stays there.

Here's something weird. The repro still crashes without the # type: ignore comment! (But only when using --no-strict-optional.)

And what's the connection with for _ in a.items():???

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 28, 2020

This is great! I was able to repro^ in a single file, order doesn't matter. I think the a.items() connection might be tuples, which are relevant since the regression commit is about tuple fallbacks.

@hauntsaninja
Copy link
Collaborator

Okay, here's a shorter repro (but using lists, not tuples):

from typing import Hashable

class DataArray:
    __hash__ = None  # type: ignore

    def f(self, x: Hashable) -> None:
        reveal_type([self, x])

@hauntsaninja
Copy link
Collaborator

Seems like the trouble is when we try to join with something hashable, inside the class definition. That is, the following does not crash:

from typing import Hashable

class DataArray:
    __hash__ = None  # type: ignore

def f(y: DataArray, x: Hashable) -> None:
    reveal_type([y, x])

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 28, 2020

Okay, it looks like we special case this already, but five lines too late:

if isinstance(subtype, NoneType) and isinstance(supertype, CallableType):

PR coming soon!

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 28, 2020

Actually, we shouldn't just tweak that check, the problem is a little more general. This fails with no special flags:

from typing import Protocol

class Flapper(Protocol):
    def flap() -> int: ...

class Blooper:
    flap = None

    def bloop(self, x: Flapper) -> None:
        reveal_type([self, x])

You can also set flap = [] for a similar crash that doesn't involve None.

@gvanrossum
Copy link
Member

Good finds! And quite surprising, given where this started...

JukkaL pushed a commit that referenced this issue Oct 1, 2020
In particular, this affected hashables. Fixes #9437

Co-authored-by: hauntsaninja <>
scivision added a commit to space-physics/msise00 that referenced this issue Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants