Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot pickle certain type hints with Dill #179

Open
robertnishihara opened this issue Jul 22, 2016 · 5 comments
Open

Cannot pickle certain type hints with Dill #179

robertnishihara opened this issue Jul 22, 2016 · 5 comments

Comments

@robertnishihara
Copy link

> python --version
Python 2.7.11 :: Anaconda 4.0.0 (x86_64)

Also

>>> dill.__version__
'0.2.4'

Hi, I'm trying to pickle some stuff in the typing module. I'm curious if there are fundamental limitations here or if this is out of scope for Dill. Note that I filed a similar issue with CloudPickle (cloudpipe/cloudpickle#63). The behavior there is similar though not quite the same. Thanks for your help!

from typing import List, Callable
from dill import loads, dumps

This works.

>>> List
typing.List<~T>

>>> loads(dumps(List))
typing.List<~T>

With List[int], dumps fails as follows.

>>> List[int]
typing.List<~T>[int]

>>> dumps(List[int])
PicklingError                             Traceback (most recent call last)
<ipython-input-4-f02da844db1c> in <module>()
----> 1 dumps(List[int])

/Users/rkn/anaconda/lib/python2.7/site-packages/dill/dill.pyc in dumps(obj, protocol, byref, fmode, recurse)
    190     """pickle an object to a string"""
    191     file = StringIO()
--> 192     dump(obj, file, protocol, byref, fmode, recurse)#, strictio)
    193     return file.getvalue()
    194 

/Users/rkn/anaconda/lib/python2.7/site-packages/dill/dill.pyc in dump(obj, file, protocol, byref, fmode, recurse)
    184             return
    185     # end hack
--> 186     pik.dump(obj)
    187     return
    188 

/Users/rkn/anaconda/lib/python2.7/pickle.pyc in dump(self, obj)
    222         if self.proto >= 2:
    223             self.write(PROTO + chr(self.proto))
--> 224         self.save(obj)
    225         self.write(STOP)
    226 

/Users/rkn/anaconda/lib/python2.7/pickle.pyc in save(self, obj)
    284         f = self.dispatch.get(t)
    285         if f:
--> 286             f(self, obj) # Call unbound method with explicit self
    287             return
    288 

/Users/rkn/anaconda/lib/python2.7/site-packages/dill/dill.pyc in save_type(pickler, obj)
   1100        #print ("%s\n%s" % (type(obj), obj.__name__))
   1101        #print ("%s\n%s" % (obj.__bases__, obj.__dict__))
-> 1102         StockPickler.save_global(pickler, obj)
   1103     return
   1104 

/Users/rkn/anaconda/lib/python2.7/pickle.pyc in save_global(self, obj, name, pack)
    757                 raise PicklingError(
    758                     "Can't pickle %r: it's not the same object as %s.%s" %
--> 759                     (obj, module, name))
    760 
    761         if self.proto >= 2:

PicklingError: Can't pickle typing.List<~T>[int]: it's not the same object as typing.List
@matsjoyce
Copy link
Contributor

It's probably because typing does some funky dynamic type generation under the hood, and so pickling by reference isn't enough. I think it should be reasonably easy to fix, as typing is a pure python module, no C involved. I'll have a look when I've a moment, if no one else has.

@mmckerns
Copy link
Member

mmckerns commented Jul 24, 2016

Maybe this is another issue where dill should deal with __qualname__.

>>> from typing import List
>>> l = List[int]
>>> l
typing.List[int]
>>> l.__class__
<class 'typing.GenericMeta'>
>>> l.__qualname__
'List'

Note that if you use dill.detect.trace(True), a dump immediately fails with a T4 and the error posted by the OP above.

@mmckerns
Copy link
Member

This seems to be a general solution for the primary typing objects:

>>> import typing
>>> import dill
>>> t = typing.Tuple[typing.Callable, typing.Any]
>>> assert t == t.__origin__[t.__args__]
>>> l = typing.List[int]
>>> assert l == l.__origin__[l.__args__]

Note that typed functions pickle, but lose their annotations:

>>> def doit(x: int, y: List[int]) -> List[int]:
...   return y + [x]
... 
>>> _doit = dill.copy(doit)
>>> doit.__annotations__
{'x': <class 'int'>, 'y': typing.List[int], 'return': typing.List[int]}
>>> _doit.__annotations__
{}
>>> 

Looks like NewType also pickles, except it gets the qualname wrong:

>>> UserId = typing.NewType('UserId', int)
>>> _UserId = dill.copy(UserId)
>>> _UserId
<function new_type at 0x10261fea0>
>>> UserId
<function NewType.<locals>.new_type at 0x10261fd90>
>>> _UserId.__qualname__
'new_type'
>>> UserId.__qualname__
'NewType.<locals>.new_type'

The pattern for a Callable is something like this:

>>> q = typing.Callable[[int,str,typing.Any], typing.List[typing.Any]]
>>> assert q == q.__origin__[list(q.__args__[:-1]),q.__args__[-1]]
>>> c = typing.Callable[[int], None]
>>> assert c == c.__origin__[list(c.__args__[:-1]),c.__args__[-1]]

Generic types don't seem to have anything new:

>>> T = typing.TypeVar('T')
>>> dill.copy(T)
~T
>>> T
~T
>>> g = typing.Generic[T]
>>> assert g == g.__origin__[g.__args__]
>>> m = typing.Mapping[int,T]
>>> assert m == m.__origin__[m.__args__]
>>> i = typing.Iterable[T]
>>> assert i == i.__origin__[i.__args__]

I didn't check all the objects in typing, but there doesn't seem (at a glance) to be any other quirky ones that might need a different pattern than above.

So the cases that need some attention appear to be:

  • most typing objects, like typing.Tuple
  • typing.Callable
  • typing.NewType
  • functions with __annotations__

@mmckerns
Copy link
Member

The typing module does some weird stuff... and while the pickling of objects doesn't seem too bad, the identification of the object type is not so straightforward. Here's a brain-dump of explorations thus far:

import dill
import typing

def test_typing_basic():
    o = dill.copy(typing.Tuple)
    assert o == typing.Tuple
    t = typing.Tuple[typing.Callable, typing.Any]
    assert t == t.__origin__[t.__args__]
    ## >>> [k for k,v in vars(t).items() if v != vars(o)[k]]    
    ## ['__subclasshook__', '__args__', '__tree_hash__', '__origin__']
    l = typing.List[int]
    assert l == l.__origin__[l.__args__]
    u = typing.Union[int,str]
    assert u == u.__origin__[u.__args__]
    w = dill.copy(typing.GenericMeta)
    assert w == typing.GenericMeta
    assert isinstance(t, w)
    ### NOTE: the below are weird: isinstance raises a TypeError ###
    j = dill.copy(typing.Union)
    assert j == typing.Union
    assert repr(type(u)) == repr(j)
    # slightly less weird, but still weird
    a = dill.copy(typing.Any)
    assert a == typing.Any
    assert repr(type(a)) == repr(a)
    b = dill.copy(typing.Optional)
    assert b == typing.Optional
    assert repr(type(b)) == repr(b)
    h = dill.copy(typing.ClassVar)
    assert h == typing.ClassVar
    assert repr(type(h)) == repr(h)
    ### NOTE: more? https://docs.python.org/3/library/typing.html
    d = dill.copy(typing.NamedTuple)
    assert d == typing.NamedTuple
    assert type(d) == typing.NamedTupleMeta

def test_typing_annotations():
    def doit(x: int, y: typing.List[int]) -> typing.List[int]:
      return y + [x]

    _doit = dill.copy(doit)
    #assert sorted(doit.__annotations__.items()) == sorted(_doit.__annotations__.items())
    #assert _doit == doit

def test_typing_newtype():
    UserId = typing.NewType('UserId', int)
    _UserId = dill.copy(UserId)
    #assert _UserId.__qualname__ == UserId.__qualname__
    #assert _UserId == UserId

def test_typing_callable():
    q = typing.Callable[[int,str,typing.Any], typing.List[typing.Any]]
    assert q == q.__origin__[list(q.__args__[:-1]),q.__args__[-1]]
    c = typing.Callable[[int], None]
    assert c == c.__origin__[list(c.__args__[:-1]),c.__args__[-1]]
    p = typing.Callable[..., str]
    assert p == p.__origin__[p.__args__]
    w = dill.copy(typing.GenericMeta)
    assert w == typing.GenericMeta
    assert isinstance(q, w)

def test_typing_generic():
    T = typing.TypeVar('T')
    _T = dill.copy(T)
    #assert _T == T
    g = typing.Generic[T]
    assert g == g.__origin__[g.__args__]
    m = typing.Mapping[int,T]
    assert m == m.__origin__[m.__args__]
    i = typing.Iterable[T]
    assert i == i.__origin__[i.__args__]
    class User(object):
        pass
    U = typing.TypeVar('U', bound=User)
    _U = dill.copy(U)
    #assert _U == U
    e = typing.Type[U]
    assert e == e.__origin__[e.__args__]
    w = dill.copy(typing.GenericMeta)
    assert w == typing.GenericMeta
    assert isinstance(g, w)
    v = dill.copy(typing.TypeVar)
    assert v == typing.TypeVar
    assert isinstance(T, v)
    ### NOTE: needs work...
    s = dill.copy(typing.AnyStr)
    #assert s == typing.AnyStr
    assert isinstance(s, v)


if __name__ == '__main__':
    test_typing_basic()
    test_typing_annotations()
    test_typing_newtype()
    test_typing_callable()
    test_typing_generic()

@tvalentyn
Copy link

cc: @udim FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants