Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract Base Classes [Support ABC and Enums - Part 2] #580

Merged
merged 1 commit into from
Mar 26, 2023

Conversation

anivegesana
Copy link
Contributor

Adds support for pickling ABCs for Python 3.6 and higher.

Co-authored-by: David Stuebe <david@camus.energy>
@anivegesana anivegesana changed the title Abstract Base Classes [Support ABC and Enums - Part 1] Abstract Base Classes [Support ABC and Enums - Part 2] Mar 8, 2023
@anivegesana anivegesana marked this pull request as draft March 8, 2023 03:03
@anivegesana
Copy link
Contributor Author

@mmckerns I think that the cases that aren't covered in this PR are for versions of PyPy in which the _get_dump function doesn't exist. Do you think we should just ignore the coverage for this PR or try running coverage on PyPy?

@anivegesana anivegesana mentioned this pull request Mar 8, 2023
@anivegesana
Copy link
Contributor Author

anivegesana commented Mar 8, 2023

Also, @mmckerns when are we dropping Python 3.7? When that happens, I can remove the ugly hacks I used to make save_cell and save_function work and just use state setters (6th element in __reduce__) like cloudpickle does. It is more efficient and would declutter those two functions a lot. This is something that would require a lot of effort though, so I am not planning on doing it any time soon.

The problem would be that dill pickles would no longer be backward compatible. If this feature was added in dill 0.3.8, all pickles made with dill 0.3.8 will be incompatible with dill 0.3.6/7 on Python 3.7. They would however work on dill 0.3.7 on Python 3.8+ if we plan ahead and add the state setter functions in the upcoming release, even if we don't quite use them yet. I don't think that dropping 3.7 support during the summer would be a problem, however, considering Python 3.7 will reach end-of-life on June 27.

@mmckerns
Copy link
Member

mmckerns commented Mar 8, 2023

dill tends to support versions of python longer than other packages, primarily because of the storage nature of this package. I haven't figured out what the drop date will be yet.

It's a reasonable approach to figure out Python first, then PyPy later. The issue is, of course, if supporting PyPy causes changes to functions along the Python serialization path... then you might have to create several shims to maintain support for already pickled files.

@anivegesana
Copy link
Contributor Author

The issue isn't that PyPy and Python need different mechanisms for loading ABCs. The loading is the same.

Dumping differs because the location that the registered subclasses of the ABC are stored in different places. In CPython, they are stored somewhere in the C code of the abc module directly. In PyPy, each ABC has a registry that stores all registered subclasses in an attribute. This different means that there are two separate cases when dumping ABCs and there is no way to each good coverage on one version of Python. It needs to be run on both CPython and PyPy to reach full coverage.

@mmckerns
Copy link
Member

mmckerns commented Mar 8, 2023

Sure. What I meant is that if you refactor anything (i.e. any dill internal functions) that are stored along with the pickle, then you potentially get into shim territory when you start refactoring to include PyPy. Regardless, you can probably just go ahead with it.

@anivegesana anivegesana marked this pull request as ready for review March 9, 2023 02:49
@mmckerns mmckerns self-requested a review March 20, 2023 23:05
@mmckerns
Copy link
Member

Sorry, this doesn't notify me unless you also tag me. Just seeing this now. Will review.

Copy link
Member

@mmckerns mmckerns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. However, please do address the comments.

@@ -1700,6 +1700,27 @@ def _get_typedict_type(cls, clsdict, attrs, postproc_list):
# clsdict.pop('__prepare__', None)
return clsdict, attrs

def _get_typedict_abc(obj, _dict, attrs, postproc_list):
if hasattr(abc, '_get_dump'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does abc ever not have _get_dump? I believe this method exists for python 3.7 and above -- and dill supports python 3.7 and above. Haven't checked if there's older versions of 3.7 (etc) that don't have this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't in PyPy. PyPy's implementation of abc is similar to the CPython 3.6 and older implementation because it is compiled directly into a binary by the RPython compiler and the point of PyPy is to implement Python in Python, so rewriting it in C doesn't make sense for them.

Copy link
Member

@mmckerns mmckerns Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had checked pypy along with python, and found the same. pypy lags changes in python, but it seems that for the last few versions of pypy, it's there.

Python 3.7.13 (7e0ae751533460d5f89f3ac48ce366d8642d1db5, Mar 29 2022, 06:30:54)
[PyPy 7.3.9 with GCC Apple LLVM 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>> import abc
>>>> abc._get_dump
<built-in function _get_dump>
>>>> 

So, are you saying we need the code block due to older releases of pypy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That is interesting. Yes. There are some older PyPy builds for Python 3.7 that do not have _get_dump. I guess they added a _get_dump function to make the API identical to the CPython version. Maybe we can remove this if statement?

del _dict['_abc_negative_cache']
# del _dict['_abc_negative_cache_version']
else:
del _dict['_abc_impl']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to use pop instead of del to delete, just because it's protected against the keyword not being there.

Copy link
Contributor Author

@anivegesana anivegesana Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, strangely enough, del is older than pop. All modern Python implementations, including MicroPython, will have both, so this is not a concern.

I chose to use del because I already use the value of _abc_impl earlier in the function, so I know it exists. I guess switching to pop might be better for the other statements though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to leave it as is. This is what I use and it makes sense based on the return values of the different functions/operators:

  • del a[·]: when I know the key exists in the dictionary and don't need the value
  • a.pop(·): when I know the key exists in the dictionary but do need the value
  • a.pop(·, None): when I do not know if the key exists in the dictionary

Copy link
Member

@mmckerns mmckerns Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I said "keyword" and meant "key". I didn't mean that for you to think I was worried that del doesn't exist, I meant it in this sense:
a.pop(·, None): when I do not know if the key exists in the dictionary
The bug encountered with del is almost always that the key doesn't exist, so if the key might not (in some rare case) exist, then pop will protect you against that.

@mmckerns mmckerns added this to the dill-0.3.7 milestone Mar 21, 2023
@mmckerns mmckerns mentioned this pull request Mar 25, 2023
@anivegesana anivegesana force-pushed the enums2 branch 2 times, most recently from 995e34c to f522c0b Compare March 25, 2023 17:07
@mmckerns
Copy link
Member

Also, note that you state "Adds support for pickling ABCs for Python 3.6 and higher". However, only 3.7+ are supported.

@mmckerns mmckerns merged commit 3c8790b into uqfoundation:master Mar 26, 2023
@anivegesana
Copy link
Contributor Author

Yeah. I just copied the commit name from the old PR, which is why it is like that. The enum PR was 2.7 and higher, but I removed the 2.7 code so we don't have to look at it. When we drop 3.7 support, we can remove the _get_dump if statement in this PR too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants