Add Pickle support #372

nada-ben-ali · 2024-12-09T09:55:10Z

I am working on caching the PDX config parsing result to avoid reparsing it each time, using Pickle for serialization.

During the pickling process, no issues were encountered. However, during the unpickling step, the following exception occurred: "maximum recursion depth exceeded" which originated in the NamedItemList class.
The recursion occurred due to infinite calls to the NamedItemList.__getattr__ method. Specifically, the code attempted to access a property that was uninitialized at the time, causing the __getattr__ to be invoked repeatedly during the reconstruction of the object.
To address this, I added the __reduce__ method which explicitly defines how an object is serialized and reconstructed during pickling and unpickling. After adding this method, the recursion issue was resolved.

After fixing the recursion issue, I got the following exception with some PDX files during the unpickling:

This issue arises because the NamedItemList assumes all its elements conform to the OdxNamed interface, including having a short_name attribute. While the PDX files were correctly parsed initially, the issue arose because TableRow objects were reconstructed in a state where their short_name attribute was not properly restored or initialized.
Thus, the solution was to add the __reduce__ method for the TableRow class, ensuring its attributes are correctly restored during unpickling.

odxtools: 8.3.4

nada-ben-ali · 2024-12-10T09:05:07Z

@andlaus I tried your suggestion, which is to copy only the dataclass fields of the objects and then pickle/unpickle the object returned by “copy_db”.
I got an exception saying that “cannot pickle ‘zlib.Decompress’ object”:

I therefore excluded .jar files from serialization by adding __getstate__ and __setstate__ methods to the Database class.
Then, I got the same exception regarding the recursion:

nada-ben-ali · 2024-12-12T17:02:10Z

@kayoub5 what do you think? Any suggestion?

kayoub5 · 2024-12-12T17:54:51Z

@kayoub5 what do you think? Any suggestion?

The approach taken by the PR looks valid at first glance.

A unit test will be needed to ensure future versions don't break pickling support.

For the suggestion from @andlaus, pickling should be in theory a faster than re-reading the pdx, and work fine on circular references of normal classes.

Trying to get the whole database to work with pickling is not the target here.
We are not pickling the whole pdx, just the request/response we are interested in (so unused inherited dops and structures are pruned from serialization)
the NamedItemList overriding the builtin list type is causing the side effects on pickling we are trying to fix here.

@nada-ben-ali do you have numbers on how much pickling is faster on a large PDX?

andlaus · 2024-12-12T18:41:57Z

(so unused inherited dops and structures are pruned from serialization)

I guess it's better to overload the __getstate__() and __setstate__() methods in classes that should be pickleable: https://docs.python.org/3/library/pickle.html#object.__getstate__

(as a general approach, a OdxDataObject class could be introduced which could handle serialization of dataclass fields. then, it must just be ensured that all dataclasses that correspond to ODX XML tags are derived from that class.)

kayoub5 · 2024-12-12T19:45:09Z

(so unused inherited dops and structures are pruned from serialization)

I guess it's better to overload the __getstate__() and __setstate__() methods in classes that should be pickleable: https://docs.python.org/3/library/pickle.html#object.__getstate__

(as a general approach, a OdxDataObject class could be introduced which could handle serialization of dataclass fields. then, it must just be ensured that all dataclasses that correspond to ODX XML tags are derived from that class.)

we tried that, doesn't work on a subclass of the builtin type list.

dataclasses are in general pickable, tablerow was the exception that we had to fix.

andlaus · 2024-12-12T20:17:18Z

dataclasses are in general pickable

sure, but you probably only want to serialize the fields of the dataclass, not also the resolved references, etc?!

kayoub5 · 2024-12-12T20:48:00Z

dataclasses are in general pickable

sure, but you probably only want to serialize the fields of the dataclass, not also the resolved references, etc?!

actually, we do, saves time on computing them again

nada-ben-ali · 2024-12-13T08:31:33Z

@nada-ben-ali do you have numbers on how much pickling is faster on a large PDX?

@kayoub5 yes, pickling has saved us a lot of time. For example, parsing a PDX file without pickling took 14 sec 296 ms, while using the pickling reduced it to just 846 ms. Similarly, with another PDX file, the time dropped from 2 sec 791 ms to 406 ms.

A unit test will be needed to ensure future versions don't break pickling support.

I agree, a unit test is crucial to guarantee that pickling support remains intact in future versions. I'll add one.

I guess it's better to overload the getstate() and setstate() methods in classes that should be pickleable: https://docs.python.org/3/library/pickle.html#object.__getstate__

@andlaus we tried that, but for list subclasses, the __setstate__() method is not invoked.

andlaus

I like this...

odxtools/nameditemlist.py

odxtools/tablerow.py

…it test

tests/test_diag_data_dictionary_spec.py

tests/test_odxtools.py

nada-ben-ali · 2024-12-18T10:03:17Z

@andlaus could you please review the changes after the updates I've made and if you approve them, then we can merge them?

andlaus

@nada-ben-ali: press the green button once you're happy with it...

odxtools/tablerow.py

nada-ben-ali added 2 commits December 9, 2024 10:52

Add Pickle support

3f7ae28

fix linting

871adc0

nada-ben-ali requested review from andlaus and kayoub5 December 9, 2024 10:10

nada-ben-ali mentioned this pull request Dec 10, 2024

Add Pickle support #371

Closed

andlaus reviewed Dec 13, 2024

View reviewed changes

odxtools/nameditemlist.py Show resolved Hide resolved

odxtools/nameditemlist.py Outdated Show resolved Hide resolved

odxtools/tablerow.py Outdated Show resolved Hide resolved

nada-ben-ali added 4 commits December 15, 2024 05:34

add missing docstrings and integrate pickle support check into the un…

06f30d1

…it test

fix linting

6e8cfcf

fix

69bfe9e

fix

fbbe5cb

nada-ben-ali requested a review from andlaus December 15, 2024 05:01

kayoub5 reviewed Dec 15, 2024

View reviewed changes

tests/test_diag_data_dictionary_spec.py Outdated Show resolved Hide resolved

kayoub5 reviewed Dec 15, 2024

View reviewed changes

tests/test_odxtools.py Outdated Show resolved Hide resolved

fix unit test

af72b9d

kayoub5 approved these changes Dec 17, 2024

View reviewed changes

andlaus approved these changes Dec 19, 2024

View reviewed changes

odxtools/tablerow.py Show resolved Hide resolved

nada-ben-ali merged commit 03a096f into mercedes-benz:main Dec 19, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Pickle support #372

Add Pickle support #372

nada-ben-ali commented Dec 9, 2024 •

edited

Loading

nada-ben-ali commented Dec 10, 2024

nada-ben-ali commented Dec 12, 2024

kayoub5 commented Dec 12, 2024

andlaus commented Dec 12, 2024

kayoub5 commented Dec 12, 2024

andlaus commented Dec 12, 2024

kayoub5 commented Dec 12, 2024

nada-ben-ali commented Dec 13, 2024 •

edited

Loading

andlaus left a comment

nada-ben-ali commented Dec 18, 2024

andlaus left a comment

Add Pickle support #372

Add Pickle support #372

Conversation

nada-ben-ali commented Dec 9, 2024 • edited Loading

nada-ben-ali commented Dec 10, 2024

nada-ben-ali commented Dec 12, 2024

kayoub5 commented Dec 12, 2024

andlaus commented Dec 12, 2024

kayoub5 commented Dec 12, 2024

andlaus commented Dec 12, 2024

kayoub5 commented Dec 12, 2024

nada-ben-ali commented Dec 13, 2024 • edited Loading

andlaus left a comment

Choose a reason for hiding this comment

nada-ben-ali commented Dec 18, 2024

andlaus left a comment

Choose a reason for hiding this comment

nada-ben-ali commented Dec 9, 2024 •

edited

Loading

nada-ben-ali commented Dec 13, 2024 •

edited

Loading