Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace _msgpack with _pyarrow #28944

Closed
endremborza opened this issue Oct 12, 2019 · 5 comments
Closed

replace _msgpack with _pyarrow #28944

endremborza opened this issue Oct 12, 2019 · 5 comments

Comments

@endremborza
Copy link
Contributor

After finding out the future of read/to_msgpack and on-the-wire transmission of pandas objects in #28388 and #28388, as I understand there will be no fast way to do:

>>> buf = df.to_XY()
>>> type(buf)
<class 'bytes'>
>>> df2 = pd.read_XY(buf)
>>> df.equals(df2)
True

I think a lot of users would welcome the XY: pyarrow (or simply arrow) function to replace the current msgpack version. This could be done with a few wrappers over pyarrow pandas serialization functions. Testing and maintaining might seem daunting, but as I see it, the arrow project is quite committed to maintaining pandas compatibility.

However, if this is too much, at the very least a doc upgrade should address this, as currently it is not very easy to find the closest thing reproducing the top example.

I would be happy to post a PR with either, just pick one, and if the first one seems viable then the extent of necessary testing should be determined.

btw, my current method of replicating the above behavior is

>>> import pyarrow as pa
>>> buf = pa.serialize_pandas(df).to_pybytes()
>>> type(buf)
<class 'bytes'>
>>> df2 = pa.deserialize_pandas(buf)
>>> df.equals(df2)
True
@jreback
Copy link
Contributor

jreback commented Oct 12, 2019

i believe the docs were recently updated (or at least the deprecation message)

would take a read_arrow method; think we have an open issue about this

@endremborza
Copy link
Contributor Author

cant find the open issue, did find the doc PR ( #28494 ), that seems fair

@jreback
Copy link
Contributor

jreback commented Oct 12, 2019

#15907

was an old PR

welcome to have you revive it

@endremborza
Copy link
Contributor Author

will do, thanks

@simonjayhawkins
Copy link
Member

msgpack is deprecated #30112

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants