-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_feather method does not restore a dataframe's table-level user-defined properties (attributes or tags or metadata) #324
Comments
Not even pickle preserves these properties when using pandas, I don't think this is something we intend to fix in generality:
|
Thank you for the speedy response (and for creating pandas!). In that case, I hope that some sort of compression support gets added to feather. (I have to store the property as a column to preserve it, so that means repeating it many, many times.) |
I'm waiting on the R community to get more involved, we have had the compression support in Apache Arrow already for a very long time |
Great, hope they catch up. (It's a big advantage to have compression, especially when you need to move around a hundred thousand dataframes with three hundred thousand rows each!) |
(just fyi: some useful cross-references: One of the more comprehensive discussions I've found: |
I create a property named someid (it is NOT a column) in a pandas dataframe named df and assign it a value:
df.someid = 24
I make sure the property is there:
print(df.someid) #prints 24
I save the dataframe to a feather file:
df.to_feather("C:/pandas/DataLoss.feather")
I read the feather file back into a dataframe:
df = pd.read_feather("C:/pandas/DataLoss.feather")
I try to retrieve the property from the dataframe:
print(df.someid)
And I do not get its value (24) back. Instead I get this error message:
Thanks for any assistance or comments. (--Apologies in advance if I missed something in the docs or online; I didn't see anything relevant after a reasonable search, except the unapproved use of the undocumented _metadata.)
I realize there are thorny, perhaps unresolvable, problems with dataframe table-level metadata propagation (e.g. how to handle vertical dataframe concatenations), but I think what happens (by design) should be documented. As it is, I don't know what is supposed to be happening by design here.
The text was updated successfully, but these errors were encountered: