-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: categorical dataexport - graceful degradation #8633
Comments
see #7621 for master issue of course appreciate user contributions to extend to these formats |
Sorry, searched but somehow missed #7621, otherwise I would have commented there. But nevertheless I think my point is a bit different in the sense that I suggest to have a rather simple generic fallback mechanism whenever there is no dedicated backend support. Yes, of course explicit conversion works, but since that is the natural generic approach, why not applying something like that internally as last resort instead of throwing NotImplemented? Would love to contribute, but barely have time to report issues. |
@fkaufer well, If am going to implement graceful degredation, then might as well implement the actual serialization. It requires nearly the same tests and effort. NotImplemented it just an explicit stop-gap until it can work. Its how a not implemented feature is signaled to the user. Do appreciate the issues report. Serialization was pushed to later for lack of time. So this is the same issue (as I don't think degradation is worth it, and it just hides it from the user, which is not good). |
Add support for exporting DataFrames containing categorical data. closes pandas-dev#8633 xref pandas-dev#7621
It would be great to generally apply graceful degradation for export of categorical data instead of raising exceptions.
Currently this is only the case for
to_sql
andto_csv
, where the categories are exported, whileto_pickle
is the only option to persist categorical dataFor Stata and HDF it is:
to_hdf
:NotImplementedError: cannot store a category dtype
to_stata
:ValueError: Data type category not currently understood. Please report an error to the developers.
As long as a backend does not support categoricals or the conversion is not yet implemented, why not generally export categories as a fallback? With the separately discussed decode method (#8628) this would be easy. If the same rigor (backend supports data type natively or fail) would be applied to CSV-IO we could only export string dtypes to CSV.
Thinking one step further, the
to_...
functions could have an optional parameter named something likeconvert_cat
with options:The last option would probably need additional parameters to control the technical implementation (e.g. table name for mapping or suffixes as for join/merge, ...)
The text was updated successfully, but these errors were encountered: