Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: use na_value from CategoricalDtype.categories in Categorical #52687

Closed
topper-123 opened this issue Apr 16, 2023 · 2 comments
Closed

API: use na_value from CategoricalDtype.categories in Categorical #52687

topper-123 opened this issue Apr 16, 2023 · 2 comments

Comments

@topper-123
Copy link
Contributor

Currently, the Categorical nan value is hard-coded to np.nan. I propose making the CategoricalDtype.na_value take its value from CategoricalDtype.categories.dtype.na_value if the categories is an ExtensionArray else fall back to np.nan.

There are various code parts in Categorical that presume that the nan sentinel value is np.nan. Those will have to be changed to use Categorical.dtype.na_value instead.

@topper-123 topper-123 changed the title API: Add na_value to CategoricalDtype API: use na_value from.categories in CategoricalDtype Apr 16, 2023
@topper-123 topper-123 changed the title API: use na_value from.categories in CategoricalDtype API: use na_value from CategoricalDtype.categories in Categorical Apr 16, 2023
@rhshadrach
Copy link
Member

Duplicate of #50711 (in particular #50711 (comment)) I think.

@topper-123
Copy link
Contributor Author

Yes, it's the same idea is mentioned by @jorisvandenbossche. I think that's also the final conclusion for #50711 (i.e to follow the nan-behaviour of the underlying categories (np.nan for numpy arrays, pd.NA for extensionarrays).

I'm ok with closing this to avoid the duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants