Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to check if an array contains categorical values? #332

Closed
juliohm opened this issue Mar 2, 2021 · 12 comments
Closed

How to check if an array contains categorical values? #332

juliohm opened this issue Mar 2, 2021 · 12 comments

Comments

@juliohm
Copy link
Contributor

juliohm commented Mar 2, 2021

I used to write v isa CategoricalArray in user code to change the behavior and noticed that in the latest release of the package a view of a CategoricalArray is no longer a CategoricalArray. Is it safe to do eltype(v) <: CategoricalValue to check the condition in code? Perhaps a user function could be added to the api with this purpose?

@bkamins
Copy link
Member

bkamins commented Mar 2, 2021

Is it safe to do eltype(v) <: CategoricalValue to check the condition in code?

It is not safe unfortunately.

Your question is in general hard that we need to resolve somehow - maybe @nalimilan has some more recent thoughts what would be best to do, as we are asked about it often.

@nalimilan
Copy link
Member

Most of the time eltype(v) <: CategoricalValue should work, but depending on what you do you will likely get failures with Array{CategoricalValue{...}} objects (e.g. if you call levels).

The most correct solution is to use v isa Union{CategoricalArray, SubArray{<:Any, <:Any, <:CategoricalArray}} (or v isa CategoricalArrays.CatArrOrSub but it's not exported so there's no guaranty it will stay). I'm not sure what we could do to improve this except exporting CatArrOrSub and documenting it -- but we would need to find a better name.

@juliohm
Copy link
Contributor Author

juliohm commented Mar 2, 2021

What about a iscategorical(v) trait function that does the above and hides from the user the internal detail? That way you can update the trait function in future releases without breaking things.

@nalimilan
Copy link
Member

Well it would be equivalent to exporting CatArrOrSub, but with the drawback that it cannot be used for dispatch.

@juliohm
Copy link
Contributor Author

juliohm commented Mar 2, 2021 via email

@nalimilan
Copy link
Member

Why would iscategorical(v) be simpler for users than v isa CatArrOrSub (especially if we find a better name)?

@juliohm
Copy link
Contributor Author

juliohm commented Mar 2, 2021

Maybe it is a matter of taste. Maybe it is because most beginners aren't even familiar with the existence of the operator isa in Julia. Writing if iscategorical(v) when iscategorical is well-documented in the package docs seems more natural to me, but of course you have the final decision. Also if for some reason the condition changes internally because the internal type is gone or split into two or more types, the complexity is hidden from the user who doesn't need to know about types. You basically shift the work to the maintainers of CategoricalArrays.jl who will be responsible to provide a working iscategorical across multiple releases for which the CatArrOrSub may not even exist. Changing CatArrOrSub is breaking so user code that does use CatArrOrSub will more likely need to change than user code that uses iscategorical.

@nalimilan
Copy link
Member

Why would changing CatArrOrSub be more breaking than changing iscategorical?

@juliohm
Copy link
Contributor Author

juliohm commented Mar 2, 2021 via email

@nalimilan
Copy link
Member

Yes that could create dispatch issues, but iscategorical couldn't be used at all for dispatch, so I'm not sure it's a strong argument. iscategorical is indeed less binding for us, but it also provides fewer possibilities to users.

@ablaom
Copy link

ablaom commented May 31, 2021

This issue, in case anyone hasn't noticed, is essentially a duplicate of #234, which is also open.

@juliohm
Copy link
Contributor Author

juliohm commented Apr 2, 2024

Closing as duplicate

@juliohm juliohm closed this as completed Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants