Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does categorical!(df::DataFrame, ...) exist? #2192

Closed
matthieugomez opened this issue Apr 16, 2020 · 3 comments · Fixed by #2195
Closed

Why does categorical!(df::DataFrame, ...) exist? #2192

matthieugomez opened this issue Apr 16, 2020 · 3 comments · Fixed by #2195
Labels
breaking The proposed change is breaking. decision
Milestone

Comments

@matthieugomez
Copy link
Contributor

matthieugomez commented Apr 16, 2020

I don't understand why categorical(df, cols) exists.

If it is about applying a function (categorical) to multiple columns of a dataframes, there should be a way to do it for any function, not just categorical (I guess that's what transform is about?).

I'm not following the package closely, so sorry if this was discussed somewhere.

@bkamins bkamins added breaking The proposed change is breaking. decision labels Apr 16, 2020
@bkamins bkamins added this to the 1.0 milestone Apr 16, 2020
@bkamins
Copy link
Member

bkamins commented Apr 16, 2020

Thank you for raising this point (we have a similar case with allowmissing and disallowmissing). Simply in the past our API was much less flexible.

It has one minor functionality that:

  1. by default it converts string column
  2. it can take type as a column selector

But apart from this it is true that roughly (I am dropping some corner cases):

@deprecate categorical(df, cols) transform(df, cols .=> categorical .=> cols)

and

@deprecate categorical!(df, cols) transform!(df, cols .=> categorical .=> cols)

Having said this, I do not see a huge problem to keep categorical and categorical! as convenience functions. We define such functions for most common transformations, so that users have an easier life. @nalimilan - what do you think?

@nalimilan
Copy link
Member

Yes I think it makes sense to have some convenience functions. That's the clearest for allowmissing as it's a very common need, but categorical is already exported since it exists in CategoricalArrays, so adding a new method isn't very costly.

For other functions, mapcols can also be used instead of transform. I guess we could add mapcols! for in-place operation.

@matthieugomez
Copy link
Contributor Author

Thanks for the explanation! I think it'd be better to have only one way to do things, but that's your choice! I appreciate all the improvement happening these days, by the way. Thanks for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking The proposed change is breaking. decision
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants