Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Suggestion - Function to Remove Specific Unused Levels #69

Closed
bschneidr opened this issue Dec 28, 2016 · 1 comment
Closed

Feature Suggestion - Function to Remove Specific Unused Levels #69

bschneidr opened this issue Dec 28, 2016 · 1 comment

Comments

@bschneidr
Copy link

bschneidr commented Dec 28, 2016

Sometimes you want to drop one or two specific unused factor levels but you don't want to drop all of the unused factor levels. For example, you might have the following factor from survey data

    vote_intention <- factor(x = c("Democrat", "Republican"), 
                             levels = c("Democrat", "Republican", 
                                        "Independent", "Undecided"))

where you'd be interested in removing the 'Undecided' level but not the 'Independent' level, even though both are unused. In such a case, you would essentially want to use a version of fct_drop() that doesn't drop every unused level but instead allows finer control over which levels are dropped.

A simple example of such a function would be the following:

fct_drop_specific <- function(f, l) {
          factor(x = f, 
                 levels = setdiff(levels(f), l),
                 ordered = is.ordered(f))
}

fct_drop_specific(f = vote_intention, l = "Undecided")

Forcats seems like the best package to have a function that would clearly accomplish this. Unless this is already implemented elsewhere, perhaps this could be implemented using arguments to fct_drop() or as a new function in forcats.

@hadley
Copy link
Member

hadley commented Dec 30, 2016

What should the function do if you request to drop a level that has values? Silently ignore and only drop levels that don't have any values? Or throw an error? Or silently replace all named levels with NA?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants