Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow cur_svy() to work inside mutate()? #138

Closed
bschneidr opened this issue Dec 20, 2021 · 8 comments
Closed

Allow cur_svy() to work inside mutate()? #138

bschneidr opened this issue Dec 20, 2021 · 8 comments

Comments

@bschneidr
Copy link
Contributor

I'm working on a pull request to try and implement the cur_svy_wts() function suggested in #136, and as a result I noticed that cur_svy() works inside summarize.tbl_svy() but not inside mutate.tbl_svy().

library(srvyr)

library(survey)
data(api)

# Create a survey design object
dstrata <- apistrat %>%
  as_survey_design(strata = stype, weights = pw)

# Able to access weights in summarize
dstrata %>%
  summarize(sum_of_wts = sum(1/cur_svy()$prob))
#> # A tibble: 1 x 1
#>   sum_of_wts
#>        <dbl>
#> 1      6194.

# But not in mutate
dstrata %>%
  mutate(wts = 1/cur_svy()$prob)
#> Error: Problem with `mutate()` column `wts`.
#> i `wts = 1/cur_svy()$prob`.
#> x Survey context not set

Created on 2021-12-19 by the reprex package (v2.0.0)

Is there a design reason why cur_svy() isn't supported inside mutate()? If not, I'm happy to see if we can get it to work.

@gergness
Copy link
Owner

Yeah that makes sense. I just hadn’t thought of a situation where you’d want the survey context set in a mutate, maybe needs it for all dplyr verbs (filtering by weights too?)

@gergness
Copy link
Owner

Filtering by comparing to the mean in a single step is probably a more likely situation

@bschneidr
Copy link
Contributor Author

Thanks, Greg. That's a good common example. Another more specialized reason for filtering on the weights is doing diagnostics of nonresponse adjustments. For example, you might want to look at cases where the adjusted weights are much different than the original weights. Or you might want to look at cases with especially large or small weights so that you can trim them to try to reduce variances.

Do you think it would make sense to add context for mutate() and filter()? I don't really see a downside other than hassle of implementation, but maybe I'm missing something. If you think it makes sense, I'm happy to put together a pull request for it.

@gergness
Copy link
Owner

Yep, makes sense! Thanks! I think it’s as simple as adding two lines like this, but I may be forgetting something
https://github.com/gergness/srvyr/blob/main/R/summarise.r#L6

@ray-p144
Copy link

I just ran into this today. I was trying to figure out how many missing values there were for the group with non-zero weights so I tried to use cur_svy() in filter(). It didn't work, so I tried using it in mutate() and that didn't work either.

I logged on to GitHub to see if there were any open issues on it just to find that you guys are already on top of it!

@gergness
Copy link
Owner

@bschneidr - did you get a chance to work on this? I'm hoping for some time to look at srvyr in the next week or two, would be nice to get this in the release. Happy to take a look myself if you haven't.

@bschneidr
Copy link
Contributor Author

Yep, sorry for the delayed reply @gergness. I had to take some time off of open-source work the last few weeks but finally have time now to work on this.

@gergness
Copy link
Owner

closed via #139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants