Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inside survey_stat_factor(), "reverse" conversion of peel variable back from numeric to character #88

Merged
merged 2 commits into from
Jul 25, 2020

Conversation

bschneidr
Copy link
Contributor

This is one way to fix #78 and #74.

Inside survey_stat_factor(), after .svy$variables[[peel_var]] is converted to a character vector, the corresponding column in the output out[[peel_var]] is converted back to the original type. In conducting the conversion from numeric to character, the format() function is used to maintain the maximum possible precision when converting from numeric to character.

My only qualm with this approach is that it is possible that some desired level of numeric precision is lost when converting from numeric to character, and perhaps there is a similarly simple solution which would avoid this issue of precision.

@bschneidr bschneidr changed the title Inside survey_stat_factor(), convert peel_var in output back to its… Inside survey_stat_factor(), "reverse" conversion of peel variable back from numeric to character Jul 20, 2020
@gergness
Copy link
Owner

seems reasonable, can you add tests though?

… original type from `.svy$variables`. In conversion to character, maintain maximum possible precision.
@bschneidr
Copy link
Contributor Author

bschneidr commented Jul 21, 2020

Good call.

It turns out something is going wrong when using double numeric grouping variables with many decimal places. The precision is preserved in the output, but sometimes the estimates and standard errors come back as NA. I'm still trying to figure out why this happens. Edit: Please disregard this- I had messed something up in my local environment when I wrote that.

@codecov-commenter
Copy link

Codecov Report

Merging #88 into master will increase coverage by 0.31%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #88      +/-   ##
==========================================
+ Coverage   79.94%   80.25%   +0.31%     
==========================================
  Files          20       20              
  Lines        1022     1028       +6     
==========================================
+ Hits          817      825       +8     
+ Misses        205      203       -2     
Impacted Files Coverage Δ
R/survey_statistics.r 92.73% <100.00%> (+1.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c379d9...8121cea. Read the comment docs.

@bschneidr
Copy link
Contributor Author

I've added three tests which I think should cover everything:

  1. survey_count()/survey_tally() work with an integer grouping variable
  2. survey_count()/survey_tally() work with a double grouping variable
  3. When a double grouping variable is used, the values of that variable in the output from survey_count()/survey_tally() are within .Machine$double.eps of the values in the input dataset.

With the addition of the third test, I think it should be ok to remove the warning message: warning("Coercing ", peel_name, " to character.", call. = FALSE):

warning("Coercing ", peel_name, " to character.", call. = FALSE)

@gergness gergness merged commit d53ed07 into gergness:master Jul 25, 2020
@gergness
Copy link
Owner

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

empty survey_mean/survey_total don't work if unpeeling an integer grouping var
3 participants