You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When taking stratified bootstrap samples from a data frame, I expected separate bootstrap samples to be taken within each stratum, so that the resulting bootstrap sample has the same number of observations in each stratum as the original data frame. However, that is not the case when using the bootstraps() function of the rsample package. When I run this code:
I was expecting to see 6 1's, 6 2's, 23 3's, and 23 4's in each of the three bootstrap samples.
When I posted a query on stackoverflow, joran commented that the function make_strata by default pools strata below 15% of the total, with no way to adjust that parameter from the calling functions, like boostraps(). This pooling is not mentioned in the help documentation for the bootstraps() function.
sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] rsample_0.0.4 tidyr_0.8.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 crayon_1.3.4 dplyr_0.8.0.1
[4] assertthat_0.2.0 R6_2.4.0 magrittr_1.5
[7] pillar_1.3.1 rlang_0.3.1 rstudioapi_0.9.0
[10] generics_0.0.2 tools_3.5.2 glue_1.3.0
[13] purrr_0.3.0 yaml_2.2.0 compiler_3.5.2
[16] pkgconfig_2.0.2 tidyselect_0.2.5 tibble_2.0.1
The text was updated successfully, but these errors were encountered:
The PR in #149 lowers the threshold for strata pooling to 10% of the total and adds documentation to each function so that users can be more clear on what's going on with their groups!
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.
When taking stratified bootstrap samples from a data frame, I expected separate bootstrap samples to be taken within each stratum, so that the resulting bootstrap sample has the same number of observations in each stratum as the original data frame. However, that is not the case when using the
bootstraps()
function of the rsample package. When I run this code:These are the results I get:
I was expecting to see 6 1's, 6 2's, 23 3's, and 23 4's in each of the three bootstrap samples.
When I posted a query on stackoverflow, joran commented that the function
make_strata
by default pools strata below 15% of the total, with no way to adjust that parameter from the calling functions, likeboostraps()
. This pooling is not mentioned in the help documentation for thebootstraps()
function.The text was updated successfully, but these errors were encountered: