diff --git a/NEWS.md b/NEWS.md index 1dea223b..47fd7d42 100644 --- a/NEWS.md +++ b/NEWS.md @@ -8,6 +8,8 @@ * Escalated the deprecation of the `gather()` method for `rset` objects to a hard deprecation. Use `tidyr::pivot_longer()` instead (#257). +* Changed resample "fingerprint" to hash the indices only rather than the entire resample result (including the data object). This is much faster and will still ensure the same resample for the same original data object (#259). + # rsample 0.1.0 * Fixed how `mc_cv()`, `initial_split()`, and `validation_split()` use the `prop` argument to first compute the assessment indices, rather than the analysis indices. This is a minor but **breaking change** in some situations; the previous implementation could cause an inconsistency in the sizes of the generated analysis and assessment sets when compared to how `prop` is documented to function (#217, @issactoast). diff --git a/R/rset.R b/R/rset.R index a44ec4a9..455c43ae 100644 --- a/R/rset.R +++ b/R/rset.R @@ -74,7 +74,8 @@ new_rset <- function(splits, ids, attrib = NULL, res <- add_class(res, cls = subclass) } - fingerprint <- rlang::hash(res) + fingerprint <- map(res$splits, function(x) list(x$in_id, x$out_id)) + fingerprint <- rlang::hash(fingerprint) attr(res, "fingerprint") <- fingerprint res