-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saved model memory issue #116
Comments
I ran it again using the most recent development version of butcher and the increase was 32x still. Using butcher::weigh on the models reveals that coefs. objects increase a lot |
Holy smokes, that's a big increase! I was able to reproduce the issue and it seems robust to saving in a few different ways. I also see that the A slightly more minimal example: library(tidymodels)
library(modeldata)
library(readr)
library(lobstr)
library(stacks)
data("lending_club")
folds <- vfold_cv(lending_club, v = 5)
lr_mod <-
logistic_reg(penalty = tune()) %>%
set_engine("glmnet") %>%
workflow(
preprocessor = Class ~ funded_amnt + int_rate,
spec = .
) %>%
tune_grid(
resamples = folds,
control = control_stack_grid()
)
lr_stack <- stacks() %>%
add_candidates(lr_mod) %>%
blend_predictions() %>%
fit_members()
#> Warning: Predictions from 12 candidates were identical to those from existing
#> candidates and were removed from the data stack.
write_rds(lr_stack, file = "saved_mod.Rds")
saved_lr_stack <- read_rds("saved_mod.Rds")
obj_sizes(lr_stack, saved_lr_stack)
#> * 6,003,216 B
#> * 219,515,088 B Created on 2022-04-21 by the reprex package (v2.0.1) with a similar 30x+ increase. Will come back to this in a bit—thanks for the issue. :) |
Quick addition—this isn't just a parsnip + glmnet issue! library(modeldata)
library(readr)
library(lobstr)
library(parsnip)
data("lending_club")
lr_mod <- logistic_reg(penalty = .5)
mod_fit <-
lr_mod %>%
set_engine("glmnet") %>%
fit(Class ~ funded_amnt + int_rate, data = lending_club)
write_rds(mod_fit, file = "saved_mod.Rds")
saved_mod_fit <- read_rds("saved_mod.Rds")
obj_sizes(mod_fit, saved_mod_fit)
#> * 1,651,112 B
#> * 1,647,624 B Created on 2022-04-21 by the reprex package (v2.0.1) |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
The problem
I'm having trouble with saving a stacked xgb model object in local folder for later use. The issue is that the object somehow expands in size and does not fit into RAM when read back to new session. I have tried to butchering but it does not seem to help much.
Im not sure if this is due to stacks, butcher or simply my approach. Here is an example with a lot smaller dataset than the one I struggle with but the relative increase in size is of the same magnitude.
Thanks,
Reproducible example
Created on 2022-04-21 by the reprex package (v2.0.1)
Session info
The text was updated successfully, but these errors were encountered: