Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch params argument with xgboost engine in boost_tree() #787

Merged
merged 4 commits into from
Aug 17, 2022

Conversation

simonpcouch
Copy link
Contributor

Closes #774, closes #459. Related to #411.

The goal of this PR is to ensure that folks can pass arguments that live in the param argument to xgb.train. The new docs section is probably the best place to start for big picture here. :)

Some notes-to-self that helped me keep track of arguments:

  • xgb.train routes arguments from its dots to the params argument. For simplicity of our own argument routing, this PR proposes we pass non-main params arguments to the dots rather than params.
  • xgb_train previously took objective as a main argument. This argument is eventually routed to the params argument in xgb.train, so again for simplicity of our machinery, I deleted the objective argument to xgb_train so that it will be passed through dots. This change isn’t user-facing and doesn’t change the way the argument is passed in practice.
  • We’d prefer that users pass elements of the params argument directly to set_engine rather than as part of the params list so that tune machinery works “out-of-the-box.” This PR now raises a warning when users supply a non-empty params argument (though now correctly handles patching the params argument with main boost_tree arguments).

An additional unit test that arguments passed via ... to xgb_train can indeed be tuned; will PR to extratests after this is merged:

# with this devel parsnip:
library(tidymodels)

ctrl$verbosity <- 0L
#> Error in ctrl$verbosity <- 0L: object 'ctrl' not found

# define base model spec
spec_base <-
  boost_tree() %>%
  set_mode("regression") %>%
  set_engine("xgboost", eval_metric = tune())

res <-
  tune_grid(
    spec_base,
    preprocessor = mpg ~ .,
    resamples = vfold_cv(mtcars, v = 6),
    param_info =
      extract_parameter_set_dials(spec_base) %>%
      update(eval_metric = new_qual_param(
        type = "character",
        values = c("rmse", "logloss"),
        label = c(eval_metric = "Evaluation Metric")
      )),
    control = control_grid(save_workflow = TRUE)
  )

res_eval_values <-
  collect_metrics(res) %>%
  pull(eval_metric)

all(c("logloss", "rmse") %in% res_eval_values)
#> [1] TRUE

Created on 2022-08-15 by the reprex package (v2.0.1)

@topepo
Copy link
Member

topepo commented Aug 17, 2022

I'm ready to merge this but it breaks a package. Can you investigate and, if needed, add a PR (or issue) to https://github.com/Harrison4192/autostats/issues?

autostats

Run revdep_details(, "autostats") for more info

Newly broken

  • checking examples ... ERROR
    Running examples in ‘autostats-Ex.R’ failed
    The error most likely occurred in:
    
    > ### Name: get_params
    > ### Title: get params
    > ### Aliases: get_params get_params.xgb.Booster
    > 
    > ### ** Examples
    > 
    > 
    ...
    > iris_dummies %>%
    +   tidy_formula(target = Petal.Length) -> p_form
    > 
    > iris_dummies %>%
    +   tidy_xgboost(p_form, mtry = .5, trees = 5L, loss_reduction = 2, sample_size = .7) -> xgb
    Warning: `early_stop` was reduced to 4.
    Warning: `early_stop` was reduced to 4.
    Error in if (objective == "multi:softmax") { : argument is of length zero
    Calls: %>% ... withCallingHandlers -> %>% -> tidy_predict -> tidy_predict.xgb.Booster
    Execution halted
    

@simonpcouch
Copy link
Contributor Author

Yup, sure thing.

@simonpcouch
Copy link
Contributor Author

Submitted a fix!👍

@topepo topepo merged commit 6c5482a into main Aug 17, 2022
@topepo topepo deleted the boost-tree-params-774 branch August 17, 2022 23:32
@github-actions
Copy link

github-actions bot commented Sep 1, 2022

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants