Pass parameters to `evaluate_plan` through a grid, rather than a series of vectors #235

AlexAxthelm · 2018-02-05T09:17:42Z

An issue that I keep running into with evaluate_plan() is that setting up incomplete multiples is kind of a pain. As an example, If I have three schools that I want to run an analysis on, I might have something along the lines

hard_plan <- drake_plan(
  credits = check_credit_hours(school__),
  students = check_students(school__),
  grads = check_graduations(school__),
  public_funds = check_public_funding(school__)
)

evaluate_plan(
  hard_plan, 
  rules = list(school__ = c("schoolA", "schoolB", "schoolC"))
)

                 target                       command
1       credits_schoolA        check_credits(schoolA)
2       credits_schoolB        check_credits(schoolB)
3       credits_schoolC        check_credits(schoolC)
4      students_schoolA       check_students(schoolA)
5      students_schoolB       check_students(schoolB)
6      students_schoolC       check_students(schoolC)
7         grads_schoolA    check_graduations(schoolA)
8         grads_schoolB    check_graduations(schoolB)
9         grads_schoolC    check_graduations(schoolC)
10 public_funds_schoolA check_public_funding(schoolA)
11 public_funds_schoolB check_public_funding(schoolB)
12 public_funds_schoolC check_public_funding(schoolC) ## This will throw an error

Except schoolC will throw an error on check_public_funds because they don't receive any. So at this point, I have a few options:

make 2 drake_plans, one for schoolC, and then another for everybody else. 🙅‍♂️
use evaluate_plan, as above, and then use something like dplyr::filter to prune away everything I dion't want. Works okay for small numbers of exceptions, but doesn't scale well.
Pass a second school_type__ argument to each of my functions, which returns a NULL when appropriate. Not a perfect solution, but it (ideally) makes the drake plan easy to maintain, and if I put the return(NULL) early in the function, it's not a huge time sink overall.

But, there isn't a great way to pass those arguments such that they match:

very_wrong <- evaluate_plan(
  better_plan,
  rules = list(
    school__ = c("schoolA", "schoolB", "schoolC"),
    school_type__ = c("public", "public", "nonpublic")
    ),
  expand = TRUE
)
print(very_wrong) # this makes each school both a public and a nonpublic, and tries to evaluate both. I could use filter, but again, won't scale well. Further, makes duplicate targets

also_wrong <- evaluate_plan(
  better_plan,
  rules = list(
    school__ = c("schoolA", "schoolB", "schoolC"),
    school_type__ = c("public", "public", "nonpublic")
  ),
  expand = FALSE
)
print(also_wrong) #This correctly matches schools and school_types, but doesn't work to actually *expand* the plan.

Ideally I would have something like this:

matched_rules = tibble::tibble( #could also define a tribble
  school__ = c("schoolA", "schoolB", "schoolC"),
  school_type__ = c("public", "public", "nonpublic")
)

working_master_plan <- evaluate_plan(
  better_plan,
  rules = matched_rules,
  expand = TRUE
)

Currently, this evaluates to the same as very_wrong above. I'm not sure if the best option here would be to change default behaviors for rectangular objects passed to rules, or maybe add a matched_arguments flag in evaluate_plan, so that it can understand that not all expansions go with each other. Also, maybe I'm on a weird edge case, and a clarification on best practices around evaluate_plan would be helpful?

I think this is relevant for #228 and #233.

The text was updated successfully, but these errors were encountered:

wlandau · 2018-02-05T13:52:30Z

I see the general picture of what you're saying, and I'm trying to wrap my head around how we would solve it. It sounds like you want one wildcard for the expansion and the others to go along for the ride. How close is this to what you're after:

library(magrittr)
drake_plan(
  credits = check_credit_hours("school_", "funding_"),
  students = check_students("school_", "funding_"),
  grads = check_graduations("school_", "funding_"),
  public_funds = check_public_funding("school_", "funding_"),
  strings_in_dots = "literals"
) %>% evaluate_plan(
    wildcard = "school_",
    values = c("schoolA", "schoolB", "schoolC"),
    expand = TRUE
  ) %>%
  evaluate_plan(
    wildcard = "funding_",
    values = c("public", "public", "private"),
    expand = FALSE
  )

#>                  target                                    command
#> 1       credits_schoolA    check_credit_hours("schoolA", "public")
#> 2       credits_schoolB    check_credit_hours("schoolB", "public")
#> 3       credits_schoolC   check_credit_hours("schoolC", "private")
#> 4      students_schoolA        check_students("schoolA", "public")
#> 5      students_schoolB        check_students("schoolB", "public")
#> 6      students_schoolC       check_students("schoolC", "private")
#> 7         grads_schoolA     check_graduations("schoolA", "public")
#> 8         grads_schoolB     check_graduations("schoolB", "public")
#> 9         grads_schoolC    check_graduations("schoolC", "private")
#> 10 public_funds_schoolA  check_public_funding("schoolA", "public")
#> 11 public_funds_schoolB  check_public_funding("schoolB", "public")
#> 12 public_funds_schoolC check_public_funding("schoolC", "private")

AlexAxthelm · 2018-02-05T15:46:01Z

This is perfect. This works well for a simple, 1 to 1 matchup between targets, like above, and more complicated many to 1 matchps can be resolved using just the same pair of evaluate_plans , replacing values, with rules:

rules_grid <- tibble(
  school_ =  c("schoolA", "schoolB", "schoolC"),
  funding_ = c("public", "public", "private"),
) %>% 
crossing(cohort_ = c("2012", "2013", "2014", "2015")) %>%
filter(!(school_ == "schoolB" & cohort_ %in% c("2012", "2013"))) %>%
print()


drake_plan(
  credits = check_credit_hours("school_", "funding_", "cohort_"),
  students = check_students("school_", "funding_", "cohort_"),
  grads = check_graduations("school_", "funding_", "cohort_"),
  public_funds = check_public_funding("school_", "funding_", "cohort_"),
  strings_in_dots = "literals"
) %>% evaluate_plan(
    wildcard = "school_",
    values = rules_grid$school_,
    expand = TRUE
  ) %>%
  evaluate_plan(
    wildcard = "funding_",
    rules = rules_grid,
    expand = FALSE
  )

In the example above, I have schoolB reporting data for only a subset of the years, but filting the missing years out, or constructing the rules_grid in some other way, lets me build this however I need.

Thanks! 👍

krlmlr · 2018-02-05T15:51:40Z

Do we have a "usage patterns" vignette or section where we could document this?

wlandau · 2018-02-05T17:22:30Z

I think the best practices vignette is the right place. Reopening because it's now a documentation issue.

wlandau · 2018-02-11T03:54:33Z

Thanks again @AlexAxthelm! Your example is great, and I have appended a section in the best practices vignette.

jw5 · 2018-05-27T20:44:56Z

Unfortunately, the plan generated here and documented in best practices is not a valid Drake plan as it contains duplicate target names. I took a stab at a version with unique names (appended year), but I'm not happy with the solution:

# Possible solution: #235 Modifed to generate unique targets as required.

rules_grid <- tibble(
  # The schools and their funding types.
  # Note that this solution does not handle the case of a school switching type!
  school_ =  c("schoolA", "schoolB", "schoolC"),
  funding_ = c("public", "public", "private"),
) %>%
  # Generate the full cross product of (school,funding)x(years)
  crossing(cohort_ = c("2012", "2013", "2014", "2015")) %>%
  # Remove the two years school B didn't exist.
  filter(!(school_ == "schoolB" & cohort_ %in% c("2012", "2013"))) %>%
  # Confirm the correct plan template.
  print()

plan <- drake_plan(
  # Start with the four types of checks to perform
  credits = check_credit_hours("school_", "funding_", "cohort_"),
  students = check_students("school_", "funding_", "cohort_"),
  grads = check_graduations("school_", "funding_", "cohort_"),
  public_funds = check_public_funding("school_", "funding_", "cohort_"),
  strings_in_dots = "literals"
) %>% expand_plan(
  # Use a forced expansion with a target suffix defined by school_year.
  # I don't really like this solution but I couldn't think of a better one :-(
  # Note that this duplicates each target 10 times for a total of 40.
  # However, no parameter substitution is done, that is fixed in the next step.
  values = paste(rules_grid$school_, rules_grid$cohort_, sep = "_")
) %>% evaluate_plan(
  # Finally, substitute the correct parameter values into the commands.
  # Note that since each target is duplicated 10 times, they each get a full
  # complement of parameter values which are used repeatedly a total of 4 times.
  rules = rules_grid,
  expand = FALSE
)
print(plan, n = 40)

# Confirm depenencies and parameter mappings.
config <- drake_config(plan)
vis_drake_graph(config)

jw5 · 2018-05-27T21:09:09Z

Here is an updated solution that deals with avoiding applying public only functions on private schools and allows for schools to switch from public to private at any time.

# Possible solution: #235 Modifed to generate unique targets as required.
# Version two: deal with avoiding public checks on private schools.
# Note that this solution can now handle the case of a school switching type.

rules_grid <- tibble(
  # The schools and their funding types.
  school_ =  c("schoolA", "schoolB", "schoolC"),
  funding_ = c("public", "public", "private"),
) %>%
  # Generate the full cross product of (school,funding)x(years)
  crossing(cohort_ = c("2012", "2013", "2014", "2015")) %>%
  # Remove the two years school B didn't exist.
  filter(!(school_ == "schoolB" & cohort_ %in% c("2012", "2013")))
# Make schoolC switch funding each year
rules_grid$funding_[rules_grid$school_ == "schoolC"] <-
  c("public", "private", "public", "private")
# Confirm the correct plan template.
print(rules_grid)

plan_both <- drake_plan(
  # Start with the three universal types of checks to perform (public or private)
  credits = check_credit_hours("school_", "funding_", "cohort_"),
  students = check_students("school_", "funding_", "cohort_"),
  grads = check_graduations("school_", "funding_", "cohort_"),
  # Leave this for later.
  #public_funds = check_public_funding("school_", "funding_", "cohort_"),
  strings_in_dots = "literals"
) %>% expand_plan(
  # Use a forced expansion with a target suffix defined by school_year.
  # I don't really like this solution but I couldn't think of a better one :-(
  # Note that this duplicates each target 10 times for a total of 30.
  # However, no parameter substitution is done, that is fixed in the next step.
  values = paste(rules_grid$school_, rules_grid$cohort_, sep = "_")
) %>% evaluate_plan(
  # Finally, substitute the correct parameter values into the commands.
  # Note that since each target is duplicated 10 times, they each get a full
  # complement of parameter values which are used repeatedly a total of 3 times.
  rules = rules_grid,
  expand = FALSE
)
print(plan_both, n = 30)

# Next get the rules for just the public schools. Note that a school could change
# from public to private or vis-versa in any year and this still works.
public_rules_grid <- rules_grid %>% filter(funding_ == "public")
print(public_rules_grid)

# Build the public only plans
plan_public <- drake_plan(
  # Include the public only checks that shouldn't be run on private schools.
  public_funds = check_public_funding("school_", "funding_", "cohort_"),
  strings_in_dots = "literals"
) %>% expand_plan(
  # Use a forced expansion with a target suffix defined by school_year.
  # I don't really like this solution but I couldn't think of a better one :-(
  # Note that this duplicates each target 10 times for a total of 10.
  # However, no parameter substitution is done, that is fixed in the next step.
  values = paste(public_rules_grid$school_, public_rules_grid$cohort_, sep = "_")
) %>% evaluate_plan(
  # Finally, substitute the correct parameter values into the commands.
  # Note that since each target is duplicated 10 times, they each get a full
  # complement of parameter values which are used repeatedly a total of 1 times.
  rules = public_rules_grid,
  expand = FALSE
)
print(plan_public, n = 8)

# Combine the both and public only plans together
plan <- bind_plans(plan_both, plan_public)
# Note that no check_public_funding is ever performed on schoolC in odd years.

# Confirm depenencies and parameter mappings.
config <- drake_config(plan)
vis_drake_graph(config)

wlandau · 2018-05-27T22:59:37Z

@jw5, glad you're helping us with slick ways to generate plans.

Unfortunately, the plan generated here and documented in best practices is not a valid Drake plan as it contains duplicate target names.

Are you talking about the plan at the end of this section? Because there, I think we're fine. Here's a reprex.

library(drake)
library(tidyverse)
#> ── Attaching packages ───────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 2.2.1     ✔ purrr   0.2.4
#> ✔ tibble  1.4.2     ✔ dplyr   0.7.5
#> ✔ tidyr   0.8.1     ✔ stringr 1.3.1
#> ✔ readr   1.1.1     ✔ forcats 0.3.0
#> ── Conflicts ──────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ tidyr::expand() masks drake::expand()
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ tidyr::gather() masks drake::gather()
#> ✖ dplyr::lag()    masks stats::lag()

# Generate the plan from the end of
# https://ropensci.github.io/drake/articles/best-practices.html#generating-workflow-plan-data-frames

rules_grid <- tibble::tibble(school_ = c("schoolA", "schoolB", "schoolC"), funding_ = c("public", 
  "public", "private"), ) %>% tidyr::crossing(cohort_ = c("2012", "2013", 
  "2014", "2015")) %>% dplyr::filter(!(school_ == "schoolB" & cohort_ %in% 
  c("2012", "2013"))) %>% print()
#> # A tibble: 10 x 3
#>    school_ funding_ cohort_
#>    <chr>   <chr>    <chr>  
#>  1 schoolA public   2012   
#>  2 schoolA public   2013   
#>  3 schoolA public   2014   
#>  4 schoolA public   2015   
#>  5 schoolB public   2014   
#>  6 schoolB public   2015   
#>  7 schoolC private  2012   
#>  8 schoolC private  2013   
#>  9 schoolC private  2014   
#> 10 schoolC private  2015

plan <- drake_plan(credits = check_credit_hours("school_", "funding_", "cohort_"), 
  students = check_students("school_", "funding_", "cohort_"), grads = check_graduations("school_", 
    "funding_", "cohort_"), public_funds = check_public_funding("school_", 
    "funding_", "cohort_"), strings_in_dots = "literals") %>% evaluate_plan(wildcard = "school_", 
  values = rules_grid$school_, expand = TRUE) %>% evaluate_plan(wildcard = "funding_", 
  rules = rules_grid, expand = FALSE)

# Do we have duplicate targets?
any(duplicated(plan$target))
#> [1] FALSE

jw5 · 2018-05-28T18:49:59Z

Well, when I cut and paste your reprex into a fresh RStudio session and source it I get: ```R

# Do we have duplicate targets? any(duplicated(plan$target))

[1] TRUE

drake_session()

R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux buster/sid Matrix products: default BLAS: /opt/R/3.4.4/lib/R/lib/libRblas.so LAPACK: /opt/R/3.4.4/lib/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] bindrcpp_0.2.2 ggplot2_2.2.1 knitr_1.20 Ecdat_0.3-1 Ecfun_0.1-7 [6] drake_5.1.2 loaded via a namespace (and not attached): [1] storr_1.1.3 tidyselect_0.2.4 purrr_0.2.4 listenv_0.7.0 [5] splines_3.4.4 lattice_0.20-35 colorspace_1.3-2 testthat_2.0.0 [9] htmltools_0.3.6 yaml_2.1.19 XML_3.98-1.11 rlang_0.2.0 [13] R.oo_1.22.0 pillar_1.2.2 glue_1.2.0 withr_2.1.2 [17] R.utils_2.6.0 CodeDepends_0.5-3 jpeg_0.1-8 bindr_0.1.1 [21] plyr_1.8.4 stringr_1.3.0 munsell_0.4.3 gtable_0.2.0 [25] R.methodsS3_1.7.1 visNetwork_2.0.3 future_1.8.1 htmlwidgets_1.2 [29] codetools_0.2-15 evaluate_0.10.1 parallel_3.4.4 Rcpp_0.12.16 [33] scales_0.5.0 backports_1.1.2 formatR_1.5 jsonlite_1.5 [37] TeachingDemos_2.10 digest_0.6.15 stringi_1.2.2 dplyr_0.7.4 [41] rprojroot_1.3-2 grid_3.4.4 tools_3.4.4 magrittr_1.5 [45] lazyeval_0.2.1 tibble_1.4.2 crayon_1.3.4 future.apply_0.2.0 [49] pkgconfig_2.0.1 MASS_7.3-50 fda_2.4.7 Matrix_1.2-14 [53] lubridate_1.7.4 assertthat_0.2.0 rstudioapi_0.7 R6_2.2.2 [57] globals_0.11.0 igraph_1.2.1 compiler_3.4.4 ``` ===================================================================== I then cleaned up the formatting and added some print statements and nuked the cache at the start. Nuking the cache apparently broke drake_session(). ===================================================================== ```R library(drake) library(tidyverse) #> ── Attaching packages ───────────────────────────────────────────────────────────── tidyverse 1.2.1 ── #> ✔ ggplot2 2.2.1 ✔ purrr 0.2.4 #> ✔ tibble 1.4.2 ✔ dplyr 0.7.5 #> ✔ tidyr 0.8.1 ✔ stringr 1.3.1 #> ✔ readr 1.1.1 ✔ forcats 0.3.0 #> ── Conflicts ──────────────────────────────────────────────────────────────── tidyverse_conflicts() ── #> ✖ tidyr::expand() masks drake::expand() #> ✖ dplyr::filter() masks stats::filter() #> ✖ tidyr::gather() masks drake::gather() #> ✖ dplyr::lag() masks stats::lag() # Added to make it more reproducible: clean(destroy = TRUE) # Generate the plan from the end of # https://ropensci.github.io/drake/articles/best-practices.html#generating-workflow-plan-data-frames rules_grid <- tibble::tibble(school_ = c("schoolA", "schoolB", "schoolC"), funding_ = c("public", "public", "private"), ) %>% tidyr::crossing(cohort_ = c("2012", "2013", "2014", "2015")) %>% dplyr::filter(!(school_ == "schoolB" & cohort_ %in% c("2012", "2013"))) %>% print() #> # A tibble: 10 x 3 #> school_ funding_ cohort_ #> <chr> <chr> <chr> #> 1 schoolA public 2012 #> 2 schoolA public 2013 #> 3 schoolA public 2014 #> 4 schoolA public 2015 #> 5 schoolB public 2014 #> 6 schoolB public 2015 #> 7 schoolC private 2012 #> 8 schoolC private 2013 #> 9 schoolC private 2014 #> 10 schoolC private 2015 plan <- drake_plan( credits = check_credit_hours( "school_", "funding_", "cohort_"), students = check_students( "school_", "funding_", "cohort_"), grads = check_graduations( "school_", "funding_", "cohort_"), public_funds = check_public_funding("school_", "funding_", "cohort_"), strings_in_dots = "literals") %>% evaluate_plan(wildcard = "school_", values = rules_grid$school_, expand = TRUE) %>% * # Note that technically there shouldn't be a "wildcard" as its overridden by rules.* evaluate_plan(wildcard = "funding_", rules = rules_grid, expand = FALSE) # Do we have duplicate targets? print(plan, n = 100) print(any(duplicated(plan$target))) #> [1] FALSE drake_session() ``` ===================================================================== sourcing this on a restarted R session yields: ===================================================================== ```R Restarting R session...

source('~/Analyses/New/Drake/test2/reprex.R')── Attaching packages ────────────────────────────────────────────── tidyverse 1.2.1 ──✔ ggplot2 2.2.1 ✔ purrr 0.2.4✔ tibble 1.4.2 ✔ dplyr 0.7.4✔ tidyr 0.8.0 ✔ stringr 1.3.0✔ readr 1.1.1 ✔ forcats 0.3.0── Conflicts ───────────────────────────────────────────────── tidyverse_conflicts() ──✖ dplyr::contains() masks drake::contains()✖ dplyr::ends_with() masks drake::ends_with()✖ dplyr::everything() masks drake::everything()✖ tidyr::expand() masks drake::expand()✖ dplyr::filter() masks stats::filter()✖ tidyr::gather() masks drake::gather()✖ dplyr::lag() masks stats::lag()✖ dplyr::matches() masks drake::matches()✖ dplyr::num_range() masks drake::num_range()✖ dplyr::one_of() masks drake::one_of()✖ dplyr::starts_with() masks drake::starts_with()

# A tibble: 10 x 3 school_ funding_ cohort_ <chr> <chr> <chr> 1 schoolA public 2012 2 schoolA public 2013 3 schoolA public 2014 4 schoolA public 2015 5 schoolB public 2014 6 schoolB public 2015 7 schoolCprivate 2012 8 schoolC private 2013 9 schoolC private 2014 10 schoolC private 2015 # A tibble: 40 x 2 target command <chr> <chr> 1 credits_schoolA "check_credit_hours(\"schoolA\",\"public\", \"2012\")" 2 credits_schoolA"check_credit_hours(\"schoolA\", \"public\", \"2013\")" 3credits_schoolA "check_credit_hours(\"schoolA\", \"public\",\"2014\")" 4 credits_schoolA "check_credit_hours(\"schoolA\",\"public\", \"2015\")" 5 credits_schoolB"check_credit_hours(\"schoolB\", \"public\", \"2014\")" 6credits_schoolB "check_credit_hours(\"schoolB\", \"public\",\"2015\")" 7 credits_schoolC "check_credit_hours(\"schoolC\",\"private\", \"2012\")" 8 credits_schoolC"check_credit_hours(\"schoolC\", \"private\", \"2013\")" 9credits_schoolC "check_credit_hours(\"schoolC\", \"private\",\"2014\")" 10 credits_schoolC "check_credit_hours(\"schoolC\",\"private\", \"2015\")" 11 students_schoolA"check_students(\"schoolA\", \"public\", \"2012\")" 12students_schoolA "check_students(\"schoolA\", \"public\",\"2013\")" 13 students_schoolA "check_students(\"schoolA\",\"public\", \"2014\")" 14 students_schoolA"check_students(\"schoolA\", \"public\", \"2015\")" 15students_schoolB "check_students(\"schoolB\", \"public\",\"2014\")" 16 students_schoolB "check_students(\"schoolB\",\"public\", \"2015\")" 17 students_schoolC"check_students(\"schoolC\", \"private\", \"2012\")" 18students_schoolC "check_students(\"schoolC\", \"private\",\"2013\")" 19 students_schoolC "check_students(\"schoolC\",\"private\", \"2014\")" 20 students_schoolC"check_students(\"schoolC\", \"private\", \"2015\")" 21grads_schoolA "check_graduations(\"schoolA\", \"public\",\"2012\")" 22 grads_schoolA "check_graduations(\"schoolA\",\"public\", \"2013\")" 23 grads_schoolA"check_graduations(\"schoolA\", \"public\", \"2014\")" 24grads_schoolA "check_graduations(\"schoolA\", \"public\",\"2015\")" 25 grads_schoolB "check_graduations(\"schoolB\",\"public\", \"2014\")" 26 grads_schoolB"check_graduations(\"schoolB\", \"public\", \"2015\")" 27grads_schoolC "check_graduations(\"schoolC\", \"private\",\"2012\")" 28 grads_schoolC "check_graduations(\"schoolC\",\"private\", \"2013\")" 29 grads_schoolC"check_graduations(\"schoolC\", \"private\", \"2014\")" 30grads_schoolC "check_graduations(\"schoolC\", \"private\",\"2015\")" 31 public_funds_schoolA"check_public_funding(\"schoolA\", \"public\", \"2012\")" 32public_funds_schoolA "check_public_funding(\"schoolA\", \"public\",\"2013\")" 33 public_funds_schoolA "check_public_funding(\"schoolA\",\"public\", \"2014\")" 34 public_funds_schoolA"check_public_funding(\"schoolA\", \"public\", \"2015\")" 35public_funds_schoolB "check_public_funding(\"schoolB\", \"public\",\"2014\")" 36 public_funds_schoolB "check_public_funding(\"schoolB\",\"public\", \"2015\")" 37 public_funds_schoolC"check_public_funding(\"schoolC\", \"private\", \"2012\")" 38public_funds_schoolC "check_public_funding(\"schoolC\", \"private\",\"2013\")" 39 public_funds_schoolC "check_public_funding(\"schoolC\",\"private\", \"2014\")" 40 public_funds_schoolC"check_public_funding(\"schoolC\", \"private\", \"2015\")" [1] TRUE Error in drake_session() : No drake::make() session detected. ``` I guess I'll do a devtools github install and see if that fixes things. Darn, I really thought I had figured out the way to reason about the various plan evaluations.

jw5 · 2018-05-28T19:14:16Z

Well, good news and bad news. I did a github install of drake and the results changed. This is sourcing the same file supplied in the last message. ```R Restarting R session...

library(devtools)> install_github("ropensci/drake")

Skipping install of 'drake' from a github remote, the SHA1 (aefa7a5) has not changed since last install. Use `force = TRUE` to force installation> source('~/Analyses/New/Drake/test2/reprex.R') Attaching package:‘drake’ The following object is masked from ‘package:devtools’: check ── Attaching packages ────────────────────────────────────────────── tidyverse 1.2.1 ──✔ ggplot2 2.2.1 ✔ purrr 0.2.4✔ tibble 1.4.2 ✔ dplyr 0.7.5✔ tidyr 0.8.0 ✔ stringr 1.3.0✔ readr 1.1.1 ✔ forcats 0.3.0── Conflicts ───────────────────────────────────────────────── tidyverse_conflicts() ──✖ tidyr::expand() masks drake::expand()✖ dplyr::filter() masks stats::filter()✖ tidyr::gather() masks drake::gather()✖ dplyr::lag() masks stats::lag() # A tibble: 10 x 3 school_ funding_ cohort_ <chr> <chr> <chr> 1 schoolA public 2012 2 schoolA public 2013 3 schoolA public 2014 4 schoolA public 2015 5 schoolB public 2014 6 schoolB public 2015 7 schoolC private 2012 8 schoolC private 2013 9 schoolC private 2014 10 schoolC private 2015 # A tibble: 12 x 2 target command <chr> <chr> 1 credits_schoolA "check_credit_hours(\"schoolA\",\"public\", \"2012\")" 2 credits_schoolB "check_credit_hours(\"schoolB\", \"public\", \"2013\")" 3 credits_schoolC "check_credit_hours(\"schoolC\", \"public\",\"2014\")" 4 students_schoolA "check_students(\"schoolA\",\"public\", \"2015\")" 5 students_schoolB "check_students(\"schoolB\", \"public\", \"2014\")" 6 students_schoolC "check_students(\"schoolC\", \"public\",\"2015\")" 7 grads_schoolA "check_graduations(\"schoolA\", \"private\", \"2012\")" 8 grads_schoolB "check_graduations(\"schoolB\", \"private\",\"2013\")" 9 grads_schoolC "check_graduations(\"schoolC\",\"private\", \"2014\")" 10 public_funds_schoolA "check_public_funding(\"schoolA\", \"private\", \"2015\")" 11 public_funds_schoolB "check_public_funding(\"schoolB\", \"public\",\"2012\")" 12 public_funds_schoolC "check_public_funding(\"schoolC\",\"public\", \"2013\")" [1] FALSE Error in drake_session() : No drake::make() session detected. ``` So, yes, there are no dups now. However, the plan is not what was intended as you are "randomly" selecting between checks and years and funding based on "wrapping" the years and funding rather than checking each year for each school. Note how public/private no longer is fixed to the school as defined in the tibble. Really would be nice is drake_session() didn't fail when no make session detected but produced only a warning? Jim

jw5 · 2018-05-28T19:34:30Z

Sorry, responded by email and now no way to fix up the formatting.

Bottom line, your best practices solution does pass the no-dups test, but yields:

# A tibble: 12 x 2
   target               command                                                   
   <chr>                <chr>                                                     
 1 credits_schoolA      "check_credit_hours(\"schoolA\", \"public\", \"2012\")"   
 2 credits_schoolB      "check_credit_hours(\"schoolB\", \"public\", \"2013\")"   
 3 credits_schoolC      "check_credit_hours(\"schoolC\", \"public\", \"2014\")"   
 4 students_schoolA     "check_students(\"schoolA\", \"public\", \"2015\")"       
 5 students_schoolB     "check_students(\"schoolB\", \"public\", \"2014\")"       
 6 students_schoolC     "check_students(\"schoolC\", \"public\", \"2015\")"       
 7 grads_schoolA        "check_graduations(\"schoolA\", \"private\", \"2012\")"   
 8 grads_schoolB        "check_graduations(\"schoolB\", \"private\", \"2013\")"   
 9 grads_schoolC        "check_graduations(\"schoolC\", \"private\", \"2014\")"   
10 public_funds_schoolA "check_public_funding(\"schoolA\", \"private\", \"2015\")"
11 public_funds_schoolB "check_public_funding(\"schoolB\", \"public\", \"2012\")" 
12 public_funds_schoolC "check_public_funding(\"schoolC\", \"public\", \"2013\")"

While I believe it should yield (from my original solution proposal):

# A tibble: 40 x 2
   target                    command                                                  
   <chr>                     <chr>                                                    
 1 credits_schoolA_2012      "check_credit_hours(\"schoolA\", \"public\", \"2012\")"  
 2 credits_schoolA_2013      "check_credit_hours(\"schoolA\", \"public\", \"2013\")"  
 3 credits_schoolA_2014      "check_credit_hours(\"schoolA\", \"public\", \"2014\")"  
 4 credits_schoolA_2015      "check_credit_hours(\"schoolA\", \"public\", \"2015\")"  
 5 credits_schoolB_2014      "check_credit_hours(\"schoolB\", \"public\", \"2014\")"  
 6 credits_schoolB_2015      "check_credit_hours(\"schoolB\", \"public\", \"2015\")"  
 7 credits_schoolC_2012      "check_credit_hours(\"schoolC\", \"private\", \"2012\")" 
 8 credits_schoolC_2013      "check_credit_hours(\"schoolC\", \"private\", \"2013\")" 
 9 credits_schoolC_2014      "check_credit_hours(\"schoolC\", \"private\", \"2014\")" 
10 credits_schoolC_2015      "check_credit_hours(\"schoolC\", \"private\", \"2015\")" 
11 students_schoolA_2012     "check_students(\"schoolA\", \"public\", \"2012\")"      
12 students_schoolA_2013     "check_students(\"schoolA\", \"public\", \"2013\")"      
13 students_schoolA_2014     "check_students(\"schoolA\", \"public\", \"2014\")"      
14 students_schoolA_2015     "check_students(\"schoolA\", \"public\", \"2015\")"      
15 students_schoolB_2014     "check_students(\"schoolB\", \"public\", \"2014\")"      
16 students_schoolB_2015     "check_students(\"schoolB\", \"public\", \"2015\")"      
17 students_schoolC_2012     "check_students(\"schoolC\", \"private\", \"2012\")"     
18 students_schoolC_2013     "check_students(\"schoolC\", \"private\", \"2013\")"     
19 students_schoolC_2014     "check_students(\"schoolC\", \"private\", \"2014\")"     
20 students_schoolC_2015     "check_students(\"schoolC\", \"private\", \"2015\")"     
21 grads_schoolA_2012        "check_graduations(\"schoolA\", \"public\", \"2012\")"   
22 grads_schoolA_2013        "check_graduations(\"schoolA\", \"public\", \"2013\")"   
23 grads_schoolA_2014        "check_graduations(\"schoolA\", \"public\", \"2014\")"   
24 grads_schoolA_2015        "check_graduations(\"schoolA\", \"public\", \"2015\")"   
25 grads_schoolB_2014        "check_graduations(\"schoolB\", \"public\", \"2014\")"   
26 grads_schoolB_2015        "check_graduations(\"schoolB\", \"public\", \"2015\")"   
27 grads_schoolC_2012        "check_graduations(\"schoolC\", \"private\", \"2012\")"  
28 grads_schoolC_2013        "check_graduations(\"schoolC\", \"private\", \"2013\")"  
29 grads_schoolC_2014        "check_graduations(\"schoolC\", \"private\", \"2014\")"  
30 grads_schoolC_2015        "check_graduations(\"schoolC\", \"private\", \"2015\")"  
31 public_funds_schoolA_2012 "check_public_funding(\"schoolA\", \"public\", \"2012\")"
32 public_funds_schoolA_2013 "check_public_funding(\"schoolA\", \"public\", \"2013\")"
33 public_funds_schoolA_2014 "check_public_funding(\"schoolA\", \"public\", \"2014\")"
34 public_funds_schoolA_2015 "check_public_funding(\"schoolA\", \"public\", \"2015\")"
35 public_funds_schoolB_2014 "check_public_funding(\"schoolB\", \"public\", \"2014\")"
36 public_funds_schoolB_2015 "check_public_funding(\"schoolB\", \"public\", \"2015\")"
37 public_funds_schoolC_2012 "check_public_funding(\"schoolC\", \"private\", \"2012\"…
38 public_funds_schoolC_2013 "check_public_funding(\"schoolC\", \"private\", \"2013\"…
39 public_funds_schoolC_2014 "check_public_funding(\"schoolC\", \"private\", \"2014\"…
40 public_funds_schoolC_2015 "check_public_funding(\"schoolC\", \"private\", \"2015\"…

wlandau · 2018-05-28T22:01:33Z

In that particular example, since school C does not receive public funding, we should not actually be calling check_public_funding("schoolC"). But I do see your point about expanding each row with matching wildcards over a manual grid.

By the way, I'm wrong for a different reason: the resulting data frame should be 10 rows, not 12. In 4e4cb98, which I will push soon, I patched the issue and updated the best practices vignette. The documentation website should update next time I rebuild it.

> plan <- drake_plan(
+   credits = check_credit_hours("school_", "funding_", "cohort_"),
+   students = check_students("school_", "funding_", "cohort_"),
+   grads = check_graduations("school_", "funding_", "cohort_"),
+   public_funds = check_public_funding("school_", "funding_", "cohort_"),
+   strings_in_dots = "literals"
+ )[c(rep(1, 4), rep(2, 2), rep(3, 4)), ] %>%
+   evaluate_plan(
+     rules = rules_grid,
+     expand = FALSE,
+     always_rename = TRUE
+   )
> plan
# A tibble: 10 x 2
   target   command                                                
   <chr>    <chr>                                                  
 1 credits  "check_credit_hours(\"schoolA\", \"public\", \"2012\")"
 2 credits  "check_credit_hours(\"schoolA\", \"public\", \"2013\")"
 3 credits  "check_credit_hours(\"schoolA\", \"public\", \"2014\")"
 4 credits  "check_credit_hours(\"schoolA\", \"public\", \"2015\")"
 5 students "check_students(\"schoolB\", \"public\", \"2014\")"    
 6 students "check_students(\"schoolB\", \"public\", \"2015\")"    
 7 grads    "check_graduations(\"schoolC\", \"private\", \"2012\")"
 8 grads    "check_graduations(\"schoolC\", \"private\", \"2013\")"
 9 grads    "check_graduations(\"schoolC\", \"private\", \"2014\")"
10 grads    "check_graduations(\"schoolC\", \"private\", \"2015\")"

I do want to think about better handling of custom grids and whether we should expand every matching command over the whole grid. My mind has not been on wildcards lately, though.

jw5 · 2018-05-29T20:02:34Z

I've been trying to come up with a better paradigm for the substitution rules in evaluate plan. I note that you have added a new flag "always_rename" which looks promising.

It seems like the problem is made more difficult by trying to get consistent behavior for expand=T/F. So for the moment, I'll ignore it. I'm also ignoring the wildcard/value args as they are really a subset of a single rule list and could be deprecated.

When you have multiple parameters being substituted at the same time via rule = list(), the primary distinction (in my mind) is whether you are generating all combinations of those parameters (as currently coded with expand = T), or if you are taking them verbatim as "rowwise" tuples of parameter values and always treating each row as a unit. This latter might be the more natural interpretation of rule = data.frame as rows are often seen a unique observations. While the former makes sense when the list contains vectors of different lengths.

This leads to the suggestion of enhancing the expansion option beyond just T/F. Currently false indicates no replication of targets and just round robin substitution of parameters. However, the actual substitution appears to depend on the both the original targets (counts, ordering and parameter usage) and the rules parameter counts. I'm not a fan, but this may need to be kept for backward compatibility?

With expand = T and a a set of rules the current combinatorial expansion would take place.

Finally, with expand = "rowwise", each target would get expanded with each of the parameter tuples defined in a row (no combinatorics unless you did the expansion when generating the rules using for example expand.grid). Thus if you had N targets and M rows in the rules you would always end up with exactly N*M evaluated targets.

Note that in some sense the rowwise expansion is more fundamental than the current combinatorics as the latter can easily be replicated using the former, but not vice-versa.

================================
On a separate issue, I'm still a little concerned about the duplicate target names in your results above. I'm guessing that always_rename isn't completely implemented?

In any event, it is only evaluating credits on schoolA, students on schoolB and grads on schoolC rather than each test on each school.

I would have expected converging to a solution similar to my "Version two" proposal above (but with out the SchoolC varying public/private as I added to the example code).

This would generate the following 36 target plan:

# A tibble: 36 x 2
   target                    command                                                  
   <chr>                     <chr>                                                    
 1 credits_schoolA_2012      "check_credit_hours(\"schoolA\", \"public\", \"2012\")"  
 2 credits_schoolA_2013      "check_credit_hours(\"schoolA\", \"public\", \"2013\")"  
 3 credits_schoolA_2014      "check_credit_hours(\"schoolA\", \"public\", \"2014\")"  
 4 credits_schoolA_2015      "check_credit_hours(\"schoolA\", \"public\", \"2015\")"  
 5 credits_schoolB_2014      "check_credit_hours(\"schoolB\", \"public\", \"2014\")"  
 6 credits_schoolB_2015      "check_credit_hours(\"schoolB\", \"public\", \"2015\")"  
 7 credits_schoolC_2012      "check_credit_hours(\"schoolC\", \"private\", \"2012\")" 
 8 credits_schoolC_2013      "check_credit_hours(\"schoolC\", \"private\", \"2013\")" 
 9 credits_schoolC_2014      "check_credit_hours(\"schoolC\", \"private\", \"2014\")" 
10 credits_schoolC_2015      "check_credit_hours(\"schoolC\", \"private\", \"2015\")" 
11 students_schoolA_2012     "check_students(\"schoolA\", \"public\", \"2012\")"      
12 students_schoolA_2013     "check_students(\"schoolA\", \"public\", \"2013\")"      
13 students_schoolA_2014     "check_students(\"schoolA\", \"public\", \"2014\")"      
14 students_schoolA_2015     "check_students(\"schoolA\", \"public\", \"2015\")"      
15 students_schoolB_2014     "check_students(\"schoolB\", \"public\", \"2014\")"      
16 students_schoolB_2015     "check_students(\"schoolB\", \"public\", \"2015\")"      
17 students_schoolC_2012     "check_students(\"schoolC\", \"private\", \"2012\")"     
18 students_schoolC_2013     "check_students(\"schoolC\", \"private\", \"2013\")"     
19 students_schoolC_2014     "check_students(\"schoolC\", \"private\", \"2014\")"     
20 students_schoolC_2015     "check_students(\"schoolC\", \"private\", \"2015\")"     
21 grads_schoolA_2012        "check_graduations(\"schoolA\", \"public\", \"2012\")"   
22 grads_schoolA_2013        "check_graduations(\"schoolA\", \"public\", \"2013\")"   
23 grads_schoolA_2014        "check_graduations(\"schoolA\", \"public\", \"2014\")"   
24 grads_schoolA_2015        "check_graduations(\"schoolA\", \"public\", \"2015\")"   
25 grads_schoolB_2014        "check_graduations(\"schoolB\", \"public\", \"2014\")"   
26 grads_schoolB_2015        "check_graduations(\"schoolB\", \"public\", \"2015\")"   
27 grads_schoolC_2012        "check_graduations(\"schoolC\", \"private\", \"2012\")"  
28 grads_schoolC_2013        "check_graduations(\"schoolC\", \"private\", \"2013\")"  
29 grads_schoolC_2014        "check_graduations(\"schoolC\", \"private\", \"2014\")"  
30 grads_schoolC_2015        "check_graduations(\"schoolC\", \"private\", \"2015\")"  
31 public_funds_schoolA_2012 "check_public_funding(\"schoolA\", \"public\", \"2012\")"
32 public_funds_schoolA_2013 "check_public_funding(\"schoolA\", \"public\", \"2013\")"
33 public_funds_schoolA_2014 "check_public_funding(\"schoolA\", \"public\", \"2014\")"
34 public_funds_schoolA_2015 "check_public_funding(\"schoolA\", \"public\", \"2015\")"
35 public_funds_schoolB_2014 "check_public_funding(\"schoolB\", \"public\", \"2014\")"
36 public_funds_schoolB_2015 "check_public_funding(\"schoolB\", \"public\", \"2015\")"

wlandau · 2018-05-31T20:24:14Z

I did some work since the last post, and those targets are no longer duplicated. Reprex:

library(drake)
library(tidyverse)
#> ── Attaching packages ──────────────────────────────────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 2.2.1     ✔ purrr   0.2.5
#> ✔ tibble  1.4.2     ✔ dplyr   0.7.5
#> ✔ tidyr   0.8.1     ✔ stringr 1.3.1
#> ✔ readr   1.1.1     ✔ forcats 0.3.0
#> ── Conflicts ─────────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ tidyr::expand() masks drake::expand()
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ tidyr::gather() masks drake::gather()
#> ✖ dplyr::lag()    masks stats::lag()
rules_grid <- tibble::tibble(school_ = c("schoolA", "schoolB", "schoolC"), funding_ = c("public", 
  "public", "private"), ) %>% tidyr::crossing(cohort_ = c("2012", "2013", 
  "2014", "2015")) %>% dplyr::filter(!(school_ == "schoolB" & cohort_ %in% 
  c("2012", "2013")))
plan <- drake_plan(credits = check_credit_hours("school_", "funding_", "cohort_"), 
  students = check_students("school_", "funding_", "cohort_"), grads = check_graduations("school_", 
    "funding_", "cohort_"), public_funds = check_public_funding("school_", 
    "funding_", "cohort_"), strings_in_dots = "literals")[c(rep(1, 4), rep(2, 
  2), rep(3, 4)), ] %>% evaluate_plan(rules = rules_grid, expand = FALSE, 
  always_rename = TRUE) %>% print
#> # A tibble: 10 x 2
#>    target                       command                                   
#>    <chr>                        <chr>                                     
#>  1 credits_schoolA_public_2012  "check_credit_hours(\"schoolA\", \"public…
#>  2 credits_schoolA_public_2013  "check_credit_hours(\"schoolA\", \"public…
#>  3 credits_schoolA_public_2014  "check_credit_hours(\"schoolA\", \"public…
#>  4 credits_schoolA_public_2015  "check_credit_hours(\"schoolA\", \"public…
#>  5 students_schoolB_public_2014 "check_students(\"schoolB\", \"public\", …
#>  6 students_schoolB_public_2015 "check_students(\"schoolB\", \"public\", …
#>  7 grads_schoolC_private_2012   "check_graduations(\"schoolC\", \"private…
#>  8 grads_schoolC_private_2013   "check_graduations(\"schoolC\", \"private…
#>  9 grads_schoolC_private_2014   "check_graduations(\"schoolC\", \"private…
#> 10 grads_schoolC_private_2015   "check_graduations(\"schoolC\", \"private…

I will need some time to think about the rest of your comments about different modes of wildcard substitution and expansion. I am planning to put this functionality in the wildcard package, which I have not updated in months.

At this point, I see wildcards as a medium-term solution. Long-term, I still prefer to move to @krlmlr's proposed DSL interface (ref: #233, #304).

jw5 · 2018-06-07T21:34:02Z

Unfortunately, the proposed solution doesn't generate the correct answer. It pseudo-randomly combines the checks with the schools and generates only 10 results.

What it should do is combine all 4 independent checks with all specified schools and years (cohorts) and generate 40 results (if you allow check_public_funding to be invoked on schoolC, or 36 if you don't).

# A tibble: 40 x 2
   target                    command                                                  
   <chr>                     <chr>                                                    
 1 credits_schoolA_2012      "check_credit_hours(\"schoolA\", \"public\", \"2012\")"  
 2 credits_schoolA_2013      "check_credit_hours(\"schoolA\", \"public\", \"2013\")"  
 3 credits_schoolA_2014      "check_credit_hours(\"schoolA\", \"public\", \"2014\")"  
 4 credits_schoolA_2015      "check_credit_hours(\"schoolA\", \"public\", \"2015\")"  
 5 credits_schoolB_2014      "check_credit_hours(\"schoolB\", \"public\", \"2014\")"  
 6 credits_schoolB_2015      "check_credit_hours(\"schoolB\", \"public\", \"2015\")"  
 7 credits_schoolC_2012      "check_credit_hours(\"schoolC\", \"private\", \"2012\")" 
 8 credits_schoolC_2013      "check_credit_hours(\"schoolC\", \"private\", \"2013\")" 
 9 credits_schoolC_2014      "check_credit_hours(\"schoolC\", \"private\", \"2014\")" 
10 credits_schoolC_2015      "check_credit_hours(\"schoolC\", \"private\", \"2015\")" 
11 students_schoolA_2012     "check_students(\"schoolA\", \"public\", \"2012\")"      
12 students_schoolA_2013     "check_students(\"schoolA\", \"public\", \"2013\")"      
13 students_schoolA_2014     "check_students(\"schoolA\", \"public\", \"2014\")"      
14 students_schoolA_2015     "check_students(\"schoolA\", \"public\", \"2015\")"      
15 students_schoolB_2014     "check_students(\"schoolB\", \"public\", \"2014\")"      
16 students_schoolB_2015     "check_students(\"schoolB\", \"public\", \"2015\")"      
17 students_schoolC_2012     "check_students(\"schoolC\", \"private\", \"2012\")"     
18 students_schoolC_2013     "check_students(\"schoolC\", \"private\", \"2013\")"     
19 students_schoolC_2014     "check_students(\"schoolC\", \"private\", \"2014\")"     
20 students_schoolC_2015     "check_students(\"schoolC\", \"private\", \"2015\")"     
21 grads_schoolA_2012        "check_graduations(\"schoolA\", \"public\", \"2012\")"   
22 grads_schoolA_2013        "check_graduations(\"schoolA\", \"public\", \"2013\")"   
23 grads_schoolA_2014        "check_graduations(\"schoolA\", \"public\", \"2014\")"   
24 grads_schoolA_2015        "check_graduations(\"schoolA\", \"public\", \"2015\")"   
25 grads_schoolB_2014        "check_graduations(\"schoolB\", \"public\", \"2014\")"   
26 grads_schoolB_2015        "check_graduations(\"schoolB\", \"public\", \"2015\")"   
27 grads_schoolC_2012        "check_graduations(\"schoolC\", \"private\", \"2012\")"  
28 grads_schoolC_2013        "check_graduations(\"schoolC\", \"private\", \"2013\")"  
29 grads_schoolC_2014        "check_graduations(\"schoolC\", \"private\", \"2014\")"  
30 grads_schoolC_2015        "check_graduations(\"schoolC\", \"private\", \"2015\")"  
31 public_funds_schoolA_2012 "check_public_funding(\"schoolA\", \"public\", \"2012\")"
32 public_funds_schoolA_2013 "check_public_funding(\"schoolA\", \"public\", \"2013\")"
33 public_funds_schoolA_2014 "check_public_funding(\"schoolA\", \"public\", \"2014\")"
34 public_funds_schoolA_2015 "check_public_funding(\"schoolA\", \"public\", \"2015\")"
35 public_funds_schoolB_2014 "check_public_funding(\"schoolB\", \"public\", \"2014\")"
36 public_funds_schoolB_2015 "check_public_funding(\"schoolB\", \"public\", \"2015\")"
37 public_funds_schoolC_2012 "check_public_funding(\"schoolC\", \"private\", \"2012\"…
38 public_funds_schoolC_2013 "check_public_funding(\"schoolC\", \"private\", \"2013\"…
39 public_funds_schoolC_2014 "check_public_funding(\"schoolC\", \"private\", \"2014\"…
40 public_funds_schoolC_2015 "check_public_funding(\"schoolC\", \"private\", \"2015\"…

wlandau · 2018-06-12T13:05:57Z

I think the 10-row data frame is really what we are going for here. (@AlexAxthelm, do you agree?) Setting expand = FALSE in evaluate_plan() means it will not expand out to 40 (or 36) rows. If you need more expansion, consider expand_plan(), more wildcards, tidyr::crossing(), etc.

wlandau · 2018-06-19T17:54:46Z

https://github.com/tidyverse/glue may be a better solution to all this. Ref: #424.

wlandau · 2018-06-19T19:24:02Z

Coming back to #235 (comment), I thought of a much better solution to the original problem: just define a special wildcard for public schools.

library(drake)
library(magrittr)
drake_plan(
  credits = check_credit_hours(all_schools__),
  students = check_students(all_schools__),
  grads = check_graduations(all_schools__),
  public_funds = check_public_funding(public_schools__)
) %>%
  evaluate_plan(
    rules = list(
      all_schools__ =  c("schoolA", "schoolB", "schoolC"),
      public_schools__ = c("schoolA", "schoolB")
    )
  )
#> # A tibble: 11 x 2
#>    target               command                      
#>    <chr>                <chr>                        
#>  1 credits_schoolA      check_credit_hours(schoolA)  
#>  2 credits_schoolB      check_credit_hours(schoolB)  
#>  3 credits_schoolC      check_credit_hours(schoolC)  
#>  4 students_schoolA     check_students(schoolA)      
#>  5 students_schoolB     check_students(schoolB)      
#>  6 students_schoolC     check_students(schoolC)      
#>  7 grads_schoolA        check_graduations(schoolA)   
#>  8 grads_schoolB        check_graduations(schoolB)   
#>  9 grads_schoolC        check_graduations(schoolC)   
#> 10 public_funds_schoolA check_public_funding(schoolA)
#> 11 public_funds_schoolB check_public_funding(schoolB)

Without that 12th row, this is the correct answer to the question posed at the top of the thread. And it only requires one call to evaluate_plan().

wlandau · 2018-10-30T16:07:30Z

Edit: map_plan() is probably a better fit for this general situation where you want to select only certain combinations of input settings.

wlandau added the topic: api label Feb 5, 2018

AlexAxthelm closed this as completed Feb 5, 2018

wlandau added topic: documentation and removed topic: api labels Feb 5, 2018

wlandau reopened this Feb 5, 2018

wlandau added the difficulty: beginner label Feb 10, 2018

wlandau closed this as completed in 03749ae Feb 11, 2018

wlandau pushed a commit that referenced this issue May 29, 2018

Revisit #235 with a grid patch

4e4cb98

wlandau mentioned this issue Jun 19, 2018

Consider glue for wildcards #424

Closed

wlandau mentioned this issue Jun 19, 2018

Dedicated vignette on wildcard templating ropensci-books/drake#5

Closed

wlandau mentioned this issue Jun 19, 2018

Mention the improved solution to drake issue 235 ropensci-books/drake#9

Closed

wlandau mentioned this issue Jul 5, 2018

Wildcard alternative to gather/reduce_plan #376

Closed

This was referenced Jan 19, 2019

DSL based on dplyr-like verbs? #233

Closed

Add map() to the DSL #687

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass parameters to `evaluate_plan` through a grid, rather than a series of vectors #235

Pass parameters to `evaluate_plan` through a grid, rather than a series of vectors #235

AlexAxthelm commented Feb 5, 2018

wlandau commented Feb 5, 2018 •

edited

Loading

AlexAxthelm commented Feb 5, 2018

krlmlr commented Feb 5, 2018

wlandau commented Feb 5, 2018

wlandau commented Feb 11, 2018

jw5 commented May 27, 2018

jw5 commented May 27, 2018

wlandau commented May 27, 2018

jw5 commented May 28, 2018 via email •

edited

Loading

jw5 commented May 28, 2018 via email •

edited

Loading

jw5 commented May 28, 2018

wlandau commented May 28, 2018

jw5 commented May 29, 2018

wlandau commented May 31, 2018

jw5 commented Jun 7, 2018

wlandau commented Jun 12, 2018

wlandau commented Jun 19, 2018

wlandau commented Jun 19, 2018

wlandau commented Oct 30, 2018

Pass parameters to evaluate_plan through a grid, rather than a series of vectors #235

Pass parameters to evaluate_plan through a grid, rather than a series of vectors #235

Comments

AlexAxthelm commented Feb 5, 2018

wlandau commented Feb 5, 2018 • edited Loading

AlexAxthelm commented Feb 5, 2018

krlmlr commented Feb 5, 2018

wlandau commented Feb 5, 2018

wlandau commented Feb 11, 2018

jw5 commented May 27, 2018

jw5 commented May 27, 2018

wlandau commented May 27, 2018

jw5 commented May 28, 2018 via email • edited Loading

jw5 commented May 28, 2018 via email • edited Loading

jw5 commented May 28, 2018

wlandau commented May 28, 2018

jw5 commented May 29, 2018

wlandau commented May 31, 2018

jw5 commented Jun 7, 2018

wlandau commented Jun 12, 2018

wlandau commented Jun 19, 2018

wlandau commented Jun 19, 2018

wlandau commented Oct 30, 2018

Pass parameters to `evaluate_plan` through a grid, rather than a series of vectors #235

Pass parameters to `evaluate_plan` through a grid, rather than a series of vectors #235

wlandau commented Feb 5, 2018 •

edited

Loading

jw5 commented May 28, 2018 via email •

edited

Loading

jw5 commented May 28, 2018 via email •

edited

Loading