Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new example: orderly + drake #41

Open
wlandau opened this issue Feb 5, 2020 · 5 comments
Open

new example: orderly + drake #41

wlandau opened this issue Feb 5, 2020 · 5 comments
Labels
help wanted Extra attention is needed

Comments

@wlandau
Copy link
Owner

wlandau commented Feb 5, 2020

According to @richfitz, orderly and drake could complement each other nicely. If we post an example here, drake users will be able to download it with drake::drake_example("orderly") and try it out for themselves.

From the docs, it looks like orderly could wrap around a drake workflow and manage multiple versions of final artifacts rendered at the end of a drake plan. Are there other obvious win-wins?

@wlandau
Copy link
Owner Author

wlandau commented Feb 5, 2020

I just got started on an example in this branch. It is just like orderly's minimal example but with a drake workflow in src/example/script.R.

library(drake)
save_bar_plot <- function(data, file) {
png(file)
par(mar = c(15, 4, 0.5, 0.5))
barplot(data, las = 2)
dev.off()
file
}
plan <- drake_plan(
bar_data = setNames(dat$number, dat$name),
bar_plot = save_bar_plot(bar_data, file_out("mygraph.png"))
)
make(plan)

@wlandau
Copy link
Owner Author

wlandau commented Feb 5, 2020

Should drake be declared as a package dependency in src/example/orderly.yml?

@wlandau
Copy link
Owner Author

wlandau commented Feb 5, 2020

One source of friction I notice is that orderly creates and sets a new working directory for each new run, while drake expects all runs to use the same file system and the same working directory. Even if we assign a storr_rds() cache in a central location that all runs can access, it is still awkward when we declare a drake file_out() with a run-specific path. @richfitz, how would you suggest we get the most out of both tools in this situation?

@richfitz
Copy link

richfitz commented Feb 6, 2020

yes, I can imagine that this is a source of friction, and tbh it's one that is fairly fundamental to orderly (minimising state between runs of an analysis). Though there are two ways to deal with this that might help:

  1. If one develops an analysis outside of orderly, interactively, and treats orderly as the final copy that will be run periodically, then you can develop the final copy in. We have work that has used this workflow in use at work.
  2. If one is going to develop interactively, then there is some support with orderly::orderly_test_start to do the directory set up (though sadly not change) following which one can write and work with R code almost as usual. This workflow needs work (you're editing files in a dir that is not the working directory) and we've not yet worked out a superb way of doing it. I have a few ideas for removing pain here but it's not really worked out yet.

Thanks for the example 😄

@richfitz
Copy link

richfitz commented Feb 6, 2020

(minimally if the project has really simple dependencies, then with judicious use of .gitignore one could just work in the src/example directory actually)

@wlandau wlandau added the help wanted Extra attention is needed label Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants