Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there still a use for drake in "hasty" mode? #6

Open
wlandau opened this issue Oct 22, 2018 · 3 comments
Open

Is there still a use for drake in "hasty" mode? #6

wlandau opened this issue Oct 22, 2018 · 3 comments

Comments

@wlandau
Copy link

wlandau commented Oct 22, 2018

I came across your article recently, and I was delighted to see more work on DAG-based data processing pipelines in R. I am sorry drake's functionality was inadequate at the time. Working to reduce overhead is a long term effort, and our latest attempt is a new "hasty" mode (PR, documentation). It sacrifices some of drake's core reproducibility features, but it makes data processing much faster.

cache <- storr::storr_environment() # for slightly faster preprocessing
make(plan, cache = cache, parallelism = "hasty", jobs = 8)
@wlandau wlandau changed the title Is there still a use for drake in "blitz" mode? Is there still a use for drake in "hasty" mode? Oct 23, 2018
@schnorr
Copy link
Owner

schnorr commented Oct 23, 2018

Hey @guilhermealles, can you give another try with drake using the so-called "hasty" mode? You can use the drake-integration branch as is. Let me know if you need large traces.

@wlandau
Copy link
Author

wlandau commented Oct 23, 2018

Thanks! Please let me know if there is anything I can do to help.

I forgot to mention that parallelism and load balancing in hasty mode are controlled by the clustermq package. So to run locally, you would call options(clustermq.scheduler = "multicore") before make(parallelism = "hasty").

@wlandau
Copy link
Author

wlandau commented Aug 4, 2019

Update: I am planning major improvements to cache speed: ropensci/drake#971 (comment). If all goes well, you will be able to tell drake to use write_fst() and read_fst() and not have to go through file_out() or file_in(). I will keep you posted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants