Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"future" parallelism takes a really long time to get going when many upstream targets are up to date #448

Closed
kendonB opened this issue Jul 3, 2018 · 2 comments

Comments

@kendonB
Copy link
Contributor

kendonB commented Jul 3, 2018

Finally trying out "future" parallelism. I have 3 targets each evaluate_plan'ed into 10000 pieces.

The main while loop takes a really long time to get through the first 20000 up to date targets before starting on target 20001. I believe this work is done using lightly_parallelize in other types of parallelism so it's not so bad.

Perhaps this method could trim the queue first by subsetting using the outdated function?

Of course, this is another issue that'll be easier after dealing with #440.

@wlandau
Copy link
Member

wlandau commented Jul 3, 2018

#440 may speed this up to a degree, and there may be ways to locally parallelize parts of the master process. I will think about it. But for you, maybe a different tack is more appropriate. You have a ton of targets, but the structure of the dependency network is extremely simple. I think I could help you far more by adding a new "clustermq_staged" backend (described at mschubert/clustermq#86 (comment)) and a "future_lapply_staged" backend. I do not usually recommend staged parallelism, but neither do I believe that the perfect should be the enemy of the good.

@wlandau
Copy link
Member

wlandau commented Jul 3, 2018

Hmm... I do not regret #452, but after looking back at the code, I strongly believe we have the same bottleneck as #435 (which #440 will solve). See below (dependencies() calls igraph::adjacent_vertices()).

drake/R/future.R

Lines 184 to 196 in adf5f87

decrease_revdep_keys <- function(worker, config, queue){
target <- attr(worker, "target")
if (!length(target) || is.na(target) || !is.character(target)){
return()
}
revdeps <- dependencies(
targets = target,
config = config,
reverse = TRUE
) %>%
intersect(y = queue$list())
queue$decrease_key(targets = revdeps)
}

Closing as a duplicate. Will reopen if I am wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants