Optionionally allow clustermq workers to cache the results #531

wlandau · 2018-10-06T00:13:01Z

@kendonB, you called it in #425 (comment).

As I've mentioned in another issue, caching on the master process is going to be slow for I/O heavy jobs.

As with the future backend, we can activate the existing caching argument for the clustermq and clustermq_staged backends: i.e.

make(plan, parallelism = "clustermq", caching = "worker", jobs = 100)
make(plan, parallelism = "clustermq_staged", caching = "worker", jobs = 100)

This enhancement could make a major difference in the tools at my workplace, and I consider it the highest priority issue for drake right now. cc @huizhang-lilly.

The text was updated successfully, but these errors were encountered:

wlandau · 2018-10-06T00:17:05Z

It would be super nice to use ZeroMQ to write the results back to the head node and write to the cache in parallel, but I am not sure exactly how. After the issues mention in mschubert/clustermq#99 get sorted out, make(plan, parallelism = "clustermq_staged", caching = "master") might approximate this somehow. But anyway, I think the current issue is a good start.

wlandau · 2018-10-06T01:36:39Z

Using the 531 branch. The implementation for "clustermq" parallelism is surprisingly easy. Still needs testing on a real cluster. Stay tuned for a PR.

wlandau · 2018-10-06T02:03:33Z

From #532, this issue is now solved for "clustermq" parallelism. I still want to implement it for "clustermq_staged" parallelism even though staged parallelism is almost never as good as persistent workers.

wlandau · 2018-10-06T02:52:21Z

Fixed via #532 and #533. I can't believe how easy that was. I guess the work on earlier backends paid off.

wlandau added difficulty: advanced topic: performance status: priority labels Oct 6, 2018

wlandau self-assigned this Oct 6, 2018

wlandau mentioned this issue Oct 6, 2018

Update caching options for clustermq parallelism ropensci-books/drake#38

Closed

wlandau pushed a commit that referenced this issue Oct 6, 2018

Sketch #531

ab3222a

wlandau mentioned this issue Oct 6, 2018

Caching options for clustermq parallelism #532

Merged

7 tasks

wlandau mentioned this issue Oct 6, 2018

Caching options for clustermq_staged parallelism #533

Merged

7 tasks

wlandau closed this as completed Oct 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optionionally allow clustermq workers to cache the results #531

Optionionally allow clustermq workers to cache the results #531

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

Optionionally allow clustermq workers to cache the results #531

Optionionally allow clustermq workers to cache the results #531

Comments

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018

wlandau commented Oct 6, 2018