Skip to content
This repository has been archived by the owner on May 19, 2021. It is now read-only.

R on high performance clusters #80

Open
lmullen opened this issue May 20, 2018 · 5 comments
Open

R on high performance clusters #80

lmullen opened this issue May 20, 2018 · 5 comments

Comments

@lmullen
Copy link
Member

lmullen commented May 20, 2018

Late suggestion, but I would be interested in discussing how R can be used on high-performance clusters. In my case that's a university cluster, but I imagine people are doing this in lots of different ways. There are already well established packages for this, such as drake and sparklyr. Perhaps an outcome of this would be a tutorial or guide to using them for some rOpenSci specific applications.

@wlandau
Copy link
Member

wlandau commented May 20, 2018

Ooh I love where this is going! For drake, I just rewrote the vignettes on HPC and timing, plus an rOpenSci tech note with an overview of some new HPC features. For tasks not in reproducible pipelines, I recommend future.batchtools. rslurm looks great for SLURM specifically, though I have not used it.

Traditional HPC can be intimidating at first, and I think more work to reduce the friction would be well spent.

@zachary-foster
Copy link

I have also encountered the flowr package, although I have not used it much.

@juyeongkim
Copy link

ping @sahilseth

@sahilseth
Copy link

thanks @juyeongkim, this sounds exciting. We have been developing several NGS pipelines on flowr. Guess this is a late reply, happy to discuss more, and can skype in.

@wlandau
Copy link
Member

wlandau commented Jul 1, 2018

I recently encountered clustermq, and I love it. The learning curve is shallow, overhead is tiny, and it requires astonishingly little setup. So far, my experience has been totally frictionless.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants