Machine learning (ML) is an important aspect of data science that can be used to create predictions, make classifications, and uncover insights in data that can be difficult to detect. ML models are becoming increasingly common in medical and pharmaceutical settings, from aiding in patient diagnosis to analysing responses to treatment. {tidymodels} is a collection of R packages that can be used for various aspects of machine learning pipelines, including sampling data, building and fitting models, and performance evaluation. {tidymodels} provides a consistent, user-friendly approach to fitting machine learning models in R.
This interactive workshop will introduce some common machine learning techniques including (but not limited to) Lasso regression, random forests, and support vector machines, and demonstrate how to fit these models in R using {tidymodels}. We'll also cover some of the common issues arising in machine learning such as dealing with imbalanced data, biases in predictive performance, parameter tuning and model over-fitting. No previous knowledge of machine learning is required for this workshop, though familiarity with some statistical concepts such as correlation, variability, and simple linear regression may be helpful. Being reasonably comfortable with data wrangling using {dplyr} and {tidyr} would be beneficial to attendees.