The goal of this repository is to provide a series of course materials that is used for online seminars to introduce R programming. Each subfolder in the respository provides materials for a specific course in the seminar series. It includes the required code and presentation slides. It requires R version 4.2.1+ as well as Rstudio and materials include the guidance for R package installation.
Instructors: Yuyan Yi, Mina Peyton
Contact: https://niaid-amp.powerappsportals.us/
**Courses: **
Part 1 – Data Wrangling
This course is designed to provide participants with an introduction to R programming, specifically focusing on data wrangling techniques. Throughout the course, participants will gain a comprehensive understanding of how to effectively use RStudio, import files into R, and navigate the basic R language. The course begins with an overview of RStudio, ensuring participants are familiar with its interface and functionality. From there, participants will learn how to import various file types into R, including CSV, Excel, and text files. They will also explore different types of variables in R, such as numeric, character, and factor variables. Participants will learn how to clean and transform data, handle missing values, filter and subset data, and perform data aggregation and reshaping. They will also be introduced to the concept of tidy data and how to achieve it using R. By the end of the course, participants will have a solid foundation in R programming and data wrangling techniques. They will be equipped with the skills necessary to efficiently manipulate and prepare data for further analysis and visualization.
Part 2 – Data Visualization
This course provides an introduction to data visualization using R. Participants will learn data visualization with base R and using the R package ggplot2 to explore various types of data visualizations, including scatter plots, line charts, bar graphs, histograms, box plots, and more. They will also be introduced to customizing and enhancing visualizations with themes, labels, annotations, and color schemes. By the end of this course, participants will have the skills and knowledge to create insightful and visually appealing data visualizations using R that will enable them to communicate data-driven insights effectively.
Part 3 – Data Analysis
This course provides an introduction to data analysis using R, focusing on conducting various statistical tests and generating commonly used statistical regression models. Participants will learn how to perform hypothesis testing, calculate descriptive statistics, and conduct inferential statistics using R. The course covers topics such as t-tests, ANOVA, chi-square tests, and correlation analysis. Participants will also learn how to generate regression models, including linear regression, logistic regression, and multiple regression, to analyze relationships between variables. Additionally, the course emphasizes interpreting the output results using R functions. Participants will gain practical skills in understanding and communicating the findings from statistical analyses conducted in R. By the end of the course, participants will have a solid foundation in data analysis using R and be able to apply these skills to real-world datasets.
Part 4 – Real-world Data Analysis Using R
In this course, participants will integrate the knowledge gained from previous sessions on data wrangling, data visualization, and data analysis to undertake a comprehensive demonstration project analyzing real-world data. Through a step-by-step walkthrough of an example data analysis project, participants will develop a thorough understanding of the entire process of real-world data analysis, including data wrangling, visualization, extracting insights, and interpreting results, to apply these skills in future projects. By the end of the course, participants will have a good foundation in the essential steps of data analysis and be equipped to effectively analyze and interpret real-world data using R.