-
Notifications
You must be signed in to change notification settings - Fork 10
/
index.Rmd
79 lines (52 loc) · 3.32 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
title: "Introduction to R for Cancer Scientists"
---
# Introduction
This site contains the materials for an R course run by the Bioinformatics Core
at the Cancer Research UK Cambridge Institute.
April -- June 2020
### Instructors
* Chandra Chilamakuri
* Matt Eldridge
* Mark Fernandes
* Kamal Kishore
* Sergio Martinez Cuesta
* Ashley Sawle
* Rory Stark
# Description
**R** is one of the leading programming languages in **Data Science** and the
most widely used within CRUK CI for interacting with, analyzing and visualizing
cancer biology datasets.
In this training course, we aim to provide a friendly introduction to R pitched
at a beginners level but also for those who have been on R training courses
previously and would like a refresher or to consolidate their skills.
The course will be run over **6 weeks** with the following structure:
* **Online lesson each Monday at 11am** lasting 45 minutes via a video call in Microsoft Teams
* the instructor will share his/her screen featuring an RStudio window during this call
* this will be recorded for those who weren’t able to join the meeting or so you can replay the lesson
* More **in-depth material** covering the concepts introduced on the Monday lesson to go through in your own time
* A **weekly assignment** consisting of exercises to practice some of the concepts covered in that and previous weeks' lessons
* An **online recap session each Friday at 11am** to go through the assignment and answer any questions you may have
* again this is expected to last around 45 minutes via a video call in Microsoft Teams
The first lesson will be on Tuesday 14 April, not the Easter Monday bank
holiday, and thereafter every Monday for the following 5 weeks. Similarly we
will run the Friday recap for the week of 4 – 8 May on the Thursday instead as
8 May is the special VE Day bank holiday.
We will be using Microsoft Teams in running this course and members of the
Bioinformatics Core will be available during the course for 1:1 support using
chats and calls within Teams.
# Schedule
0. [Getting set up](week0.html) (6 April) - installing R and RStudio
1. [Introduction to R](week1.html) (14 April) - Interacting with R using RStudio and introducing objects, data types and functions
2. [Working with data](week2.html) (20 April) - Creating R scripts, working with tabular data and other types of objects in R, reading data into R
3. [Data visualization with ggplot2](week3.html) (27 April) - A common grammar to create scatter plots, bar charts, boxplots, histograms and line graphs for time series data
4. [Data manipulation using dplyr](week4.html) (4 May) - Filtering and modifying tabular data, computing summary values, faceting with ggplot2
5. [Grouping and combining data](week5.html) (11 May) - Advanced grouping and summarization operations, joining data from different tables, customizing ggplot2 plots
6. [Restructuring data for analysis](week6.html) (18 May) - The concept of 'tidy data', pivoting and separating operations, ggplot2 extras
7. **Capstone project** – putting it all together in a typical data analysis including:
* reading in a data set
* handling missing values
* selecting and filtering subsets of interest
* creating plots
* generating summary statistics
* saving data transformed into a tidy format as a csv file for later analysis