forked from statOmics/PSLS
-
Notifications
You must be signed in to change notification settings - Fork 0
/
04_exercises.Rmd
129 lines (84 loc) · 3.8 KB
/
04_exercises.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
title: "Exercises on chapter 4: Data exploration"
author: "Lieven Clement, Jeroen Gilis and Milan Malfait"
date: "statOmics, Ghent University (https://statomics.github.io)"
code_download: false
---
In this hands-on exercise session you will perform the data exploration for 4 different studies:
* [NHANES example]
* [Pertussis example]
* [Diabetes example]
* [FEV example]
---
# NHANES example
The National Health and Nutrition Examination Survey (NHANES) contains data
that has been collected since 1960. For this exercise, we will make use of the
data that was collected between 2009 and 2012, for 10.000 U.S. civilians.
The dataset contains a large number of physical, demographic, nutritional and
life-style-related parameters.
The first step in a data analysis is data exploration.
In this exercise, you will learn howto:
- import data into R
- tidy and wrangle data
- explore and visualize data
by following and interpreting the code of a worked out example.
- Preliminary: [Preliminary](./extra1_preliminary_tidyverse.html)
- Exercise: [Exercise1](./04_1_NHANES.html)
- Data path:
`https://raw.githubusercontent.com/statOmics/PSLSData/main/NHANES.csv`
---
# Pertussis example
Researchers wanted to study the immune response on pertussis.
They have set up an experiments with 40 rats.
16 rats were infected with pertussis and 24 rats received a control treatment.
Researchers measured the white blood cell concentration (WBC) in each rat (count per mm$^3$.
The data consist of two variables:
- WBC: white blood cell count (counts/mm$^3$).
- trt: treatment
- control: rat recieved control treatment
- pertussis: rat was infected with pertussis
Upon this exercise you can
- implement a good data exploration for a two group comparison in R and
- interpret the plots and results.
Files
- Exercise: [Exercise2](./04_2_pertussis.html)
- Data path:
`https://raw.githubusercontent.com/statOmics/PSLSData/main/wbcon.csv`
- Solution: [Solution2](./04_2_pertussis_sol.html)
---
# Diabetes example
The diabetes dataset consists of a small experiment with 8 patients
that were subjected to a glucose tolerance test.
Patients had to fast for eight hours before the test.
When the patients entered the hospital their baseline glucose level was measured (mmol/l).
Patients then had to drink 250 ml of a syrupy glucose solution containing 100 grams of sugar.
Two hours later, their blood glucose level was measured again.
The data consist of three variables:
- before: glucose concentration upon 8 hours of fasting (mmol/l)
- after: glucose concentration 2 hours after drinking glucose solution (mmol/l).
- patient: identifier for the patient
In this exercise, you will acquire the skills to
- recognize paired data
- conduct a data exploration in R for data from
paired experimental designs.
- interpret the results of a data exploration for paired experimental designs
Files:
- Exercise: [Exercise3](./04_3_diabetes.html)
- Data path:
`https://raw.githubusercontent.com/statOmics/PSLSData/main/diabetes.txt`
- Solution: [Solution3](./04_3_diabetes_sol.html)
---
# FEV example
The forced expiratory volume (FEV) is a measure of how
much air a person can exhale (in liters) during a forced breath. In this
dataset, the FEV of 606 children, between the ages of 6 and 17, were measured.
The dataset also provides additional information on these children:
their `age`, their `height`, their `gender` and, most importantly, whether the
child is a smoker or a non-smoker. The goal of this experiment was to find out
if smoking has an effect on the FEV of children.
In this exercise, you will learn how plots can help you to discover confounding in a real datasets.
- Exercise: [Exercise4](./04_4_FEV.html)
- Data path:
`https://raw.githubusercontent.com/statOmics/PSLSData/main/fev.txt`
- Solution: [Solution4](./04_4_FEV_sol.html)
<br/>