forked from statOmics/PSLS
-
Notifications
You must be signed in to change notification settings - Fork 0
/
08_6_rats.Rmd
170 lines (111 loc) · 4.32 KB
/
08_6_rats.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
title: "Exercise 8.6: Blocking on the rat diet dataset (2)"
author: "Lieven Clement and Jeroen Gilis"
date: "statOmics, Ghent University (https://statomics.github.io)"
---
# Background
Researchers are studying the impact of protein sources and protein levels in
the diet on the weight of rats. They feed the rats with diets of beef, cereal
and pork and use a low and high protein level for each diet type.
The researchers can include 60 rats in the experiment. Prior to the experiment,
the rats were divided in 10 homogeneous groups of 6 rats based on
characteristics such as initial weight, appetite, etc.
Within each group a rat is randomly assigned to a diet. The rats are fed during
a month and the weight gain in grams is recorded for each rat.
The researchers want to assess the effect of the type of diet and the protein
level on the weight of the rats.
**In contrast to the previous exercise, we perform the analysis for all three**
**diets in the dataset.**
# Experimental design
- There are three explanatory variables in the experiment: the factor diet type
with two levels (beef and cereal), factor protein level with levels
(low and high) and a group blocking factor with 10 levels.
- There are 6 treatments: beef-high, cereal-high, pork-high, beef-low,
cereal-low, pork-low protein.
- The rats are the experimental and observational units.
- The weight gain is the response variable.
- The experiment is a randomized complete block (RCB) design
Load libraries
```{r, message=FALSE, warning=FALSE}
library(tidyverse)
```
# Data import
```{r}
diet <- read.table("https://raw.githubusercontent.com/statOmics/PSLSData/main/dietRats.txt",
header = TRUE
)
head(diet)
```
# Tidy data
```{r}
diet <- diet %>%
mutate(
block = as.factor(block),
protSource = as.factor(protSource),
protLevel = as.factor(protLevel)
) %>%
mutate(protLevel = fct_relevel(protLevel, "l"))
```
# Data exploration
- Boxplot of the weight gain against protein source, protein level with coloring according to block
```{r}
```
- Lineplot of the weight gain against protein source, protein level with coloring and grouping according to block
```{r}
```
- Interpret the plots
# Multivariate linear regression analysis
## Model specification
Based on the data exploration, propose a sensible regression model to analyse
the data.
## Assumptions
Check the assumptions of the linear regression model
## Hypothesis testing
Use the `summary` function to get an initial test for the parameters in the
model.
## Interpretation of the regression parameters
To facilitate the interpretation of the different parameters in our regression
model, we can make use of the `VisualizeDesign` function of the
`ExploreModelMatrix` R package. The first argument of this function is the name
of our target dataset, the second argument is a model formula, which in this
case is specified as a `~` followed by the explanatory variables in our model.
```{r, eval = FALSE}
library(ExploreModelMatrix)
ExploreModelMatrix::VisualizeDesign(..., ~...)$plotlist
```
After seeing this, again think about the meaning of the parameters in our model.
## Testing the overall (omnibus) effect of diet
By comparing a model containing diet effects to a model that does not have
diet effects, using anova.
```{r}
```
## Assessing the interaction effect between protein source and protein level
```{r}
```
## Assessing specific contrasts
Imagine that we are interested in assessing if there is an effect of
1. protein source in the low protein diets
- cereal versus beef
- pork verus beef
- pork versus cereal
2. protein source in high protein diets
- cereal versus beef
- pork verus beef
- pork versus cereal
3. protein level (high versus low) for
- beef diets
- cereal diets
- pork diets
4. If the effect of the protein level differs between
- beef and cereal
- beef and pork
- cereal and pork diets
These effects of interest are so-called
**contrasts, i.e. linear combinations of the parameters**.
Step 1: translate these research questions into parameters of the model (or
combinations of multiple parameters).
Step 2: Assess the significance of the contrasts using the `multcomp` package.
The contrasts are given as input in the form of symbolic
descriptions to the `linfct` argument of the `glht` function.
# Conclusion
Formulate a conclusion for the different research hypotheses.