-
Notifications
You must be signed in to change notification settings - Fork 1
/
L22-Effect-size.qmd
190 lines (124 loc) · 16.4 KB
/
L22-Effect-size.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
# Effect size {#sec-effect-size}
```{r include=FALSE}
source("../_startup.R")
set_chapter(22)
```
Starting with this Lesson, we focus on issues surrounding the selection of explanatory variables. Up to now, we've taken a somewhat casual view, pointing out that models with multiple explanatory variables can be made. In discussing prediction models (Lesson [-@sec-predictions]) we went so far as to say that any set of explanatory (predictor) variables is valid so long as it leads to a good prediction.
This and the next several Lessons deal with statistical modeling methods that support **intervening** in a system. Such interventions occur on both grand scales and small: changes in government policies such as funding for preschool education or subsidies for renewable energy, closing a road to redirect traffic or opening a new highway or bus line, changing the minimum wage, etc. Before making such interventions, it is wise to know what the consequences are likely to be. Figuring this out is often a matter of understanding how the system works: what causes what. As interventions often affect multiple individuals, influencing the overall trend of the effect across individuals might be the goal instead of predicting how each individual will be affected.
::: {.callout-note}
## Intervention versus prediction
The statistical thinker distinguishes between settings that call for a predictive model and settings that are fundamentally about the consequences of intervening in a system. In a predictive setting, we're interested in the outcome from a system that we do not plan to change fundamentally.
The need for predictions arises in both mundane and critical settings. For instance, an airline needs to consider what will be the demand for a particular route so that they can plan how many aircraft and of what type to serve the route. To set the route schedule, the airline needs a prediction about what will be the demand, which may vary based day of the week, time of day, season of the year, and special events (such as massive convention or a solar eclipse). Another example: Merchants and social media sites must choose what products or posts to display to a viewer. Merchants have many products, and social media has many news feeds, tweets, and competing blog entries. The people who manage these websites want to promote the products or postings most likely to cause a viewer to respond. To identify viable products or postings, the site managers construct predictive models based on earlier viewers' choices.
In an intervention setting, there is a plan to change how the system works. Such changes can take many forms. For instance, an airline needs to set prices and the availability of different seat classes. Demand will depend on price but also on other factors, such as the requirement to make a connection through a third airport and the competition's prices. A useful model of demand versus price will include such other factors; those factors play a *causal role* in influencing demand.
:::
## Effect size: Input to output
This Lesson focuses on "**effect size**," a measure of how changing an explanatory variable will play out in the response variable. Built into the previous sentence is an assumption that the explanatory variable *causes* the response variable. In this Lesson we focus on the calculation and interpretation of effect size. Lessons [-@sec-DAGs] through [-@sec-experiment] take a detailed look at how to make responsible claims about whether a connection between variables is causal.
An intervention changes something in the world. Some examples are the budget for a program, the dose of a medicine, or the fuel flow into an engine. The thing being changed is the *input*. In response, something else in the world changes, for instance, the reading ability of students, the patient's serotonin levels (a neurotransmitter), or the power output from the engine. The thing that changes in response to the change in input is called the "output."
"**Effect size**" describes the change in the output with respect to the change in the input. We will focus here on quantitative output variables. (For categorical output variables, the methods concerning "risk" presented in Lesson [-@sec-risk] are appropriate.)
An effect size (with a quantitative output variable) takes two different forms, depending on whether the explanatory variable is quantitative or categorical. We write "the explanatory variable" because effect sizes concern the response to **changes** in a single explanatory variable, even though there may be others in the model.
### Effect size for *quantitative* explanatory variable {#sec-effect-calc}
When the explanatory variable is *quantitative*, the effect size is a **rate**. Rates are always *ratios*: the change in output divided by the change in input that caused the output to change. For instance, in the Scottish hill racing setting considered in Lesson [-@sec-R-squared] we modeled running `time` as a function of race `distance` and `climb`. Such a model will involve two effect sizes: the change in running time per unit change in distance; and the change in running time per unit change in climb.
Effect sizes typically have units. These will be the unit of the output variable divided by the unit of the explanatory variable. In the effect size of `time` with respect to `distance`, the effect-size unit will be seconds-per-kilometer. On the other hand, the effect size of `time` with respect to `climb` will have units seconds-per-meter.
Here is one way to calculate an effect size: change a single input by a known amount, measure the corresponding change in output, and take the ratio. For example:
```{r label='430-Effect-size-eI7B4e'}
race_mod <- Hill_racing |> model_train(time ~ distance + climb)
```
To calculate the effect size on `time` with respect to `distance`, we evaluate the model for two different distances, keeping `climb` at the same level for both distances.
```{r label='430-Effect-size-wJhbMh', results=asis()}
race_mod |> model_eval(distance = c(5, 10), climb = 500)
```
The output changed from 2104 seconds to 3373 seconds in response to changing the value of `distance` from 5 moving to 10 km. The effect size is is therefore
$$\frac{3373 - 2104}{10 - 5} = \frac{1269}{5} = 253.8\ \text{s/km}$$
To calculate the effect size on `time` with respect to `climb`, a similar calculation is done, but holding `distance` constant and using two different levels of `climb`:
```{r label='430-Effect-size-Lig3VO', results=asis()}
race_mod |> model_eval(distance = 10, climb = c(500,600))
```
The effect size is:
$$\frac{3634 - 3373}{100} = \frac{261}{100} = 2.6\ \text{s/m}$$
To see how effect sizes might be used in practice, put yourself in the position of a member of a committee establishing a new race. The new race will have a distance of 17 km and a climb of 600 m. The anticipated winning time in the new race will be a matter of *prediction*:
```{r label='430-Effect-size-klARxv', results=asis()}
race_mod |> model_eval(distance = 17, climb = 600)
```
Note how broad the prediction interval is: from about one hour up to two hours.
Debate ensues. One faction on the committee wants to shorten the race to 15 km and 500 m climb. How much will this lower the winning time?
On its own, the -2 km change in the race `distance` will lead to an approximately will lead to a winning time shorter by
$$-2\ \text{km} \times 253.8\ \text{s/km} = -508\ \text{s}$$
where $253.8\ \text{s}{km}$ is the effect size we calculated earlier.
The previous calculation did not consider the proposed reduction in `climb` from 600 m to 500 m. On its own, the -100 m change in race `climb` will also shorten the winning time:
$$ -100\ \text{m} \times 2.6\ \text{s/km} = -260\ \text{s}$$
Each of these two calculations of change in output looks at only a single explanatory variable, not both simultaneously. To calculate the overall change in race `time` when both `distance` and `climb` are changed, add the two changes associated with the variables separately. Thus, the overall change of the winning time will be $$(-508\ \text{s}) + (-260\ \text{s}) = -768\ \text{s} .$$
::: {.callout-note}
## Comparing predictions?
The predicted winning race time for inputs of 17 km `distance` and 600 m `climb` was [3700 to 7100] seconds. What if we make a second prediction with the proposed changes in distance and time, and subtract the two predictions?
```{r label='430-Effect-size-xaNcXu', results=asis()}
race_mod |> model_eval(distance = 15, climb = 500)
```
The shorter race has a predicted winning time of [2900 to 6400] seconds.
Question: How do you subtract one *interval* from another? Should we we look at the worst-case difference: [(6400 - 3700) to (2900 - 7100)], that is, [-4200 to 2700] seconds? Or perhaps we should construct the difference as the change between the lower ends of the two prediction intervals up to the change in the upper ends? That will be [(2900 - 3700) to (6400 - 7100)], that is, [-800 to -700] s.
A good perspective on this question of the difference between intervals is based on the distinction between the part of the `time` that is *explained* by `distance` and `climb`, and the part of `time` that remains unexplained, perhaps due to weather conditions or the rockiness of the course. If the committee decides to change the course `distance` and `time` it will not have any effect on the weather or course rockiness; these factors will remain random noise. The lower end of each prediction interval reflects one extreme weather/rockiness condition; the upper end reflect another extreme of weather/rockiness. Apples and oranges. The *change* in race `time` due to `distance` and `time` should properly be calculated at the same weather/rockiness conditions. Thus, the [-800 to -700] s estimate of the change in running `time` is more appropriate. The effect-size calculation does the apples-to-apples comparison.
:::
### Effect size for *categorical* explanatory variable
When an explanatory variable is categorical, the change in input must always be from one level to another. For example, an airline demand model might involve a day-of-week variable with levels "weekday" and "weekend." To calculate the effect size on demand with respect to day-of-week, all you can do is measure the corresponding change in the model output when day-of-week is changed from "weekday" to "weekend." The effect size will simply be this change in output, not a rate. Calculating a rate would mean quantifying the change in input, but weekday-to-weekend is not a number.
## Model coefficients and effect size
For simplicity in these Lessons, we emphasize models where the explanatory variables contribute *additively*, as implicit in the use of `+` in model specifications like `time ~ distance + climb`. More generally, both additive and *multiplicative* contributions can be used in models. (Similarly, it's possible to use curvey transformations of variables.) In @sec-interactions we will investigate the uses of multiplicative contributions.
In models incorporating multiplicative or curvey contributions, effect size can be calculated using the `model_eval()`-based method described in @sec-effect-calc. But, for models where explanatory variables contribute additively, there is an easy shortcut for calculating effect size: *the coefficient on each explanatory variable is the effect size for that variable*.
To illustrate, look at the coefficients on the `time ~ distance + climb` model:
```{r label='430-Effect-size-vqIzZY', results=asis()}
race_mod |> conf_interval()
```
The `.coef`icients on `distance` and on `climb` are the same as we calculated using the `model_eval()` method!
Moreover, for additive models, the *confidence interval* on the coefficient also expresses the confidence interval on the corresponding effect size. So, when in @sec-effect-calc we said the effect size of `distance` on `time` was 253.8 s/km, a better statement would have been as an interval: [246 to 261] s/km.
## Interactions {#sec-interactions}
The model `time ~ distance + climb` combines the explanatory variables additively. @fig-dist-climb-additive shows the "shape" of the model graphically in two different ways: with `distance` mapped to x and `climb` mapped to color (left panel) and with `climb` mapped to x and `distance` to color (right panel). The same model function is shown in both; just the presentation is different. In both panels, the model function appears as a set of **parallel** sloped lines. This is the hallmark of an additive model. (See @fig-SAT-expend-partic for another example.)
```{r warning=FALSE}
#| label: fig-dist-climb-additive
#| fig-cap: "Two views of the additive model `time ~ distance + climb`. The lines for different colors are parallel. "
#| fig-subcap:
#| - "`distance` mapped to x"
#| - "`climb` mapped to x"
#| layout-ncol: 2
#| column: page-right
Hill_racing |> filter(climb > 100) |>
point_plot(time ~ distance + climb, annot = "model",
model_ink = 1)
Hill_racing |> filter(climb > 200) |>
point_plot(time ~ climb + distance, annot = "model",
model_ink = 1)
```
The effect size of the variable being mapped to x appears as the slope of the lines. The effect size of the variable mapped to color appears as the vertical separation between lines. @fig-dist-climb-additive shows that the effect of `distance` and the effect of `climb` do not change when the other variable changes; the lines are parallel.
In contrast, @fig-dist-climb-multiplicative gives views of the multiplicative model `time ~ distance * climb`. In @fig-dist-climb-multiplicative, the spacing between the different colored lines is not constant; the lines fan out rather than being parallel.
```{r warning=FALSE}
#| label: fig-dist-climb-multiplicative
#| fig-cap: "Two views of the **multiplicative** model `time ~ distance * climb`. The lines fan out. "
#| fig-subcap:
#| - "`distance` mapped to x"
#| - "`climb` mapped to x"
#| layout-ncol: 2
#| column: page-right
Hill_racing |> filter(climb > 100) |>
point_plot(time ~ distance * climb, annot = "model",
model_ink = 1)
Hill_racing |> filter(climb > 200) |>
point_plot(time ~ climb * distance, annot = "model",
model_ink = 1)
```
Again, the effect size of the variable mapped to color appears as the vertical spacing between the different colored lines. Now, however, that vertical spacing changes as a function of the variable mapped to x. That is, the effect size of one explanatory variable depends on the other.
The model coefficients show the contrast between additive and multiplicative models. For the additive model, there is one coefficient for each explanatory variable. That variable's coefficient captures the effect size of the variable.
```{r label='430-Effect-size-3nwfET', results=asis()}
Hill_racing |> model_train(time ~ distance + climb) |> conf_interval()
```
For the multiplicative model, there is a third coefficient. The model summary reports this as `distance:climb`. Generically, it is called the "**interaction coefficient**." The interaction coefficient quantifies how the effect of each explanatory variable depends on the other.
```{r label='430-Effect-size-5V8vzC', results=asis()}
Hill_racing |> model_train(time ~ distance * climb) |> conf_interval()
```
You can't read the effect size for an explanatory variable from a single coefficient. Instead, arithmetic is required. For instance, the effect size of `distance` is not just the quantity reported as the `.coef` on `distance`. 224 s/m. Instead, the effect size of `distance` is a function of `climb`:
$$\text{Effect size of }\mathtt{distance}: 224 + 0.044 \times \mathtt{climb}$$
$$\text{Effect size of }\mathtt{climb}: 1.78 + 0.044 \times \mathtt{distance}$$
Each of the above formulas is for an effect size: how the model output changes when the corresponding explanatory variable changes. In contrast, the model function gives `time` as a function of `distance` and `climb`. The model function is:
$$\text{Model function: }\texttt{time(distance, climb)} = \\-59.2 + 224 \times \texttt{distance} + 1.78 \times \texttt{climb} + 0.044 \times \texttt{distance} \times \texttt{climb}$$
::: {.callout-note}
## For the reader who has already studied calculus:
The effect sizes are the partial derivatives of the model function. The interaction coefficient is the "mixed partial derivative" of the function with respect to both `distance` and `climb`.
$$\text{Effect size of }\mathtt{distance}: \frac{\partial\ \texttt{time}}{\partial\ \texttt{distance}}$$
$$\text{Effect size of }\mathtt{climb}: \frac{\partial\ \texttt{time}}{\partial\ \texttt{climb}}$$
:::