-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.Rmd
683 lines (474 loc) · 24.2 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
---
title: "Lord's Paradox"
author: "Michael Clark"
date: "`r Sys.Date()`"
output:
html_document:
toc_float: yes
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(message=F, cache=FALSE, comment=NA, warning=F, echo=F)
```
The following is a summary of Pearl's 2014 and 2013 technical reports on some modeling situations that lead to surprising results that are initially at odds with out intuition.
# Lord's Paradox
## Background
Two statisticians are interested in determining whether there are differences among males and females in weight gain over the course of semester. In the original depiction by Lord, there is an implication of some diet treatment, but all that can be assumed is that those under consideration both received the 'treatment' if there was one. The two statisticians take different approaches to examining the data, yet come to different conclusions.
### Variables
- sex
- weight time 1
- weight time 2
- weight change
### Issues
The following graph is from Pearl 2014. The ellipses represent the scatter plots for boys and girls. The diagonal 45^o^ degree line would represent no change from time 1 to time 2. The center of the ellipses are both on this line, and thus the mean change for boys and girls are identical and zero. The density plot in the lower left depicts the distribution of change scores centered on this zero estimate.
<img src="img/lp1.png" style="display:block; margin: 0 auto;">
## Results
Statistician 1 focuses on change scores, while statistician 2 uses an ANCOVA to examine sex differences at time 2 while adjusting for initial weight.
- t-test for group difference on change score vs. ANCOVA approach
- Statistician 1 concludes no difference in *change*
- Statistician 2 concludes a difference in time 2 if controlling for time 1
### DAG
The model can be depicted as a directed acyclic graph as follows.
```{r dag1}
library(DiagrammeR)
grViz("
digraph dag1 {
# a 'graph' statement
graph [fontsize = 10, layout=circo] #rankdir ignored for circo
# several 'node' statements
Change[shape = doublecircle, fontname = Helvetica, fontcolor='gray50', fillcolor='gray95', width=1, penwidth=0.2];
node [shape = box, fontname = Helvetica, fontcolor='gray50', style=filled, penwidth=0]
Sex[color=lightsalmon];
Initial[color=navajowhite];
Final[color=navajowhite];
# edge statements
Sex -> Initial[label='a' fontcolor='gray25' color='dodgerblue']
Sex->Final [label='b' fontcolor='gray25' color='darkred']
Initial->Final [label='c' fontcolor='gray25' color='dodgerblue']
Initial->Change [label='-1' color='gray75' fontcolor='gray25' ]
Final->Change [label='+1' color='gray75' fontcolor='gray25' ]
}
")
```
In the above we can define the following effects of sex on final weight, or on change through effects on initial and final weight.
- <span style="color:#1e90ff">**Indirect effect**</span>: is a*c
- <span style="color:#8b0000">**Direct effect**</span>: b
- <span style="color:#66023C">**Total effect**</span>: (a\*-1) + (a\*c\*1) + (b\*1)
Gain is completely determined as the difference between final weight and initial weight, and so its direct effects from initial and final weight are not estimated, but fixed to -1 and +1 respectively. To calculate the total effect of Sex on weight gain, we must sum all the paths from Sex to it- from sex-final-gain, sex-initial-gain, sex-initial-final-gain.
### Who is *correct*?
- Both statisticians are correct
- t-test on change = total effect
- ANCOVA = direct effect
In summary, the two statisticians are focused on different effects from the same model- one the total effect, the other the direct effect.
## Data Example
We can get an explicit sense of the results by means of a hands on example. In the following we have simulated data that will reproduce the situation described thus far. Parameters were chosen for visual and statistical effect. One thing that's not been noted about the example is that it likely would not occur, in the sense that the total effect would likely be positive, as there would be strong sex differences at initial weight, and a strong correlation between initial and final weight, all in a positive manner (and not construed in a way to nullify the effect). The other issue not addressed is that the entire focus on whether an effect exists hinges on p-values, and with large enough data and such simple models, impractical effects could flag significant. Plus there are other general modeling issues ignored. A goal here is conceptual simplicity however.
```{r dataSetup, echo=1:8}
set.seed(1234)
N = 200
group = rep(c(0, 1), e=N/2)
initial = .75*group + rnorm(N, sd=.25)
final = .4*initial + .5*group + rnorm(N, sd=.1)
change = final-initial
df = data.frame(id=factor(1:N), group=factor(group, labels=c('Female', 'Male')), initial, final, change)
head(df)
library(dplyr)
dflong = tidyr::gather(df, key=time, value=score, initial:final) %>% arrange(id)
head(dflong)
```
```{r plotData}
library(ggplot2); library(plotly); library(dplyr)
# plot_ly(filter(dflong, group=='Female'), x=time, y=score, group=id, mode='line', showlegend=F, line=list(color='#ff5500')) %>%
# add_trace(data=filter(dflong, group=='Male'), x=time, y=score, group=id, showlegend=F, line=list(color='dodgerblue'))
coefm = coef(lm(final~initial, filter(df, group=='Male')))
coeff = coef(lm(final~initial, filter(df, group=='Female')))
g = ggplot(aes(x=initial, y=final), data=df) +
geom_abline(intercept=0, slope=1, color='gray50', alpha=.5) +
geom_point(aes(color=group), alpha=.5) +
stat_ellipse(aes(color=group), level=.999) +
# geom_smooth(aes(color=group), method='lm', se=F) +
scale_color_manual(values=c('#ff5503', 'dodgerblue')) +
geom_abline(intercept=coefm[1], slope=coefm[2], color='dodgerblue') +
geom_abline(intercept=coeff[1], slope=coeff[2], color='#ff5503') +
lazerhawk::theme_trueMinimal()
ggplotly(g)
```
<br>
In the following we'll use lavaan to estimate the full mediation model, then run separate regressions to demonstrate the t-test on change vs. the ANCOVA approach. For the mediation model, we only need to estimate the relevant effects on initial and final weight. As noted above, the t-test on change score measures the total effect of sex, while the ANCOVA measures the direct effect on final weight. It is unnecessary to distinguish them as separate modeling approaches, as they are merely standard regressions with different target variables.
```{r runModels, echo=T}
mod = "
initial ~ a*group
final ~ b*group + c*initial
# change ~ -1*initial + 1*final (implied)
# total effect
TE := (a*-1) + (a*c*1) + (b*1) # using tracing rules
"
library(lavaan)
lpmod = sem(mod, data=df)
summary(lpmod)
summary(lm(change ~ group, df)) # t-test on change scores = total effect
summary(lm(final ~ group + initial, df)) # 'ancova' uncovers direct effect etc.
```
### Aside
We can model the change score while adjusting for initial weight (and we should generally). Note that the coefficient for initial weight `r round(coef(lm(change~group+initial, df))[3],2)` is equivalent to the ANCOVA coefficient (`r round(coef(lm(final~group+initial, df))[3],2)`) minus 1. One way to think about this is just as we have been, but focus on the initial weight score instead of the coefficient for sex. The indirect effect on change through final weight is its coefficient (path **c**) times +1, but the total effect includes the indirect plus the direct effect (i.e. direct effect - 1).
The change score result duplicates the ANCOVA result for the group effect. In fact all coefficients for covariates would be identical in a model for final weight vs. weight gain, as long as the baseline value is controlled for. They are the direct effects for a model with final weight times + 1.
See for example, Laird 1983.
```{r changewithAdjust}
summary(lm(change ~ group + initial, df))
```
## Treatment with confounding
### Background
Wainer & Brown 2007 took a different interpretation of the paradox. Here we can think of a similar situation, but instead of sex differences we now have a group difference regarding whether one dines in a particular room[^wb].
### Variables
- weight time 1
- weight time 2
- weight change
- room A vs. B
### Issues
Visually we can depict it as before but showing the difference. The choice of comic sans font in the graph is due to Wainer and Brown and should be held against them.
<img src="img/lp2.png" style="display:block; margin: 0 auto;">
- Heavier kids more likely to sit at table B
- Two statisticians come the conclusions as before
### Results
The DAG makes clear the difference in the model compared to the previous scenario.
```{r dag2}
grViz("
digraph dag2 {
# a 'graph' statement
graph [fontsize = 10, layout=circo] #rankdir ignored for circo
# several 'node' statements
Change[shape = doublecircle, fontname = Helvetica, fontcolor='gray50', fillcolor='gray95', width=1, penwidth=0.2];
node [shape = box, fontname = Helvetica, fontcolor='gray50', style=filled, penwidth=0]
Group[color=lightsalmon];
Initial[color=navajowhite];
Final[color=navajowhite];
# edge statements
Initial -> Group[label='a' fontcolor='gray25' color='dodgerblue']
Group->Final [label='b' fontcolor='gray25' color='darkred']
Initial->Final [label='c' fontcolor='gray25' color='dodgerblue']
Initial->Change [label='-1' color='gray85' fontcolor='gray25' ]
Final->Change [label='+1' color='gray85' fontcolor='gray25' ]
}
")
```
<br>
- Weight time 1 is now a confounder
- Arrow *from* time 1 to 'treatment'
- Statistician 1 concludes no change
- Statistician 2 concludes a difference (seen next)
<img src="img/lp2_afteradj.png" style="display:block; margin: 0 auto;">
While Wainer and Brown again suggest that both statisticians are correct, Pearl disagrees. Statistician 1 is incorrect because they do not adjust for the confounder, which is necessary to determine causal effects.
Note that both paradox scenarios presented assume no latent confounders. If present then both statisticians are potentially wrong in both cases. As depicted however, it was not the case that two legitimate methods gave two different answers to the same research question, as Lord concluded originally.
## Birth Weight Paradox
### Background
The problem discussed thus far extends beyond controlling for baseline scores to involving any covariate, where the focus on change scores isn't even possible[^lord].
Here we are concerned with the relationship of birth weight and infant mortality rate. In general, low birth weight is associated with higher likelihood of death. The paradox arises from the fact that low birth weight children born to smoking mothers have a lower mortality rate.
### Variables
- birth weight
- smoking mom
- infant mortality
- other causes
### Issues
- No difference score
- Before, focus on clash between two seemingly legitimate methods of analysis
- Now using a single standard regression approach but results seem implausible
### Results
- low birth weight children have higher mortality rate (100 fold higher)
- children of smoking mothers notably more likely to have low birth weight
- low birth weight children born to smoking mothers have a lower mortality rate
- Conclusion: expectant mothers should start smoking?!
### Explanation
#### Collider bias (explain away effect)
The DAG for this situation is depicted as follows. Smoking does have an effect on birth weight and infant mortality, but so do a host of other variables, at least some of which are far more detrimental.
```{r dagBirthweight}
grViz("
digraph dagbw {
# a 'graph' statement
graph [fontsize = 10, layout=circo] #rankdir ignored for circo
# several 'node' statements
node [shape = box, fontname = Helvetica, fontcolor='gray50', style=filled, penwidth=0]
Other[color=navajowhite];
BW[color=lightsalmon];
Smoking[color=navajowhite];
Death[color=navajowhite];
# edge statements
edge [color='gray50']
Smoking->BW Other->BW
BW->Death Smoking->Death Other->Death
}
")
```
<br>
Pearl explains the result from two perspectives.
#### Perspective 1
What is the causal effect of birth weight on death?
- Birth weight is confounded by smoking and other causes
- Controlling just for smoking leaves other causes, resulting in bias
- In addition, controlling for smoking changes the probability of other causes (due to BW collider) for any stratum of BW
-- Example: for BW='low', if we compare smoking vs. non-smoking mothers, we are also comparing situations where other causes are rare vs. one where other causes are likely, thus leading to the paradoxical conclusion.
#### Perspective 2
What is the causal effect of smoking on death?
Another perspective is from the point of Lord's paradox. Here we are concerned with the effect of smoking on mortality above and beyond its effect though birth weight (i.e. the mediation context of previous). Unlike before (or at least what was assumed before), here we have other confounders.
In this case, adjusting for birth weight doesn't sever all paths though the mediator, and actually opens up a new path, and the effect is now spurious.
```{r dagbw2, fig.align='center'}
grViz("
digraph dagbw2 {
# a 'graph' statement
graph [fontsize = 10, layout=circo] #rankdir ignored for circo
# several 'node' statements
node [shape = box, fontname = Helvetica, fontcolor='gray50', style=filled, penwidth=0]
Other[color=navajowhite];
Death[color=navajowhite];
BW[color=lightsalmon];
Smoking[color=navajowhite];
# edge statements
edge [color='gray50']
Smoking->BW Other->BW
Other->Death
}
", height=200)
```
<br>
Essentially we end up in the same situation. By conditioning on birth weight == 'low', it does not physically keep birth weight from changing. Comparison of smoking vs. non-smoking leads to a comparison of infants with no other causes vs. those with other causes.
# Simpson's Paradox
## Description
Simpson's paradox refers to a general phenomenon of reversal of results from what is expected. Lord's paradox can be seen as a special case, and while we have gone through the details of that particular aspect, we can describe Simpson's paradox with a simple example.
Consider a treatment given to males and females with the following success rates:
```{r simpsonExampleFreq}
# vals = data.frame(Sex=c('Male','Female'), Control=c('234/270','55/80'), Treamtent=c('81/87','192/263'))
# vals2 = data.frame(Sex=c('Male','Female'),
# Control=round(c(234/270,55/80),2),
# Treamtent=round(c(81/87,192/263), 2))
vals =data.frame(Sex=c('Male','Female'),
Control=c('23/27','5/8'),
Treatment=c('8/9','19/26'))
vals2 = data.frame(Sex=c('Male','Female'),
Control=round(c(23/27,5/8),2),
Treatment=round(c(8/9,19/26), 2))
htmlTable::htmlTable(vals, rnames=F,
css.table='margin-left:auto; margin-right:auto; border:none;width:50%')
```
<br>
```{r simpsonExampleProps}
htmlTable::htmlTable(vals2, rnames=F,
css.table='margin-left:auto; margin-right:auto; border:none;width:50%')
```
<br>
And what are the total results across male and females?
<br>
```{r simpsonExampleAggregate}
vals = data.frame(Sex=c('All', ''),
Control=c('28/35','80%'),
Treatment=c('27/35','77%'))
htmlTable::htmlTable(vals, rnames=F,
css.table='margin-left:auto; margin-right:auto; border:none;width:50%')
```
<br>
So we are back to our low birth weight issue.
Pearl notes three things are required for resolving such a paradox.
- The solution must explain why the results are seen to be surprising
- The solution must identify those cases where the paradox will arise
- The solution must provide a means for making a *correct* decision
## Surprise
The surprise is as we have just noted. We see individual proportion results, but the sum of those proportions leads to a different conclusion, thus invoking surprise. The 'paradox' isn't really a paradox, as the result is just arithmetic. However, the surprise it invokes tends us toward thinking of it as such. Our intuition tells us that, for example, a drug can't be harmful to both men and women but good for the population as a whole. This is in fact the case, but statistically it can happen if we aren't applying an appropriate model.
Pearl's sure thing theorem:
> An action A that increase the probability of event B in each subpopulation must also increase the probability of B in the whole population, *provided that the action does not change the distribution of the subpopulations*.
In other words, regardless if some effect Z is a confounder or not, and even if we don't have the correct causal structure, such reversal should invokes suspicion rather than surprise. In the above example, simply having appropriate amounts of data would likely be enough to rule out a reversal.
## Reversal
In the following graphical models we have some treatment X, and some recovery Y, with an additional covariate Z. In addition, for some we have some additional latent variable(s) L[^pearlL].
```{r set1}
g1 = "
digraph a {
# a 'graph' statement
graph [layout=circo, label='a', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
'Recovery (Y)'[color=navajowhite];
'Gender (Z)'[color=navajowhite];
'Treatment (X)'[color=lightsalmon] ;
# edge statements
edge [color='gray50']
'Treatment (X)'->'Recovery (Y)'
'Gender (Z)' -> 'Recovery (Y)'
'Gender (Z)' -> 'Treatment (X)'
}
"
g2 = "
digraph b {
# a 'graph' statement
graph [layout=circo, label='b', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
'Recovery (Y)'[color=navajowhite];
'Blood Pressure (Z)'[color=navajowhite];
'Treatment (X)'[color=lightsalmon] ;
# edge statements
edge [color='gray50']
'Treatment (X)'->'Recovery (Y)'
'Treatment (X)'->'Blood Pressure (Z)'
'Blood Pressure (Z)' -> 'Recovery (Y)'
}
"
g3 = "
digraph c {
# a 'graph' statement
graph [layout=dot, label='c', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
'Recovery (Y)'[color=navajowhite];
'Treatment (X)'[color=lightsalmon] ;
Z[color=navajowhite];
L1[color=navajowhite];
L2[color=navajowhite];
# edge statements
edge [color='gray50']
'Treatment (X)'->'Recovery (Y)'
L1 -> 'Treatment (X)'; L1->Z
L2 -> 'Recovery (Y)'; L2->Z
}
"
g4 = "
digraph d {
# a 'graph' statement
graph [layout=circo, label='d', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
Y[color=navajowhite];
X[color=lightsalmon] ;
Z[color=navajowhite];
L1[color=navajowhite];
# edge statements
edge [color='gray50']
X -> Y
Z ->Y
L1 -> X; L1->Z
}
"
```
<div style='text-align:center'> Set 1 </div>
<table align='center'>
<colgroup span="4"></colgroup>
<tr>
<td>`r grViz(g1, height=200, width=200)`</td>
<td>`r grViz(g2, height=200, width=200)`</td>
</tr>
<tr>
<td>`r grViz(g3, height=200, width=200)`</td>
<td>`r grViz(g4, height=200, width=200)`</td>
</tr>
</table>
All of the set 1 graphs are situations that might invite reversal, and in fact are observationally equivalent.
In the following graphs we could have reversal in a-c, but not d-f.
```{r set2}
g1 = "
digraph a {
# a 'graph' statement
graph [layout=circo, label='a', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
X; Y; L; Z;
# edge statements
edge [color='gray50']
X -> Y; X-> Z;
L -> Y; L-> Z;
}
"
g2 = "
digraph b {
# a 'graph' statement
graph [layout=circo, label='b', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
X; Y; L; Z;
# edge statements
edge [color='gray50']
X -> Y; Z-> X;
L -> Y; L-> X;
}
"
g3 = "
digraph c {
# a 'graph' statement
graph [layout=circo, label='c', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
X; Y; Z;
# edge statements
edge [color='gray50']
X -> Y; X -> Z;
Y -> Z;
}
"
g4 = "
digraph d {
# a 'graph' statement
graph [layout=circo, label='d', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
X; Y; Z;
# edge statements
edge [color='gray50']
X -> Y; Z -> X;
}
"
g5 = "
digraph e {
# a 'graph' statement
graph [layout=circo, label='e', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
X; Y; Z;
# edge statements
edge [color='gray50']
X -> Y; Z -> Y;
}
"
g6 = "
digraph f {
# a 'graph' statement
graph [layout=circo, label='f', labelloc='t'] #rankdir ignored for circo
# several 'node' statements
node [width = 0, penwidth=0, fontname = Helvetica, fontcolor='gray50']
X; Y; Z;
# edge statements
edge [color='gray50']
X -> Y; X -> Z;
}
"
```
<div style='text-align:center'> Set 2</div>
<table align='center'>
<colgroup span="3"></colgroup>
<tr>
<td>`r grViz(g1, height=200, width=200)`</td>
<td>`r grViz(g2, height=200, width=200)`</td>
<td>`r grViz(g3, height=200, width=200)`</td>
</tr>
<tr>
<td>`r grViz(g4, height=200, width=200)`</td>
<td>`r grViz(g5, height=200, width=200)`</td>
<td>`r grViz(g6, height=200, width=200)`</td>
</tr>
</table>
## Decision
Pearl suggests using the back-door criterion in order to help us make a decision, summarized as follows:
- Paths between X and Y are of two kinds, causal and spurious
- Causal paths can be traced as arrows from X to Y
- Spurious paths need to be blocked by conditioning
- All paths containing an arrow *into* X are spurious
- In the case of a singleton covariate Z, we must ensure that
- Z is not a descendant of X
- Z blocks every path that ends with an arrow into X
- Collider variables are a special case in which they block the path when they and all their descendants are *not* conditioned on.
This leads to the following conclusions:
- In set 1 we need to condition on Z in **a** and **d** (blocking the back door path $X \leftarrow Z \rightarrow Y$). We would not in **b** and **c** because in **b**, there are no back door paths, and in **c** the backdoor path is blocked when not conditioned on.
- When conditioning on Z is required, the Z specific information carries the correct information. However in other cases, e.g. Set 1 graph **c**, and Set 2 graphs **a** and **c**, the aggregated information is correct because the spurious path $X \rightarrow Z \leftarrow Y$ is blocked if Z is not conditioned on.
- In some cases there is not enough information with Z to block potential back-door paths, as in Set 2 **b**.
# Summary
Unfortunately most modeling situations are much more complex than the simple scenarios depicted. Most of the time experiment is not an option, and the nature of the relationships of variables ambiguous, leaving any causal explanation an impossible prospect. However, even in those situations, thinking causally can help our general understanding, and perhaps make some of these 'surprising' situations less so.
# References
Laird, N. 1983. *Further Comparative Analyses of Pretest-Posttest Research Designs*. [link](http://www.tandfonline.com/doi/abs/10.1080/00031305.1983.10483133)
Pearl, J. 2014. *Lord's Paradox Revisited -- (Oh Lord! Kumbaya!)*. [link](http://ftp.cs.ucla.edu/pub/stat_ser/r436.pdf)
Pearl, J. 2013. *Understanding Simpson's Paradox*. [link](http://ftp.cs.ucla.edu/pub/stat_ser/r414-reprint.pdf)
Senn, S. 2006. *Change from baseline and analysis of covariance revisited*. [link](http://onlinelibrary.wiley.com/doi/10.1002/sim.2682/pdf)
[^wb]: The reasoning behind not using sex was because it is not a manipulable variable. See Holland & Rubin (186) "No causation without manipulation."
[^lord]: Lord himself acknowledged this in determining group differences on college freshman grade point average while adjusting for 'aptitude'.
[^pearlL]: Oddly Pearl doesn't actually mention what L represents anywhere in the article.