-
Notifications
You must be signed in to change notification settings - Fork 2
/
analysis-ror.Rmd
78 lines (61 loc) · 2.42 KB
/
analysis-ror.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
title: "Base Ranker: Reporting Odds Ratio (ROR)"
author:
- name: Nan Xiao
url: https://nanx.me/
affiliation: Seven Bridges
affiliation_url: https://www.sevenbridges.com/
- name: Soner Koc
url: https://github.com/skoc
affiliation: Seven Bridges
affiliation_url: https://www.sevenbridges.com/
- name: Kaushik Ghose
url: https://kaushikghose.wordpress.com/
affiliation: Seven Bridges
affiliation_url: https://www.sevenbridges.com/
date: "`r Sys.Date()`"
output: distill::distill_article
bibliography: rankv.bib
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = TRUE, cache = TRUE)
```
Beside PRR, ROR is yet another commonly used metric for safety signal detection. Let's denote the vaccine-symptom pairs from the VAERS database as many $2 \times 2$ contingency tables:
| Target vaccine | Target symptom | All other symptoms | Total |
| :------------- | :------------- | :----------------------- | :-------- |
| Yes | $n_{ij}$ | $n_i - n_{ij}$ | $n_i$ |
| No | $n_j - n_{ij}$ | $n - n_i - n_j + n_{ij}$ | $n - n_i$ |
| Total | $n_j$ | $n - n_j$ | $n$ |
In the table, $n_i = \sum_j n_{ij}$, $n_j = \sum_i n_{ij}$. The reporting odds ratio (ROR) [@van2002comparison] for each vaccine-symptom pair is defined as
$$
ROR_{ij} = \frac{n_{ij}}{\tilde{E}_{ij}}
$$
where
$$
\tilde{E}_{ij} = \frac{(n_i - n_{ij})(n_j - n_{ij})}{n - n_i - n_j + n_{ij}}.
$$
Similar to PRR, a relatively higher ROR indicates stronger association between the vaccine and the symptom.
Load the packages for ROR-based singal detection and ranking:
```{r}
suppressMessages(library("PhViD"))
library("kableExtra")
```
Load the preprocessed VAERS data and transform it into the analyzable format:
```{r}
df_p <- readRDS("data-processed/df_p.rds")
df_p <- df_p[, 1:3]
df_v <- as.PhViD(df_p, MARGIN.THRES = 10)
```
Calculate the Reporting Odds Ratio (ROR) [@van2002comparison] and the ranking statistic --- lower bound of the 95% two-sided confidence interval of log(ROR):
```{r}
lst_ror <- ROR(df_v, MIN.n11 = 10, DECISION = 3, RANKSTAT = 2)
df_ror <- lst_ror$SIGNALS[order(lst_ror$SIGNALS$`LB95(log(ROR))`, decreasing = TRUE), 1:6]
row.names(df_ror) <- NULL
```
View the top ranked vaccine-adverse event pairs:
```{r}
head(df_ror) %>% kable() %>% kable_styling()
```
```{r,echo=FALSE}
saveRDS(df_ror, file = "data-processed/df_ror.rds")
```