1D Pooling simulation.Rmd

---
title: '1D Pooling: Simulation'
author: "Xianbin Cheng"
date: "December 17, 2018"
output: html_document
---

# Objective  

  1. We want to simulate one-dimensional pooling.
  2. We want to analyze the cost.
  
# Method  

1. Load necessary libraries and files.  

```{r, message=FALSE, warning = FALSE}
library(tidyverse)
library(Hmisc)
library(mc2d)
source("1D_pooling_simulation.R")
source("STD_simulation.R")
```

2. Set input parameters

  * `n` = the total number of kernels in a plate (48 by default).
  * `n_pool` = the number of pools.
  * 

```{r}
n = 48
n_pos_vals = 1:(n-1)
thresh = 20
n_iter = 1000
```

3. Generate simulated data. Calculate sensitivity, specificity, and the total number of tests needed. The function used to generate random contamination levels is the same as the one used to generate data for STD simulation. 

```{r, eval = FALSE}
# Pool by rows
set.seed(123)
pool_row = tune_1d_n_pos(n_pos_vals = n_pos_vals, n_iter = n_iter, n = n, thresh = thresh, n_pool = 6)

# Pool by columns
set.seed(123)
pool_col = tune_1d_n_pos(n_pos_vals = n_pos_vals, n_iter = n_iter, n = n, thresh = thresh, n_pool = 8)
```

```{r, echo = FALSE}
pool_row = readRDS("Pool_row_48_1000.rds")
pool_col = readRDS("Pool_col_48_1000.rds")
```

```{r}
# Pool by rows
metrics_row = do.call(what = rbind.data.frame, args = pool_row) %>%
  gather(data = ., key = "Type", value = "Value", - n_pos)
cost_row = do.call(what = rbind.data.frame, args = pool_row) %>%
  mutate(putative_pos = sensi*n_pos + (1-speci)*(n-n_pos),
         n_test_total = 6 + putative_pos,
         n_pipette = n + 6 + putative_pos)

# Pool by columns
metrics_col = do.call(what = rbind.data.frame, args = pool_col) %>%
  gather(data = ., key = "Type", value = "Value", - n_pos)
cost_col = do.call(what = rbind.data.frame, args = pool_col) %>%
  mutate(putative_pos = sensi*n_pos + (1-speci)*(n-n_pos),
         n_test_total = 8 + putative_pos,
         n_pipette = n + 8 + putative_pos)
```

# Results

1. Visualization of metrics with simulated data.

```{r}
# Pool by rows
draw_metrics(df = metrics_row, method = "median_hilow")

# Pool by columns
draw_metrics(df = metrics_col, method = "median_hilow")
```

2. Cost analysis.

  1) The total number of tests needed:

```{r}
# Pool by rows
draw_cost(df = cost_row, n = n, var = n_test_total, ylab = "The total number of tests needed")

# Pool by columns
draw_cost(df = cost_col, n = n, var = n_test_total, ylab = "The total number of tests needed")
```

  2) The number of pipetting:
  
```{r}
# Pool by rows
draw_cost(df = cost_row, n = n, var = n_pipette, ylab = "The number of transfers")

# Pool by columns
draw_cost(df = cost_col, n = n, var = n_pipette, ylab = "The number of transfers")
```