Skip to content

7. Differential analysis

Catalina Vallejos edited this page Jun 7, 2020 · 6 revisions

‼️ THIS WIKI IS NO LONGER MAINTAINED - PLEASE REFER TO THE VIGNETTE INSTEAD ‼️

BASiCS can perform differential expression analyses between two or cell groups, such experimental conditions or cell types.

In Vallejos et al (2016), we introduced two types of differential testing: for mean and for over-dispersion (MCMC chains are obtained through the BASiCS_MCMC function, with Regression = FALSE). However, due to the confounding between mean and over-dispersion that is typically observed in scRNA-seq data, meaningful changes in over-dispersion could only be assessed for those genes in which the mean does not change between groups. To address this limitation, Eling et al (2017) extended the BASiCS model in order to derive a measure of residual over-dispersion that is not confounded by mean expression (MCMC chains are obtained through the BASiCS_MCMC function, with Regression = TRUE). The latter is now used as part of the BASiCS differential testing feature.

Here, we illustrate this feature using a subset of the MCMC chains that were obtained for the dataset presented by Grün et al (2014) (single cells vs pool-and-split samples). These were obtained by independently running the BASiCS_MCMC function for each group of cells, using the option Regression = TRUE.

data(ChainSCReg)
data(ChainRNAReg)

Test <- BASiCS_TestDE(Chain1 = ChainSCReg, Chain2 = ChainRNAReg,
                      GroupLabel1 = "SC", GroupLabel2 = "PaS",
                      EpsilonM = log2(1.5), EpsilonD = log2(1.5),
                      EpsilonR = log2(1.5)/log2(exp(1)),
                      EFDR_M = 0.10, EFDR_D = 0.10,
                      Offset = TRUE, PlotOffset = FALSE, Plot = FALSE)

Here, EpsilonM sets the log2 fold change (log2FC) in expression ($\mu$), EpsilonD the log2FC in over-dispersion ($\delta$) and EpsilonR for differences in residual over-dispersion ($\epsilon$). As a default option: EpsilonM = EpsilonD = log2(1.5) and EpsilonR = log2(1.5)/log2(exp(1)), equating to a 50% increase in mean or over-dispersion. For more details about these default values, see page e4 in Eling et al (2017).

Note: EpsilonR is only required when the input BASiCS_Chain objects were generated using Regression = TRUE in the call to BASiCS_MCMC.

To adjust for differences in overall RNA content, an internal offset correction is performed when OffSet=TRUE. This is the recommended default.

The resulting output list can be displayed using

head(Test$TableMean)
head(Test$TableDisp)
head(Test$TableResDisp)

Note: Test$TableResDisp is only displayed when the input BASiCS_Chain objects were generated using Regression = TRUE in the call to BASiCS_MCMC.

Clone this wiki locally