Skip to content

6. HVG & LVG detection

Catalina Vallejos edited this page Jun 7, 2020 · 4 revisions

‼️ THIS WIKI IS NO LONGER MAINTAINED - PLEASE REFER TO THE VIGNETTE INSTEAD ‼️

We illustrate this analysis using a small extract from the MCMC chain obtained in Vallejos et al (2016) when analysing the single cell samples provided by Grün et al (2014). This is included within BASiCS as the ChainSC dataset.

data(ChainSC)

The following code is used to identify highly variable genes (HVG) and lowly variable genes (LVG) within these cells. The VarThreshold parameter sets a lower threshold for the proportion of variability that is assigned to the biological component (Sigma). In the examples below:

  • HVG are defined as those genes for which at least 60% of their total variability is attributed to the biological variability component.
  • LVG are defined as those genes for which at most 40% of their total variability is attributed to the biological variability component.

For each gene, these functions return posterior probabilities as a measure of HVG/LVG evidence. A cut-off value for these posterior probabilities is set by controlling EFDR (defaul option: EviThreshold defined such that EFDR = 0.10).

par(mfrow = c(2,2))
HVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.6, Plot = TRUE)
LVG <- BASiCS_DetectLVG(ChainSC, VarThreshold = 0.2, Plot = TRUE)

To access the results of these tests, please use.

head(HVG$Table)
head(LVG$Table)
SummarySC <- Summary(ChainSC)
plot(SummarySC, Param = "mu", Param2 = "delta", log = "xy")
with(HVG$Table[HVG$Table$HVG == TRUE,], points(Mu, Delta))
with(LVG$Table[LVG$Table$LVG == TRUE,], points(Mu, Delta))

Note: this criteria for threshold has changed with respect to the original release of BASiCS (where EviThreshold was defined such that EFDR = EFNR). However, the new choice is more stable (sometimes, it was not posible to find a threshold such that EFDR = EFNR).

Clone this wiki locally