Skip to content

Latest commit

 

History

History
172 lines (105 loc) · 6.75 KB

README.md

File metadata and controls

172 lines (105 loc) · 6.75 KB

GRUFFI

The Gruffi R package helps you (1) to identify stressed cells in single-cell RNA-seq datasets using granular funcitonal filtering, or (2) you can use it to analyze any gene set's pathway activity (GO-term or custom defined), and define a cell population based on their combined activity.

Gruffi integrates into any Seurat analysis pipelione & it comes with a graphical user interface.

Installation

You can install dependencies from CRAN, Bioconductor and GitHub via devtools:

# Install CRAN & Bioconductor dependencies
install.packages('BiocManager')
BiocManager::install("DOSE")
BiocManager::install("org.Hs.eg.db")
BiocManager::install("sparseMatrixStats")
BiocManager::install("biomaRt")


# Install custom dependencies
install.packages('devtools')
devtools::install_github(repo = "vertesy/Stringendo", upgrade = F)
devtools::install_github(repo = "vertesy/CodeAndRoll2", ref = 'v2.3.4', upgrade = F)
devtools::install_github(repo = "vertesy/ReadWriter", upgrade = F)
devtools::install_github(repo = "vertesy/MarkdownHelpers", upgrade = F)
devtools::install_github(repo = "vertesy/Markdownreports", upgrade = F)
devtools::install_github(repo = "vertesy/ggExpress", upgrade = F)
devtools::install_github(repo = "vertesy/Seurat.utils", upgrade = F)


# Install gruffi
devtools::install_github(repo = "jn-goe/gruffi", upgrade = F)

NOTE: If you type 'all' when R asks to update dependencies, you may get into installation errors / infinite loops. If updating fails, type 'no' when prompted.

Usage

image

1. Setup

Load your Seurat single-cell object of interest (in the following combined.obj), perform basic analysis steps (Integration, PCA, Clustering) and calculate 2D and 3D umaps.

library(gruffi)

combined.obj <- Seurat.utils::SetupReductionsNtoKdimensions(obj = combined.obj, nPCs = 50, dimensions=3:2, reduction="umap")
# Note that this will recalculate 2D and 3D umaps, and back them up in combined.obj@misc$reductions.backup. 
# See FAQ.md, if you don't want to overwrite the 2D UMAP.

If you want to store multiple UMAP's, you have to keep them in backup slot (in combined.obj@misc). Use combined.obj <- RecallReduction(obj = combined.obj, dim = 2, reduction="umap") to access them.

Prepare GO-terms and gene-sets

ensembl <- biomaRt::useEnsembl("ensembl", dataset = "hsapiens_gene_ensembl")

go1 <- "GO:0006096" # Glycolysis
go2 <- "GO:0034976" # ER-stress
go3 <- "GO:0042063" # Gliogenesis, negative filtering

2. Granule partitioning

Gruffi works best if you partition cells into groups of 100-200 cells. Find the corresponding clustering resolution by:

combined.obj <- aut.res.clustering(obj = combined.obj)

The optimal resolution found is stored in:

granule.res.4.gruffi <- combined.obj@misc$gruffi$'optimal.granule.res'	

Some granules have too few cells, therfore their scoring is not robust statistically. Use reassign.small.clusters to assign these cells to the nearest large-enough granule:

combined.obj <- reassign.small.clusters(combined.obj, ident = granule.res.4.gruffi) # will be stored in meta data column as "seurat_clusters.reassigned"

Above, granules with <30 cells are cell-by-cell re-assigned to a neighboring granule (by default based on Euclidean distance between the mean of cell groups in 3dim UMAP space). The reassigned granules are suffixed as :

granule.res.4.gruffi <- paste0(granule.res.4.gruffi, '.reassigned')

3. Pathway scoring

After finding the right granule resolution, first GO scores per cells, then average GO-scores per granule (normalized for respective cell numbers) are computed. Within the GO score computation an ensembl database will be loaded (that you created above as ensembl).

# Glycolytic process	GO:0006096
combined.obj <- GO_score_evaluation(obj = combined.obj, GO_term = go1, save.UMAP = TRUE, new_GO_term_computation = T, clustering = granule.res.4.gruffi, plot.each.gene = F)

# ER stress 	GO:0034976
combined.obj <- GO_score_evaluation(obj = combined.obj, GO_term = go2, save.UMAP = TRUE, new_GO_term_computation = T, clustering = granule.res.4.gruffi, plot.each.gene = F)

# Gliogenesis		GO:0042063
combined.obj <- GO_score_evaluation(obj = combined.obj, GO_term = go3, save.UMAP = TRUE, new_GO_term_computation = T, clustering = granule.res.4.gruffi, plot.each.gene = F)

These functions store the resulting scores in combined.obj@meta.data.

4. Stress filtering

We will now call a Shiny Interface to auto-estimate and/or manually adjust the stress assignment threeholds via Shiny.GO.thresh().

  • The first 2 scores are for features to filter out (e.g.: cells with high ER stress and glycolysis),
  • and the last 2 scores are for features we want to keep (e.g.: cells with high gliogenesis).

Example code for filtering cells high in glycolytic process and ER stress but low in gliogenesis:

# Create score names:
(i1 <- Stringendo::kppu(granule.res.4.gruffi, 'cl.av', go1))
(i2 <- Stringendo::kppu(granule.res.4.gruffi, 'cl.av', go2))
(i3 <- Stringendo::kppu(granule.res.4.gruffi, 'cl.av', go3))

# Call Shiny app
combined.obj <- Shiny.GO.thresh(obj = combined.obj,
                                stress.ident1 = i1,
                                stress.ident2 = i2,
                                notstress.ident3 = i3,
                                plot.cluster.shiny = "orig.ident")

"Dont forget to click the button in the app: Save New Thresholds"

The interfacte allows automatic estimation and manual adjustment of thresholds for stress scores:

ShinyInterface_Vel Org7 d90 ImpV

After pushing the Save new thresholds button in the Shiny graphical user interface, thresholds are saved in combined.obj@misc$gruffi and the stress assignment is stored as a new meta data column is.Stressed. Check results as:

Seurat.utils::clUMAP('is.Stressed', label =F)

5. Remove stressed cells

cellIDs.keep <- which_names(!combined.obj$'is.Stressed')
subset.obj <- subset(x = combined.obj, cells = cellIDs.keep)  

Seurat.utils::clUMAP('is.Stressed', label = F, obj = subset.obj)

Troubleshooting

For Errors, Problems & Questions: **please see the FAQ ** or raise an issue in the github repository. We try to answer.