FauxFlow: a graphics library and shiny tool for visualizing single cell data in a flow cytometry-like paradigm.
Nathan Siemers
(data from Villani et al. DOI: 10.1126/science.aah4573)
I have several data sets available for interrogation on the internet if you want to quickly test out functionality. They are on a small server, so performance may be limited.
http://shiny.fiveprime.org/SCHCC (Zheng et al. Liver cancer T cells)
http://shiny.fiveprime.org/SCMel (Tirosh et al. Melanoma tumor, immune, and stroma)
http://shiny.fiveprime.org/SCBlood (Villani et al. Normal Human Blood Myeloid)
tidyverse, ggthemes, GGally, lazyeval, shiny, shinyjs, shinythemes, shinycssloaders, viridis, MASS, dplyr, rlang
rmarkdown is needed to produce knitted reports (MS Word format), and you also need a modern release of pandoc
The data.table package is also used during the process of data set construction.
The tool comes with the Zheng et al. single cell hepatocellular carcinoma T cell dataset available. http://dx.doi.org/10.1016/j.cell.2017.05.035
If you have a shiny server available, the tool should be ready for use with the Zheng data. If you don't have a shiny server, you can run an instance of the tool as follows:
R -e 'shiny::runApp("./", port=8888, host="0.0.0.0")'
copy Data/defaults.R.template to Data/defaults.R
Note: Once you create defaults.R, it will automatically be sourced instead of defaults.R.template to configure the deployment
edit Data/defaults.R:
- Fill in appropriate background information about the data set.
- Make a list of column names that are clinical/sample annotations (sc_env$clin.cols), these columns will be treated differently in some cases (gaussian noise is added to numeric data, but not on the columns specified here).
- write a loader (see the load_data() function)
The loader should return a single data frame of combined rna-seq and clinical information (genes and clinical categories in columns). There's almost always some tweaking to do.
The graphics engine for FauxFlow is contained here in the allpairs() and underlying gggpairs() functions. The engine uses GGally::ggpairs extensively, but with heavy customization of underlying plotting functions that get called for each subplot.
You should be able to update FauxFlow from the git repository without harming things, as long as you have made a custom Data/defaults.R file.
-
Gating is extremely crude - you get only a single gate point for all markers you gate with, both positive and negative.
-
If you haven't worked with single cell RNA data before, you need to realize that there can be significant drop-outs in the data. Gating (especially negative gating) and fractional proportions can be biased by this.