The SummarizedExperiment (se) class offers a useful way to store multiple row
and column metadata along with the values from an experiment and is widely used
in computational biology.
Although subsetting se's is possible with base R notation (ie using []
),
se's cannot be manipulated using grammar from the tidyverse. As a
consequence, it is not possible to manipulate se's in pipelines using the
pipe operator.
This package contains a number of wrapper functions to extend the usage of se's:
- dplyr functions: to use dplyr's grammar of data manipulation
- arithmetic functions: to perform arithmetic on 2 se's
- write functions: to print the options of a se and to write se's to delimited files
As an example, compare how cleanse is used to subset rows for gene_group NOTCH and then arrange the columns by patient
Using native syntax | Using cleanse |
---|---|
rowdata <- rowData(se) |
se <- se %>% |
Usage information can be found by reading the vignettes: browseVignettes("cleanse")
.
Functions that subset the se based on the rowData or colData
filter()
picks rows/cols based on the se's attached rowData/colDataslice()
picks rows/cols by positionarrange()
changes the ordering of the rowssample_slice()
picks a random portion of rows or cols from the se.
Functions that change the se's rowData or colData
select()
selects variablesrename()
renames variablesmutate()
adds new variables that are functions of existing variablesdrop_metadata()
drops all rowData and colData having only 1 unique value
-
subtracts values from the assays in 2 se's+
adds values from the assays in 2 se's/
divides values from the assays in 2 se's*
multiplies values from the assays in 2 se'sround
rounds the assay values of a se
write_csv()
writes a se to csvwrite_tsv()
writes a se to tsvwrite_delim()
writes a se to a delimited file
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("cleanse")
library(cleanse)
# -- An example se called seq_se is provided
# Example pipe
data(seq_se)
seq_se %>%
filter(row, gene_group == "NOTCH") %>%
filter(col, site %in% c("brain", "skin")) %>%
arrange(col, patient) %>%
round(3)
# Example sampling
data(seq_se)
seq_se %>% slice_sample(row, prop=.2)
# Example arithmetic subtracting the expression values at T=0 from T=4
data(seq_se)
(filter(seq_se, col, time == 4)) - (filter(seq_se, col, time == 0))
If you encounter a clear bug, please file a minimal reproducible example on github.