cleanse

Overview

The SummarizedExperiment (se) class offers a useful way to store multiple row and column metadata along with the values from an experiment and is widely used in computational biology.
Although subsetting se's is possible with base R notation (ie using []), se's cannot be manipulated using grammar from the tidyverse. As a consequence, it is not possible to manipulate se's in pipelines using the pipe operator.

This package contains a number of wrapper functions to extend the usage of se's:

dplyr functions: to use dplyr's grammar of data manipulation
arithmetic functions: to perform arithmetic on 2 se's
write functions: to print the options of a se and to write se's to delimited files

As an example, compare how cleanse is used to subset rows for gene_group NOTCH and then arrange the columns by patient

Using native syntax	Using cleanse
rowdata <- rowData(se) se <- se[rowdata$gene_group == "NOTCH", ] se <- se[, order(se$patient)]	se <- se %>% filter(row, gene_group == "NOTCH") %>% arrange(col, patient)

Usage information can be found by reading the vignettes: browseVignettes("cleanse").

Supported dplyr functions

Functions that subset the se based on the rowData or colData

filter() picks rows/cols based on the se's attached rowData/colData
slice() picks rows/cols by position
arrange() changes the ordering of the rows
sample_slice() picks a random portion of rows or cols from the se.

Functions that change the se's rowData or colData

select() selects variables
rename() renames variables
mutate() adds new variables that are functions of existing variables
drop_metadata() drops all rowData and colData having only 1 unique value

Supported arithmetic functions

- subtracts values from the assays in 2 se's
+ adds values from the assays in 2 se's
/ divides values from the assays in 2 se's
* multiplies values from the assays in 2 se's
round rounds the assay values of a se

Supported write functions

write_csv() writes a se to csv
write_tsv() writes a se to tsv
write_delim() writes a se to a delimited file

Installation

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("cleanse")

Usage

library(cleanse)

# -- An example se called seq_se is provided

# Example pipe
data(seq_se)
seq_se %>%
  filter(row, gene_group == "NOTCH") %>%
  filter(col, site %in% c("brain", "skin")) %>%
  arrange(col, patient) %>%
  round(3)

# Example sampling
data(seq_se)
seq_se %>% slice_sample(row, prop=.2)

# Example arithmetic subtracting the expression values at T=0 from T=4
data(seq_se)
(filter(seq_se, col, time == 4)) - (filter(seq_se, col, time == 0))

Getting help

If you encounter a clear bug, please file a minimal reproducible example on github.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
R		R
data-raw		data-raw
data		data
man		man
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

cleanse

Overview

Supported dplyr functions

Supported arithmetic functions

Supported write functions

Installation

Usage

Getting help

About

Licenses found

Releases

Packages

Languages

License

Licenses found

martijnvanattekum/cleanse

Folders and files

Latest commit

History

Repository files navigation

cleanse

Overview

Supported dplyr functions

Supported arithmetic functions

Supported write functions

Installation

Usage

Getting help

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages