Skip to content

Provides dplyr and other functions for the SummarizedExperiment class

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

martijnvanattekum/cleanse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cleanse

Overview

The SummarizedExperiment (se) class offers a useful way to store multiple row and column metadata along with the values from an experiment and is widely used in computational biology.
Although subsetting se's is possible with base R notation (ie using []), se's cannot be manipulated using grammar from the tidyverse. As a consequence, it is not possible to manipulate se's in pipelines using the pipe operator.

This package contains a number of wrapper functions to extend the usage of se's:

  • dplyr functions: to use dplyr's grammar of data manipulation
  • arithmetic functions: to perform arithmetic on 2 se's
  • write functions: to print the options of a se and to write se's to delimited files

As an example, compare how cleanse is used to subset rows for gene_group NOTCH and then arrange the columns by patient

Using native syntax Using cleanse
rowdata <- rowData(se)
se <- se[rowdata$gene_group == "NOTCH", ]
se <- se[, order(se$patient)]
se <- se %>%
filter(row, gene_group == "NOTCH") %>%
arrange(col, patient)

Usage information can be found by reading the vignettes: browseVignettes("cleanse").

Supported dplyr functions

Functions that subset the se based on the rowData or colData

  • filter() picks rows/cols based on the se's attached rowData/colData
  • slice() picks rows/cols by position
  • arrange() changes the ordering of the rows
  • sample_slice() picks a random portion of rows or cols from the se.

Functions that change the se's rowData or colData

  • select() selects variables
  • rename() renames variables
  • mutate() adds new variables that are functions of existing variables
  • drop_metadata() drops all rowData and colData having only 1 unique value

Supported arithmetic functions

  • - subtracts values from the assays in 2 se's
  • + adds values from the assays in 2 se's
  • / divides values from the assays in 2 se's
  • * multiplies values from the assays in 2 se's
  • round rounds the assay values of a se

Supported write functions

  • write_csv() writes a se to csv
  • write_tsv() writes a se to tsv
  • write_delim() writes a se to a delimited file

Installation

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("cleanse")

Usage

library(cleanse)

# -- An example se called seq_se is provided

# Example pipe
data(seq_se)
seq_se %>%
  filter(row, gene_group == "NOTCH") %>%
  filter(col, site %in% c("brain", "skin")) %>%
  arrange(col, patient) %>%
  round(3)

# Example sampling
data(seq_se)
seq_se %>% slice_sample(row, prop=.2)

# Example arithmetic subtracting the expression values at T=0 from T=4
data(seq_se)
(filter(seq_se, col, time == 4)) - (filter(seq_se, col, time == 0))

Getting help

If you encounter a clear bug, please file a minimal reproducible example on github.

About

Provides dplyr and other functions for the SummarizedExperiment class

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published