Skip to content

Source Code File Structure and Organization

Paul Hoffman edited this page Nov 6, 2018 · 6 revisions

Seurat's source code is sorted into several R source files. These files all contain code that logically goes together, for example generics.R contains generics declared by Seurat. In order to increase clarity, each file follows the same overall structure, with a couple of exceptions. All files consist of an optional header, followed by code sections.

Header

The header which consists of @include statements to include code definitions found within other source files; @import, @importFrom, and @importClassesFrom statements to import code from other packages to be applied to all code to the current file; and @useDynLib to include C++ source code. After the header, there should be a spacer line denoted with a #' and a NULL to denote that the imports and includes apply to the entire file. The @importFrom statements should only be used if the code being imported is being used by a majority of the code contained within the source file. An example header section is provided below

#' @include generics.R
#' @importFrom Rcpp evalCpp
#' @importFrom Matrix colSums rowSums colMeans rowMeans
#' @importFrom methods setClass setOldClass setClassUnion slot
#' slot<- setMethod new signature
#' @useDynLib Seurat
#'
NULL

Code Sections

Each source file is broken into one or more code sections. Code is sectioned off into type of code, and sorted within the section alphanumericly. All code sections are denoted the same way: a hash (#) followed by 79 percent (%) symbols on the first line, a hash (#) followed by a space and the name of the section on the second line, and a hash (#) followed by 79 percent (%) symbols on the last line. An example section denotation is provided below

#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Functions
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The following code sections are allowed

Section Title Purpose
Class definitions Any formal class defintions, declared with setClass, setOldClass, or setClassUnion
Functions Standard R functions that are exported and used by the end-user
Methods for Seurat-defined generics S3 methods for any generic contained within generics.R
Methods for R-defined generics S3 methods for any generic not contained within generics.R; these can be methods for generics provided by base R or another package (eg. Matrix, ggplot2)
S4 methods S4 methods for any generic, either contained within generics.R or provided by base R or an external package; these can be either exported or internal
Internal Standard R functions that are internal to Seurat and are not to be used by the end-user

All code in Functions, Methods for Seurat-defined generics, and Methods for R-defined generics should be exported. All code in Internal should not be exported. Code in Class definitions may be exported with the Roxygen tag @exportClass, but this is not mandatory. Code in S4 methods may be exported, but this is not mandatory. See S3 vs S4 for more details about S3 and S4 style methods.

Code Sorting

All code should be sorted alphanumericly by name of the function, method, or class defined. As not all code declarations use pure alphanumeric symbols, the following extensions to a standard alphanumeric sort are made

  • Periods (.) rank first
    • .DollarNames ranks above as.logical
    • dim.Assay ranks above dimnames.Assay
    • subset.Seurat ranks above SubsetData.Assay
  • Symbols rank second, and symbols themselves are ranked in the order they appear on the US English QWERTY keyboard
    • $.Seurat ranks above dim.Seurat
    • $.Jackstraw ranks above %||%
  • Extract operators rank third, with single [ extract operators ranking higher than double [[ extract operators
    • [.Seurat ranks above [[.Seurat
  • Alphanumeric symbols rank last
    • dim.Assay ranks above dim.Seurat
  • If the code is an assignment declaration, it ranks after any corresponding normal functions or methods
    • Idents ranks above Idents<-
    • Idents.Seurat ranks above Idents<-

Code sorting is case-insensitive, capitals and lowercase are of equal weight and the same name should not be used for different functions with the only difference being upper or lowercase.

data.R

Unlike other source files, data.R has no code sections. This file simply contains documentation for included datasets, sorted by the name of the provided dataset.

generics.R

Unlike other source files, generics.R has no code sections. All code should be alphanumericly sorted.

visualization.R

Unlike other source files, visualization.R uses custom code sections

Section Title Purpose
Heatmaps Plotting functions that use SingleImageMap or SingleRasterMap
Expression by identity plots Plotting functions that use ExIPlot
Dimensional reduction plots Plotting functions that use SingleDimPlot
Scatter plots Plotting functions that use SingleCorPlot
Polygon plots Plotting functions that use SinglePolyPlot
Other plotting functions Plotting functions that use their own underlying plotting code
Exported utility functions Standard R functions that do not do any plotting themselves, are exported, and used by the end-user
Seurat themes Themes to be added to ggplot2-based plots
Internal Standard R functions that are internal to Seurat and are not to be used by the end-user

Within each section, all code should be alphanumericly sorted.

File list

The following source files are present in Seurat's R directory

File Contents
clustering.R Code used in clustering datasets
data.R Declarations and documentation for included datasets
differential_expression.R Code for finding differentially expressed features and any associated methods
dimensional_reduction.R Methods for reducing dimensionality of single-cell expression data
generics.R Any and all declarations for generics; this file has no code sections and all generic declarations should be sorted alphabetically
integration.R ...
objects.R Formal class definitions and associated functions and methods for interacting with said objects
preprocessing.R Functions and methods leading up to dimensional reduction, clustering, and differential expression
RcppExports.R R functions for C++ code, not to be edited by hand
utilities.R Various tools used throughout Seurat
visualization.R Code for plotting data, has its own custom set of code sections