This report contains a summary of cell type annotation results for library SCPCL000295. The goal of this report is to provide more detailed information about cell type annotation results as an initial evaluation of their quality and reliability.

Performing cell type annotation is an inherently challenging task with high levels of uncertainty, especially when using automated annotation methods. One way to address this is to use multiple cell type annotation approaches and compare the results, as we have begun to do in this report.

When multiple methods annotate a given cell as the same or similar cell type, this may qualitatively indicate a more robust annotation.

Note that the contents of this report will vary based on which cell type annotations are present. Further, be aware that different cell type annotation methods may assign different labels to the same or similar cell type (e.g., the different string representations B cell naive and Naive B cell), due to use of different underlying reference datasets.

This library contains the following cell type annotations:

  • Submitter-provided cell type annotation, generated by the original lab which produced this data.
  • Annotations from SingleR, a reference-based approach (Looney et al. 2019). The BlueprintEncodeData dataset, obtained from the celldex package, was used for reference annotations.
  • Annotations from CellAssign, a marker-gene-based approach (Zhang et al. 2019). Marker genes for cell types were obtained from PanglaoDB and compiled into a reference named blood. This reference includes the following organs and tissue compartments: organ.

Cell type Annotation Summary

The plots and tables included here detail the results from performing cell type annotation.

Statistics

Submitter-provided cell type annotations

In this table, cells labeled “Submitter-excluded” are those for which submitters did not provide an annotation.
Annotated cell type Number of cells Percent of cells
celltype4 1408 25.83%
celltype3 1384 25.39%
celltype2 1353 24.82%
celltype1 1307 23.97%

SingleR cell type annotations

In this table, cells labeled “Unknown cell type” are those which SingleR pruned due to low-quality assignments. In the processed result files, these cells are labeled NA.
Annotated cell type Number of cells Percent of cells
monocyte 2686 49.27%
granulocyte monocyte progenitor cell 2483 45.54%
common myeloid progenitor 105 1.93%
common lymphoid progenitor 46 0.84%
naive B cell 29 0.53%
hematopoietic multipotent progenitor cell 21 0.39%
plasma cell 17 0.31%
natural killer cell 15 0.28%
macrophage 14 0.26%
CD8-positive, alpha-beta T cell 11 0.2%
CD4-positive, alpha-beta T cell 7 0.13%
eosinophil 5 0.09%
effector memory CD8-positive, alpha-beta T cell 3 0.06%
central memory CD8-positive, alpha-beta T cell 2 0.04%
hematopoietic stem cell 2 0.04%
memory B cell 2 0.04%
effector memory CD4-positive, alpha-beta T cell 1 0.02%
megakaryocyte-erythroid progenitor cell 1 0.02%
Unknown cell type 2 0.04%

CellAssign cell type annotations

In this table, cells labeled “Unknown cell type” are those which CellAssign could not confidently assign to a label in the reference list. In the processed result files, these cells are labeled "other".
Annotated cell type Number of cells Percent of cells
Myeloid-derived suppressor cells 2005 36.78%
Gamma delta T cells 221 4.05%
B cells naive 38 0.7%
Neutrophils 32 0.59%
Dendritic cells 31 0.57%
Eosinophils 22 0.4%
T cells 17 0.31%
NK cells 16 0.29%
B cells memory 13 0.24%
T memory cells 8 0.15%
Megakaryocytes 4 0.07%
Osteoclast precursor cells 3 0.06%
Plasma cells 2 0.04%
Hematopoietic stem cells 1 0.02%
Macrophages 1 0.02%
Mast cells 1 0.02%
Monocytes 1 0.02%
Nuocytes 1 0.02%
Red pulp macrophages 1 0.02%
Reticulocytes 1 0.02%
Unknown cell type 3033 55.63%

UMAPs

In this section, we show UMAPs colored by clusters. Clusters were calculated using the graph-based louvain algorithm with jaccard weighting.

## Warning: There were 2 warnings in `dplyr::mutate()`.
## The first warning was:
## ℹ In argument: `across(...)`.
## Caused by warning:
## ! 1 unknown level in `f`: Unknown cell type
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.

Next, we show UMAPs colored by cell types. For each cell typing method, we show a separate faceted UMAP. In each panel, cells that were assigned the given cell type label are colored, while all other cells are in grey.

For legibility, only the seven most common cell types are shown. All other cell types are grouped together and labeled “All remaining cell types” (not to be confused with “Unknown cell type” which represents cells that could not be classified).

Cell label comparison plots

This section displays heatmaps comparing cell labels from various methods.

We use the Jaccard similarity index to display the agreement between between pairs of labels assigned by different annotation methods.

The Jaccard index reflects the degree of overlap between the two labels and ranges from 0 to 1.

  • If the labels are assigned to identical sets of cells, the Jaccard index will be 1.
  • If the labels are assigned to completely non-overlapping sets of cells, the Jaccard index will be 0.

High agreement between methods qualitatively indicates higher confidence in the cell type annotation.

Unsupervised clustering

Here we show the labels from unsupervised clustering compared to cell type annotations. Cluster assignment was performed using the louvain algorithm.

Submitter-provided annotations

This section displays heatmaps comparing submitter-provided cell type annotations to those obtained from SingleR and CellAssign.

Automated annotations

This section displays a heatmap directly comparing SingleR and CellAssign cell type annotations. Note that due to different annotations references, these methods may use different names for similar cell types.

Quality assessments of automated annotations

SingleR annotations

To assess the quality of SingleR cell type annotations, we use the delta median statistic.

  • Delta median is calculated for each cell as the difference between the SingleR score of the annotated cell type label and the median score of the other cell type labels in the reference dataset.
  • Higher delta median values indicate higher quality cell type annotations.

You can interpret this plot as follows:

  • Each point represents the delta median statistic of a given cell whose SingleR annotation is shown on the y-axis.
  • The point style indicates SingleR’s quality assessment of the annotation:
    • High-quality cell annotations are shown as closed points.
    • Low-quality cell annotations are shown as open points. In other sections of this report, these cells are labeled as Unknown cell types.
    • For more information on how SingleR calculates annotation quality, please refer to this SingleR documentation.
  • Diamonds represent the median of the delta median statistic specifically among high-quality annotations for the given cell type annotation.

CellAssign annotations

To assess the quality of CellAssign cell type annotations, we consider the probability associated with the annotated cell type. These probabilities are provided directly by CellAssign:

  • CellAssign first calculates the probability of each cell being annotated as each cell type present in the reference.
  • CellAssign then annotates cells by selecting the cell type with the highest probability among all cell types considered.
  • These probabilities range from 0 to 1, with larger values indicating greater confidence in a given cell type label. We therefore expect reliable labels to have values close to 1.

The plot below shows the distribution of CellAssign-calculated probabilities for the final cell type labels. Line segments represent individual values that comprise each distribution.

For cell types with 2 or fewer labeled cells, only the individual value line segments are shown. Line segments are also taller for any cell type label with 5 or fewer cells.

Session Info

R session information
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.1 (2023-06-16)
##  os       macOS Sonoma 14.2.1
##  system   aarch64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       America/New_York
##  date     2024-01-05
##  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package              * version   date (UTC) lib source
##  abind                  1.4-5     2016-07-21 [1] CRAN (R 4.3.0)
##  beachmat               2.16.0    2023-05-08 [1] Bioconductor
##  Biobase              * 2.60.0    2023-05-08 [1] Bioconductor
##  BiocGenerics         * 0.46.0    2023-06-04 [1] Bioconductor
##  BiocParallel           1.34.2    2023-05-28 [1] Bioconductor
##  bitops                 1.0-7     2021-04-24 [1] CRAN (R 4.3.0)
##  bslib                  0.5.1     2023-08-11 [1] CRAN (R 4.3.0)
##  cachem                 1.0.8     2023-05-01 [1] CRAN (R 4.3.0)
##  Cairo                  1.6-1     2023-08-18 [1] CRAN (R 4.3.0)
##  circlize               0.4.15    2022-05-10 [1] CRAN (R 4.3.0)
##  cli                    3.6.1     2023-03-23 [1] CRAN (R 4.3.0)
##  clue                   0.3-65    2023-09-23 [1] CRAN (R 4.3.1)
##  cluster                2.1.4     2022-08-22 [2] CRAN (R 4.3.1)
##  codetools              0.2-19    2023-02-01 [2] CRAN (R 4.3.1)
##  colorspace             2.1-0     2023-01-23 [1] CRAN (R 4.3.0)
##  ComplexHeatmap         2.16.0    2023-05-08 [1] Bioconductor
##  crayon                 1.5.2     2022-09-29 [1] CRAN (R 4.3.0)
##  DelayedArray           0.26.7    2023-07-30 [1] Bioconductor
##  DelayedMatrixStats     1.22.6    2023-09-03 [1] Bioconductor
##  digest                 0.6.33    2023-07-07 [1] CRAN (R 4.3.0)
##  doParallel             1.0.17    2022-02-07 [1] CRAN (R 4.3.0)
##  dplyr                  1.1.4     2023-11-17 [1] CRAN (R 4.3.1)
##  evaluate               0.23      2023-11-01 [1] CRAN (R 4.3.1)
##  fansi                  1.0.5     2023-10-08 [1] CRAN (R 4.3.1)
##  farver                 2.1.1     2022-07-06 [1] CRAN (R 4.3.0)
##  fastmap                1.1.1     2023-02-24 [1] CRAN (R 4.3.0)
##  flexmix                2.3-19    2023-03-16 [1] CRAN (R 4.3.0)
##  forcats                1.0.0     2023-01-29 [1] CRAN (R 4.3.0)
##  foreach                1.5.2     2022-02-02 [1] CRAN (R 4.3.0)
##  generics               0.1.3     2022-07-05 [1] CRAN (R 4.3.0)
##  GenomeInfoDb         * 1.36.4    2023-10-08 [1] Bioconductor
##  GenomeInfoDbData       1.2.10    2023-09-14 [1] Bioconductor
##  GenomicRanges        * 1.52.1    2023-10-08 [1] Bioconductor
##  GetoptLong             1.0.5     2020-12-15 [1] CRAN (R 4.3.0)
##  ggforce                0.4.1     2022-10-04 [1] CRAN (R 4.3.0)
##  ggplot2              * 3.4.4     2023-10-12 [1] CRAN (R 4.3.1)
##  GlobalOptions          0.1.2     2020-06-10 [1] CRAN (R 4.3.0)
##  glue                   1.6.2     2022-02-24 [1] CRAN (R 4.3.0)
##  gtable                 0.3.4     2023-08-21 [1] CRAN (R 4.3.0)
##  highr                  0.10      2022-12-22 [1] CRAN (R 4.3.0)
##  htmltools              0.5.7     2023-11-03 [1] CRAN (R 4.3.1)
##  httr                   1.4.7     2023-08-15 [1] CRAN (R 4.3.0)
##  IRanges              * 2.34.1    2023-07-02 [1] Bioconductor
##  iterators              1.0.14    2022-02-05 [1] CRAN (R 4.3.0)
##  jquerylib              0.1.4     2021-04-26 [1] CRAN (R 4.3.0)
##  jsonlite               1.8.8     2023-12-04 [1] CRAN (R 4.3.1)
##  kableExtra             1.3.4     2021-02-20 [1] CRAN (R 4.3.0)
##  knitr                  1.45      2023-10-30 [1] CRAN (R 4.3.1)
##  labeling               0.4.3     2023-08-29 [1] CRAN (R 4.3.0)
##  lattice                0.21-8    2023-04-05 [2] CRAN (R 4.3.1)
##  lifecycle              1.0.3     2022-10-07 [1] CRAN (R 4.3.0)
##  magrittr               2.0.3     2022-03-30 [1] CRAN (R 4.3.0)
##  MASS                   7.3-60    2023-05-04 [2] CRAN (R 4.3.1)
##  Matrix                 1.5-4.1   2023-05-18 [2] CRAN (R 4.3.1)
##  MatrixGenerics       * 1.12.3    2023-07-30 [1] Bioconductor
##  matrixStats          * 1.0.0     2023-06-02 [1] CRAN (R 4.3.0)
##  miQC                   1.8.0     2023-05-08 [1] Bioconductor
##  modeltools             0.2-23    2020-03-05 [1] CRAN (R 4.3.0)
##  munsell                0.5.0     2018-06-12 [1] CRAN (R 4.3.0)
##  nnet                   7.3-19    2023-05-03 [2] CRAN (R 4.3.1)
##  pillar                 1.9.0     2023-03-22 [1] CRAN (R 4.3.0)
##  pkgconfig              2.0.3     2019-09-22 [1] CRAN (R 4.3.0)
##  png                    0.1-8     2022-11-29 [1] CRAN (R 4.3.0)
##  polyclip               1.10-6    2023-09-27 [1] CRAN (R 4.3.1)
##  purrr                  1.0.2     2023-08-10 [1] CRAN (R 4.3.0)
##  R6                     2.5.1     2021-08-19 [1] CRAN (R 4.3.0)
##  RColorBrewer           1.1-3     2022-04-03 [1] CRAN (R 4.3.0)
##  Rcpp                   1.0.11    2023-07-06 [1] CRAN (R 4.3.0)
##  RCurl                  1.98-1.13 2023-11-02 [1] CRAN (R 4.3.1)
##  rjson                  0.2.21    2022-01-09 [1] CRAN (R 4.3.0)
##  rlang                  1.1.2     2023-11-04 [1] CRAN (R 4.3.1)
##  rmarkdown              2.25      2023-09-18 [1] CRAN (R 4.3.1)
##  rstudioapi             0.15.0    2023-07-07 [1] CRAN (R 4.3.0)
##  rvest                  1.0.3     2022-08-19 [1] CRAN (R 4.3.0)
##  S4Arrays               1.0.6     2023-08-30 [1] Bioconductor
##  S4Vectors            * 0.38.2    2023-09-24 [1] Bioconductor
##  sass                   0.4.7     2023-07-15 [1] CRAN (R 4.3.0)
##  scales                 1.2.1     2022-08-20 [1] CRAN (R 4.3.0)
##  scuttle                1.10.3    2023-10-15 [1] Bioconductor
##  sessioninfo            1.2.2     2021-12-06 [1] CRAN (R 4.3.0)
##  shape                  1.4.6     2021-05-19 [1] CRAN (R 4.3.0)
##  SingleCellExperiment * 1.22.0    2023-05-08 [1] Bioconductor
##  sparseMatrixStats      1.12.2    2023-07-02 [1] Bioconductor
##  stringi                1.7.12    2023-01-11 [1] CRAN (R 4.3.0)
##  stringr                1.5.1     2023-11-14 [1] CRAN (R 4.3.1)
##  SummarizedExperiment * 1.30.2    2023-06-11 [1] Bioconductor
##  svglite                2.1.2     2023-10-11 [1] CRAN (R 4.3.1)
##  systemfonts            1.0.5     2023-10-09 [1] CRAN (R 4.3.1)
##  tibble                 3.2.1     2023-03-20 [1] CRAN (R 4.3.0)
##  tidyr                  1.3.0     2023-01-24 [1] CRAN (R 4.3.0)
##  tidyselect             1.2.0     2022-10-10 [1] CRAN (R 4.3.0)
##  tweenr                 2.0.2     2022-09-06 [1] CRAN (R 4.3.0)
##  utf8                   1.2.4     2023-10-22 [1] CRAN (R 4.3.1)
##  vctrs                  0.6.4     2023-10-12 [1] CRAN (R 4.3.1)
##  viridisLite            0.4.2     2023-05-02 [1] CRAN (R 4.3.0)
##  webshot                0.5.5     2023-06-26 [1] CRAN (R 4.3.0)
##  withr                  2.5.2     2023-10-30 [1] CRAN (R 4.3.1)
##  xfun                   0.41      2023-11-01 [1] CRAN (R 4.3.1)
##  xml2                   1.3.6     2023-12-04 [1] CRAN (R 4.3.1)
##  XVector                0.40.0    2023-05-08 [1] Bioconductor
##  yaml                   2.3.7     2023-01-23 [1] CRAN (R 4.3.0)
##  zlibbioc               1.46.0    2023-05-08 [1] Bioconductor
## 
##  [1] /Users/sjspielman/Library/R/arm64/4.3/library
##  [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────
