This report contains a summary of cell type annotation results for
library SCPCL000295. The goal of this report is to provide more detailed
information about cell type annotation results as an initial evaluation
of their quality and reliability.
Performing cell type annotation is an inherently challenging task
with high levels of uncertainty, especially when using automated
annotation methods. One way to address this is to use multiple cell type
annotation approaches and compare the results, as we have begun to do in
this report.
When multiple methods annotate a given cell as the same or similar
cell type, this may qualitatively indicate a more robust annotation.
Note that the contents of this report will vary based on which cell
type annotations are present. Further, be aware that different cell type
annotation methods may assign different labels to the same or similar
cell type (e.g., the different string representations
B cell naive
and Naive B cell
), due to use of
different underlying reference datasets.
This library contains the following cell type
annotations:
- Submitter-provided cell type annotation, generated by the original
lab which produced this data.
- Annotations from
SingleR
,
a reference-based approach (Looney et al.
2019). The BlueprintEncodeData
dataset, obtained from
the celldex
package, was used for reference annotations.
- Annotations from
CellAssign
,
a marker-gene-based approach (Zhang et al.
2019). Marker genes for cell types were obtained from PanglaoDB and compiled into a reference
named blood
. This reference includes the following organs
and tissue compartments: organ.
Cell type Annotation Summary
The plots and tables included here detail the results from performing
cell type annotation.
Statistics
Submitter-provided cell type annotations
In this table, cells labeled “Submitter-excluded” are those for which
submitters did not provide an annotation.
Annotated cell type
|
Number of cells
|
Percent of cells
|
celltype1
|
5452
|
100%
|
SingleR
cell type annotations
In this table, cells labeled “Unknown cell type” are those which
SingleR
pruned due to low-quality assignments. In the
processed result files, these cells are labeled
NA
.
Annotated cell type
|
Number of cells
|
Percent of cells
|
monocyte
|
2686
|
49.27%
|
granulocyte monocyte progenitor cell
|
2483
|
45.54%
|
common myeloid progenitor
|
105
|
1.93%
|
common lymphoid progenitor
|
46
|
0.84%
|
naive B cell
|
29
|
0.53%
|
hematopoietic multipotent progenitor cell
|
21
|
0.39%
|
plasma cell
|
17
|
0.31%
|
natural killer cell
|
15
|
0.28%
|
macrophage
|
14
|
0.26%
|
CD8-positive, alpha-beta T cell
|
11
|
0.2%
|
CD4-positive, alpha-beta T cell
|
7
|
0.13%
|
eosinophil
|
5
|
0.09%
|
effector memory CD8-positive, alpha-beta T cell
|
3
|
0.06%
|
central memory CD8-positive, alpha-beta T cell
|
2
|
0.04%
|
hematopoietic stem cell
|
2
|
0.04%
|
memory B cell
|
2
|
0.04%
|
effector memory CD4-positive, alpha-beta T cell
|
1
|
0.02%
|
megakaryocyte-erythroid progenitor cell
|
1
|
0.02%
|
Unknown cell type
|
2
|
0.04%
|
CellAssign
cell type annotations
In this table, cells labeled “Unknown cell type” are those which
CellAssign
could not confidently assign to a label in the
reference list. In the processed result files, these cells are labeled
"other"
.
Annotated cell type
|
Number of cells
|
Percent of cells
|
Myeloid-derived suppressor cells
|
2005
|
36.78%
|
Gamma delta T cells
|
221
|
4.05%
|
B cells naive
|
38
|
0.7%
|
Neutrophils
|
32
|
0.59%
|
Dendritic cells
|
31
|
0.57%
|
Eosinophils
|
22
|
0.4%
|
T cells
|
17
|
0.31%
|
NK cells
|
16
|
0.29%
|
B cells memory
|
13
|
0.24%
|
T memory cells
|
8
|
0.15%
|
Megakaryocytes
|
4
|
0.07%
|
Osteoclast precursor cells
|
3
|
0.06%
|
Plasma cells
|
2
|
0.04%
|
Hematopoietic stem cells
|
1
|
0.02%
|
Macrophages
|
1
|
0.02%
|
Mast cells
|
1
|
0.02%
|
Monocytes
|
1
|
0.02%
|
Nuocytes
|
1
|
0.02%
|
Red pulp macrophages
|
1
|
0.02%
|
Reticulocytes
|
1
|
0.02%
|
Unknown cell type
|
3033
|
55.63%
|
UMAPs
In this section, we show UMAPs colored by clusters. Clusters were
calculated using the graph-based louvain algorithm with jaccard
weighting.
## Warning: There were 2 warnings in `dplyr::mutate()`.
## The first warning was:
## ℹ In argument: `across(...)`.
## Caused by warning:
## ! 1 unknown level in `f`: Unknown cell type
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.

Next, we show UMAPs colored by cell types. For each cell typing
method, we show a separate faceted UMAP. In each panel, cells that were
assigned the given cell type label are colored, while all other cells
are in grey.
For legibility, only the seven most common cell types are shown. All
other cell types are grouped together and labeled “All remaining cell
types” (not to be confused with “Unknown cell type” which represents
cells that could not be classified).



Cell label comparison plots
This section displays heatmaps comparing cell labels from various
methods.
We use the Jaccard similarity
index to display the agreement between between pairs of labels
assigned by different annotation methods.
The Jaccard index reflects the degree of overlap between the two
labels and ranges from 0 to 1.
- If the labels are assigned to identical sets of cells, the Jaccard
index will be 1.
- If the labels are assigned to completely non-overlapping sets of
cells, the Jaccard index will be 0.
High agreement between methods qualitatively indicates higher
confidence in the cell type annotation.
Unsupervised clustering
Here we show the labels from unsupervised clustering compared to cell
type annotations. Cluster assignment was performed using the
louvain
algorithm.

Submitter-provided annotations
This section displays heatmaps comparing submitter-provided cell type
annotations to those obtained from SingleR
and
CellAssign
.

Automated annotations
This section displays a heatmap directly comparing
SingleR
and CellAssign
cell type annotations.
Note that due to different annotations references, these methods may use
different names for similar cell types.

Quality assessments of automated annotations
SingleR
annotations
To assess the quality of SingleR
cell type annotations,
we use the delta median statistic.
- Delta median is calculated for each cell as the difference between
the
SingleR
score of the annotated cell type label and the
median score of the other cell type labels in the reference
dataset.
- Higher delta median values indicate higher quality cell type
annotations.
You can interpret this plot as follows:
- Each point represents the delta median statistic of a given cell
whose
SingleR
annotation is shown on the y-axis.
- The point style indicates
SingleR
’s quality assessment
of the annotation:
- High-quality cell annotations are shown as closed points.
- Low-quality cell annotations are shown as open points. In other
sections of this report, these cells are labeled as
Unknown cell types
.
- For more information on how
SingleR
calculates
annotation quality, please refer to this
SingleR
documentation.
- Diamonds represent the median of the delta median statistic
specifically among high-quality annotations for the given cell type
annotation.

CellAssign
annotations
To assess the quality of CellAssign
cell type
annotations, we consider the probability associated with the annotated
cell type. These probabilities are provided directly by
CellAssign
:
CellAssign
first calculates the probability of each
cell being annotated as each cell type present in the reference.
CellAssign
then annotates cells by selecting the cell
type with the highest probability among all cell types considered.
- These probabilities range from 0 to 1, with larger values indicating
greater confidence in a given cell type label. We therefore expect
reliable labels to have values close to 1.
The plot below shows the distribution of
CellAssign
-calculated probabilities for the final cell type
labels. Line segments represent individual values that comprise each
distribution.
For cell types with 2 or fewer labeled cells, only the individual
value line segments are shown. Line segments are also taller for any
cell type label with 5 or fewer cells.

Session Info
R session information
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.3.1 (2023-06-16)
## os macOS Sonoma 14.2.1
## system aarch64, darwin20
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2024-01-05
## pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [1] CRAN (R 4.3.0)
## beachmat 2.16.0 2023-05-08 [1] Bioconductor
## Biobase * 2.60.0 2023-05-08 [1] Bioconductor
## BiocGenerics * 0.46.0 2023-06-04 [1] Bioconductor
## BiocParallel 1.34.2 2023-05-28 [1] Bioconductor
## bitops 1.0-7 2021-04-24 [1] CRAN (R 4.3.0)
## bslib 0.5.1 2023-08-11 [1] CRAN (R 4.3.0)
## cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.0)
## Cairo 1.6-1 2023-08-18 [1] CRAN (R 4.3.0)
## circlize 0.4.15 2022-05-10 [1] CRAN (R 4.3.0)
## cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)
## clue 0.3-65 2023-09-23 [1] CRAN (R 4.3.1)
## cluster 2.1.4 2022-08-22 [2] CRAN (R 4.3.1)
## codetools 0.2-19 2023-02-01 [2] CRAN (R 4.3.1)
## colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0)
## ComplexHeatmap 2.16.0 2023-05-08 [1] Bioconductor
## crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0)
## DelayedArray 0.26.7 2023-07-30 [1] Bioconductor
## DelayedMatrixStats 1.22.6 2023-09-03 [1] Bioconductor
## digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)
## doParallel 1.0.17 2022-02-07 [1] CRAN (R 4.3.0)
## dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.3.1)
## evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.1)
## fansi 1.0.5 2023-10-08 [1] CRAN (R 4.3.1)
## farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.0)
## fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)
## flexmix 2.3-19 2023-03-16 [1] CRAN (R 4.3.0)
## forcats 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
## foreach 1.5.2 2022-02-02 [1] CRAN (R 4.3.0)
## generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
## GenomeInfoDb * 1.36.4 2023-10-08 [1] Bioconductor
## GenomeInfoDbData 1.2.10 2023-09-14 [1] Bioconductor
## GenomicRanges * 1.52.1 2023-10-08 [1] Bioconductor
## GetoptLong 1.0.5 2020-12-15 [1] CRAN (R 4.3.0)
## ggforce 0.4.1 2022-10-04 [1] CRAN (R 4.3.0)
## ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.3.1)
## GlobalOptions 0.1.2 2020-06-10 [1] CRAN (R 4.3.0)
## glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0)
## gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.0)
## highr 0.10 2022-12-22 [1] CRAN (R 4.3.0)
## htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.1)
## httr 1.4.7 2023-08-15 [1] CRAN (R 4.3.0)
## IRanges * 2.34.1 2023-07-02 [1] Bioconductor
## iterators 1.0.14 2022-02-05 [1] CRAN (R 4.3.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.3.0)
## jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.3.1)
## kableExtra 1.3.4 2021-02-20 [1] CRAN (R 4.3.0)
## knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1)
## labeling 0.4.3 2023-08-29 [1] CRAN (R 4.3.0)
## lattice 0.21-8 2023-04-05 [2] CRAN (R 4.3.1)
## lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0)
## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
## MASS 7.3-60 2023-05-04 [2] CRAN (R 4.3.1)
## Matrix 1.5-4.1 2023-05-18 [2] CRAN (R 4.3.1)
## MatrixGenerics * 1.12.3 2023-07-30 [1] Bioconductor
## matrixStats * 1.0.0 2023-06-02 [1] CRAN (R 4.3.0)
## miQC 1.8.0 2023-05-08 [1] Bioconductor
## modeltools 0.2-23 2020-03-05 [1] CRAN (R 4.3.0)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0)
## nnet 7.3-19 2023-05-03 [2] CRAN (R 4.3.1)
## pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
## png 0.1-8 2022-11-29 [1] CRAN (R 4.3.0)
## polyclip 1.10-6 2023-09-27 [1] CRAN (R 4.3.1)
## purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
## RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.3.0)
## Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.0)
## RCurl 1.98-1.13 2023-11-02 [1] CRAN (R 4.3.1)
## rjson 0.2.21 2022-01-09 [1] CRAN (R 4.3.0)
## rlang 1.1.2 2023-11-04 [1] CRAN (R 4.3.1)
## rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1)
## rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)
## rvest 1.0.3 2022-08-19 [1] CRAN (R 4.3.0)
## S4Arrays 1.0.6 2023-08-30 [1] Bioconductor
## S4Vectors * 0.38.2 2023-09-24 [1] Bioconductor
## sass 0.4.7 2023-07-15 [1] CRAN (R 4.3.0)
## scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0)
## scuttle 1.10.3 2023-10-15 [1] Bioconductor
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
## shape 1.4.6 2021-05-19 [1] CRAN (R 4.3.0)
## SingleCellExperiment * 1.22.0 2023-05-08 [1] Bioconductor
## sparseMatrixStats 1.12.2 2023-07-02 [1] Bioconductor
## stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
## stringr 1.5.1 2023-11-14 [1] CRAN (R 4.3.1)
## SummarizedExperiment * 1.30.2 2023-06-11 [1] Bioconductor
## svglite 2.1.2 2023-10-11 [1] CRAN (R 4.3.1)
## systemfonts 1.0.5 2023-10-09 [1] CRAN (R 4.3.1)
## tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
## tidyr 1.3.0 2023-01-24 [1] CRAN (R 4.3.0)
## tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0)
## tweenr 2.0.2 2022-09-06 [1] CRAN (R 4.3.0)
## utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.1)
## vctrs 0.6.4 2023-10-12 [1] CRAN (R 4.3.1)
## viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.3.0)
## webshot 0.5.5 2023-06-26 [1] CRAN (R 4.3.0)
## withr 2.5.2 2023-10-30 [1] CRAN (R 4.3.1)
## xfun 0.41 2023-11-01 [1] CRAN (R 4.3.1)
## xml2 1.3.6 2023-12-04 [1] CRAN (R 4.3.1)
## XVector 0.40.0 2023-05-08 [1] Bioconductor
## yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
## zlibbioc 1.46.0 2023-05-08 [1] Bioconductor
##
## [1] /Users/sjspielman/Library/R/arm64/4.3/library
## [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────
