Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update UMAP faceting code #644

Merged
merged 19 commits into from
Jan 9, 2024
Merged
Changes from 8 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
93eed97
Update legend position and add label wrapping to the lump function. T…
sjspielman Jan 5, 2024
d319c60
Fix factor bug - after lumping, they should still be in order of freq…
sjspielman Jan 5, 2024
22a1beb
Add conditional legend placement using modulo
sjspielman Jan 5, 2024
3fa66d0
add dplyr:: for when this function is called from the supp report, wh…
sjspielman Jan 5, 2024
1e22cd2
Merge branch 'development' into sjspielman/637-faceted-umap-legend
sjspielman Jan 5, 2024
2addaad
make sure we're using three columns
sjspielman Jan 5, 2024
5b1732c
also account for unknown cell type in the refactoring
sjspielman Jan 5, 2024
dc5af08
also if 1 cell type, legend goes on the bottom. This is almost certai…
sjspielman Jan 5, 2024
ad59190
restore aspect.ratio = 1 and update legend placement to fit
sjspielman Jan 5, 2024
b00c951
Set umap height dynamically, but fix width at 8.
sjspielman Jan 5, 2024
14d1d4b
Plot height and legend tweaks after viewing all different options
sjspielman Jan 5, 2024
cf16bfb
Add comments about legend placement choices
sjspielman Jan 8, 2024
61d22c2
Styling: Align UMAPs in the center and update sizing approach to dete…
sjspielman Jan 8, 2024
2e040da
use roxygen comments and put function in a function chunk higher in rmd
sjspielman Jan 8, 2024
13dd624
fix comment typo
sjspielman Jan 8, 2024
c40a52b
Update templates/qc_report/celltypes_qc.rmd
sjspielman Jan 9, 2024
900a8d2
Update templates/qc_report/celltypes_qc.rmd
sjspielman Jan 9, 2024
16ffe3e
Legend y-positioning based on number of facets
sjspielman Jan 9, 2024
0768afe
Merge branch 'development' into sjspielman/637-faceted-umap-legend
sjspielman Jan 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 49 additions & 12 deletions templates/qc_report/celltypes_qc.rmd
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ format_celltype_n_table <- function(df) {

#' Function to lump celltype columns in an existing data frame for all of the
#' following columns, if they exist: `<singler/cellassign/submitter>_celltype_annotation`.
#' The cell types will also be renamed via wrapping at the given `wrap` level.
#' The resulting lumped column will be named:
#' `<singler/cellassign/submitter>_celltype_annotation_lumped`.
#'
Expand All @@ -50,16 +51,28 @@ format_celltype_n_table <- function(df) {
#' @param n_celltypes Number of groups to lump into, with rest put into "Other" group. Default is 7.
#'
#' @return Updated df with new column of lumped celltypes for each present method
lump_celltypes <- function(df, n_celltypes = 7) {
lump_wrap_celltypes <- function(df, n_celltypes = 7, wrap = 35) {
df <- df |>
# First, wrap labels
dplyr::mutate(
across(
ends_with("_celltype_annotation"),
\(x) forcats::fct_lump_n(x, n_celltypes, other_level = "All remaining cell types", ties.method = "first"),
\(x) stringr::str_wrap(x, wrap)
)
) |>
# Next, apply factor lumping, but ensure final order is via frequency with the "others" at the end
dplyr::mutate(
across(
ends_with("_celltype_annotation"),
\(x) {
x |>
forcats::fct_lump_n(n_celltypes, other_level = "All remaining cell types", ties.method = "first") |>
forcats::fct_infreq() |>
forcats::fct_relevel("Unknown cell type", "All remaining cell types", after = Inf)
},
.names = "{.col}_lumped"
)
)

return(df)
}

Expand Down Expand Up @@ -120,14 +133,20 @@ plot_umap <- function(
#' In each panel, the cell type of interest is colored and all other cells are grey.
faceted_umap <- function(umap_df,
annotation_column) {
# Find total number of cell types for determining legend placement
total_celltypes <- umap_df |>
dplyr::pull({{ annotation_column }}) |>
levels() |>
length()

# color by the annotation column but only color one cell type at a time
faceted_umap <- ggplot(
umap_df,
aes(x = UMAP1, y = UMAP2, color = {{ annotation_column }})
) +
# set points for all "other" points
geom_point(
data = select(
data = dplyr::select(
umap_df, -{{ annotation_column }}
),
color = "gray80",
Expand All @@ -136,7 +155,10 @@ faceted_umap <- function(umap_df,
) +
# set points for desired cell type
geom_point(size = 0.3, alpha = 0.5) +
facet_wrap(vars({{ annotation_column }})) +
facet_wrap(
vars({{ annotation_column }}),
ncol = 3
) +
scale_color_brewer(palette = "Dark2") +
# remove axis numbers and background grid
scale_x_continuous(labels = NULL, breaks = NULL) +
Expand All @@ -150,13 +172,28 @@ faceted_umap <- function(umap_df,
size = 1.5
)
)
) +
theme(
legend.position = c(.9, 0),
legend.justification = c("right", "bottom"),
legend.title.align = 0.5
)

# Determine legend placement based on total_celltypes
if (total_celltypes %% 3 != 0 & total_celltypes != 1) {
faceted_umap <- faceted_umap +
theme(
legend.position = c(0.95, 0),
legend.justification = c(1, -0.1),
legend.title.align = 0.5,
# use slightly smaller legend text, which helps legend fit and prevents
# long wrapped labels from bunching up
legend.text = element_text(size = rel(0.85))
)
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just add a comment here that this is specifically for when n = 2 that because we have < 3 columns, we need the legend on the bottom?
Maybe another comment for each of the if conditions would be helpful to say why each decision was made here.

faceted_umap <- faceted_umap +
theme(
legend.position = "bottom"
)
}



return(faceted_umap)
}
```
Expand Down Expand Up @@ -278,8 +315,8 @@ Clusters were calculated using the graph-based {metadata(processed_sce)$cluster_


```{r, eval = has_umap}
# Create dataset for plotting UMAPs with lumped cell types
umap_df <- lump_celltypes(celltype_df)
# Create dataset for plotting UMAPs with lumped and label-wrapped cell types
umap_df <- lump_wrap_celltypes(celltype_df)
```


Expand Down