-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up cell types QC report #483
Conversation
…functions are only used once, but still this makes things cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few initial thoughts, some of which I can probably get over:
-
My first thought is that I don't really like using
source()
to define functions, as I think it can make for extremely confusing namespaces. But I can probably get over that. (I looked into some ideas for creating a namespace... one idea here: https://community.rstudio.com/t/is-it-dangerous-to-create-a-namespace-for-functions-in-a-script-without-a-package/91141 But this is probably too much effort for my own discomfort.)- I would try to limit the
source
d file to only the functions that are used in more than one place, and try to avoid code that is about formatting (kable & plotting) as I think the indirection makes it harder to do the live adjustments we are probably going to need during revisions. - When using sourced files, those files should contain all required
library()
imports, even if they are potentially redundant. (In this case, SingleCellExperiment, most likely). This is probably controversial, as it could result in overwriting of functions in a non-transparent way, but that is really more being mad atsource()
again.
- I would try to limit the
-
The
celltypes_report.rmd
file should probably live insideqc_report(s)
(possibly with a bit of renaming...main_qc_report.rmd
?)- This is because when we are passing files to a process, it is easier to pass the directory that contains all needed files and know that the structure will be preserved, vs passing in each of the component files that we will need, and worrying that nextflow might rename them.
- Since we need the
qc_utils
directory that should also live withinqc_report
or be just a bare file in that directory.
-
As I expect we will want to give the celltype report a dynamic name, we probably want to add a parameter with that value to the QC report and use that for the link.
Finally, in response to your last question: yes, I think we will want to expand the celltype report with a bit more explanation in a few places, but I think that can wait for separate PRs.
# kable formatting | ||
knitr::kable(align = "r") |> | ||
kableExtra::kable_styling( | ||
bootstrap_options = "striped", | ||
full_width = FALSE, | ||
position = "left" | ||
) |> | ||
kableExtra::column_spec(2, monospace = TRUE) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might leave the kable
formatting out. If we are reusing this style frequently across different table types, it might make sense as a separate function that we define at the top and reuse as needed. (I think we duplicate this a lot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are reusing this style frequently across different table types, it might make sense as a separate function that we define at the top and reuse as needed. (I think we duplicate this a lot)
💯^💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now I'm splitting out the kable
code into its own function, but more generally, see #371 (comment).
Ok, we're getting some stochastic stub errors; see also - #479 (comment) |
…e_functions.R. Other functions go into the file where they are used
… relative link to supp report with library id.
…d update case_when syntax
…isit more generally
… it does have a for loop...
…plify code to prepare umap df - just needs to lump cell types now
@jashapiro this should be ready for another look! I've made quite a few changes in response to review so I won't list them all here, but the commits are each very focused 💪 to serve as that "change list." Noting there's also a for loop that made it in for this round but I think it was a judiciously-used one, let me know if you have other thoughts! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me.
I probably would have a few more suggestions if I went through every line, but for the general task of breaking things up, it should be fine.
But, I couldn't really handle the for loop, so I made a very tidy suggestion of an alternative.
#' @param df Data frame to format | ||
#' | ||
#' @return kable table of cell type counts | ||
format_celltype_n_table <- function(df) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably the same format we are using everywhere, or at least many many places. I would think a more general title like format_counts_table
... but I get that this is just the first crack at this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd favor circling back when we really upgrade tables - #371
Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>
… some comment updates
…gler/cellassign comparison
Feeling good about #487 with all these green checks! |
Closes #481
This PR splits up the cell type report:
templates/celltypes_report.rmd
that contains diagnostic plots and all the heatmaps. This report is linked from the main QC report, using the relative link as it would appear in the ScPCA download.templates/qc_utils/celltype_functions.R
, and added roxygen comments.Notable places for feedback:
templates/
since there was an integration report there too, and similarly this is where I placedqc_utils/
. I am not tied to these location choices if we have other good ideas here