You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After the v0.5.2 release 😞 , we have observed that a small amount of datasets fail during post_process_sce specifically because of ADT quality control statistics.
In certain (infrequent!) circumstances, the assumptions of cleanTagCounts(), which uses ambientContibSparse() in the absence of isotype controls, are not met (source).
From the source:
The assumption here is that of sparsity, i.e., no more than \code{prop * nrow(y)} features should be actually present in each cell with a non-zero number of molecules.
This is reasonable for most tag-based applications where we would expect only 1-2 tags (for cell hashing) or a minority of tags (for general CITE-seq) to be present per cell.
Thus, counts for all other features must be driven by ambient contamination, allowing us to estimate a scaling factor for each cell based on the ratio to the ambient profile.
So, we'll need to get a patch in for this. We have discussed taking the following strategy -
If discard has NA values, instead use zero.ambient for filtering: cells get "Keep" if zero.ambient is TRUE and "Remove" if it is FALSE.
If bothdiscard and zero.ambient have NA values, we might just fail on the post-processing altogether? Or, open for discussion here!
Worth noting that zero.ambient is always returned by cleanTagCounts() regardless of whether any isotype controls are present, so this strategy can be used universally.
So, we'll need to get a patch in for this. We have discussed taking the following strategy -
If discard has NA values, instead use zero.ambient for filtering: cells get "Keep" if zero.ambient is TRUE and "Remove" if it is FALSE.
If bothdiscard and zero.ambient have NA values, we might just fail on the post-processing altogether? Or, open for discussion here!
Just for clarification, is this on a cell-by-cell basis? I would expect based on your description that if one discard is NA they all are, but I still might implement this so that we can accommodate NA if it occurs in just some cells. Though as I write that I'm not confident I like it.
Just for clarification, is this on a cell-by-cell basis?
Yes, it's cell-by-cell. My description may have been misleading, sorry! If one is NA, it's not a guarantee that others are also; there can be NA, FALSE, and TRUE all in one library.
After the
v0.5.2
release 😞 , we have observed that a small amount of datasets fail duringpost_process_sce
specifically because of ADT quality control statistics.In certain (infrequent!) circumstances, the assumptions of
cleanTagCounts()
, which usesambientContibSparse()
in the absence of isotype controls, are not met (source).From the source:
So, we'll need to get a patch in for this. We have discussed taking the following strategy -
discard
hasNA
values, instead usezero.ambient
for filtering: cells get "Keep" ifzero.ambient
isTRUE
and "Remove" if it isFALSE
.discard
andzero.ambient
haveNA
values, we might just fail on the post-processing altogether? Or, open for discussion here!Worth noting that
zero.ambient
is always returned bycleanTagCounts()
regardless of whether any isotype controls are present, so this strategy can be used universally.CC @jashapiro @allyhawkins
The text was updated successfully, but these errors were encountered: