Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip processing if no cells remain after empty droplet filtering #738

Merged
merged 9 commits into from
Apr 15, 2024

Conversation

allyhawkins
Copy link
Member

Closes #682
Closes #735

This PR makes some adjustments to the workflow to account for any libraries that have no cells after removing empty droplets. I took a very similar approach to how we handle objects that have 0 cells after removing low-quality cells in the processed object.

  • In the filter_sce.R script, I add a check for the number of columns after removing empty droplets. If that is equal to 0, then an empty filtered.rds is created, otherwise processing proceeds as normal.
  • If the filtered.rds file is empty, then any additional steps are skipped. The log will include a note about any libraries that have no cells.
  • Finally, the filtered and processed rds are written out as empty files.

While I was here I also accounted for the sample metadata issue and made sure we read in all columns as characters before adding them to the object.

Questions for reviewers:

  • First, do we like this approach of creating an empty file? I do wonder if we want to change to logging an error and stopping the workflow? I think it's important to note that there is an error, but I also don't want to kill a run of multiple libraries because one of them is poor quality...
  • Right now, we publish an empty processed file and we don't account for that in documentation. I actually think if the file is empty, maybe we just don't want to include it? This would have other implications on the Portal itself since we would have to make sure the script can accommodate missing rds files when zipping everything up. Alternatively, if we keep this route of providing empty files we should just make a note in the documentation.

@allyhawkins
Copy link
Member Author

The stub workflow was failing because everything was getting flagged as having no cells since they will all be empty files. I updated this to not consider stub files when checking file size after filtering and processing.

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I made a suggestion that would reduce the number of lines changed in the script by just exiting rather than having an else block.

I don't think I need to see this again though.

bin/filter_sce.R Outdated Show resolved Hide resolved
main.nf Outdated Show resolved Hide resolved
@allyhawkins allyhawkins merged commit ceee0dc into development Apr 15, 2024
4 checks passed
@allyhawkins allyhawkins deleted the allyhawkins/skip-processing-when-no-cells branch April 15, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants