Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mem_max for all merging and sync changes from main #726

Merged
merged 27 commits into from
Mar 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
22be575
remove duplicated code in cell type module
sjspielman Feb 5, 2024
b734559
Merge pull request #685 from AlexsLemonade/sjspielman/remove-main-dup…
sjspielman Feb 5, 2024
8e6b8a2
export empty file if no processed cells
allyhawkins Feb 13, 2024
b96ff26
check for file size in qc report
allyhawkins Feb 13, 2024
d221a23
skip additional processing for processed objects with 0 cells
allyhawkins Feb 13, 2024
145ce2b
add to log
allyhawkins Feb 13, 2024
d51814d
update log error
allyhawkins Feb 13, 2024
5db6a27
relocate check for file size
allyhawkins Feb 14, 2024
f925a20
Merge pull request #694 from AlexsLemonade/jashapiro/precommit-ci
allyhawkins Feb 15, 2024
e87092f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 15, 2024
7c02d7d
Update .pre-commit-config.yaml with excludes
jashapiro Feb 20, 2024
1569842
Merge pull request #696 from AlexsLemonade/allyhawkins/skip-processin…
allyhawkins Feb 20, 2024
da659bc
bump to 128 gb
allyhawkins Feb 29, 2024
557e623
try 256
allyhawkins Mar 1, 2024
16b9578
488
allyhawkins Mar 1, 2024
78f719b
Bump memory for CCDL only
jashapiro Mar 4, 2024
10095cb
Merge pull request #711 from AlexsLemonade/jashapiro/mem-bump
allyhawkins Mar 4, 2024
16277d6
set up max_mem label
allyhawkins Mar 7, 2024
f1c6628
use the correct order for deciding memory
allyhawkins Mar 7, 2024
80b60a2
Apply suggestions from code review
allyhawkins Mar 11, 2024
b400776
Merge pull request #722 from AlexsLemonade/allyhawkins/cell-assign-me…
allyhawkins Mar 11, 2024
5ce6c75
update tags to v0.7.3
allyhawkins Mar 11, 2024
b185050
Merge pull request #724 from AlexsLemonade/allyhawkins/v0.7.3
allyhawkins Mar 11, 2024
2148f71
Merge branch 'main' into allyhawkins/mem-max
allyhawkins Mar 11, 2024
054ab19
compress processed file
allyhawkins Mar 11, 2024
5a5367b
use mem_max for merging
allyhawkins Mar 11, 2024
aec4ba3
Merge branch 'development' into allyhawkins/mem-max
allyhawkins Mar 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions bin/post_process_sce.R
Original file line number Diff line number Diff line change
Expand Up @@ -279,5 +279,11 @@ if (length(reducedDimNames(processed_sce)) == 0) {
# write out filtered SCE with additional filtering column
readr::write_rds(sce, opt$out_filtered_sce_file, compress = "bz2")

# write out processed SCE
readr::write_rds(processed_sce, opt$out_processed_sce_file, compress = "bz2")
# only write out processed SCE if > 0 cells
if (ncol(processed_sce) > 0) {
# write out processed SCE
readr::write_rds(processed_sce, opt$out_processed_sce_file, compress = "bz2")
} else {
# make an empty processed file
file.create(opt$out_processed_sce_file)
}
8 changes: 7 additions & 1 deletion bin/sce_qc_report.R
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,13 @@ if (opt$workflow_commit == "null") {
# read sce files
unfiltered_sce <- readr::read_rds(opt$unfiltered_sce)
filtered_sce <- readr::read_rds(opt$filtered_sce)
processed_sce <- readr::read_rds(opt$processed_sce)

# make sure processed sce has an object, otherwise set to NULL
if (file.size(opt$processed_sce) > 0) {
processed_sce <- readr::read_rds(opt$processed_sce)
} else {
processed_sce <- NULL
}

# Compile metadata for output files
sce_meta <- metadata(unfiltered_sce)
Expand Down
3 changes: 3 additions & 0 deletions config/process_base.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ process {
withLabel: mem_96 {
memory = {check_memory(48.GB + 48.GB * task.attempt, params.max_memory)}
}
withLabel: mem_max {
memory = {task.attempt > 1 ? params.max_memory : check_memory(96.GB, params.max_memory)}
}
withLabel: cpus_2 {
cpus = {check_cpus(2, params.max_cpus)}
}
Expand Down
6 changes: 3 additions & 3 deletions external-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,12 @@ Using the above command will run the workflow from the `main` branch of the work
To update to the latest released version you can run `nextflow pull AlexsLemonade/scpca-nf` before the `nextflow run` command.

To be sure that you are using a consistent version, you can specify use of a release tagged version of the workflow, set below with the `-r` flag.
The command below will pull the `scpca-nf` workflow directly from Github using the `v0.7.2` version.
The command below will pull the `scpca-nf` workflow directly from Github using the `v0.7.3` version.
Released versions can be found on the [`scpca-nf` repository releases page](https://github.com/AlexsLemonade/scpca-nf/releases).

```sh
nextflow run AlexsLemonade/scpca-nf \
-r v0.7.2 \
-r v0.7.3 \
-config <path to config file> \
-profile <name of profile>
```
Expand Down Expand Up @@ -325,7 +325,7 @@ If you will be analyzing spatial expression data, you will also need the Cell Ra

If your compute nodes do not have internet access, you will likely have to pre-pull the required container images as well.
When doing this, it is important to be sure that you also specify the revision (version tag) of the `scpca-nf` workflow that you are using.
For example, if you would run `nextflow run AlexsLemonade/scpca-nf -r v0.7.2`, then you will want to set `-r v0.7.2` for `get_refs.py` as well to be sure you have the correct containers.
For example, if you would run `nextflow run AlexsLemonade/scpca-nf -r v0.7.3`, then you will want to set `-r v0.7.3` for `get_refs.py` as well to be sure you have the correct containers.
By default, `get_refs.py` will download files and images associated with the latest release.

If your system uses Docker, you can add the `--docker` flag:
Expand Down
2 changes: 1 addition & 1 deletion internal-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Please refer to our [`CONTRIBUTING.md`](CONTRIBUTING.md#stub-workflows) for more
When running the workflow for a project or group of samples that is ready to be released on ScPCA portal, please use the tag for the latest release:

```
nextflow run AlexsLemonade/scpca-nf -r v0.7.2 -profile ccdl,batch --project SCPCP000000
nextflow run AlexsLemonade/scpca-nf -r v0.7.3 -profile ccdl,batch --project SCPCP000000
```

### Processing example data
Expand Down
19 changes: 18 additions & 1 deletion main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -261,8 +261,22 @@ workflow {
all_sce_ch = sce_ch.no_genetic.mix(genetic_demux_sce.out)
post_process_sce(all_sce_ch)


post_process_ch = post_process_sce.out
// only continue processing any samples with > 0 cells left after processing
.branch{
continue_processing: it[3].size() > 0
skip_processing: true
}

// send library ids in post_process_ch.skip_processing to log
post_process_ch.skip_processing
.subscribe{
log.error("There are no cells found in the processed object for ${it[0].library_id}.")
}

// Cluster SCE
cluster_sce(post_process_sce.out)
cluster_sce(post_process_ch.continue_processing)

if (params.perform_celltyping) {
// Perform celltyping, if specified
Expand All @@ -271,6 +285,9 @@ workflow {
annotated_celltype_ch = cluster_sce.out
}

// combine back with libraries that skipped post processing
sce_output_ch = annotated_celltype_ch.mix(post_process_ch.skip_processing)

// generate QC reports
sce_qc_report(
annotated_celltype_ch,
Expand Down
6 changes: 3 additions & 3 deletions merge.nf
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ if (param_error) {
// merge individual SCE objects into one SCE object
process merge_sce {
container params.SCPCATOOLS_CONTAINER
label 'mem_32'
label 'mem_max'
publishDir "${params.results_dir}/${merge_group_id}/merged"
input:
tuple val(merge_group_id), val(has_adt), val(library_ids), path(scpca_nf_file)
Expand Down Expand Up @@ -60,7 +60,7 @@ process merge_sce {
process generate_merge_report {
container params.SCPCATOOLS_CONTAINER
publishDir "${params.results_dir}/${merge_group_id}/merged"
label 'mem_16'
label 'mem_max'
input:
tuple path(merged_sce_file), val(merge_group_id), val(has_adt)
path(report_template)
Expand All @@ -86,7 +86,7 @@ process generate_merge_report {

process export_anndata {
container params.SCPCATOOLS_CONTAINER
label 'mem_32'
label 'mem_max'
tag "${merge_group_id}"
publishDir "${params.results_dir}/${merge_group_id}/merged", mode: 'copy'
input:
Expand Down
2 changes: 1 addition & 1 deletion modules/classify-celltypes.nf
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ process classify_cellassign {
mode: 'copy',
pattern: "${cellassign_dir}"
)
label 'mem_96'
label 'mem_max'
label 'cpus_12'
tag "${meta.library_id}"
input:
Expand Down
2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ manifest {
homePage = 'https://github.com/AlexsLemonade/scpca-nf'
mainScript = 'main.nf'
defaultBranch = 'main'
version = 'v0.7.2'
version = 'v0.7.3'
}

// global parameters for workflows
Expand Down
Loading