Skip to content

Commit

Permalink
feat: add performance tweaks for resource optimization in workflow ru…
Browse files Browse the repository at this point in the history
…les (#153)

Co-authored-by: Max Schubach <max.schubach@bih-charite.de>
  • Loading branch information
visze and Max Schubach authored Dec 18, 2024
1 parent 56b2254 commit 5ed1ef9
Show file tree
Hide file tree
Showing 4 changed files with 33 additions and 2 deletions.
27 changes: 27 additions & 0 deletions docs/cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,33 @@ Having 30 cores and 10GB of memory.
snakemake --sdm conda --configfile config/config.yaml -c 30 --resources mem_mb=10000 --workflow-profile profiles/default
Performance tweaks: Running specific rules with different resources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some of the rule swill benefit from multithreading or more memory. This can be specified within your profile, worflow profile or in the command line interface using :code:`--set-resources RULE_NAME:RESOURCE_NAME=VALUE` or :code:`---set-threads RULE_NAME=VALUE`. Before changing resources make sure that you really need the rule by running a dry run getting the list of executed rules only::code:`snakemamake -n --quiet rules`.

Possible rules to tweaks:

:Assignment:

:assignment_hybridFWRead_get_reads_by_cutadapt:
Only needed when using linker option in config. You can add more threads using :code:`--set-threads assignment_hybridFWRead_get_reads_by_cutadapt=4`. Default is always 1 thread.

:assignment_mapping_bbmap:
Only needed when using bbmap for mapping. Memory and threads can be optimized e.g. via :code:`--set-threads assignment_mapping_bbmap=30 --set-resources assignment_mapping_bbmap:mem_mb=10000`. Default is 1 thread and 4GB memory but we recommend to use 30 threads and 10GB if available.

:assignment_mapping_bwa:
Only needed when using bwa for mapping. Memory and threads can be optimized e.g. via :code:`--set-threads assignment_mapping_bwa=30 --set-resources assignment_mapping_bwa:mem_mb:10000`. Default is 1 thread but we recommend to use 30 threads and 10GB if available.

:assignment_collectBCs:
Threads can be optimized e.g. via :code:`--set-threads assignment_collectBCs=30`. Default is 1 thread but we recommend to use 30 threads if available.

:Experiment:

:counts_onlyFW_raw_counts_by_cutadapt:
Only needed when you have only FW reads and use the adapter option. Threads can be optimized e.g. via :code:`--set-threads counts_onlyFW_raw_counts_by_cutadapt=30`. Default is 1 thread.


Running on an HPC using SLURM
-----------------------------

Expand Down
3 changes: 2 additions & 1 deletion workflow/rules/assignment/hybridFWRead.smk
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ rule assignment_hybridFWRead_get_reads_by_cutadapt:
"""
conda:
"../../envs/cutadapt.yaml"
threads: 1
input:
lambda wc: config["assignments"][wc.assignment]["FW"],
output:
Expand All @@ -55,6 +56,6 @@ rule assignment_hybridFWRead_get_reads_by_cutadapt:
linker=lambda wc: config["assignments"][wc.assignment]["linker"],
shell:
"""
cutadapt -a {params.linker} -G {params.linker}\
cutadapt --cores {threads} -a {params.linker} -G {params.linker}\
-o {output.BC} -p {output.FW} <(zcat {input}) <(zcat {input}) &> {log}
"""
2 changes: 2 additions & 0 deletions workflow/rules/assignment/mapping_bwa.smk
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ rule assignment_mapping_bwa:
"""
conda:
"../../envs/bwa_samtools_picard_htslib.yaml"
threads: 1
input:
reads="results/assignment/{assignment}/fastq/merge_split{split}.join.fastq.gz",
reference="results/assignment/{assignment}/reference/reference.fa",
Expand Down Expand Up @@ -100,6 +101,7 @@ rule assignment_collect:
"""
conda:
"../../envs/bwa_samtools_picard_htslib.yaml"
threads: 1
input:
bams=lambda wc: expand(
"results/assignment/{{assignment}}/{mapper}/merge_split{split}.mapped.bam",
Expand Down
3 changes: 2 additions & 1 deletion workflow/rules/counts/counts_onlyFW.smk
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ rule counts_onlyFW_raw_counts_by_cutadapt:
"""
conda:
"../../envs/cutadapt.yaml"
threads: 1
input:
lambda wc: getFW(wc.project, wc.condition, wc.replicate, wc.type),
output:
Expand All @@ -49,7 +50,7 @@ rule counts_onlyFW_raw_counts_by_cutadapt:
shell:
"""
zcat {input} | \
cutadapt -a {params.adapter} - |
cutadapt --cores {threads} -a {params.adapter} - |
awk 'NR%4==2 {{print $1}}' | \
sort | \
gzip -c > {output} 2> {log}
Expand Down

0 comments on commit 5ed1ef9

Please sign in to comment.